Re: [PATCH] make MADV_FREE lazily free memory

2007-04-11 Thread Eric Dumazet

Rik van Riel a écrit :

Eric Dumazet wrote:

Rik van Riel a écrit :

Make it possible for applications to have the kernel free memory
lazily.  This reduces a repeated free/malloc cycle from freeing
pages and allocating them, to just marking them freeable.  If the
application wants to reuse them before the kernel needs the memory,
not even a page fault will happen.


I dont understand this last sentence. If not even a page fault 
happens, how the kernel knows that the page was eventually reused by 
the application, and should not be freed in case of memory pressure ?


Before maybe freeing the page, the kernel checks the referenced
and dirty bits of the page table entries mapping that page.


ptr = mmap(some space);
madvise(ptr, length, MADV_FREE);
/* kernel may free the pages */


All this call does is:
- clear the accessed and dirty bits
- move the page to the far end of the inactive list,
  where it will be the first to be reclaimed


sleep(10);

/* what the application must do know before reusing space ? */
memset(ptr, data, 1);
/* kernel should not free ptr[0..1] now */


Two things can happen here.

If this program used the pages before the kernel needed
them, the program will be reusing its old pages.


ah ok, this is because accessed/dirty bits are set by hardware and not a page 
fault. Is it true for all architectures ?




If the kernel got there first, you will get page faults
and the kernel will fill in the memory with new pages.


perfect



Both of these alternatives are transparent to userspace.



Thanks a lot for these clarifications. This will fly :)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kernel BUG at net/core/skbuff.c in linux-2.6.21-rc6

2007-04-11 Thread Patrick McHardy
Bartek wrote:
> Hopefully, this time it my bug report should be ok :):
> 
> Apr 11 23:53:38 localhost pppd[31289]: rcvd [proto=0x7689] e1 cd 33 f6
> fd f7 52 e6 58 c9 73 98 bc ff ad d5 b5 a3 e5 d9 1e 77 76 0a 1c 87 59
> bf 44 cc ac 3b ...
> Apr 11 23:53:38 localhost pppd[31289]: Unsupported protocol 0x7689 received
> Apr 11 23:53:38 localhost pppd[31289]: sent [LCP ProtRej id=0x9 76 89
> e1 cd 33 f6 fd f7 52 e6 58 c9 73 98 bc ff ad d5 b5 a3 e5 d9 1e 77 76
> 0a 1c 87 59 bf 44 cc ...]
> Apr 11 23:53:38 localhost pppd[31289]: rcvd [proto=0xda7d] 15 19 45 3c
> e0 ac 44 92 3b c4 8e 75 6b b8 4a 9f 4a 3a 22 63 d3 a1 56 98 47 62 bc
> cd a6 8e d5 77 ...
> Apr 11 23:53:38 localhost pppd[31289]: Unsupported protocol 0xda7d received
> Apr 11 23:53:38 localhost pppd[31289]: sent [LCP ProtRej id=0xa da 7d
> 15 19 45 3c e0 ac 44 92 3b c4 8e 75 6b b8 4a 9f 4a 3a 22 63 d3 a1 56
> 98 47 62 bc cd a6 8e ...]
> Apr 11 23:53:40 localhost kernel: skb_under_panic: text:f8c62c0e
> len:291 put:1 head:ddc94800 data:ddc947ff tail:ddc94922 end:ddc94e00
> dev:


It seems we fail to reserve enough headroom for the case
buf[0] == PPP_ALLSTATIONS and buf[1] != PPP_UI.

Can you try this patch please?

diff --git a/drivers/net/ppp_async.c b/drivers/net/ppp_async.c
index 933e2f3..c68e37f 100644
--- a/drivers/net/ppp_async.c
+++ b/drivers/net/ppp_async.c
@@ -890,6 +890,8 @@ ppp_async_input(struct asyncppp *ap, const unsigned char 
*buf,
ap->rpkt = skb;
}
if (skb->len == 0) {
+   int headroom = 0;
+
/* Try to get the payload 4-byte aligned.
 * This should match the
 * PPP_ALLSTATIONS/PPP_UI/compressed tests in
@@ -897,7 +899,10 @@ ppp_async_input(struct asyncppp *ap, const unsigned char 
*buf,
 * enough chars here to test buf[1] and buf[2].
 */
if (buf[0] != PPP_ALLSTATIONS)
-   skb_reserve(skb, 2 + (buf[0] & 1));
+   headroom += 2;
+   if (buf[0] & 1)
+   headroom += 1;
+   skb_reserve(skb, headroom);
}
if (n > skb_tailroom(skb)) {
/* packet overflowed MRU */


Re: tmpfs and the OOM killer

2007-04-11 Thread Pedro
On Thursday 12 April 2007 02:04, Al Boldi wrote:
> > Pedro wrote:
> >   2) How should an application be written to not be killed by OOM?
>
> Try this:
>
> # echo -17 > /proc//oom_adj

  I should know that to run a fail-safe application is a superuser privilege.

  Sorry from wasting your time.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 4/4] Pte simplify ops.patch

2007-04-11 Thread Zachary Amsden
Add comment and condense code to make use of native_local_ptep_get_and_clear
function.  Also, it turns out the 2-level and 3-level paging definitions
were identical, so move the common definition into pgtable.h

Signed-off-by: Zachary Amsden <[EMAIL PROTECTED]>

diff -r b3bbc1b5e085 include/asm-i386/pgtable-2level.h
--- a/include/asm-i386/pgtable-2level.h Wed Apr 11 18:23:44 2007 -0700
+++ b/include/asm-i386/pgtable-2level.h Wed Apr 11 18:24:07 2007 -0700
@@ -39,16 +39,6 @@ static inline void native_pte_clear(stru
 static inline void native_pte_clear(struct mm_struct *mm, unsigned long addr, 
pte_t *xp)
 {
*xp = __pte(0);
-}
-
-/* local pte updates need not use xchg for locking */
-static inline pte_t native_local_ptep_get_and_clear(pte_t *ptep)
-{
-   pte_t res;
-
-   res = *ptep;
-   native_pte_clear(NULL, 0, ptep);
-   return res;
 }
 
 #ifdef CONFIG_SMP
diff -r b3bbc1b5e085 include/asm-i386/pgtable-3level.h
--- a/include/asm-i386/pgtable-3level.h Wed Apr 11 18:23:44 2007 -0700
+++ b/include/asm-i386/pgtable-3level.h Wed Apr 11 18:23:49 2007 -0700
@@ -139,16 +139,6 @@ static inline void pud_clear (pud_t * pu
 #define pmd_offset(pud, address) ((pmd_t *) pud_page(*(pud)) + \
pmd_index(address))
 
-/* local pte updates need not use xchg for locking */
-static inline pte_t native_local_ptep_get_and_clear(pte_t *ptep)
-{
-   pte_t res;
-   
-   res = *ptep;
-   native_pte_clear(NULL, 0, ptep);
-   return res;
-}
-
 #ifdef CONFIG_SMP
 static inline pte_t native_ptep_get_and_clear(pte_t *ptep)
 {
diff -r b3bbc1b5e085 include/asm-i386/pgtable.h
--- a/include/asm-i386/pgtable.hWed Apr 11 18:23:44 2007 -0700
+++ b/include/asm-i386/pgtable.hWed Apr 11 18:23:49 2007 -0700
@@ -267,6 +267,16 @@ static inline pte_t pte_mkhuge(pte_t pte
 #define pte_update_defer(mm, addr, ptep)   do { } while (0)
 #endif
 
+/* local pte updates need not use xchg for locking */
+static inline pte_t native_local_ptep_get_and_clear(pte_t *ptep)
+{
+   pte_t res = *ptep;
+
+   /* Pure native function needs no input for mm, addr */
+   native_pte_clear(NULL, 0, ptep);
+   return res;
+}
+
 /*
  * We only update the dirty/accessed state if we set
  * the dirty bit by hand in the kernel, since the hardware
@@ -348,8 +358,11 @@ static inline pte_t ptep_get_and_clear_f
 {
pte_t pte;
if (full) {
-   pte = *ptep;
-   native_pte_clear(mm, addr, ptep);
+   /*
+* Full address destruction in progress; paravirt does not
+* care about updates and native needs no locking
+*/
+   pte = native_local_ptep_get_and_clear(ptep);
} else {
pte = ptep_get_and_clear(mm, addr, ptep);
}
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3/4] Pte xchg optimization.patch

2007-04-11 Thread Zachary Amsden
In situations where page table updates need only be made locally, and there
is no cross-processor A/D bit races involved, we need not use the heavyweight
xchg instruction to atomically fetch and clear page table entries.  Instead,
we can just read and clear them directly.

This introduces a neat optimization for non-SMP kernels; drop the atomic
xchg operations from page table updates.

Thanks to Michel Lespinasse for noting this potential optimization.

Signed-off-by: Zachary Amsden <[EMAIL PROTECTED]>

diff -r 47495b2532b3 include/asm-i386/pgtable-2level.h
--- a/include/asm-i386/pgtable-2level.h Wed Apr 11 18:23:01 2007 -0700
+++ b/include/asm-i386/pgtable-2level.h Wed Apr 11 18:23:39 2007 -0700
@@ -41,10 +41,24 @@ static inline void native_pte_clear(stru
*xp = __pte(0);
 }
 
+/* local pte updates need not use xchg for locking */
+static inline pte_t native_local_ptep_get_and_clear(pte_t *ptep)
+{
+   pte_t res;
+
+   res = *ptep;
+   native_pte_clear(NULL, 0, ptep);
+   return res;
+}
+
+#ifdef CONFIG_SMP
 static inline pte_t native_ptep_get_and_clear(pte_t *xp)
 {
return __pte(xchg(>pte_low, 0));
 }
+#else
+#define native_ptep_get_and_clear(xp) native_local_ptep_get_and_clear(xp)
+#endif
 
 #define pte_page(x)pfn_to_page(pte_pfn(x))
 #define pte_none(x)(!(x).pte_low)
diff -r 47495b2532b3 include/asm-i386/pgtable-3level.h
--- a/include/asm-i386/pgtable-3level.h Wed Apr 11 18:23:01 2007 -0700
+++ b/include/asm-i386/pgtable-3level.h Wed Apr 11 18:23:05 2007 -0700
@@ -139,6 +139,17 @@ static inline void pud_clear (pud_t * pu
 #define pmd_offset(pud, address) ((pmd_t *) pud_page(*(pud)) + \
pmd_index(address))
 
+/* local pte updates need not use xchg for locking */
+static inline pte_t native_local_ptep_get_and_clear(pte_t *ptep)
+{
+   pte_t res;
+   
+   res = *ptep;
+   native_pte_clear(NULL, 0, ptep);
+   return res;
+}
+
+#ifdef CONFIG_SMP
 static inline pte_t native_ptep_get_and_clear(pte_t *ptep)
 {
pte_t res;
@@ -150,6 +161,9 @@ static inline pte_t native_ptep_get_and_
 
return res;
 }
+#else
+#define native_ptep_get_and_clear(xp) native_local_ptep_get_and_clear(xp)
+#endif
 
 #define __HAVE_ARCH_PTE_SAME
 static inline int pte_same(pte_t a, pte_t b)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/4] Pte drop ptep_get_and_clear paravirt op.patch

2007-04-11 Thread Zachary Amsden
In shadow mode hypervisors, ptep_get_and_clear achieves the desired
purpose of keeping the shadows in sync by issuing a native_get_and_clear,
followed by a call to pte_update, which indicates the PTE has been
modified.

Direct mode hypervisors (Xen) have no need for this anyway, and will trap
the update using writable pagetables.

This means no hypervisor makes use of ptep_get_and_clear; there is no
reason to have it in the paravirt-ops structure.  Change confusing
terminology about raw vs. native functions into consistent use of
native_pte_xxx for operations which do not invoke paravirt-ops.

Signed-off-by: Zachary Amsden <[EMAIL PROTECTED]>

diff -r c02c6f5e882c arch/i386/kernel/paravirt.c
--- a/arch/i386/kernel/paravirt.c   Wed Apr 11 16:25:09 2007 -0700
+++ b/arch/i386/kernel/paravirt.c   Wed Apr 11 17:09:55 2007 -0700
@@ -315,8 +315,6 @@ struct paravirt_ops paravirt_ops = {
.pte_update = paravirt_nop,
.pte_update_defer = paravirt_nop,
 
-   .ptep_get_and_clear = native_ptep_get_and_clear,
-
 #ifdef CONFIG_HIGHPTE
.kmap_atomic_pte = kmap_atomic,
 #endif
diff -r c02c6f5e882c include/asm-i386/paravirt.h
--- a/include/asm-i386/paravirt.h   Wed Apr 11 16:25:09 2007 -0700
+++ b/include/asm-i386/paravirt.h   Wed Apr 11 17:12:03 2007 -0700
@@ -187,8 +187,6 @@ struct paravirt_ops
void (*pte_update)(struct mm_struct *mm, unsigned long addr, pte_t 
*ptep);
void (*pte_update_defer)(struct mm_struct *mm,
 unsigned long addr, pte_t *ptep);
-
-   pte_t (*ptep_get_and_clear)(pte_t *ptep);
 
 #ifdef CONFIG_HIGHPTE
void *(*kmap_atomic_pte)(struct page *page, enum km_type type);
@@ -859,12 +857,8 @@ static inline void pmd_clear(pmd_t *pmdp
PVOP_VCALL1(pmd_clear, pmdp);
 }
 
-static inline pte_t raw_ptep_get_and_clear(pte_t *p)
-{
-   unsigned long long val = PVOP_CALL1(unsigned long long, 
ptep_get_and_clear, p);
-   return (pte_t) { val, val >> 32 };
-}
 #else  /* !CONFIG_X86_PAE */
+
 static inline pte_t __pte(unsigned long val)
 {
return (pte_t) { PVOP_CALL1(unsigned long, make_pte, val) };
@@ -899,11 +893,6 @@ static inline void set_pmd(pmd_t *pmdp, 
 static inline void set_pmd(pmd_t *pmdp, pmd_t pmdval)
 {
PVOP_VCALL2(set_pmd, pmdp, pmdval.pud.pgd.pgd);
-}
-
-static inline pte_t raw_ptep_get_and_clear(pte_t *p)
-{
-   return (pte_t) { PVOP_CALL1(unsigned long, ptep_get_and_clear, p) };
 }
 #endif /* CONFIG_X86_PAE */
 
diff -r c02c6f5e882c include/asm-i386/pgtable.h
--- a/include/asm-i386/pgtable.hWed Apr 11 16:25:09 2007 -0700
+++ b/include/asm-i386/pgtable.hWed Apr 11 17:11:22 2007 -0700
@@ -265,8 +265,6 @@ static inline pte_t pte_mkhuge(pte_t pte
  */
 #define pte_update(mm, addr, ptep) do { } while (0)
 #define pte_update_defer(mm, addr, ptep)   do { } while (0)
-
-#define raw_ptep_get_and_clear(xp) native_ptep_get_and_clear(xp)
 #endif
 
 /*
@@ -340,7 +338,7 @@ do {
\
 #define __HAVE_ARCH_PTEP_GET_AND_CLEAR
 static inline pte_t ptep_get_and_clear(struct mm_struct *mm, unsigned long 
addr, pte_t *ptep)
 {
-   pte_t pte = raw_ptep_get_and_clear(ptep);
+   pte_t pte = native_ptep_get_and_clear(ptep);
pte_update(mm, addr, ptep);
return pte;
 }
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2/4] Pte clear optimization.patch

2007-04-11 Thread Zachary Amsden
When exiting from an address space, no special hypervisor notification of page
table updates needs to occur; direct page table hypervisors, such as Xen,
switch to another address space first (init_mm) and unprotects the page tables
to avoid the cost of trapping to the hypervisor for each pte_clear.  Shadow
mode hypervisors, such as VMI and lhype don't need to do the extra work of
calling through paravirt-ops, and can just directly clear the page table
entries without notifiying the hypervisor, since all the page tables are
about to be freed.

So introduce native_pte_clear functions which bypass any paravirt-ops
notification.  This results in a significant performance win for VMI
and removes some indirect calls from zap_pte_range.

Note the 3-level paging already had a native_pte_clear function, thus
demanding argument conformance and extra args for the 2-level definition.

Signed-off-by: Zachary Amsden <[EMAIL PROTECTED]>

diff -r 1478ce4ec9e3 include/asm-i386/pgtable-2level.h
--- a/include/asm-i386/pgtable-2level.h Wed Apr 11 17:13:10 2007 -0700
+++ b/include/asm-i386/pgtable-2level.h Wed Apr 11 18:22:51 2007 -0700
@@ -35,6 +35,11 @@ static inline void native_set_pmd(pmd_t 
 
 #define pte_clear(mm,addr,xp)  do { set_pte_at(mm, addr, xp, __pte(0)); } 
while (0)
 #define pmd_clear(xp)  do { set_pmd(xp, __pmd(0)); } while (0)
+
+static inline void native_pte_clear(struct mm_struct *mm, unsigned long addr, 
pte_t *xp)
+{
+   *xp = __pte(0);
+}
 
 static inline pte_t native_ptep_get_and_clear(pte_t *xp)
 {
diff -r 1478ce4ec9e3 include/asm-i386/pgtable.h
--- a/include/asm-i386/pgtable.hWed Apr 11 17:13:10 2007 -0700
+++ b/include/asm-i386/pgtable.hWed Apr 11 18:21:43 2007 -0700
@@ -349,7 +349,7 @@ static inline pte_t ptep_get_and_clear_f
pte_t pte;
if (full) {
pte = *ptep;
-   pte_clear(mm, addr, ptep);
+   native_pte_clear(mm, addr, ptep);
} else {
pte = ptep_get_and_clear(mm, addr, ptep);
}
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 0/4] i386 - pte update optimizations

2007-04-11 Thread Zachary Amsden
Some PTE optimizations for native and paravirt-ops kernels; this
provides a huge win for shadow mode hypervisors and gets rid of
some unnecessary atomic instructions in native kernels, saving
even more on UP by getting rid of implicit LOCK on xchg instruction.

Zach
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] NET: [UPDATED] Multiqueue network device support implementation.

2007-04-11 Thread Zhu Yi
On Wed, 2007-04-11 at 19:03 +0200, Patrick McHardy wrote:
> 
> You bring up a good point, it would be good to hear the opinion from
> one of the wireless people on this since they have their own
> multiqueue scheduler in the wireless-dev tree. 

The one in the wireless-dev is pretty much like this one. It existed
only because there was not such a multiqueue aware qdisc available at
that time.

The requirement for wireless is the same as the strict PRIO with an
addition that the dequeued SKB's corresponding NIC hardware queue must
be active (this is also true for other devices I think, otherwise it has
to be requeued which leads a busy or dead loop in the end). In other
words, the dequeue method should select the SKB with the highest
priority from all the ACTIVE hardware queues (not all queues). The
wireless hardware then schedules all the packets from its 4 hardware TX
queues based on the priority and network environment.

Thanks,
-yi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: tmpfs and the OOM killer

2007-04-11 Thread Pedro
On Wednesday 11 April 2007 19:39, Alan Cox wrote:
> >   2) How should an application be written to not be killed by OOM?
>
> OOM isn't an application matter. The kernel has to choose between
> allowing overcommit on the basis it might run out of memory and have to
> kill stuff, or that it won't in which case an applicatio which correctly
> handles malloc() and similar failures will not be killed (unless it is
> out of space on a stack grow which is a C language flaw as you can't
> catch that event in C)
>
> It's configured by /proc/sys/vm/overcommit_memory
>
> 0 - try and spot obviously dumb allocations
> 1 - anything goes
> 2 - strictly control resource commit

  I deduce that a fail-safe application must scanf overcommit_memory, warn 
the user and waitpid.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kernel BUG at net/core/skbuff.c in linux-2.6.21-rc6

2007-04-11 Thread Bartek

It is not enough to unload proprietary modules. As long as they have
ever been loaded at all the kernel is tainted.
You need to ensure that the proprietary modules never get loaded at
all. I guess you probably already worked that out, just wanted to
point it out just in case :-)


Hopefully, this time it my bug report should be ok :):

Apr 11 23:53:38 localhost pppd[31289]: rcvd [proto=0x7689] e1 cd 33 f6
fd f7 52 e6 58 c9 73 98 bc ff ad d5 b5 a3 e5 d9 1e 77 76 0a 1c 87 59
bf 44 cc ac 3b ...
Apr 11 23:53:38 localhost pppd[31289]: Unsupported protocol 0x7689 received
Apr 11 23:53:38 localhost pppd[31289]: sent [LCP ProtRej id=0x9 76 89
e1 cd 33 f6 fd f7 52 e6 58 c9 73 98 bc ff ad d5 b5 a3 e5 d9 1e 77 76
0a 1c 87 59 bf 44 cc ...]
Apr 11 23:53:38 localhost pppd[31289]: rcvd [proto=0xda7d] 15 19 45 3c
e0 ac 44 92 3b c4 8e 75 6b b8 4a 9f 4a 3a 22 63 d3 a1 56 98 47 62 bc
cd a6 8e d5 77 ...
Apr 11 23:53:38 localhost pppd[31289]: Unsupported protocol 0xda7d received
Apr 11 23:53:38 localhost pppd[31289]: sent [LCP ProtRej id=0xa da 7d
15 19 45 3c e0 ac 44 92 3b c4 8e 75 6b b8 4a 9f 4a 3a 22 63 d3 a1 56
98 47 62 bc cd a6 8e ...]
Apr 11 23:53:40 localhost kernel: skb_under_panic: text:f8c62c0e
len:291 put:1 head:ddc94800 data:ddc947ff tail:ddc94922 end:ddc94e00
dev:
Apr 11 23:53:40 localhost kernel: [ cut here ]
Apr 11 23:53:40 localhost kernel: kernel BUG at net/core/skbuff.c:111!
Apr 11 23:53:40 localhost kernel: invalid opcode:  [#1]
Apr 11 23:53:40 localhost kernel: Modules linked in: nfs nfsd exportfs
lockd nfs_acl sunrpc button xt_TCPMSS xt_limit xt_tcpudp nf_nat_irc
nf_nat_ftp iptable_nat iptable_mangle ipt_LOG ipt_MASQUERADE nf_nat
ipt_TOS ipt_REJECT nf_conntrack_irc nf_conntrack_ftp nf_conntrack_ipv4
xt_state nf_conntrack nfnetlink iptable_filter ip_tables x_tables
ppp_async ipv6 ppp_generic slhc xfs fuse eeprom w83781d w83627hf
hwmon_vid i2c_isa ide_generic parport_pc parport i2c_viapro floppy
i2c_core serio_raw snd_via82xx snd_ac97_codec ac97_bus snd_pcm
snd_timer snd_page_alloc snd_mpu401_uart via_ircc snd_rawmidi
snd_seq_device irda rtc psmouse via_agp agpgart snd soundcore pcspkr
crc_ccitt evdev ext3 jbd mbcache usbhid ide_cd cdrom ide_disk generic
uhci_hcd usbcore via82cxxx ide_core e100 mii thermal processor fan
Apr 11 23:53:40 localhost kernel: CPU:0
Apr 11 23:53:40 localhost kernel: EIP:0060:[]Not tainted VLI
Apr 11 23:53:40 localhost kernel: EFLAGS: 00010096   (2.6.21-rc6 #3)
Apr 11 23:53:40 localhost kernel: EIP is at skb_under_panic+0x59/0x5d
Apr 11 23:53:40 localhost kernel: eax: 0072   ebx: ddc94800   ecx:
   edx: 
Apr 11 23:53:40 localhost kernel: esi:    edi: ddc94924   ebp:
ddc9491e   esp: c1ce5ed8
Apr 11 23:53:40 localhost kernel: ds: 007b   es: 007b   fs: 00d8  gs:
  ss: 0068
Apr 11 23:53:40 localhost kernel: Process events/0 (pid: 3,
ti=c1ce4000 task=dfd02030 task.ti=c1ce4000)
Apr 11 23:53:40 localhost kernel: Stack: c02c47d0 f8c62c0e 0123
0001 ddc94800 ddc947ff ddc94922 ddc94e00
Apr 11 23:53:40 localhost kernel:c02b7ed8 dfd23a60 00ff
f8c62c13 0282 dfff5c20 f7e67c00 0208
Apr 11 23:53:40 localhost kernel:f0e45d34 f0e45c34 f7e67c00
0202 e0f7d600 0006 f0e45c00 f7e67c0c
Apr 11 23:53:40 localhost kernel: Call Trace:
Apr 11 23:53:40 localhost kernel:  []
ppp_asynctty_receive+0x3b0/0x584 [ppp_async]
Apr 11 23:53:40 localhost kernel:  []
ppp_asynctty_receive+0x3b5/0x584 [ppp_async]
Apr 11 23:53:40 localhost kernel:  [] flush_to_ldisc+0xe6/0x124
Apr 11 23:53:40 localhost kernel:  [] flush_to_ldisc+0x0/0x124
Apr 11 23:53:40 localhost kernel:  [] run_workqueue+0x70/0x101
Apr 11 23:53:40 localhost kernel:  [] worker_thread+0x105/0x12e
Apr 11 23:53:40 localhost kernel:  [] default_wake_function+0x0/0xc
Apr 11 23:53:40 localhost kernel:  [] worker_thread+0x0/0x12e
Apr 11 23:53:40 localhost kernel:  [] kthread+0xa0/0xc8
Apr 11 23:53:40 localhost kernel:  [] kthread+0x0/0xc8
Apr 11 23:53:40 localhost kernel:  [] kernel_thread_helper+0x7/0x10
Apr 11 23:53:40 localhost kernel:  ===
Apr 11 23:53:40 localhost kernel: Code: 00 00 89 5c 24 14 8b 98 a0 00
00 00 89 54 24 0c 89 5c 24 10 8b 40 60 89 4c 24 04 c7 04 24 d0 47 2c
c0 89 44 24 08 e8 af c5 ef ff <0f> 0b eb fe 56 53 bb d8 7e 2b c0 83 ec
24 8b 70 14 85 f6 0f 45
Apr 11 23:53:40 localhost kernel: EIP: []
skb_under_panic+0x59/0x5d SS:ESP 0068:c1ce5ed8
Apr 11 23:54:01 localhost /USR/SBIN/CRON[32147]: (root) CMD
(/usr/local/bin/pppd_test.sh)
Apr 11 23:54:31 localhost pppd[31289]: No response to 5 echo-requests
Apr 11 23:54:31 localhost pppd[31289]: Serial link appears to be disconnected.
Apr 11 23:54:31 localhost pppd[31289]: Connect time 34.0 minutes.
Apr 11 23:54:31 localhost pppd[31289]: Sent 6451377 bytes, received
21004296 bytes.
Apr 11 23:54:31 localhost pppd[31289]: Script /etc/ppp/ip-down started
(pid 32149)
Apr 11 23:54:31 localhost pppd[31289]: sent [LCP TermReq id=0xb "Peer
not responding"]
Apr 11 23:54:31 

Re: tmpfs and the OOM killer

2007-04-11 Thread Al Boldi
Pedro wrote:
> On Wednesday 11 April 2007 16:48, Willy Tarreau wrote:
> > On Wed, Apr 11, 2007 at 02:23:31AM -0300, Pedro wrote:
> > >
> > >   As the OOM killer is not Posix,
> >
> > If you cannot control your application's memory usage, you'll have to
> > finely tune the overcommit_ratio.
>
>   2) How should an application be written to not be killed by OOM?


Try this:

# echo -17 > /proc//oom_adj

Or this:

# echo 2 > /proc/sys/vm/overcommit_memory
# echo 95 > /proc/sys/vm/overcommit_ratio

Or this:

# ulimit -v [max vm]


Thanks, and good luck with the OOM killer!


--
Al

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[KJ][PATCH] ROUND_UP macro cleanup in arch/sh64/kernel/pci_sh5.c

2007-04-11 Thread Milind Arun Choudhary
ROUND_UP macro cleanup, use ALIGN where ever appropriate.

Signed-off-by: Milind Arun Choudhary <[EMAIL PROTECTED]>

---
 pci_sh5.c |   12 
 1 files changed, 4 insertions(+), 8 deletions(-)

diff --git a/arch/sh64/kernel/pci_sh5.c b/arch/sh64/kernel/pci_sh5.c
index 9dae689..11d1fef 100644
--- a/arch/sh64/kernel/pci_sh5.c
+++ b/arch/sh64/kernel/pci_sh5.c
@@ -376,8 +376,6 @@ irqreturn_t pcish5_serr_irq(int irq, void *dev_id, struct 
pt_regs *regs)
return IRQ_NONE;
 }
 
-#define ROUND_UP(x, a) (((x) + (a) - 1) & ~((a) - 1))
-
 static void __init
 pcibios_size_bridge(struct pci_bus *bus, struct resource *ior,
struct resource *memr)
@@ -434,8 +432,8 @@ pcibios_size_bridge(struct pci_bus *bus, struct resource 
*ior,
mem_res.end -= mem_res.start;
 
/* Align the sizes up by bridge rules */
-   io_res.end = ROUND_UP(io_res.end, 4*1024) - 1;
-   mem_res.end = ROUND_UP(mem_res.end, 1*1024*1024) - 1;
+   io_res.end = ALIGN(io_res.end, 4*1024) - 1;
+   mem_res.end = ALIGN(mem_res.end, 1*1024*1024) - 1;
 
/* Adjust the bridge's allocation requirements */
bridge->resource[0].end = bridge->resource[0].start + io_res.end;
@@ -448,18 +446,16 @@ pcibios_size_bridge(struct pci_bus *bus, struct resource 
*ior,
 
/* adjust parent's resource requirements */
if (ior) {
-   ior->end = ROUND_UP(ior->end, 4*1024);
+   ior->end = ALIGN(ior->end, 4*1024);
ior->end += io_res.end;
}
 
if (memr) {
-   memr->end = ROUND_UP(memr->end, 1*1024*1024);
+   memr->end = ALIGN(memr->end, 1*1024*1024);
memr->end += mem_res.end;
}
 }
 
-#undef ROUND_UP
-
 static void __init pcibios_size_bridges(void)
 {
struct resource io_res, mem_res;

-- 
Milind Arun Choudhary
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Linux 2.6.21-rc6

2007-04-11 Thread Jeff Chua

On 4/10/07, Linus Torvalds <[EMAIL PROTECTED]> wrote:

On Tue, 10 Apr 2007, Jeff Chua wrote:
> I couldn't get suspend-to-disk to work with 2.6.21-rc6. I've tried
> set/unset CONFIG_NO_HZ/CONFIG_HPET_TIMER, but nothing worked.

Do you think you could busect it? You'd have to apply maxim's patch by
hand at each bisection step (up until the point where it's already applied
in the git tree, of course), so it's not a totally mindless bisection, but
it should still be fairly painless, since there is only 277 commits
between -rc5 and -rc6 (so bisection should rather quickly narrow it down)


Linus,

I did that last night and realize that I could suspend to disk/ram
with 2.6.21-rc6  CONFIG_NO_HZ unset. I must have done something wrong
before.

Thank you,
Jeff.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/7] [RFC] Battery monitoring class

2007-04-11 Thread Greg KH
On Thu, Apr 12, 2007 at 03:25:03AM +0400, Anton Vorontsov wrote:
> Here is battery monitor class. According to first copyright string, we're
> maintaining it since 2003. I've took few days and cleaned it up to be
> more suitable for mainline inclusion.
> 
> It differs from battery class at git://git.infradead.org/battery-2.6.git:

Why fork from David's work?  Does he not like these changes for some
reason?

> +static int battery_create_attrs(struct battery *bat)
> +{
> + int rc;
> +
> + #define create_bat_attr_conditional(name)\
> + if(bat->get_##name) {\
> + rc = device_create_file(bat->dev, _attr_##name); \
> + if (rc) goto name##_failed;  \
> + }
> +
> + create_bat_attr_conditional(status);
> + create_bat_attr_conditional(min_voltage);
> + create_bat_attr_conditional(min_current);
> + create_bat_attr_conditional(min_capacity);
> + create_bat_attr_conditional(max_voltage);
> + create_bat_attr_conditional(max_current);
> + create_bat_attr_conditional(max_capacity);
> + create_bat_attr_conditional(temp);
> + create_bat_attr_conditional(voltage);
> + create_bat_attr_conditional(current);
> + create_bat_attr_conditional(capacity);

Use an attribute group please.  It's much simpler and will be created at
the proper time so your userspace tools don't have to sit and spin in
order to properly wait for them to show up.

Ok, yes, you want a conditional type of attribute group, like the
new firewire code does.  I have no problem adding that if you like.

thanks,

greg k-h
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/7] [RFC] Battery monitoring class

2007-04-11 Thread Randy Dunlap
On Thu, 12 Apr 2007 03:25:03 +0400 Anton Vorontsov wrote:

> Here is battery monitor class. According to first copyright string, we're
> maintaining it since 2003. I've took few days and cleaned it up to be
> more suitable for mainline inclusion.
> 
> ---
>  drivers/Kconfig   |2 +
>  drivers/Makefile  |1 +
>  drivers/battery/Kconfig   |   11 ++
>  drivers/battery/Makefile  |1 +
>  drivers/battery/battery.c |  303 
> +
>  include/linux/battery.h   |   98 +++
>  6 files changed, 416 insertions(+), 0 deletions(-)
>  create mode 100644 drivers/battery/Kconfig
>  create mode 100644 drivers/battery/Makefile
>  create mode 100644 drivers/battery/battery.c
>  create mode 100644 include/linux/battery.h
> 
> diff --git a/drivers/battery/battery.c b/drivers/battery/battery.c
> new file mode 100644
> index 000..32b8288
> --- /dev/null
> +++ b/drivers/battery/battery.c
> @@ -0,0 +1,303 @@
> +
> +void battery_status_changed(struct battery *bat)
> +{
> + pr_debug("%s\n", __FUNCTION__);
> + #ifdef CONFIG_LEDS_TRIGGERS

Please don't indent preprocessor controls (ifdef/endif etc.).

> + switch(bat->get_status(bat))
> + {
> + case BATTERY_STATUS_FULL:
> + led_trigger_event(bat->charging_trig, LED_OFF);
> + led_trigger_event(bat->full_trig, LED_FULL);
> + break;
> + case BATTERY_STATUS_CHARGING:
> + led_trigger_event(bat->charging_trig, LED_FULL);
> + led_trigger_event(bat->full_trig, LED_OFF);
> + break;
> + default:
> + led_trigger_event(bat->charging_trig, LED_OFF);
> + led_trigger_event(bat->full_trig, LED_OFF);
> + break;

Place 'switch' and 'case' at the same indent level.  This prevents
the "double-indent" for the code statements.

> + }
> + #endif /* CONFIG_LEDS_TRIGGERS */
> + return;
> +}
> +
> +static char *status_text[] = {
> + "Unknown", "Charging", "Discharging", "Not charging", "Full"
> +};
> +
> +static ssize_t battery_show_status(struct device *dev,
> +   struct device_attribute *attr, char *buf)
> +{
> + struct battery *bat = dev_get_drvdata(dev);
> + int status = 0;

We usually try to place a blank line between local data and code.

> + if (bat->get_status) {
> + status = bat->get_status(bat);
> + if (status > 4)
> + status = 0;
> + return sprintf(buf, "%s\n", status_text[status]);
> + }
> + return 0;
> +}
> +
> +static int battery_create_attrs(struct battery *bat)
> +{
> + int rc;
> +
> + #define create_bat_attr_conditional(name)\
> + if(bat->get_##name) {\

space after "if"

> + rc = device_create_file(bat->dev, _attr_##name); \
> + if (rc) goto name##_failed;  \
> + }
> +
> + create_bat_attr_conditional(status);
> + create_bat_attr_conditional(min_voltage);
> + create_bat_attr_conditional(min_current);
> + create_bat_attr_conditional(min_capacity);
> + create_bat_attr_conditional(max_voltage);
> + create_bat_attr_conditional(max_current);
> + create_bat_attr_conditional(max_capacity);
> + create_bat_attr_conditional(temp);
> + create_bat_attr_conditional(voltage);
> + create_bat_attr_conditional(current);
> + create_bat_attr_conditional(capacity);
> +
> + #define remove_bat_attr_conditional(name)   \
> + if(bat->get_##name) \

ditto.

> + device_remove_file(bat->dev, _attr_##name);
> +
> + goto success;
> +
> +capacity_failed: remove_bat_attr_conditional(current);
> +current_failed:  remove_bat_attr_conditional(voltage);
> +voltage_failed:  remove_bat_attr_conditional(temp);
> +temp_failed: remove_bat_attr_conditional(max_capacity);
> +max_capacity_failed: remove_bat_attr_conditional(max_current);
> +max_current_failed:  remove_bat_attr_conditional(max_voltage);
> +max_voltage_failed:  remove_bat_attr_conditional(min_capacity);
> +min_capacity_failed: remove_bat_attr_conditional(min_current);
> +min_current_failed:  remove_bat_attr_conditional(min_voltage);
> +min_voltage_failed:  remove_bat_attr_conditional(status);

I thought there was a class_remove() or something like that?
but I'm not sure of it.

> +status_failed:
> +success:
> + return rc;
> +}
> +
> +static void battery_remove_attrs(struct battery *bat)
> +{
> + remove_bat_attr_conditional(capacity);
> + remove_bat_attr_conditional(current);
> + remove_bat_attr_conditional(voltage);
> + remove_bat_attr_conditional(temp);
> + remove_bat_attr_conditional(max_capacity);
> + remove_bat_attr_conditional(max_current);
> + 

Re: [PATCH 0/12] Pass MAP_FIXED down to get_unmapped_area

2007-04-11 Thread Benjamin Herrenschmidt
> Is there any support consideration for nommu arch such as blackfin which
> is in the -mm tree now?
> 
> It is very kind of you to point out some idea about MAP_FIXED for
> Blackfin arch, I will do some help for this.

Right now, my understanding is that nommu archs just reject MAP_FIXED
outright... we might be able to be smarter, especially if we bring a
better infrastructure which I'm still thinking about.

Ben.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 6/17] hfs: remove redundant read_mapping_page error check

2007-04-11 Thread Nate Diller
Now that read_mapping_page() does error checking internally, there is no
need to check PageError here.

Signed-off-by: Nate Diller <[EMAIL PROTECTED]>

---

diff -urpN -X dontdiff linux-2.6.21-rc6-mm1/fs/hfs/bnode.c 
linux-2.6.21-rc6-mm1-test/fs/hfs/bnode.c
--- linux-2.6.21-rc6-mm1/fs/hfs/bnode.c 2007-04-09 17:20:13.0 -0700
+++ linux-2.6.21-rc6-mm1-test/fs/hfs/bnode.c2007-04-10 21:28:03.0 
-0700
@@ -282,10 +282,6 @@ static struct hfs_bnode *__hfs_bnode_cre
page = read_mapping_page(mapping, block++, NULL);
if (IS_ERR(page))
goto fail;
-   if (PageError(page)) {
-   page_cache_release(page);
-   goto fail;
-   }
page_cache_release(page);
node->page[i] = page;
}
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 7/17] jffs2: convert jffs2_gc_fetch_page to read_cache_page

2007-04-11 Thread Nate Diller
Replace jffs2_gc_fetch_page() and jffs2_gc_release_page() using the
read_cache_page() and put_kmapped_page() calls, and update the call site
accordingly.  Explicit calls to kmap()/kunmap() make the code more clear.

Signed-off-by: Nate Diller <[EMAIL PROTECTED]>

---

diff -urpN -X dontdiff linux-2.6.21-rc5-mm4/fs/jffs2/fs.c 
linux-2.6.21-rc5-mm4-test/fs/jffs2/fs.c
--- linux-2.6.21-rc5-mm4/fs/jffs2/fs.c  2007-04-05 17:14:25.0 -0700
+++ linux-2.6.21-rc5-mm4-test/fs/jffs2/fs.c 2007-04-06 01:59:19.0 
-0700
@@ -621,33 +621,6 @@ struct jffs2_inode_info *jffs2_gc_fetch_
return JFFS2_INODE_INFO(inode);
 }
 
-unsigned char *jffs2_gc_fetch_page(struct jffs2_sb_info *c,
-  struct jffs2_inode_info *f,
-  unsigned long offset,
-  unsigned long *priv)
-{
-   struct inode *inode = OFNI_EDONI_2SFFJ(f);
-   struct page *pg;
-
-   pg = read_cache_page(inode->i_mapping, offset >> PAGE_CACHE_SHIFT,
-(void *)jffs2_do_readpage_unlock, inode);
-   if (IS_ERR(pg))
-   return (void *)pg;
-
-   *priv = (unsigned long)pg;
-   return kmap(pg);
-}
-
-void jffs2_gc_release_page(struct jffs2_sb_info *c,
-  unsigned char *ptr,
-  unsigned long *priv)
-{
-   struct page *pg = (void *)*priv;
-
-   kunmap(pg);
-   page_cache_release(pg);
-}
-
 static int jffs2_flash_setup(struct jffs2_sb_info *c) {
int ret = 0;
 
diff -urpN -X dontdiff linux-2.6.21-rc5-mm4/fs/jffs2/gc.c 
linux-2.6.21-rc5-mm4-test/fs/jffs2/gc.c
--- linux-2.6.21-rc5-mm4/fs/jffs2/gc.c  2007-04-05 17:13:10.0 -0700
+++ linux-2.6.21-rc5-mm4-test/fs/jffs2/gc.c 2007-04-06 01:59:19.0 
-0700
@@ -1078,7 +1078,7 @@ static int jffs2_garbage_collect_dnode(s
uint32_t alloclen, offset, orig_end, orig_start;
int ret = 0;
unsigned char *comprbuf = NULL, *writebuf;
-   unsigned long pg;
+   struct page *page;
unsigned char *pg_ptr;
 
memset(, 0, sizeof(ri));
@@ -1219,12 +1219,16 @@ static int jffs2_garbage_collect_dnode(s
 *page OK. We'll actually write it out again in commit_write, which 
is a little
 *suboptimal, but at least we're correct.
 */
-   pg_ptr = jffs2_gc_fetch_page(c, f, start, );
+   page = read_cache_page(OFNI_EDONI_2SFFJ(f)->i_mapping,
+   start >> PAGE_CACHE_SHIFT,
+   (void *)jffs2_do_readpage_unlock,
+   OFNI_EDONI_2SFFJ(f));
 
-   if (IS_ERR(pg_ptr)) {
+   if (IS_ERR(page)) {
printk(KERN_WARNING "read_cache_page() returned error: %ld\n", 
PTR_ERR(pg_ptr));
-   return PTR_ERR(pg_ptr);
+   return PTR_ERR(page);
}
+   pg_ptr = kmap(page);
 
offset = start;
while(offset < orig_end) {
@@ -1287,6 +1291,7 @@ static int jffs2_garbage_collect_dnode(s
}
}
 
-   jffs2_gc_release_page(c, pg_ptr, );
+   kunmap(page);
+   page_cache_release(page);
return ret;
 }
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 11/17] ntfs: convert ntfs_map_page to read_kmap_page

2007-04-11 Thread Nate Diller
Replace ntfs_map_page() and ntfs_unmap_page() using the new read_kmap_page()
and put_kmapped_page() calls, and their locking variants, and remove
unneeded PageError checking.

Signed-off-by: Nate Diller <[EMAIL PROTECTED]>

---

diff -urpN -X dontdiff linux-2.6.21-rc5-mm4/fs/ntfs/aops.h 
linux-2.6.21-rc5-mm4-test/fs/ntfs/aops.h
--- linux-2.6.21-rc5-mm4/fs/ntfs/aops.h 2007-04-05 17:14:25.0 -0700
+++ linux-2.6.21-rc5-mm4-test/fs/ntfs/aops.h2007-04-06 01:59:19.0 
-0700
@@ -31,73 +31,6 @@
 
 #include "inode.h"
 
-/**
- * ntfs_unmap_page - release a page that was mapped using ntfs_map_page()
- * @page:  the page to release
- *
- * Unpin, unmap and release a page that was obtained from ntfs_map_page().
- */
-static inline void ntfs_unmap_page(struct page *page)
-{
-   kunmap(page);
-   page_cache_release(page);
-}
-
-/**
- * ntfs_map_page - map a page into accessible memory, reading it if necessary
- * @mapping:   address space for which to obtain the page
- * @index: index into the page cache for @mapping of the page to map
- *
- * Read a page from the page cache of the address space @mapping at position
- * @index, where @index is in units of PAGE_CACHE_SIZE, and not in bytes.
- *
- * If the page is not in memory it is loaded from disk first using the readpage
- * method defined in the address space operations of @mapping and the page is
- * added to the page cache of @mapping in the process.
- *
- * If the page belongs to an mst protected attribute and it is marked as such
- * in its ntfs inode (NInoMstProtected()) the mst fixups are applied but no
- * error checking is performed.  This means the caller has to verify whether
- * the ntfs record(s) contained in the page are valid or not using one of the
- * ntfs_is__record{,p}() macros, where  is the record type you are
- * expecting to see.  (For details of the macros, see fs/ntfs/layout.h.)
- *
- * If the page is in high memory it is mapped into memory directly addressible
- * by the kernel.
- *
- * Finally the page count is incremented, thus pinning the page into place.
- *
- * The above means that page_address(page) can be used on all pages obtained
- * with ntfs_map_page() to get the kernel virtual address of the page.
- *
- * When finished with the page, the caller has to call ntfs_unmap_page() to
- * unpin, unmap and release the page.
- *
- * Note this does not grant exclusive access. If such is desired, the caller
- * must provide it independently of the ntfs_{un}map_page() calls by using
- * a {rw_}semaphore or other means of serialization. A spin lock cannot be
- * used as ntfs_map_page() can block.
- *
- * The unlocked and uptodate page is returned on success or an encoded error
- * on failure. Caller has to test for error using the IS_ERR() macro on the
- * return value. If that evaluates to 'true', the negative error code can be
- * obtained using PTR_ERR() on the return value of ntfs_map_page().
- */
-static inline struct page *ntfs_map_page(struct address_space *mapping,
-   unsigned long index)
-{
-   struct page *page = read_mapping_page(mapping, index, NULL);
-
-   if (!IS_ERR(page)) {
-   kmap(page);
-   if (!PageError(page))
-   return page;
-   ntfs_unmap_page(page);
-   return ERR_PTR(-EIO);
-   }
-   return page;
-}
-
 #ifdef NTFS_RW
 
 extern void mark_ntfs_record_dirty(struct page *page, const unsigned int ofs);
diff -urpN -X dontdiff linux-2.6.21-rc5-mm4/fs/ntfs/bitmap.c 
linux-2.6.21-rc5-mm4-test/fs/ntfs/bitmap.c
--- linux-2.6.21-rc5-mm4/fs/ntfs/bitmap.c   2006-11-29 13:57:37.0 
-0800
+++ linux-2.6.21-rc5-mm4-test/fs/ntfs/bitmap.c  2007-04-06 12:40:53.0 
-0700
@@ -72,7 +72,7 @@ int __ntfs_bitmap_set_bits_in_run(struct
 
/* Get the page containing the first bit (@start_bit). */
mapping = vi->i_mapping;
-   page = ntfs_map_page(mapping, index);
+   page = read_kmap_page(mapping, index);
if (IS_ERR(page)) {
if (!is_rollback)
ntfs_error(vi->i_sb, "Failed to map first page (error "
@@ -123,8 +123,8 @@ int __ntfs_bitmap_set_bits_in_run(struct
/* Update @index and get the next page. */
flush_dcache_page(page);
set_page_dirty(page);
-   ntfs_unmap_page(page);
-   page = ntfs_map_page(mapping, ++index);
+   put_kmapped_page(page);
+   page = read_kmap_page(mapping, ++index);
if (IS_ERR(page))
goto rollback;
kaddr = page_address(page);
@@ -159,7 +159,7 @@ done:
/* We are done.  Unmap the page and return success. */
flush_dcache_page(page);
set_page_dirty(page);
-   ntfs_unmap_page(page);
+   put_kmapped_page(page);
ntfs_debug("Done.");
return 0;
 rollback:
diff -urpN -X dontdiff linux-2.6.21-rc5-mm4/fs/ntfs/dir.c 

[PATCH 1/17] cramfs: use read_mapping_page

2007-04-11 Thread Nate Diller
read_mapping_page_async() is going away, so convert its only user to
read_mapping_page().  This change has not been benchmarked, however, in
order to get real parallelism this wants something completely different,
like __do_page_cache_readahead(), which is not currently exported.

Signed-off-by: Nate Diller <[EMAIL PROTECTED]>

---

diff -urpN -X dontdiff linux-2.6.21-rc6-mm1/fs/cramfs/inode.c 
linux-2.6.21-rc6-mm1-test/fs/cramfs/inode.c
--- linux-2.6.21-rc6-mm1/fs/cramfs/inode.c  2007-04-09 17:24:03.0 
-0700
+++ linux-2.6.21-rc6-mm1-test/fs/cramfs/inode.c 2007-04-09 21:37:09.0 
-0700
@@ -180,8 +180,7 @@ static void *cramfs_read(struct super_bl
struct page *page = NULL;
 
if (blocknr + i < devsize) {
-   page = read_mapping_page_async(mapping, blocknr + i,
-   NULL);
+   page = read_mapping_page(mapping, blocknr + i, NULL);
/* synchronous error? */
if (IS_ERR(page))
page = NULL;
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 5/17] hfsplus: remove redundant read_mapping_page error check

2007-04-11 Thread Nate Diller
Now that read_mapping_page() does error checking internally, there is no
need to check PageError here.

Signed-off-by: Nate Diller <[EMAIL PROTECTED]> 

--- 

diff -urpN -X dontdiff linux-2.6.21-rc6-mm1/fs/hfsplus/bnode.c 
linux-2.6.21-rc6-mm1-test/fs/hfsplus/bnode.c
--- linux-2.6.21-rc6-mm1/fs/hfsplus/bnode.c 2007-04-09 17:20:13.0 
-0700
+++ linux-2.6.21-rc6-mm1-test/fs/hfsplus/bnode.c2007-04-10 
21:28:45.0 -0700
@@ -442,10 +442,6 @@ static struct hfs_bnode *__hfs_bnode_cre
page = read_mapping_page(mapping, block, NULL);
if (IS_ERR(page))
goto fail;
-   if (PageError(page)) {
-   page_cache_release(page);
-   goto fail;
-   }
page_cache_release(page);
node->page[i] = page;
}
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 14/17] reiserfs: convert reiserfs_get_page to read_kmap_page

2007-04-11 Thread Nate Diller
Replace reiserfs_get_page() and reiserfs_put_page() using the new
read_kmap_page() and put_kmapped_page() calls and their locking variants. 
Also, propagate the gfp_mask() deadlock comment to callsites.

Signed-off-by: Nate Diller <[EMAIL PROTECTED]>

---

diff -urpN -X dontdiff linux-2.6.21-rc5-mm4/fs/reiserfs/xattr.c 
linux-2.6.21-rc5-mm4-test/fs/reiserfs/xattr.c
--- linux-2.6.21-rc5-mm4/fs/reiserfs/xattr.c2007-04-05 17:14:25.0 
-0700
+++ linux-2.6.21-rc5-mm4-test/fs/reiserfs/xattr.c   2007-04-06 
14:41:34.0 -0700
@@ -438,33 +438,6 @@ int xattr_readdir(struct file *file, fil
return res;
 }
 
-/* Internal operations on file data */
-static inline void reiserfs_put_page(struct page *page)
-{
-   kunmap(page);
-   page_cache_release(page);
-}
-
-static struct page *reiserfs_get_page(struct inode *dir, unsigned long n)
-{
-   struct address_space *mapping = dir->i_mapping;
-   struct page *page;
-   /* We can deadlock if we try to free dentries,
-  and an unlink/rmdir has just occured - GFP_NOFS avoids this */
-   mapping_set_gfp_mask(mapping, GFP_NOFS);
-   page = read_mapping_page(mapping, n, NULL);
-   if (!IS_ERR(page)) {
-   kmap(page);
-   if (PageError(page))
-   goto fail;
-   }
-   return page;
-
-  fail:
-   reiserfs_put_page(page);
-   return ERR_PTR(-EIO);
-}
-
 static inline __u32 xattr_hash(const char *msg, int len)
 {
return csum_partial(msg, len, 0);
@@ -537,13 +510,15 @@ reiserfs_xattr_set(struct inode *inode, 
else
chunk = buffer_size - buffer_pos;
 
-   page = reiserfs_get_page(xinode, file_pos >> PAGE_CACHE_SHIFT);
+   /* We can deadlock if we try to free dentries,
+  and an unlink/rmdir has just occured - GFP_NOFS avoids this 
*/
+   mapping_set_gfp_mask(mapping, GFP_NOFS);
+   page = __read_kmap_page(mapping, file_pos >> PAGE_CACHE_SHIFT);
if (IS_ERR(page)) {
err = PTR_ERR(page);
goto out_filp;
}
 
-   lock_page(page);
data = page_address(page);
 
if (file_pos == 0) {
@@ -566,8 +541,7 @@ reiserfs_xattr_set(struct inode *inode, 
 page_offset + chunk +
 skip);
}
-   unlock_page(page);
-   reiserfs_put_page(page);
+   put_locked_page(page);
buffer_pos += chunk;
file_pos += chunk;
skip = 0;
@@ -646,13 +620,15 @@ reiserfs_xattr_get(const struct inode *i
else
chunk = isize - file_pos;
 
-   page = reiserfs_get_page(xinode, file_pos >> PAGE_CACHE_SHIFT);
+   /* We can deadlock if we try to free dentries,
+  and an unlink/rmdir has just occured - GFP_NOFS avoids this 
*/
+   mapping_set_gfp_mask(xinode->i_mapping, GFP_NOFS);
+   page = __read_kmap_page(xinode->i_mapping, file_pos >> 
PAGE_CACHE_SHIFT);
if (IS_ERR(page)) {
err = PTR_ERR(page);
goto out_dput;
}
 
-   lock_page(page);
data = page_address(page);
if (file_pos == 0) {
struct reiserfs_xattr_header *rxh =
@@ -661,8 +637,7 @@ reiserfs_xattr_get(const struct inode *i
chunk -= skip;
/* Magic doesn't match up.. */
if (rxh->h_magic != cpu_to_le32(REISERFS_XATTR_MAGIC)) {
-   unlock_page(page);
-   reiserfs_put_page(page);
+   put_locked_page(page);
reiserfs_warning(inode->i_sb,
 "Invalid magic for xattr (%s) "
 "associated with %k", name,
@@ -673,8 +648,7 @@ reiserfs_xattr_get(const struct inode *i
hash = le32_to_cpu(rxh->h_hash);
}
memcpy(buffer + buffer_pos, data + skip, chunk);
-   unlock_page(page);
-   reiserfs_put_page(page);
+   put_locked_page(page);
file_pos += chunk;
buffer_pos += chunk;
skip = 0;
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 17/17] vxfs: convert vxfs_get_page to read_kmap_page

2007-04-11 Thread Nate Diller
Replace vxfs_get_page() with the new read_kmap_page().

Signed-off-by: Nate Diller <[EMAIL PROTECTED]>

---

diff -urpN -X dontdiff linux-2.6.21-rc5-mm4/fs/freevxfs/vxfs_extern.h 
linux-2.6.21-rc5-mm4-test/fs/freevxfs/vxfs_extern.h
--- linux-2.6.21-rc5-mm4/fs/freevxfs/vxfs_extern.h  2007-04-05 
17:13:29.0 -0700
+++ linux-2.6.21-rc5-mm4-test/fs/freevxfs/vxfs_extern.h 2007-04-06 
01:59:19.0 -0700
@@ -69,7 +69,6 @@ extern const struct file_operations   vxfs
 extern int vxfs_read_olt(struct super_block *, u_long);
 
 /* vxfs_subr.c */
-extern struct page *   vxfs_get_page(struct address_space *, u_long);
 extern voidvxfs_put_page(struct page *);
 extern struct buffer_head *vxfs_bread(struct inode *, int);
 
diff -urpN -X dontdiff linux-2.6.21-rc5-mm4/fs/freevxfs/vxfs_inode.c 
linux-2.6.21-rc5-mm4-test/fs/freevxfs/vxfs_inode.c
--- linux-2.6.21-rc5-mm4/fs/freevxfs/vxfs_inode.c   2007-04-05 
17:14:25.0 -0700
+++ linux-2.6.21-rc5-mm4-test/fs/freevxfs/vxfs_inode.c  2007-04-06 
01:59:19.0 -0700
@@ -138,7 +138,7 @@ __vxfs_iget(ino_t ino, struct inode *ili
u_long  offset;
 
offset = (ino % (PAGE_SIZE / VXFS_ISIZE)) * VXFS_ISIZE;
-   pp = vxfs_get_page(ilistp->i_mapping, ino * VXFS_ISIZE / PAGE_SIZE);
+   pp = read_kmap_page(ilistp->i_mapping, ino * VXFS_ISIZE / PAGE_SIZE);
 
if (!IS_ERR(pp)) {
struct vxfs_inode_info  *vip;
diff -urpN -X dontdiff linux-2.6.21-rc5-mm4/fs/freevxfs/vxfs_lookup.c 
linux-2.6.21-rc5-mm4-test/fs/freevxfs/vxfs_lookup.c
--- linux-2.6.21-rc5-mm4/fs/freevxfs/vxfs_lookup.c  2007-04-05 
17:13:29.0 -0700
+++ linux-2.6.21-rc5-mm4-test/fs/freevxfs/vxfs_lookup.c 2007-04-06 
01:59:19.0 -0700
@@ -125,7 +125,7 @@ vxfs_find_entry(struct inode *ip, struct
caddr_t kaddr;
struct page *pp;
 
-   pp = vxfs_get_page(ip->i_mapping, page);
+   pp = read_kmap_page(ip->i_mapping, page);
if (IS_ERR(pp))
continue;
kaddr = (caddr_t)page_address(pp);
@@ -280,7 +280,7 @@ vxfs_readdir(struct file *fp, void *retp
caddr_t kaddr;
struct page *pp;
 
-   pp = vxfs_get_page(ip->i_mapping, page);
+   pp = read_kmap_page(ip->i_mapping, page);
if (IS_ERR(pp))
continue;
kaddr = (caddr_t)page_address(pp);
diff -urpN -X dontdiff linux-2.6.21-rc5-mm4/fs/freevxfs/vxfs_subr.c 
linux-2.6.21-rc5-mm4-test/fs/freevxfs/vxfs_subr.c
--- linux-2.6.21-rc5-mm4/fs/freevxfs/vxfs_subr.c2007-04-05 
17:14:25.0 -0700
+++ linux-2.6.21-rc5-mm4-test/fs/freevxfs/vxfs_subr.c   2007-04-06 
01:59:19.0 -0700
@@ -56,39 +56,6 @@ vxfs_put_page(struct page *pp)
 }
 
 /**
- * vxfs_get_page - read a page into memory.
- * @ip:inode to read from
- * @n: page number
- *
- * Description:
- *   vxfs_get_page reads the @n th page of @ip into the pagecache.
- *
- * Returns:
- *   The wanted page on success, else a NULL pointer.
- */
-struct page *
-vxfs_get_page(struct address_space *mapping, u_long n)
-{
-   struct page *   pp;
-
-   pp = read_mapping_page(mapping, n, NULL);
-
-   if (!IS_ERR(pp)) {
-   kmap(pp);
-   /** if (!PageChecked(pp)) **/
-   /** vxfs_check_page(pp); **/
-   if (PageError(pp))
-   goto fail;
-   }
-   
-   return (pp);
-
-fail:
-   vxfs_put_page(pp);
-   return ERR_PTR(-EIO);
-}
-
-/**
  * vxfs_bread - read buffer for a give inode,block tuple
  * @ip:inode
  * @block: logical block
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 15/17] sysv: convert dir_get_page to read_kmap_page

2007-04-11 Thread Nate Diller
Replace sysv dir_get_page() with the new read_kmap_page().

Signed-off-by: Nate Diller <[EMAIL PROTECTED]>

---

diff -urpN -X dontdiff linux-2.6.21-rc5-mm4/fs/sysv/dir.c 
linux-2.6.21-rc5-mm4-test/fs/sysv/dir.c
--- linux-2.6.21-rc5-mm4/fs/sysv/dir.c  2007-04-05 17:14:25.0 -0700
+++ linux-2.6.21-rc5-mm4-test/fs/sysv/dir.c 2007-04-06 01:59:19.0 
-0700
@@ -50,15 +50,6 @@ static int dir_commit_chunk(struct page 
return err;
 }
 
-static struct page * dir_get_page(struct inode *dir, unsigned long n)
-{
-   struct address_space *mapping = dir->i_mapping;
-   struct page *page = read_mapping_page(mapping, n, NULL);
-   if (!IS_ERR(page))
-   kmap(page);
-   return page;
-}
-
 static int sysv_readdir(struct file * filp, void * dirent, filldir_t filldir)
 {
unsigned long pos = filp->f_pos;
@@ -77,7 +68,7 @@ static int sysv_readdir(struct file * fi
for ( ; n < npages; n++, offset = 0) {
char *kaddr, *limit;
struct sysv_dir_entry *de;
-   struct page *page = dir_get_page(inode, n);
+   struct page *page = read_kmap_page(inode->i_mapping, n);
 
if (IS_ERR(page))
continue;
@@ -149,7 +140,7 @@ struct sysv_dir_entry *sysv_find_entry(s
 
do {
char *kaddr;
-   page = dir_get_page(dir, n);
+   page = read_kmap_page(dir->i_mapping, n);
if (!IS_ERR(page)) {
kaddr = (char*)page_address(page);
de = (struct sysv_dir_entry *) kaddr;
@@ -191,7 +182,7 @@ int sysv_add_link(struct dentry *dentry,
 
/* We take care of directory expansion in the same loop */
for (n = 0; n <= npages; n++) {
-   page = dir_get_page(dir, n);
+   page = read_kmap_page(dir->i_mapping, n);
err = PTR_ERR(page);
if (IS_ERR(page))
goto out;
@@ -299,7 +290,7 @@ int sysv_empty_dir(struct inode * inode)
for (i = 0; i < npages; i++) {
char *kaddr;
struct sysv_dir_entry * de;
-   page = dir_get_page(inode, i);
+   page = read_kmap_page(inode->i_mapping, i);
 
if (IS_ERR(page))
continue;
@@ -353,7 +344,7 @@ void sysv_set_link(struct sysv_dir_entry
 
 struct sysv_dir_entry * sysv_dotdot (struct inode *dir, struct page **p)
 {
-   struct page *page = dir_get_page(dir, 0);
+   struct page *page = read_kmap_page(dir->i_mapping, 0);
struct sysv_dir_entry *de = NULL;
 
if (!IS_ERR(page)) {
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/12] Pass MAP_FIXED down to get_unmapped_area

2007-04-11 Thread Wu, Bryan
On Thu, 2007-04-12 at 12:20 +1000, Benjamin Herrenschmidt wrote:
> This is a "first step" as there are still cleanups to be done in various
> areas touched by that code but I think it's probably good to go as is and
> at least enables me to implement what I need for PowerPC.
> 
> (Andrew, this is also candidate for 2.6.22 since I haven't had any real
> objection, mostly suggestion for improving further, which I'll try to
> do later, and I have further powerpc patches that rely on this).
> 
> The current get_unmapped_area code calls the f_ops->get_unmapped_area or
> the arch one (via the mm) only when MAP_FIXED is not passed. That makes
> it impossible for archs to impose proper constraints on regions of the
> virtual address space. To work around that, get_unmapped_area() then
> calls some hugetlbfs specific hacks.
> 
> This cause several problems, among others:
> 
>  - It makes it impossible for a driver or filesystem to do the same thing
> that hugetlbfs does (for example, to allow a driver to use larger page
> sizes to map external hardware) if that requires applying a constraint
> on the addresses (constraining that mapping in certain regions and other
> mappings out of those regions).
> 
>  - Some archs like arm, mips, sparc, sparc64, sh and sh64 already want
> MAP_FIXED to be passed down in order to deal with aliasing issues.
> The code is there to handle it... but is never called.
> 

Is there any support consideration for nommu arch such as blackfin which
is in the -mm tree now?

It is very kind of you to point out some idea about MAP_FIXED for
Blackfin arch, I will do some help for this.

Thanks 
-Bryan
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 16/17] ufs: convert ufs_get_page to read_kmap_page

2007-04-11 Thread Nate Diller
Replace ufs_get_page()/ufs_get_locked_page() and
ufs_put_page()/ufs_put_locked_page() using the new read_kmap_page() and
put_kmapped_page() calls and their locking variants.  Also, change the
ufs_check_page() call to return the page's error status, and update the
call sites accordingly.

Signed-off-by: Nate Diller <[EMAIL PROTECTED]>

---

diff -urpN -X dontdiff linux-2.6.21-rc5-mm4/fs/ufs/balloc.c 
linux-2.6.21-rc5-mm4-test/fs/ufs/balloc.c
--- linux-2.6.21-rc5-mm4/fs/ufs/balloc.c2007-04-05 17:13:29.0 
-0700
+++ linux-2.6.21-rc5-mm4-test/fs/ufs/balloc.c   2007-04-06 12:46:02.0 
-0700
@@ -272,7 +272,7 @@ static void ufs_change_blocknr(struct in
index = i >> (PAGE_CACHE_SHIFT - inode->i_blkbits);
 
if (likely(cur_index != index)) {
-   page = ufs_get_locked_page(mapping, index);
+   page = __read_mapping_page(mapping, index, NULL);
if (!page)/* it was truncated */
continue;
if (IS_ERR(page)) {/* or EIO */
@@ -325,8 +325,10 @@ static void ufs_change_blocknr(struct in
bh = bh->b_this_page;
} while (bh != head);
 
-   if (likely(cur_index != index))
-   ufs_put_locked_page(page);
+   if (likely(cur_index != index)) {
+   unlock_page(page);
+   page_cache_release(page);
+   }
}
UFSD("EXIT\n");
 }
diff -urpN -X dontdiff linux-2.6.21-rc5-mm4/fs/ufs/truncate.c 
linux-2.6.21-rc5-mm4-test/fs/ufs/truncate.c
--- linux-2.6.21-rc5-mm4/fs/ufs/truncate.c  2007-04-05 17:13:29.0 
-0700
+++ linux-2.6.21-rc5-mm4-test/fs/ufs/truncate.c 2007-04-06 12:46:14.0 
-0700
@@ -395,8 +395,9 @@ static int ufs_alloc_lastblock(struct in
 
lastfrag--;
 
-   lastpage = ufs_get_locked_page(mapping, lastfrag >>
-  (PAGE_CACHE_SHIFT - inode->i_blkbits));
+   lastpage = __read_mapping_page(mapping, lastfrag >>
+  (PAGE_CACHE_SHIFT - inode->i_blkbits),
+  NULL);
if (IS_ERR(lastpage)) {
err = -EIO;
goto out;
@@ -441,7 +442,8 @@ static int ufs_alloc_lastblock(struct in
   }
}
 out_unlock:
-   ufs_put_locked_page(lastpage);
+   unlock_page(lastpage);
+   page_cache_release(lastpage);
 out:
return err;
 }
diff -urpN -X dontdiff linux-2.6.21-rc5-mm4/fs/ufs/util.c 
linux-2.6.21-rc5-mm4-test/fs/ufs/util.c
--- linux-2.6.21-rc5-mm4/fs/ufs/util.c  2007-04-05 17:14:25.0 -0700
+++ linux-2.6.21-rc5-mm4-test/fs/ufs/util.c 2007-04-06 12:40:53.0 
-0700
@@ -232,55 +232,3 @@ ufs_set_inode_dev(struct super_block *sb
ufsi->i_u1.i_data[0] = cpu_to_fs32(sb, fs32);
 }
 
-/**
- * ufs_get_locked_page() - locate, pin and lock a pagecache page, if not exist
- * read it from disk.
- * @mapping: the address_space to search
- * @index: the page index
- *
- * Locates the desired pagecache page, if not exist we'll read it,
- * locks it, increments its reference
- * count and returns its address.
- *
- */
-
-struct page *ufs_get_locked_page(struct address_space *mapping,
-pgoff_t index)
-{
-   struct page *page;
-
-   page = find_lock_page(mapping, index);
-   if (!page) {
-   page = read_mapping_page(mapping, index, NULL);
-
-   if (IS_ERR(page)) {
-   printk(KERN_ERR "ufs_change_blocknr: "
-  "read_mapping_page error: ino %lu, index: %lu\n",
-  mapping->host->i_ino, index);
-   goto out;
-   }
-
-   lock_page(page);
-
-   if (unlikely(page->mapping == NULL)) {
-   /* Truncate got there first */
-   unlock_page(page);
-   page_cache_release(page);
-   page = NULL;
-   goto out;
-   }
-
-   if (!PageUptodate(page) || PageError(page)) {
-   unlock_page(page);
-   page_cache_release(page);
-
-   printk(KERN_ERR "ufs_change_blocknr: "
-  "can not read page: ino %lu, index: %lu\n",
-  mapping->host->i_ino, index);
-
-   page = ERR_PTR(-EIO);
-   }
-   }
-out:
-   return page;
-}
diff -urpN -X dontdiff linux-2.6.21-rc5-mm4/fs/ufs/util.h 
linux-2.6.21-rc5-mm4-test/fs/ufs/util.h
--- linux-2.6.21-rc5-mm4/fs/ufs/util.h  2007-04-05 17:13:29.0 -0700
+++ linux-2.6.21-rc5-mm4-test/fs/ufs/util.h 2007-04-06 12:46:36.0 
-0700
@@ -251,16 +251,6 @@ extern void _ubh_ubhcpymem_(struct ufs_s
 #define ubh_memcpyubh(ubh,mem,size) 

[PATCH 13/17] reiser4: remove redundant read_mapping_page error checks

2007-04-11 Thread Nate Diller
read_mapping_page() is now fully synchronous, so there's no need wait for
the page lock or check for I/O errors.

Signed-off-by: Nate Diller <[EMAIL PROTECTED]>

---

diff -urpN -X dontdiff 
linux-2.6.21-rc6-mm1/fs/reiser4/plugin/file/tail_conversion.c 
linux-2.6.21-rc6-mm1-test/fs/reiser4/plugin/file/tail_conversion.c
--- linux-2.6.21-rc6-mm1/fs/reiser4/plugin/file/tail_conversion.c   
2007-04-09 17:24:03.0 -0700
+++ linux-2.6.21-rc6-mm1-test/fs/reiser4/plugin/file/tail_conversion.c  
2007-04-10 21:33:47.0 -0700
@@ -608,14 +608,6 @@ int extent2tail(unix_file_info_t *uf_inf
break;
}
 
-   wait_on_page_locked(page);
-
-   if (!PageUptodate(page)) {
-   page_cache_release(page);
-   result = RETERR(-EIO);
-   break;
-   }
-
/* cut part of file we have read */
start_byte = (__u64) (i << PAGE_CACHE_SHIFT);
set_key_offset(, start_byte);
diff -urpN -X dontdiff 
linux-2.6.21-rc6-mm1/fs/reiser4/plugin/item/extent_file_ops.c 
linux-2.6.21-rc6-mm1-test/fs/reiser4/plugin/item/extent_file_ops.c
--- linux-2.6.21-rc6-mm1/fs/reiser4/plugin/item/extent_file_ops.c   
2007-04-10 19:41:14.0 -0700
+++ linux-2.6.21-rc6-mm1-test/fs/reiser4/plugin/item/extent_file_ops.c  
2007-04-10 21:38:41.0 -0700
@@ -1220,15 +1220,8 @@ int reiser4_read_extent(struct file *fil
page = read_mapping_page(mapping, cur_page, file);
if (IS_ERR(page))
return PTR_ERR(page);
-   lock_page(page);
-   if (!PageUptodate(page)) {
-   unlock_page(page);
-   page_cache_release(page);
-   warning("jmacd-97178", "extent_read: page is not up to 
date");
-   return RETERR(-EIO);
-   }
+
mark_page_accessed(page);
-   unlock_page(page);
 
/* If users can be writing to this page using arbitrary virtual
   addresses, take care about potential aliasing before reading
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 12/17] partition: remove redundant read_mapping_page error checks

2007-04-11 Thread Nate Diller
Remove unneeded PageError checking in read_dev_sector(), and clean up the
code a bit.

Can anyone point out why it's OK to use page_address() here on a page which
has not been kmapped?  If it's not OK, then a good number of callers need to
be fixed.

Signed-off-by: Nate Diller <[EMAIL PROTECTED]>

---

diff -urpN -X dontdiff linux-2.6.21-rc6-mm1/fs/partitions/check.c 
linux-2.6.21-rc6-mm1-test/fs/partitions/check.c
--- linux-2.6.21-rc6-mm1/fs/partitions/check.c  2007-04-09 17:24:03.0 
-0700
+++ linux-2.6.21-rc6-mm1-test/fs/partitions/check.c 2007-04-10 
21:59:01.0 -0700
@@ -568,16 +568,12 @@ unsigned char *read_dev_sector(struct bl
 
page = read_mapping_page(mapping, (pgoff_t)(n >> (PAGE_CACHE_SHIFT-9)),
 NULL);
-   if (!IS_ERR(page)) {
-   if (PageError(page))
-   goto fail;
-   p->v = page;
-   return (unsigned char *)page_address(page) +  ((n & ((1 << 
(PAGE_CACHE_SHIFT - 9)) - 1)) << 9);
-fail:
-   page_cache_release(page);
+   if (IS_ERR(page)) {
+   p->v = NULL;
+   return NULL;
}
-   p->v = NULL;
-   return NULL;
+   p->v = page;
+   return (unsigned char *)page_address(page) +  ((n & ((1 << 
(PAGE_CACHE_SHIFT - 9)) - 1)) << 9);
 }
 
 EXPORT_SYMBOL(read_dev_sector);
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 9/17] minix: convert dir_get_page to read_kmap_page

2007-04-11 Thread Nate Diller
Replace minix dir_get_page() and dir_put_page() using the new
read_kmap_page() and put_kmapped_page()/put_locked_page() calls.  Also, use
__read_kmap_page() instead of re-taking the page_lock.

Signed-off-by: Nate Diller <[EMAIL PROTECTED]>

---

diff -urpN -X dontdiff linux-2.6.21-rc5-mm4/fs/minix/dir.c 
linux-2.6.21-rc5-mm4-test/fs/minix/dir.c
--- linux-2.6.21-rc5-mm4/fs/minix/dir.c 2007-04-05 17:14:25.0 -0700
+++ linux-2.6.21-rc5-mm4-test/fs/minix/dir.c2007-04-06 02:31:55.0 
-0700
@@ -23,12 +23,6 @@ const struct file_operations minix_dir_o
.fsync  = minix_sync_file,
 };
 
-static inline void dir_put_page(struct page *page)
-{
-   kunmap(page);
-   page_cache_release(page);
-}
-
 /*
  * Return the offset into page `page_nr' of the last valid
  * byte in that page, plus one.
@@ -60,22 +54,6 @@ static int dir_commit_chunk(struct page 
return err;
 }
 
-static struct page * dir_get_page(struct inode *dir, unsigned long n)
-{
-   struct address_space *mapping = dir->i_mapping;
-   struct page *page = read_mapping_page(mapping, n, NULL);
-   if (!IS_ERR(page)) {
-   kmap(page);
-   if (!PageUptodate(page))
-   goto fail;
-   }
-   return page;
-
-fail:
-   dir_put_page(page);
-   return ERR_PTR(-EIO);
-}
-
 static inline void *minix_next_entry(void *de, struct minix_sb_info *sbi)
 {
return (void*)((char*)de + sbi->s_dirsize);
@@ -102,7 +80,7 @@ static int minix_readdir(struct file * f
 
for ( ; n < npages; n++, offset = 0) {
char *p, *kaddr, *limit;
-   struct page *page = dir_get_page(inode, n);
+   struct page *page = read_kmap_page(inode->i_mapping, n);
 
if (IS_ERR(page))
continue;
@@ -128,12 +106,12 @@ static int minix_readdir(struct file * f
(n << PAGE_CACHE_SHIFT) | offset,
inumber, DT_UNKNOWN);
if (over) {
-   dir_put_page(page);
+   put_kmapped_page(page);
goto done;
}
}
}
-   dir_put_page(page);
+   put_kmapped_page(page);
}
 
 done:
@@ -177,7 +155,7 @@ minix_dirent *minix_find_entry(struct de
for (n = 0; n < npages; n++) {
char *kaddr, *limit;
 
-   page = dir_get_page(dir, n);
+   page = read_kmap_page(dir->i_mapping, n);
if (IS_ERR(page))
continue;
 
@@ -198,7 +176,7 @@ minix_dirent *minix_find_entry(struct de
if (namecompare(namelen, sbi->s_namelen, name, namx))
goto found;
}
-   dir_put_page(page);
+   put_kmapped_page(page);
}
return NULL;
 
@@ -233,11 +211,10 @@ int minix_add_link(struct dentry *dentry
for (n = 0; n <= npages; n++) {
char *limit, *dir_end;
 
-   page = dir_get_page(dir, n);
+   page = __read_kmap_page(dir->i_mapping, n);
err = PTR_ERR(page);
if (IS_ERR(page))
goto out;
-   lock_page(page);
kaddr = (char*)page_address(page);
dir_end = kaddr + minix_last_byte(dir, n);
limit = kaddr + PAGE_CACHE_SIZE - sbi->s_dirsize;
@@ -265,8 +242,7 @@ int minix_add_link(struct dentry *dentry
if (namecompare(namelen, sbi->s_namelen, name, namx))
goto out_unlock;
}
-   unlock_page(page);
-   dir_put_page(page);
+   put_locked_page(page);
}
BUG();
return -EINVAL;
@@ -288,13 +264,12 @@ got_it:
err = dir_commit_chunk(page, from, to);
dir->i_mtime = dir->i_ctime = CURRENT_TIME_SEC;
mark_inode_dirty(dir);
-out_put:
-   dir_put_page(page);
+   put_kmapped_page(page);
 out:
return err;
 out_unlock:
-   unlock_page(page);
-   goto out_put;
+   put_locked_page(page);
+   return err;
 }
 
 int minix_delete_entry(struct minix_dir_entry *de, struct page *page)
@@ -314,7 +289,7 @@ int minix_delete_entry(struct minix_dir_
} else {
unlock_page(page);
}
-   dir_put_page(page);
+   put_kmapped_page(page);
inode->i_ctime = inode->i_mtime = CURRENT_TIME_SEC;
mark_inode_dirty(inode);
return err;
@@ -378,7 +353,7 @@ int minix_empty_dir(struct inode * inode
for (i = 0; i < npages; i++) {
char *p, *kaddr, *limit;
 
-   page = dir_get_page(inode, i);
+   page = read_kmap_page(inode->i_mapping, i);
if 

[PATCH 10/17] mtd: convert page_read to read_kmap_page

2007-04-11 Thread Nate Diller
Replace page_read() with read_kmap_page()/__read_kmap_page().  This probably
fixes behaviour on highmem systems, since page_address() was being used
without kmap().  Also eliminate the need to re-take the page lock during
writes to the page.

Signed-off-by: Nate Diller <[EMAIL PROTECTED]>

---

diff -urpN -X dontdiff linux-2.6.21-rc5-mm4/drivers/mtd/devices/block2mtd.c 
linux-2.6.21-rc5-mm4-test/drivers/mtd/devices/block2mtd.c
--- linux-2.6.21-rc5-mm4/drivers/mtd/devices/block2mtd.c2007-04-05 
17:14:24.0 -0700
+++ linux-2.6.21-rc5-mm4-test/drivers/mtd/devices/block2mtd.c   2007-04-06 
01:59:19.0 -0700
@@ -39,12 +39,6 @@ struct block2mtd_dev {
 /* Static info about the MTD, used in cleanup_module */
 static LIST_HEAD(blkmtd_device_list);
 
-
-static struct page *page_read(struct address_space *mapping, int index)
-{
-   return read_mapping_page(mapping, index, NULL);
-}
-
 /* erase a specified part of the device */
 static int _block2mtd_erase(struct block2mtd_dev *dev, loff_t to, size_t len)
 {
@@ -56,23 +50,19 @@ static int _block2mtd_erase(struct block
u_long *max;
 
while (pages) {
-   page = page_read(mapping, index);
-   if (!page)
-   return -ENOMEM;
+   page = __read_kmap_page(mapping, index);
if (IS_ERR(page))
return PTR_ERR(page);
 
max = page_address(page) + PAGE_SIZE;
for (p=page_address(page); pblkdev->bd_inode->i_mapping, index);
-   if (!page)
-   return -ENOMEM;
+   page = read_kmap_page(dev->blkdev->bd_inode->i_mapping, index);
if (IS_ERR(page))
return PTR_ERR(page);
 
memcpy(buf, page_address(page) + offset, cpylen);
-   page_cache_release(page);
+   put_kmapped_page(page);
 
if (retlen)
*retlen += cpylen;
@@ -163,19 +151,15 @@ static int _block2mtd_write(struct block
cpylen = len;   // this page
len = len - cpylen;
 
-   page = page_read(mapping, index);
-   if (!page)
-   return -ENOMEM;
+   page = __read_kmap_page(mapping, index);
if (IS_ERR(page))
return PTR_ERR(page);
 
if (memcmp(page_address(page)+offset, buf, cpylen)) {
-   lock_page(page);
memcpy(page_address(page) + offset, buf, cpylen);
set_page_dirty(page);
-   unlock_page(page);
}
-   page_cache_release(page);
+   put_locked_page(page);
 
if (retlen)
*retlen += cpylen;
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 4/17] ext2: convert ext2_get_page to read_kmap_page

2007-04-11 Thread Nate Diller
Replace ext2_get_page() and ext2_put_page() using the new read_kmap_page()
and put_kmapped_page() calls.  Also, change the ext2_check_page() call to
return the page's error status, and update the call sites accordingly.

Signed-off-by: Nate Diller <[EMAIL PROTECTED]>

---

diff -urpN -X dontdiff linux-2.6.21-rc5-mm4/fs/ext2/dir.c 
linux-2.6.21-rc5-mm4-test/fs/ext2/dir.c
--- linux-2.6.21-rc5-mm4/fs/ext2/dir.c  2007-04-06 12:27:03.0 -0700
+++ linux-2.6.21-rc5-mm4-test/fs/ext2/dir.c 2007-04-06 14:34:23.0 
-0700
@@ -35,12 +35,6 @@ static inline unsigned ext2_chunk_size(s
return inode->i_sb->s_blocksize;
 }
 
-static inline void ext2_put_page(struct page *page)
-{
-   kunmap(page);
-   page_cache_release(page);
-}
-
 static inline unsigned long dir_pages(struct inode *inode)
 {
return (inode->i_size+PAGE_CACHE_SIZE-1)>>PAGE_CACHE_SHIFT;
@@ -74,7 +68,7 @@ static int ext2_commit_chunk(struct page
return err;
 }
 
-static void ext2_check_page(struct page *page)
+static int ext2_check_page(struct page *page)
 {
struct inode *dir = page->mapping->host;
struct super_block *sb = dir->i_sb;
@@ -86,6 +80,14 @@ static void ext2_check_page(struct page 
ext2_dirent *p;
char *error;
 
+   if (likely(PageChecked(page))) {
+   if (likely(!PageError(page)))
+   return 0;
+
+   put_kmapped_page(page);
+   return -EIO;
+   }
+
if ((dir->i_size >> PAGE_CACHE_SHIFT) == page->index) {
limit = dir->i_size & ~PAGE_CACHE_MASK;
if (limit & (chunk_size - 1))
@@ -112,7 +114,7 @@ static void ext2_check_page(struct page 
goto Eend;
 out:
SetPageChecked(page);
-   return;
+   return 0;
 
/* Too bad, we had an error */
 
@@ -153,24 +155,8 @@ Eend:
 fail:
SetPageChecked(page);
SetPageError(page);
-}
-
-static struct page * ext2_get_page(struct inode *dir, unsigned long n)
-{
-   struct address_space *mapping = dir->i_mapping;
-   struct page *page = read_mapping_page(mapping, n, NULL);
-   if (!IS_ERR(page)) {
-   kmap(page);
-   if (!PageChecked(page))
-   ext2_check_page(page);
-   if (PageError(page))
-   goto fail;
-   }
-   return page;
-
-fail:
-   ext2_put_page(page);
-   return ERR_PTR(-EIO);
+   put_kmapped_page(page);
+   return -EIO;
 }
 
 /*
@@ -262,9 +248,9 @@ ext2_readdir (struct file * filp, void *
for ( ; n < npages; n++, offset = 0) {
char *kaddr, *limit;
ext2_dirent *de;
-   struct page *page = ext2_get_page(inode, n);
+   struct page *page = read_kmap_page(inode->i_mapping, n);
 
-   if (IS_ERR(page)) {
+   if (IS_ERR(page) || ext2_check_page(page)) {
ext2_error(sb, __FUNCTION__,
   "bad page in #%lu",
   inode->i_ino);
@@ -286,7 +272,7 @@ ext2_readdir (struct file * filp, void *
if (de->rec_len == 0) {
ext2_error(sb, __FUNCTION__,
"zero-length directory entry");
-   ext2_put_page(page);
+   put_kmapped_page(page);
return -EIO;
}
if (de->inode) {
@@ -301,13 +287,13 @@ ext2_readdir (struct file * filp, void *
(nf_pos += le16_to_cpu(de->rec_len);
}
-   ext2_put_page(page);
+   put_kmapped_page(page);
}
return 0;
 }
@@ -344,8 +330,8 @@ struct ext2_dir_entry_2 * ext2_find_entr
n = start;
do {
char *kaddr;
-   page = ext2_get_page(dir, n);
-   if (!IS_ERR(page)) {
+   page = read_kmap_page(dir->i_mapping, n);
+   if (!IS_ERR(page) && !ext2_check_page(page)) {
kaddr = page_address(page);
de = (ext2_dirent *) kaddr;
kaddr += ext2_last_byte(dir, n) - reclen;
@@ -353,14 +339,14 @@ struct ext2_dir_entry_2 * ext2_find_entr
if (de->rec_len == 0) {
ext2_error(dir->i_sb, __FUNCTION__,
"zero-length directory entry");
-   ext2_put_page(page);
+  

[PATCH 8/17] jfs: use locking read_mapping_page

2007-04-11 Thread Nate Diller
Use the new locking variant of read_mapping_page to avoid doing extra work.

Signed-off-by: Nate Diller <[EMAIL PROTECTED]>

---

diff -urpN -X dontdiff linux-2.6.21-rc6-mm1/fs/jfs/jfs_metapage.c 
linux-2.6.21-rc6-mm1-test/fs/jfs/jfs_metapage.c
--- linux-2.6.21-rc6-mm1/fs/jfs/jfs_metapage.c  2007-04-09 17:23:48.0 
-0700
+++ linux-2.6.21-rc6-mm1-test/fs/jfs/jfs_metapage.c 2007-04-09 
21:37:09.0 -0700
@@ -632,12 +632,11 @@ struct metapage *__get_metapage(struct i
}
SetPageUptodate(page);
} else {
-   page = read_mapping_page(mapping, page_index, NULL);
-   if (IS_ERR(page) || !PageUptodate(page)) {
+   page = __read_mapping_page(mapping, page_index, NULL);
+   if (IS_ERR(page)) {
jfs_err("read_mapping_page failed!");
return NULL;
}
-   lock_page(page);
}
 
mp = page_to_mp(page, page_offset);
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2/17] fs: introduce new read_cache_page interface

2007-04-11 Thread Nate Diller
Export a single version of read_cache_page, which returns with a locked,
Uptodate page or a synchronous error, and use inline helper functions to
replicate the old behavior.  Also, introduce new helper functions for the
most common file system uses, which include kmapping the page, as well as
needing to keep the page locked.  These changes collectively eliminate a
substantial amount of private fs logic in favor of generic code.

It also simplifies filemap.c significantly, by assuming that callers want
synchronous behavior, which is true for all callers anyway except one.

Signed-off-by: Nate Diller <[EMAIL PROTECTED]>

---

diff -urpN -X dontdiff linux-2.6.21-rc6-mm1/include/linux/pagemap.h 
linux-2.6.21-rc6-mm1-test/include/linux/pagemap.h
--- linux-2.6.21-rc6-mm1/include/linux/pagemap.h2007-04-11 
14:22:19.0 -0700
+++ linux-2.6.21-rc6-mm1-test/include/linux/pagemap.h   2007-04-11 
14:29:31.0 -0700
@@ -108,21 +108,30 @@ static inline struct page *grab_cache_pa
 
 extern struct page * grab_cache_page_nowait(struct address_space *mapping,
unsigned long index);
-extern struct page * read_cache_page_async(struct address_space *mapping,
-   unsigned long index, filler_t *filler,
-   void *data);
-extern struct page * read_cache_page(struct address_space *mapping,
+extern struct page *__read_cache_page(struct address_space *mapping,
unsigned long index, filler_t *filler,
void *data);
 extern int read_cache_pages(struct address_space *mapping,
struct list_head *pages, filler_t *filler, void *data);
 
-static inline struct page *read_mapping_page_async(
-   struct address_space *mapping,
+void fastcall unlock_page(struct page *page);
+static inline struct page *read_cache_page(struct address_space *mapping,
+   unsigned long index, filler_t *filler,
+   void *data)
+{
+   struct page *page;
+
+   page = __read_cache_page(mapping, index, filler, data);
+   if (!IS_ERR(page))
+   unlock_page(page);
+   return page;
+}
+
+static inline struct page *__read_mapping_page(struct address_space *mapping,
 unsigned long index, void *data)
 {
filler_t *filler = (filler_t *)mapping->a_ops->readpage;
-   return read_cache_page_async(mapping, index, filler, data);
+   return __read_cache_page(mapping, index, filler, data);
 }
 
 static inline struct page *read_mapping_page(struct address_space *mapping,
@@ -132,6 +141,36 @@ static inline struct page *read_mapping_
return read_cache_page(mapping, index, filler, data);
 }
 
+static inline struct page *__read_kmap_page(struct address_space *mapping,
+ unsigned long index)
+{
+   struct page *page = __read_mapping_page(mapping, index, NULL);
+   if (!IS_ERR(page))
+   kmap(page);
+   return page;
+}
+
+static inline struct page *read_kmap_page(struct address_space *mapping,
+ unsigned long index)
+{
+   struct page *page = read_mapping_page(mapping, index, NULL);
+   if (!IS_ERR(page))
+   kmap(page);
+   return page;
+}
+
+static inline void put_kmapped_page(struct page *page)
+{
+   kunmap(page);
+   page_cache_release(page);
+}
+
+static inline void put_locked_page(struct page *page)
+{
+   unlock_page(page);
+   put_kmapped_page(page);
+}
+
 int add_to_page_cache(struct page *page, struct address_space *mapping,
unsigned long index, gfp_t gfp_mask);
 int add_to_page_cache_lru(struct page *page, struct address_space *mapping,
diff -urpN -X dontdiff linux-2.6.21-rc6-mm1/mm/filemap.c 
linux-2.6.21-rc6-mm1-test/mm/filemap.c
--- linux-2.6.21-rc6-mm1/mm/filemap.c   2007-04-11 14:26:42.0 -0700
+++ linux-2.6.21-rc6-mm1-test/mm/filemap.c  2007-04-10 21:46:03.0 
-0700
@@ -1600,115 +1600,53 @@ int generic_file_readonly_mmap(struct fi
 EXPORT_SYMBOL(generic_file_mmap);
 EXPORT_SYMBOL(generic_file_readonly_mmap);
 
-static struct page *__read_cache_page(struct address_space *mapping,
-   unsigned long index,
-   int (*filler)(void *,struct page*),
-   void *data)
-{
-   struct page *page, *cached_page = NULL;
-   int err;
-repeat:
-   page = find_get_page(mapping, index);
-   if (!page) {
-   if (!cached_page) {
-   cached_page = page_cache_alloc_cold(mapping);
-   if (!cached_page)
-   return ERR_PTR(-ENOMEM);
-   }
-   err = add_to_page_cache_lru(cached_page, mapping,
-   index, 

[PATCH 3/17] afs: convert afs_dir_get_page to read_kmap_page

2007-04-11 Thread Nate Diller
Replace afs_dir_get_page() and afs_dir_put_page() using the new
read_kmap_page() and put_kmapped_page() calls, and eliminate unnecessary
PageError checks.  Also, change the afs_dir_check_page() call to return
the page's error status, and update the call site accordingly.

Signed-off-by: Nate Diller <[EMAIL PROTECTED]>

---

diff -urpN -X dontdiff linux-2.6.21-rc5-mm4/fs/afs/dir.c 
linux-2.6.21-rc5-mm4-test/fs/afs/dir.c
--- linux-2.6.21-rc5-mm4/fs/afs/dir.c   2007-04-06 12:27:03.0 -0700
+++ linux-2.6.21-rc5-mm4-test/fs/afs/dir.c  2007-04-06 14:30:22.0 
-0700
@@ -115,12 +115,15 @@ struct afs_dir_lookup_cookie {
 /*
  * check that a directory page is valid
  */
-static inline void afs_dir_check_page(struct inode *dir, struct page *page)
+static inline int afs_dir_check_page(struct inode *dir, struct page *page)
 {
struct afs_dir_page *dbuf;
loff_t latter;
int tmp, qty;
 
+   if (likely(PageChecked(page)))
+   return PageError(page);
+
 #if 0
/* check the page count */
qty = desc.size / sizeof(dbuf->blocks[0]);
@@ -154,52 +157,16 @@ static inline void afs_dir_check_page(st
}
 
SetPageChecked(page);
-   return;
+   return 0;
 
  error:
SetPageChecked(page);
SetPageError(page);
-
+   return 1;
 } /* end afs_dir_check_page() */
 
 /*/
 /*
- * discard a page cached in the pagecache
- */
-static inline void afs_dir_put_page(struct page *page)
-{
-   kunmap(page);
-   page_cache_release(page);
-
-} /* end afs_dir_put_page() */
-
-/*/
-/*
- * get a page into the pagecache
- */
-static struct page *afs_dir_get_page(struct inode *dir, unsigned long index)
-{
-   struct page *page;
-
-   _enter("{%lu},%lu", dir->i_ino, index);
-
-   page = read_mapping_page(dir->i_mapping, index, NULL);
-   if (!IS_ERR(page)) {
-   kmap(page);
-   if (!PageChecked(page))
-   afs_dir_check_page(dir, page);
-   if (PageError(page))
-   goto fail;
-   }
-   return page;
-
- fail:
-   afs_dir_put_page(page);
-   return ERR_PTR(-EIO);
-} /* end afs_dir_get_page() */
-
-/*/
-/*
  * open an AFS directory file
  */
 static int afs_dir_open(struct inode *inode, struct file *file)
@@ -344,11 +311,16 @@ static int afs_dir_iterate(struct inode 
blkoff = *fpos & ~(sizeof(union afs_dir_block) - 1);
 
/* fetch the appropriate page from the directory */
-   page = afs_dir_get_page(dir, blkoff / PAGE_SIZE);
+   page = read_kmap_page(dir->i_mapping, blkoff / PAGE_SIZE);
if (IS_ERR(page)) {
ret = PTR_ERR(page);
break;
}
+   if (afs_check_page(dir, page)) {
+   err = -EIO;
+   put_kmapped_page(page);
+   break;
+   }
 
limit = blkoff & ~(PAGE_SIZE - 1);
 
@@ -361,7 +333,7 @@ static int afs_dir_iterate(struct inode 
ret = afs_dir_iterate_block(fpos, dblock, blkoff,
cookie, filldir);
if (ret != 1) {
-   afs_dir_put_page(page);
+   put_kmapped_page(page);
goto out;
}
 
@@ -369,7 +341,7 @@ static int afs_dir_iterate(struct inode 
 
} while (*fpos < dir->i_size && blkoff < limit);
 
-   afs_dir_put_page(page);
+   put_kmapped_page(page);
ret = 0;
}
 
diff -urpN -X dontdiff linux-2.6.21-rc6-mm1/fs/afs/mntpt.c 
linux-2.6.21-rc6-mm1-test/fs/afs/mntpt.c
--- linux-2.6.21-rc6-mm1/fs/afs/mntpt.c 2007-04-09 17:24:03.0 -0700
+++ linux-2.6.21-rc6-mm1-test/fs/afs/mntpt.c2007-04-10 21:22:07.0 
-0700
@@ -74,11 +74,6 @@ int afs_mntpt_check_symlink(struct afs_v
ret = PTR_ERR(page);
goto out;
}
-
-   ret = -EIO;
-   if (PageError(page))
-   goto out_free;
-
buf = kmap(page);
 
/* examine the symlink's contents */
@@ -98,7 +93,6 @@ int afs_mntpt_check_symlink(struct afs_v
ret = 0;
 
kunmap(page);
- out_free:
page_cache_release(page);
  out:
_leave(" = %d", ret);
@@ -180,10 +174,6 @@ static struct vfsmount *afs_mntpt_do_aut
goto error;
}
 
-   ret = -EIO;
-   if (PageError(page))
-   goto error;
-
buf = kmap(page);
memcpy(devname, buf, size);
kunmap(page);
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" 

[PATCH 0/17] fs: cleanup single page synchronous read interface

2007-04-11 Thread Nate Diller
Nick Piggin recently changed the read_cache_page interface to be
synchronous, which is pretty much what the file systems want anyway.  Turns
out that they have more in common than that, though, and some of them want
to be able to get an uptodate *locked* page.  Many of them want a kmapped
page, which is uptodate and unlocked, and they all have their own individual
helper functions to achieve this.

Since the helper functions are so similar, this patch just combines them
into a small number of simple library functions, which call read_cache_page
(renamed to __read_cache_page because it now returns a locked page).  The
immediate result is a vast reduction in the number of fs-specific helper
functions.  The secondary goal is to reduce the number of places the page
lock is taken, and eliminate a lot of PageUptodate and PageError checks.

The file systems that still use PageChecked now have checker functions that
return an error if the page is corrupted or has some other error.  This
simplifies the logic since the checker function is not part of any helper
function anymore.

Compile tested on x86_64.

Signed-off-by: Nate Diller <[EMAIL PROTECTED]>

---

 drivers/mtd/devices/block2mtd.c  |   28 +--
 fs/afs/dir.c |   56 +++---
 fs/afs/mntpt.c   |   10 --
 fs/cramfs/inode.c|3 
 fs/ext2/dir.c|   82 -
 fs/freevxfs/vxfs_extern.h|1 
 fs/freevxfs/vxfs_inode.c |2 
 fs/freevxfs/vxfs_lookup.c|4 -
 fs/freevxfs/vxfs_subr.c  |   33 
 fs/hfs/bnode.c   |4 -
 fs/hfsplus/bnode.c   |4 -
 fs/jffs2/fs.c|   27 ---
 fs/jffs2/gc.c|   15 ++-
 fs/jfs/jfs_metapage.c|5 -
 fs/minix/dir.c   |   59 ---
 fs/ntfs/aops.h   |   67 -
 fs/ntfs/bitmap.c |8 +-
 fs/ntfs/dir.c|   65 ++---
 fs/ntfs/index.c  |   12 +--
 fs/ntfs/lcnalloc.c   |6 -
 fs/ntfs/logfile.c|   12 +--
 fs/ntfs/mft.c|   53 +
 fs/ntfs/super.c  |   38 -
 fs/ntfs/usnjrnl.c|4 -
 fs/partitions/check.c|   14 +--
 fs/reiser4/plugin/file/tail_conversion.c |8 --
 fs/reiser4/plugin/item/extent_file_ops.c |9 --
 fs/reiserfs/xattr.c  |   48 ++--
 fs/sysv/dir.c|   19 +---
 fs/ufs/balloc.c  |8 +-
 fs/ufs/dir.c |   90 +--
 fs/ufs/truncate.c|8 +-
 fs/ufs/util.c|   52 -
 fs/ufs/util.h|   10 --
 include/linux/pagemap.h  |   53 -
 mm/filemap.c |  118 +++
 36 files changed, 315 insertions(+), 720 deletions(-)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: If not readdir() then what?

2007-04-11 Thread Jörn Engel
On Thu, 12 April 2007 11:46:41 +1000, Neil Brown wrote:
> 
> I could argue that nfs came before ext3+dirindex, so ext3 should have
> been designed to work properly with NFS.  You could argue that fixing
> it in nfsd fixes it for all filesystems.  But I'm not sure either of
> those arguments are likely to be at all convincing...

Caring about a non-ext3 filesystem, I sure would like an nfs solution as
well. :)

> Hmmm. I wonder.  Which is more likely?
>   - That two 64bit hashes from some set are the same
>   - or that 65536 48bit hashes from a set of equal size are the same.

The former.  Each bit going from hash strength to collision chain length
reduces the likelihood of an overflow.  In the extreme case of a 0bit
hash and 64bit collision chain, you need 2^64 entries compared to 2^32
for the other extreme.

However, the collision chain gives me quite a bit of headache.  One
would have to store each entry's position on the chain, deal with older
entries getting deleted, newer entries getting removed, etc.  All this
requires a lot of complicated code that basically never gets tested in
the wild.

Just settling for a 64bit hash and returning -EEXIST when someone causes
a collision an creat() sounds more appealing.  Directories with 4
billion entries will cause problems, but that is hardly news to anyone.

Jörn

-- 
Fantasy is more important than knowledge. Knowledge is limited,
while fantasy embraces the whole world.
-- Albert Einstein
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Announce: New release of Linux-ready Firmware Dev Kit - Release 2

2007-04-11 Thread Selbak, Rolla N

The Linux-ready Firmware Developer Kit team is pleased to announce the
release R2 of the kit. 

This release is mostly very heavy with bug-fixes,
infrastructure re-org. to make it easier for outside developers to 
write & contribute plugins, and of course, tons of documentation. A few
new tests and features have been added, including things you've asked
for
such as ssh-upload and a text-based version of the results.

The Linux-ready Firmware Developer Kit is a tool to test how well Linux
works together with the firmware (BIOS or EFI) of your machine, and is
designed for use by both firmware development teams and Linux kernel
hackers to prevent and diagnose firmware bugs.

Summary
===

Enhancements
* ssh upload of results
* globalized DSDT & SSDT lists for standalone and .so plugins
* better logically implemented dmesg and e820 functions
* text-based results
* documentation of each plugin and the meaning if its results 
(Documentation/TestsInfo)
* bug-fixes


New Tests
=
* ia64 error injection tool
* fan test (now functional)
* SUN test (now functional)
* ebda test
* cpufreq: added test for Ingo's _PSS bug
* dmesg: added detection for buzilla.kernel.org bug 6859


You can download this latest release of the kit from

http://www.linuxfirmwarekit.org


The Linux-ready Firmware Developer Kit team
Jacob Pan
Rolla Selbak
Arjan van de Ven


[Please cc my email on any replies or comments]

Thanks,

rs

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 7/12] get_unmapped_area handles MAP_FIXED on parisc

2007-04-11 Thread Benjamin Herrenschmidt
Handle MAP_FIXED in parisc arch_get_unmapped_area(), just return the
address. We might want to also check for possible cache aliasing
issues now that we get called in that case (like ARM or MIPS),
leave a comment for the maintainers to pick up.

Signed-off-by: Benjamin Herrenschmidt <[EMAIL PROTECTED]>

 arch/parisc/kernel/sys_parisc.c |5 +
 1 file changed, 5 insertions(+)

Index: linux-cell/arch/parisc/kernel/sys_parisc.c
===
--- linux-cell.orig/arch/parisc/kernel/sys_parisc.c 2007-03-22 
15:28:05.0 +1100
+++ linux-cell/arch/parisc/kernel/sys_parisc.c  2007-03-22 15:29:08.0 
+1100
@@ -106,6 +106,11 @@ unsigned long arch_get_unmapped_area(str
 {
if (len > TASK_SIZE)
return -ENOMEM;
+   /* Might want to check for cache aliasing issues for MAP_FIXED case
+* like ARM or MIPS ??? --BenH.
+*/
+   if (flags & MAP_FIXED)
+   return addr;
if (!addr)
addr = TASK_UNMAPPED_BASE;
 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 6/12] get_unmapped_area handles MAP_FIXED on ia64

2007-04-11 Thread Benjamin Herrenschmidt
Handle MAP_FIXED in ia64 arch_get_unmapped_area and
hugetlb_get_unmapped_area(), just call prepare_hugepage_range
in the later and is_hugepage_only_range() in the former.

Signed-off-by: Benjamin Herrenschmidt <[EMAIL PROTECTED]>

 arch/ia64/kernel/sys_ia64.c |7 +++
 arch/ia64/mm/hugetlbpage.c  |8 
 2 files changed, 15 insertions(+)

Index: linux-cell/arch/ia64/kernel/sys_ia64.c
===
--- linux-cell.orig/arch/ia64/kernel/sys_ia64.c 2007-03-22 15:10:45.0 
+1100
+++ linux-cell/arch/ia64/kernel/sys_ia64.c  2007-03-22 15:10:47.0 
+1100
@@ -33,6 +33,13 @@ arch_get_unmapped_area (struct file *fil
if (len > RGN_MAP_LIMIT)
return -ENOMEM;
 
+   /* handle fixed mapping: prevent overlap with huge pages */
+   if (flags & MAP_FIXED) {
+   if (is_hugepage_only_range(mm, addr, len))
+   return -EINVAL;
+   return addr;
+   }
+
 #ifdef CONFIG_HUGETLB_PAGE
if (REGION_NUMBER(addr) == RGN_HPAGE)
addr = 0;
Index: linux-cell/arch/ia64/mm/hugetlbpage.c
===
--- linux-cell.orig/arch/ia64/mm/hugetlbpage.c  2007-03-22 15:12:32.0 
+1100
+++ linux-cell/arch/ia64/mm/hugetlbpage.c   2007-03-22 15:12:39.0 
+1100
@@ -148,6 +148,14 @@ unsigned long hugetlb_get_unmapped_area(
return -ENOMEM;
if (len & ~HPAGE_MASK)
return -EINVAL;
+
+   /* Handle MAP_FIXED */
+   if (flags & MAP_FIXED) {
+   if (prepare_hugepage_range(addr, len, pgoff))
+   return -EINVAL;
+   return addr;
+   }
+
/* This code assumes that RGN_HPAGE != 0. */
if ((REGION_NUMBER(addr) != RGN_HPAGE) || (addr & (HPAGE_SIZE - 1)))
addr = HPAGE_REGION_BASE;
 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 5/12] get_unmapped_area handles MAP_FIXED on i386

2007-04-11 Thread Benjamin Herrenschmidt
Handle MAP_FIXED in i386 hugetlb_get_unmapped_area(), just call
prepare_hugepage_range.

Signed-off-by: Benjamin Herrenschmidt <[EMAIL PROTECTED]>

 arch/i386/mm/hugetlbpage.c |6 ++
 1 file changed, 6 insertions(+)

Index: linux-cell/arch/i386/mm/hugetlbpage.c
===
--- linux-cell.orig/arch/i386/mm/hugetlbpage.c  2007-03-22 16:08:12.0 
+1100
+++ linux-cell/arch/i386/mm/hugetlbpage.c   2007-03-22 16:14:19.0 
+1100
@@ -367,6 +367,12 @@ hugetlb_get_unmapped_area(struct file *f
if (len > TASK_SIZE)
return -ENOMEM;
 
+   if (flags & MAP_FIXED) {
+   if (prepare_hugepage_range(addr, len, pgoff))
+   return -EINVAL;
+   return addr;
+   }
+
if (addr) {
addr = ALIGN(addr, HPAGE_SIZE);
vma = find_vma(mm, addr);
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 4/12] get_unmapped_area handles MAP_FIXED on frv

2007-04-11 Thread Benjamin Herrenschmidt
Handle MAP_FIXED in arch_get_unmapped_area on frv. Trivial case, just
return the address.

Signed-off-by: Benjamin Herrenschmidt <[EMAIL PROTECTED]>

 arch/frv/mm/elf-fdpic.c |4 
 1 file changed, 4 insertions(+)

Index: linux-cell/arch/frv/mm/elf-fdpic.c
===
--- linux-cell.orig/arch/frv/mm/elf-fdpic.c 2007-03-22 15:00:50.0 
+1100
+++ linux-cell/arch/frv/mm/elf-fdpic.c  2007-03-22 15:01:06.0 +1100
@@ -64,6 +64,10 @@ unsigned long arch_get_unmapped_area(str
if (len > TASK_SIZE)
return -ENOMEM;
 
+   /* handle MAP_FIXED */
+   if (flags & MAP_FIXED)
+   return addr;
+
/* only honour a hint if we're not going to clobber something doing so 
*/
if (addr) {
addr = PAGE_ALIGN(addr);
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 9/12] get_unmapped_area handles MAP_FIXED on x86_64

2007-04-11 Thread Benjamin Herrenschmidt
Handle MAP_FIXED in x86_64 arch_get_unmapped_area(), simple case, just
return the address as passed in

Signed-off-by: Benjamin Herrenschmidt <[EMAIL PROTECTED]>

 arch/x86_64/kernel/sys_x86_64.c |3 +++
 1 file changed, 3 insertions(+)

Index: linux-cell/arch/x86_64/kernel/sys_x86_64.c
===
--- linux-cell.orig/arch/x86_64/kernel/sys_x86_64.c 2007-03-22 
16:10:10.0 +1100
+++ linux-cell/arch/x86_64/kernel/sys_x86_64.c  2007-03-22 16:11:06.0 
+1100
@@ -93,6 +93,9 @@ arch_get_unmapped_area(struct file *filp
unsigned long start_addr;
unsigned long begin, end;

+   if (flags & MAP_FIXED)
+   return addr;
+
find_start_end(flags, , ); 
 
if (len > end)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 8/12] get_unmapped_area handles MAP_FIXED on sparc64

2007-04-11 Thread Benjamin Herrenschmidt
Handle MAP_FIXED in hugetlb_get_unmapped_area on sparc64
by just using prepare_hugepage_range()

Signed-off-by: Benjamin Herrenschmidt <[EMAIL PROTECTED]>

 arch/sparc64/mm/hugetlbpage.c |6 ++
 1 file changed, 6 insertions(+)

Index: linux-cell/arch/sparc64/mm/hugetlbpage.c
===
--- linux-cell.orig/arch/sparc64/mm/hugetlbpage.c   2007-03-22 
16:12:57.0 +1100
+++ linux-cell/arch/sparc64/mm/hugetlbpage.c2007-03-22 16:15:33.0 
+1100
@@ -175,6 +175,12 @@ hugetlb_get_unmapped_area(struct file *f
if (len > task_size)
return -ENOMEM;
 
+   if (flags & MAP_FIXED) {
+   if (prepare_hugepage_range(addr, len, pgoff))
+   return -EINVAL;
+   return addr;
+   }
+
if (addr) {
addr = ALIGN(addr, HPAGE_SIZE);
vma = find_vma(mm, addr);
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 12/12] get_unmapped_area doesn't need hugetlbfs hacks anymore

2007-04-11 Thread Benjamin Herrenschmidt
Remove the hugetlbfs specific hacks in toplevel get_unmapped_area() now
that all archs and hugetlbfs itself do the right thing for both cases.

Signed-off-by: Benjamin Herrenschmidt <[EMAIL PROTECTED]>

 mm/mmap.c |   16 
 1 file changed, 16 deletions(-)

Index: linux-cell/mm/mmap.c
===
--- linux-cell.orig/mm/mmap.c   2007-04-12 12:14:46.0 +1000
+++ linux-cell/mm/mmap.c2007-04-12 12:14:47.0 +1000
@@ -1381,22 +1381,6 @@ get_unmapped_area(struct file *file, uns
if (addr & ~PAGE_MASK)
return -EINVAL;
 
-   if (file && is_file_hugepages(file))  {
-   /*
-* Check if the given range is hugepage aligned, and
-* can be made suitable for hugepages.
-*/
-   ret = prepare_hugepage_range(addr, len, pgoff);
-   } else {
-   /*
-* Ensure that a normal request is not falling in a
-* reserved hugepage range.  For some archs like IA-64,
-* there is a separate region for hugepages.
-*/
-   ret = is_hugepage_only_range(current->mm, addr, len);
-   }
-   if (ret)
-   return -EINVAL;
return addr;
 }
 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 10/12] get_unmapped_area handles MAP_FIXED in hugetlbfs

2007-04-11 Thread Benjamin Herrenschmidt
Generic hugetlb_get_unmapped_area() now handles MAP_FIXED by just
calling prepare_hugepage_range()

Signed-off-by: Benjamin Herrenschmidt <[EMAIL PROTECTED]>

 fs/hugetlbfs/inode.c |6 ++
 1 file changed, 6 insertions(+)

Index: linux-cell/fs/hugetlbfs/inode.c
===
--- linux-cell.orig/fs/hugetlbfs/inode.c2007-03-22 16:12:56.0 
+1100
+++ linux-cell/fs/hugetlbfs/inode.c 2007-03-22 16:16:02.0 +1100
@@ -115,6 +115,12 @@ hugetlb_get_unmapped_area(struct file *f
if (len > TASK_SIZE)
return -ENOMEM;
 
+   if (flags & MAP_FIXED) {
+   if (prepare_hugepage_range(addr, len, pgoff))
+   return -EINVAL;
+   return addr;
+   }
+
if (addr) {
addr = ALIGN(addr, HPAGE_SIZE);
vma = find_vma(mm, addr);
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 11/12] get_unmapped_area handles MAP_FIXED in generic code

2007-04-11 Thread Benjamin Herrenschmidt
generic arch_get_unmapped_area() now handles MAP_FIXED. Now that
all implementations have been fixed, change the toplevel
get_unmapped_area() to call into arch or drivers for the MAP_FIXED
case.

Signed-off-by: Benjamin Herrenschmidt <[EMAIL PROTECTED]>

 mm/mmap.c |   25 +++--
 1 file changed, 15 insertions(+), 10 deletions(-)

Index: linux-cell/mm/mmap.c
===
--- linux-cell.orig/mm/mmap.c   2007-03-22 16:29:22.0 +1100
+++ linux-cell/mm/mmap.c2007-03-22 16:30:06.0 +1100
@@ -1199,6 +1199,9 @@ arch_get_unmapped_area(struct file *filp
if (len > TASK_SIZE)
return -ENOMEM;
 
+   if (flags & MAP_FIXED)
+   return addr;
+
if (addr) {
addr = PAGE_ALIGN(addr);
vma = find_vma(mm, addr);
@@ -1272,6 +1275,9 @@ arch_get_unmapped_area_topdown(struct fi
if (len > TASK_SIZE)
return -ENOMEM;
 
+   if (flags & MAP_FIXED)
+   return addr;
+
/* requesting a specific address */
if (addr) {
addr = PAGE_ALIGN(addr);
@@ -1360,22 +1366,21 @@ get_unmapped_area(struct file *file, uns
unsigned long pgoff, unsigned long flags)
 {
unsigned long ret;
+   unsigned long (*get_area)(struct file *, unsigned long,
+ unsigned long, unsigned long, unsigned long);
 
-   if (!(flags & MAP_FIXED)) {
-   unsigned long (*get_area)(struct file *, unsigned long, 
unsigned long, unsigned long, unsigned long);
-
-   get_area = current->mm->get_unmapped_area;
-   if (file && file->f_op && file->f_op->get_unmapped_area)
-   get_area = file->f_op->get_unmapped_area;
-   addr = get_area(file, addr, len, pgoff, flags);
-   if (IS_ERR_VALUE(addr))
-   return addr;
-   }
+   get_area = current->mm->get_unmapped_area;
+   if (file && file->f_op && file->f_op->get_unmapped_area)
+   get_area = file->f_op->get_unmapped_area;
+   addr = get_area(file, addr, len, pgoff, flags);
+   if (IS_ERR_VALUE(addr))
+   return addr;
 
if (addr > TASK_SIZE - len)
return -ENOMEM;
if (addr & ~PAGE_MASK)
return -EINVAL;
+
if (file && is_file_hugepages(file))  {
/*
 * Check if the given range is hugepage aligned, and
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2/12] get_unmapped_area handles MAP_FIXED on alpha

2007-04-11 Thread Benjamin Herrenschmidt
Handle MAP_FIXED in alpha's arch_get_unmapped_area(), simple case, just
return the address as passed in

Signed-off-by: Benjamin Herrenschmidt <[EMAIL PROTECTED]>

 arch/alpha/kernel/osf_sys.c |3 +++
 1 file changed, 3 insertions(+)

Index: linux-cell/arch/alpha/kernel/osf_sys.c
===
--- linux-cell.orig/arch/alpha/kernel/osf_sys.c 2007-03-22 14:58:33.0 
+1100
+++ linux-cell/arch/alpha/kernel/osf_sys.c  2007-03-22 14:58:44.0 
+1100
@@ -1267,6 +1267,9 @@ arch_get_unmapped_area(struct file *filp
if (len > limit)
return -ENOMEM;
 
+   if (flags & MAP_FIXED)
+   return addr;
+
/* First, see if the given suggestion fits.
 
   The OSF/1 loader (/sbin/loader) relies on us returning an
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3/12] get_unmapped_area handles MAP_FIXED on arm

2007-04-11 Thread Benjamin Herrenschmidt
ARM already had a case for MAP_FIXED in arch_get_unmapped_area() though
it was not called before. Fix the comment to reflect that it will now
be called.

Signed-off-by: Benjamin Herrenschmidt <[EMAIL PROTECTED]>

 arch/arm/mm/mmap.c |3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

Index: linux-cell/arch/arm/mm/mmap.c
===
--- linux-cell.orig/arch/arm/mm/mmap.c  2007-03-22 14:59:51.0 +1100
+++ linux-cell/arch/arm/mm/mmap.c   2007-03-22 15:00:01.0 +1100
@@ -49,8 +49,7 @@ arch_get_unmapped_area(struct file *filp
 #endif
 
/*
-* We should enforce the MAP_FIXED case.  However, currently
-* the generic kernel code doesn't allow us to handle this.
+* We enforce the MAP_FIXED case.
 */
if (flags & MAP_FIXED) {
if (aliasing && flags & MAP_SHARED && addr & (SHMLBA - 1))
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/12] get_unmapped_area handles MAP_FIXED on powerpc

2007-04-11 Thread Benjamin Herrenschmidt
Handle MAP_FIXED in powerpc's arch_get_unmapped_area() in all 3
implementations of it.

Signed-off-by: Benjamin Herrenschmidt <[EMAIL PROTECTED]>

 arch/powerpc/mm/hugetlbpage.c |   21 +
 1 file changed, 21 insertions(+)

Index: linux-cell/arch/powerpc/mm/hugetlbpage.c
===
--- linux-cell.orig/arch/powerpc/mm/hugetlbpage.c   2007-03-22 
14:52:07.0 +1100
+++ linux-cell/arch/powerpc/mm/hugetlbpage.c2007-03-22 14:57:40.0 
+1100
@@ -572,6 +572,13 @@ unsigned long arch_get_unmapped_area(str
if (len > TASK_SIZE)
return -ENOMEM;
 
+   /* handle fixed mapping: prevent overlap with huge pages */
+   if (flags & MAP_FIXED) {
+   if (is_hugepage_only_range(mm, addr, len))
+   return -EINVAL;
+   return addr;
+   }
+
if (addr) {
addr = PAGE_ALIGN(addr);
vma = find_vma(mm, addr);
@@ -647,6 +654,13 @@ arch_get_unmapped_area_topdown(struct fi
if (len > TASK_SIZE)
return -ENOMEM;
 
+   /* handle fixed mapping: prevent overlap with huge pages */
+   if (flags & MAP_FIXED) {
+   if (is_hugepage_only_range(mm, addr, len))
+   return -EINVAL;
+   return addr;
+   }
+
/* dont allow allocations above current base */
if (mm->free_area_cache > base)
mm->free_area_cache = base;
@@ -829,6 +843,13 @@ unsigned long hugetlb_get_unmapped_area(
/* Paranoia, caller should have dealt with this */
BUG_ON((addr + len)  < addr);
 
+   /* Handle MAP_FIXED */
+   if (flags & MAP_FIXED) {
+   if (prepare_hugepage_range(addr, len, pgoff))
+   return -EINVAL;
+   return addr;
+   }
+
if (test_thread_flag(TIF_32BIT)) {
curareas = current->mm->context.low_htlb_areas;
 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 0/12] Pass MAP_FIXED down to get_unmapped_area

2007-04-11 Thread Benjamin Herrenschmidt
This is a "first step" as there are still cleanups to be done in various
areas touched by that code but I think it's probably good to go as is and
at least enables me to implement what I need for PowerPC.

(Andrew, this is also candidate for 2.6.22 since I haven't had any real
objection, mostly suggestion for improving further, which I'll try to
do later, and I have further powerpc patches that rely on this).

The current get_unmapped_area code calls the f_ops->get_unmapped_area or
the arch one (via the mm) only when MAP_FIXED is not passed. That makes
it impossible for archs to impose proper constraints on regions of the
virtual address space. To work around that, get_unmapped_area() then
calls some hugetlbfs specific hacks.

This cause several problems, among others:

 - It makes it impossible for a driver or filesystem to do the same thing
that hugetlbfs does (for example, to allow a driver to use larger page
sizes to map external hardware) if that requires applying a constraint
on the addresses (constraining that mapping in certain regions and other
mappings out of those regions).

 - Some archs like arm, mips, sparc, sparc64, sh and sh64 already want
MAP_FIXED to be passed down in order to deal with aliasing issues.
The code is there to handle it... but is never called.

This serie of patches moves the logic to handle MAP_FIXED down to the
various arch/driver get_unmapped_area() implementations, and then changes
the generic code to always call them. The hugetlbfs hacks then disappear
from the generic code.

Since I need to do some special 64K pages mappings for SPEs on cell, I need
to work around the first problem at least. I have further patches thus
implementing a "slices" layer that handles multiple page sizes through
slices of the address space for use by hugetlbfs, the SPE code, and possibly
others, but it requires that serie of patches first/

There is still a potential (but not practical) issue due to the fact that
filesystems/drivers implemeting g_u_a will effectively bypass all arch
checks. This is not an issue in practice as the only filesystems/drivers
using that hook are doing so for arch specific purposes in the first place.

There is also a problem with mremap that will completely bypass all arch
checks. I'll try to address that separately, I'm not 100% certain yet how,
possibly by making it not work when the vma has a file whose f_ops has a
get_unmapped_area callback, and by making it use is_hugepage_only_range()
before expanding into a new area.

Also, I want to turn is_hugepage_only_range() into a more generic
is_normal_page_range() as that's really what it will end up meaning
when used in stack grow, brk grow and mremap.

None of the above "issues" however are introduced by this patch, they are
already there, so I think the patch can go in.

Cheers,
Ben.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 7/8] Clean up workqueue.c with respect to the freezer based cpu-hotplug

2007-04-11 Thread Srivatsa Vaddagiri
On Tue, Apr 03, 2007 at 10:48:20PM +0530, Srivatsa Vaddagiri wrote:
> > Actually, we should do this before destroy_workqueue() calls 
> > flush_workqueue().
> > Otherwise flush_cpu_workqueue() can hang forever in a similar manner.
> 
> Yep. I guess these are a class of freezer deadlocks very similar to vfork
> parent waiting on child case. I get a feeling these should become common
> outside of kthread too (A waits on B for something, B gets frozen, which
> means A won't freeze causing freezer to fail). Can freezer detect this
> dependency somehow and thaw B automatically? Probably not that easy ..

I wonder if there is some value in "enforcing" an order in which
processes get frozen i.e freeze A first before B. That may solve the
deadlocks we have been discussing wrt kthread_stop and flush_workqueue
as well.

The idea is similar to how deadlock wrt multiple locks are solved -
where a ordering is enforced. Take Lock A first before Lock B. 

If process A waits on B (like in kthread_stop or flush_workqueue), then if we:

1. Insert A and B in a list (freeze_me_first_list)
2. Have freezer scan freeze_me_first_list before the master
   task-list, so that it:
2a. "freezes A and waits for A to get frozen" first
2b. "freezes B and waits for B to get frozen" next

then we would avoid the nastiness of "B getting frozen first and A doesnt
freeze because of that" with lesser code changes?

-- 
Regards,
vatsa
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[ANNOUNCE] GIT 1.5.1.1

2007-04-11 Thread Junio C Hamano
The latest maintenance release GIT 1.5.1.1 is available at the
usual places:

  http://www.kernel.org/pub/software/scm/git/

  git-1.5.1.1.tar.{gz,bz2}  (tarball)
  git-htmldocs-1.5.1.1.tar.{gz,bz2} (preformatted docs)
  git-manpages-1.5.1.1.tar.{gz,bz2} (preformatted docs)
  RPMS/$arch/git-*-1.5.1.1-1.$arch.rpm  (RPM)

GIT v1.5.1.1 Release Notes
==

Fixes since v1.5.1
--

* Documentation updates

  - The --left-right option of rev-list and friends is documented.

  - The documentation for cvsimport has been majorly improved.

  - "git-show-ref --exclude-existing" was documented.

* Bugfixes

  - The implementation of -p option in "git cvsexportcommit" had
the meaning of -C (context reduction) option wrong, and
loosened the context requirements when it was told to be
strict.

  - "git cvsserver" did not behave like the real cvsserver when
client side removed a file from the working tree without
doing anything else on the path.  In such a case, it should
restore it from the checked out revision.

  - "git fsck" issued an alarming error message on detached
HEAD.  It is not an error since at least 1.5.0.

  - "git send-email" produced of References header of unbounded length;
fixed this with line-folding.

  - "git archive" to download from remote site should not
require you to be in a git repository, but it incorrectly
did.

  - "git apply" ignored -p for "diff --git" formatted
patches.

  - "git rerere" recorded a conflict that had one side empty
(the other side adds) incorrectly; this made merging in the
other direction fail to use previously recorded resolution.

  - t4200 test was broken where "wc -l" pads its output with
spaces.

  - "git branch -m old new" to rename branch did not work
without a configuration file in ".git/config".

  - The sample hook for notification e-mail was misnamed.

  - gitweb did not show type-changing patch correctly in the
blobdiff view.

  - git-svn did not error out with incorrect command line options.

  - git-svn fell into an infinite loop when insanely long commit
message was found.

  - git-svn dcommit and rebase was confused by patches that were
merged from another branch that is managed by git-svn.



Changes since v1.5.1 are as follows:

Arjen Laarhoven (4):
  usermanual.txt: some capitalization nits
  t3200-branch.sh: small language nit
  t5300-pack-object.sh: portability issue using /usr/bin/stat
  Makefile: iconv() on Darwin has the old interface

Brian Gernhardt (3):
  Fix t4200-rerere for white-space from "wc -l"
  Document --left-right option to rev-list.
  Distinguish branches by more than case in tests.

Dana How (1):
  Fix lseek(2) calls with args 2 and 3 swapped

Eric Wong (3):
  git-svn: bail out on incorrect command-line options
  git-svn: dcommit/rebase confused by patches with git-svn-id: lines
  git-svn: fix log command to avoid infinite loop on long commit messages

Frank Lichtenheld (7):
  cvsimport: sync usage lines with existing options
  cvsimport: Improve documentation of CVSROOT and CVS module determination
  cvsimport: Improve usage error reporting
  cvsimport: Reorder options in documentation for better understanding
  cvsimport: Improve formating consistency
  cvsserver: small corrections to asciidoc documentation
  cvsserver: Fix handling of diappeared files on update

Geert Bosch (1):
  Fix renaming branch without config file

Gerrit Pape (1):
  rename contrib/hooks/post-receieve-email to 
contrib/hooks/post-receive-email.

Jakub Narebski (1):
  gitweb: Fix bug in "blobdiff" view for split (e.g. file to symlink) 
patches

Jim Meyering (1):
  (encode_85, decode_85): Mark source buffer pointer as "const".

Julian Phillips (1):
  Documentation: show-ref: document --exclude-existing

Junio C Hamano (7):
  rerere: make sorting really stable.
  Fix dependency of common-cmds.h
  Documentation: tighten dependency for git.{html,txt}
  Prepare for 1.5.1.1
  Add Documentation/cmd-list.made to .gitignore
  fsck: do not complain on detached HEAD.
  GIT 1.5.1.1

Lars Hjemli (2):
  rename_ref(): only print a warning when config-file update fails
  Make builtin-branch.c handle the git config file

René Scharfe (1):
  Revert "builtin-archive: use RUN_SETUP"

Shawn O. Pearce (1):
  Honor -p when applying git diffs

Tomash Brechko (1):
  cvsexportcommit -p : fix the usage of git-apply -C.

Ville Skyttä (1):
  DESTDIR support for git/contrib/emacs

YOSHIFUJI Hideaki (1):
  Avoid composing too long "References" header.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  

Re: [PATCH, take4] FUTEX : new PRIVATE futexes

2007-04-11 Thread Nick Piggin

Eric Dumazet wrote:

On Wed, 11 Apr 2007 19:23:26 +1000
Nick Piggin <[EMAIL PROTECTED]> wrote:



As this external thing certainly is not doing the check itself, to be on the 
safe side we should enforce it in get_futex_key(). I agree with you : If we 
want to maximize performance, we could say : The check *must* be done by the 
caller.


Well we _control_ the API, so let's make it as clean and performant as possible
from the start.



Take a look at do_futex().
Adding checks in callers just increase code size. I tried this got only bad 
results.
This would speedup only the slow path (ie when some user code want to give us 
non aligned addrs)
A single factorized check is cleaner and not slower, since we reduce icache 
pressure.


1 extra check versus all that additional argument passing? I don't think it
is conclusive.

--
SUSE Labs, Novell Inc.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: If not readdir() then what?

2007-04-11 Thread Neil Brown
On Wednesday April 11, [EMAIL PROTECTED] wrote:
> 
> Actually, no, we can't keep the collision chain count stable across a
> create/delete even while the tree is cached.  At least, not without
> storing a huge amount of state associated with each page.  (It would
> be a lot more work than simply having nfsd keep a fd cache for
> directory streams ;-).

Well, there's the rub, isn't it :-)
You think it is easier to fix the problem in nfsd, and I think it is
easier to fix the problem in ext3.  Both positions are quite
understandable and quite genuine.
And I am quite sure that all the issues that have been raised can be
solved with a bit of effort providing the motivation is there.

I could argue that nfs came before ext3+dirindex, so ext3 should have
been designed to work properly with NFS.  You could argue that fixing
it in nfsd fixes it for all filesystems.  But I'm not sure either of
those arguments are likely to be at all convincing...

Maybe the best compromise is to both fix the 'problem' :-?

Let me explores some designs a bit more..

NFS:
   All we have to do is cache the open files.  This should be only a
   performance issue, not a correctness issue (once we get 64bit
   cookies from ext3).  So the important thing is to cache then for
   a reasonable period of time.

   We currently cache the read-ahead info from regular files (though
   I was hoping that would go away when the page-cache-based readahead
   became a reality).  We can reasonably replace this with caching
   the open files if we are going to do it for directories anyway.

   So the primary key is "struct inode * + loff_t".  This is suitable
   both for file-readahead and for ext3-directory-caching.  Might also
   be useful for filesystem that stores pre-allocation information in
   the struct-file.
   We keep these in an LRU list and a hash table.  We register a
   callback with register_shrinker (or whatever it is called today) so
   that VM pressure can shrink the cache, and also arrange a timer to
   remove entries older than -- say -- 5 seconds.

   I think that placing a fixed size on the cache based on number of
   active clients would be a mistake, as it is virtually impossible to
   know how many active clients there are, and the number can change
   very dynamically.

   When a filesystem is un-exported, (rare event) we walk the whole
   list and discard entries for that filesystem.

   Look into the possibility of a callback on unlink to drop the
   cached open when the link count hits zero I wonder if inotify
   or leases can help with that.

   To help with large NUMA machine, we probably don't want a single
   hash LRU chain, but rather a number of LRU chains.  That way the
   action of moving an entry to the end of the chain is less likely to
   conflict with another processor trying to do the same thing to a
   different entry.  This is the sort of consideration that is already
   handled in the page cache, and having to handle it in every other
   cache is troublesome because the next time a need like that
   arises, the page cache will get fixed but other little caches won't
   until someone like Greg Banks come along with a big hammer...

EXT3:
   Have a rbtree for storing directory entries. This is attached to a
   pages via the ->private page field.
   Normally each page of a directory has it's own rbtree, but when two
   pages contain entries with the same hash, the one rbtree is shared
   between the pages.
   Thus when you load a block you must also load other blocks under
   the same hash, but I think you do that already.

   When you split a block (because it has become too big) the rbtree
   attached to that block is dismantled and each entry is inserted
   into the appropriate new rbtree, one for each of the two blocks.
   The entries are unchanged - they just get placed in a different
   tree - so cursors in the struct file will still be valid.

   Each entry has a count of the number of cursors pointing to it, and
   when this is non-zero, a refcount on the page is held, thus making
   sure the page doesn't get freed and the btree lost.  The entry
   should possibly also contain a pointer to the page.. not sure if
   that is needed.

   Each entry in the rbtree contains (in minor_hash) a sequence number
   that is used when multiple entries hash to the same value.  We
   store a 'current-seq-number' in the root of the rbtree and when an
   attempt to insert an entry finds a collision, we increase
   current-seq-number, set the minor_hash to that, and retry the
   insert.
   This minor_hash is combined with some bits of the major hash to
   form the fpos/cookie.

   The releasepage address_space_operation will check that all pages
   which share the same major hash are treated as a unit, all released
   at the same time.  So it will fail if any of the pages in the
   group are in use.  If they can all be freed, it will free the
   rbtree for that group.


   This not only benefits nfsd, which opens 

[PATCH] fix bogon in /dev/mem mmap'ing on nommu

2007-04-11 Thread Benjamin Herrenschmidt
While digging through my MAP_FIXED changes, I found that rather obvious
bug in /dev/mem mmap implementation for nommu archs. get_unmapped_area()
is expected to return an address, not a pfn.

Signed-off-by: Benjamin Herrenschmidt <[EMAIL PROTECTED]>
---

I suppose that can go in anytime, and probably in stable too, Dave ?

Index: linux-cell/drivers/char/mem.c
===
--- linux-cell.orig/drivers/char/mem.c  2007-02-12 10:36:14.0 +1100
+++ linux-cell/drivers/char/mem.c   2007-04-12 11:38:44.0 +1000
@@ -248,7 +248,7 @@ static unsigned long get_unmapped_area_m
 {
if (!valid_mmap_phys_addr_range(pgoff, len))
return (unsigned long) -EINVAL;
-   return pgoff;
+   return pgoff << PAGE_SHIFT;
 }
 
 /* can't do an in-place private mapping if there's no MMU */


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 9/10] Vmi timer update.patch

2007-04-11 Thread Zachary Amsden

Chris Wright wrote:

* Zachary Amsden ([EMAIL PROTECTED]) wrote:
  

+void __init vmi_time_init(void)
+{
+   /* Disable PIT: BIOSes start PIT CH0 with 18.2hz peridic. */
+   outb_p(0x3a, PIT_MODE); /* binary, mode 5, LSB/MSB, ch 0 */


That shouldn't be necessary using clockevents.
  
Actually, I'm not so sure.  If clockevents simply masks the PIT when 
disabling it, we still have overhead of keeping the latch in sync, which 
requires a timer at the PIT frequency.  I can instrument to see how 
exactly the PIT gets disabled.



It should switch from pit to vmi-timer, and the switch should do the state
transistions on pit to go to unused mode.
  


Ok, here's why we need it: the reason is even more basic.  PIT 
clockevents never get setup; the time_init paravirt-op makes it 
conditional whether the PIT or VMI timer get invoked.  But our BIOS 
still sets it up to run at 18.2 HZ, like any good BIOS would.  We need 
the disable hack, in fact it is actually a good thing to do for native 
hardware.  Why leave the PIT enabled with junk programming from the BIOS 
once we are in the protected mode kernel?  Eventually, on hardware that 
doesn't want to use the PIT at all, this might be wanted to conserve 
power (casually joking but potentially correct argument).


Zach
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Security computation within Linux kernel

2007-04-11 Thread H. Peter Anvin

Carlo Florendo wrote:


IIRC, The kernel does some encryption functions, involving TCP, NFS, and 
IPsec since all these are part of the kernel itself.




Yes, but key management is done in userspace.

-hpa
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Security computation within Linux kernel

2007-04-11 Thread Carlo Florendo

JanuGerman wrote:

Hi every one,

I have one question regarding security libraries, already shipped with Linux Kernel. That is, all PKI, RSA libraries, as provided by OpenSSL are already integrated within the linux kernel source code? OR, one have to use OpenSSL seperately in this regard. 


IIRC, The kernel does some encryption functions, involving TCP, NFS, and 
IPsec since all these are part of the kernel itself.


If you intend to write your own apps that have to use encryption functions, 
it would be best to use the relevant encryption libraries, such as OpenSSL.


Thank you very much.

Best Regards,

Carlo


--
Carlo Florendo
Softare Engineer/Network Co-Administrator
Astra Philippines Inc.
UP-Ayala Technopark, Diliman 1101, Quezon City
Philippines
http://www.astra.ph

--
The Astra Group of Companies
5-3-11 Sekido, Tama City
Tokyo 206-0011, Japan
http://www.astra.co.jp
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: "menu" versus "menuconfig" -- they're *both* a bad idea

2007-04-11 Thread Carlo Florendo

Robert P. J. Day wrote:

  (in short, if i, the builder, explicitly choose *not* to add a
certain feature to my build, i think i have every right to expect that
some other part of my configuration isn't quietly going to put some
sub-choice of that feature back in behind my back.)


I agree with this.  However, if another feature actually depends on another 
explicitly unselected feature, there should at least be a warning prompt 
that such is the case.


It probably would be hard though to track all dependencies.

Best Regards,

Carlo


--
Carlo Florendo
Softare Engineer/Network Co-Administrator
Astra Philippines Inc.
UP-Ayala Technopark, Diliman 1101, Quezon City
Philippines
http://www.astra.ph

--
The Astra Group of Companies
5-3-11 Sekido, Tama City
Tokyo 206-0011, Japan
http://www.astra.co.jp
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/7] [RFC] External power framework

2007-04-11 Thread Anton Vorontsov
> diff --git a/drivers/Kconfig b/drivers/Kconfig
> index 050323f..c546de3 100644

I've forgot to pass -s flag to git-format-patch. :-/

Please count it for whole x/7 patch set:

Signed-off-by: Anton Vorontsov <[EMAIL PROTECTED]>
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH] kernel-doc: fix plist.h comments

2007-04-11 Thread Perez-Gonzalez, Inaky
>From: Randy Dunlap <[EMAIL PROTECTED]>
>
>Make kernel-doc comments match macro names.
>Correct parameter names in a few places.
>Remove '#' from beginning of kernel-doc comment macro names.
>Remove extra (erroneous) blank lines in kernel-doc.
>
> ...
>
>cc: Inaky Perez-Gonzalez <[EMAIL PROTECTED]>
>cc: Daniel Walker <[EMAIL PROTECTED]>
>cc: Thomas Gleixner <[EMAIL PROTECTED]>
>cc: Oleg Nesterov <[EMAIL PROTECTED]>
>
>Signed-off-by: Randy Dunlap <[EMAIL PROTECTED]>

Acked-by: Inaky Perez-Gonzalez <[EMAIL PROTECTED]>
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] i386 tsc: remove xtime_lock'ing around cpufreq notifier

2007-04-11 Thread Andrew Morton
On Wed, 11 Apr 2007 14:33:57 -0700
Andrew Morton <[EMAIL PROTECTED]> wrote:

> > Here is the call ordering ,
> > 
> > ktime_get()
> >  ktime_get_ts() -> read_seqretry(_lock, seq)
> >   getnstimeofday()
> >__get_realtime_clock_ts() -> read_seqretry(_lock, seq)
> > 
> > 
> > I wonder if there is a weird case which case this to loop forever .. But
> > as said , it's just something I noticed so I don't know if it's
> > related .
> > 
> 
> hm.
> 
> Bear in mind that printk calls sched_clock() for each line of output. 
> (with the "time" kernel boot parameter).
> 
> If we're doing a read_seqretry() in sched_clock() then bascially any printk
> inside the write_seqlock() will cause a lockup.
> 
> So in fact, this explains my hang: I was debugging it with printk and I
> noticed that the printk before the write_seqlock() came out and the one
> after it did not.  Presumably if I wasn't using "time", that hang wouldn't
> have happened.
> 
> Which means that I still don't have a clue why Andi's patch is locking up
> the Vaio.
> 
> It's a bad idea to make sched_clock() this complex - we've gone and
> degraded kernel debuggability somewhat.
> 
> We have provision for fixing this: the architecture can provide its own
> printk_clock().  We should do something quick-n-dirty in printk_clock()
> which doesn't require any locks.
> 

OK, so I resurrected x86_64-mm-sched-clock-share.patch and
x86_64-mm-sched-clock64.patch.  The x86_64 box hangs on boot when using
netconsole and printk timestamps too.  Removing "time" from the kernel boot
command line prevents that.

This explains why the hang only happens with
x86_64-mm-log-reason-why-tsc-was-marked-unstable.patch applied, too: that
patch must be triggering a printk inside xtime_lock.

Does someone want to cook up a lockless printk_clock() for i386 and x86_64?

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] kernel-doc: fix plist.h comments

2007-04-11 Thread Randy Dunlap
From: Randy Dunlap <[EMAIL PROTECTED]>

Make kernel-doc comments match macro names.
Correct parameter names in a few places.
Remove '#' from beginning of kernel-doc comment macro names.
Remove extra (erroneous) blank lines in kernel-doc.

Warning(plist.h:100): Cannot understand  * #PLIST_HEAD_INIT - static struct 
plist_head initializer on line 100 - I thought it was a doc line
Warning(plist.h:112): Cannot understand  * #PLIST_NODE_INIT - static struct 
plist_node initializer on line 112 - I thought it was a doc line
Warning(plist.h:103): No description found for parameter '_lock'
Warning(plist.h:129): No description found for parameter 'lock'
Warning(plist.h:158): No description found for parameter 'pos'
Warning(plist.h:169): No description found for parameter 'pos'
Warning(plist.h:169): No description found for parameter 'n'
Warning(plist.h:179): No description found for parameter 'mem'

This still leaves one warning & one error that need attention:
Error(plist.h:219): cannot understand prototype: '('
Warning(plist.h): no structured comments found

cc: Inaky Perez-Gonzalez <[EMAIL PROTECTED]>
cc: Daniel Walker <[EMAIL PROTECTED]>
cc: Thomas Gleixner <[EMAIL PROTECTED]>
cc: Oleg Nesterov <[EMAIL PROTECTED]>

Signed-off-by: Randy Dunlap <[EMAIL PROTECTED]>
---
 include/linux/plist.h |   54 +-
 1 file changed, 23 insertions(+), 31 deletions(-)

--- linux-2621-rc6.orig/include/linux/plist.h
+++ linux-2621-rc6/include/linux/plist.h
@@ -97,9 +97,9 @@ struct plist_node {
 #endif
 
 /**
- * #PLIST_HEAD_INIT - static struct plist_head initializer
- *
+ * PLIST_HEAD_INIT - static struct plist_head initializer
  * @head:  struct plist_head variable name
+ * @_lock: lock to initialize for this list
  */
 #define PLIST_HEAD_INIT(head, _lock)   \
 {  \
@@ -109,8 +109,7 @@ struct plist_node {
 }
 
 /**
- * #PLIST_NODE_INIT - static struct plist_node initializer
- *
+ * PLIST_NODE_INIT - static struct plist_node initializer
  * @node:  struct plist_node variable name
  * @__prio:initial node priority
  */
@@ -122,8 +121,8 @@ struct plist_node {
 
 /**
  * plist_head_init - dynamic struct plist_head initializer
- *
  * @head:   plist_head pointer
+ * @lock:  list spinlock, remembered for debugging
  */
 static inline void
 plist_head_init(struct plist_head *head, spinlock_t *lock)
@@ -137,7 +136,6 @@ plist_head_init(struct plist_head *head,
 
 /**
  * plist_node_init - Dynamic struct plist_node initializer
- *
  * @node:   plist_node pointer
  * @prio:  initial node priority
  */
@@ -152,49 +150,46 @@ extern void plist_del(struct plist_node 
 
 /**
  * plist_for_each - iterate over the plist
- *
- * @pos1:  the type * to use as a loop counter.
- * @head:  the head for your list.
+ * @pos:   the type * to use as a loop counter
+ * @head:  the head for your list
  */
 #define plist_for_each(pos, head)  \
 list_for_each_entry(pos, &(head)->node_list, plist.node_list)
 
 /**
- * plist_for_each_entry_safe - iterate over a plist of given type safe
- * against removal of list entry
+ * plist_for_each_safe - iterate safely over a plist of given type
+ * @pos:   the type * to use as a loop counter
+ * @n: another type * to use as temporary storage
+ * @head:  the head for your list
  *
- * @pos1:  the type * to use as a loop counter.
- * @n1:another type * to use as temporary storage
- * @head:  the head for your list.
+ * Iterate over a plist of given type, safe against removal of list entry.
  */
 #define plist_for_each_safe(pos, n, head)  \
 list_for_each_entry_safe(pos, n, &(head)->node_list, plist.node_list)
 
 /**
  * plist_for_each_entry- iterate over list of given type
- *
- * @pos:   the type * to use as a loop counter.
- * @head:  the head for your list.
- * @member:the name of the list_struct within the struct.
+ * @pos:   the type * to use as a loop counter
+ * @head:  the head for your list
+ * @mem:   the name of the list_struct within the struct
  */
 #define plist_for_each_entry(pos, head, mem)   \
 list_for_each_entry(pos, &(head)->node_list, mem.plist.node_list)
 
 /**
- * plist_for_each_entry_safe - iterate over list of given type safe against
- * removal of list entry
- *
- * @pos:   the type * to use as a loop counter.
+ * plist_for_each_entry_safe - iterate safely over list of given type
+ * @pos:   the type * to use as a loop counter
  * @n: another type * to use as temporary storage
- * @head:  the head for your list.
- * @m: the name of the list_struct within the struct.
+ * @head:  the head for your list
+ * @m: the name of the list_struct within the struct
+ *
+ * Iterate over list of given type, safe against removal of list entry.
  */
 #define plist_for_each_entry_safe(pos, n, head, m) \

RE: Help Understanding Linux memory management

2007-04-11 Thread David Schwartz

> 1) When physical memory runs low, the memory manager will try to use
> memory currently allocated to the pagecache.  Is this true?

Yes.

> 2) When vm.overcommit_memory = 2 (overcommit disabled), and memory runs
> low, it appears that the memory manager does not try to use memory
> currently allocated to pagecache.  Is this true?

It does try, that doesn't mean it will succeed. If overcommit is disabled,
the OS must have enoug (RAM+swap) to handle the maximum memory consumption
is has allowed to take place.

Perhaps you are laboring under the incorrect assumption that the pagecache
can always be shrunk to zero? Not all the data in the pagecache is
discardable. For example, any page that has been modified from its disk copy
cannot be discarded.

> 3) Is it possible to disable the pagecache?

No, because huge amounts of capability would become impossible. It is not
even clear to me how you could execute self-modifying code without a
pagecache.

DS


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] markers-linker-generic

2007-04-11 Thread Vara Prasad

Jim Keniston wrote:


On Wed, 2007-04-11 at 15:21 -0400, Mathieu Desnoyers wrote:
 


* Andrew Morton ([EMAIL PROTECTED]) wrote:
   


On Wed, 11 Apr 2007 13:51:11 -0400
Mathieu Desnoyers <[EMAIL PROTECTED]> wrote:

 


What's this marker stuff about?

 


Hi Russel,

Here is an overview :
   


I am told that the systemtap developers plan to (or are) using this
infrastructure.

 


Quoting Frank Ch. Eigler, from the SystemTAP team :

"The LTTng user-space programs use it today.  Systemtap used to support
the earlier marker prototype and will be rapidly ported over to this
new API upon acceptance."


   


If correct: what is their reason for preferring it over kprobes?
 

Markers are not a substitute or preference over kprobes, they augment 
kprobes by enabling additional functionality.


 


I will let them answer on this one..

   



I'll take a shot at this one.

First of all, kprobes remains a vital foundation for SystemTap.  But
markers are attactive as an alternate source of trace/debug info.
Here's why:

1. Markers will live in the kernel and presumably be kept up to date by
the maintainers of the enclosing code.  We have a growing set of tapsets
(probe libraries), each of which "knows" the source code for a certain
area of the kernel.  Whenever the underlying kernel code changes (e.g.,
a function or one of its args disappears or is renamed), there's a
chance that the tapset will become invalid until we bring it back in
sync with the kernel.  As you can imagine, maintaining tapsets separate
from the kernel source is a maintenance headache.  Markers could
mitigate this.
 

Jim's above stated reason is not a consideration for markers. We don't 
plan to convert the current tapsets to use markers. We do need to 
augment tapsets with a few markers in the kernel code where it is not 
easy to put a kprobe in a maintainable fashion -- e.g in the middle of a 
function.



2. Because the kernel code is highly optimized, the kernel's dwarf info
doesn't always accurately reflect which variables have which values on
which lines (sometimes even upon entry to a function).  A marker is a
way to ensure that values of interest are available to SystemTap at
marked points.
 


Agreed


3. Sometimes the overhead of a kprobe probepoint is too much (either in
terms of time or locking) for the particular hotspot we want to probe.

 


Agreed


Jim

 


bye,
Vara Prasad

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


One odd thing about Synaptics

2007-04-11 Thread Pete Zaitcev
Hi, Peter:

There's one thing I wanted to report, just in case... It does not affect
anything, but it's odd.

Every time I lift a finger from the pad, the driver sends an event with
odd values (X is 1 and Y is 5855):

Event: time 1174695694.561806, type 1 (Key), code 330 (Touch), value 0
Event: time 1174695694.561809, type 3 (Absolute), code 0 (X), value 1425
Event: time 1174695694.561811, type 3 (Absolute), code 1 (Y), value 1223
Event: time 1174695694.561813, type 3 (Absolute), code 24 (Pressure), value 20
Event: time 1174695694.561816, -- Report Sync 
Event: time 1174695694.573918, type 3 (Absolute), code 0 (X), value 1500
Event: time 1174695694.573921, type 3 (Absolute), code 1 (Y), value 1265
Event: time 1174695694.573922, type 3 (Absolute), code 24 (Pressure), value 5
Event: time 1174695694.573924, type 3 (Absolute), code 28 (Tool Width), value 5
Event: time 1174695694.573926, -- Report Sync 
Event: time 1174695694.585575, type 3 (Absolute), code 0 (X), value 1
Event: time 1174695694.585578, type 3 (Absolute), code 1 (Y), value 5855
Event: time 1174695694.585580, type 3 (Absolute), code 24 (Pressure), value 2
Event: time 1174695694.585582, type 1 (Key), code 325 (ToolFinger), value 0
Event: time 1174695694.585584, type 1 (Key), code 333 (Tool Doubletap), value 1
Event: time 1174695694.585587, -- Report Sync 
Event: time 1174695694.622685, type 3 (Absolute), code 24 (Pressure), value 1

This correspods to hw.x=1, hw.y=0 in the driver. Looks like a bug somewhere.

Cheers,
-- Pete
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: If not readdir() then what?

2007-04-11 Thread Neil Brown
On Wednesday April 11, [EMAIL PROTECTED] wrote:
> On Thu, 12 Apr 2007, Neil Brown wrote:
> 
> > For the second.
> >  You say that you " would need at least 96 bits in order to make that
> >  guarantee; 64 bits of hash, plus a 32-bit count value in the hash
> >  collision chain".  I think 96 is a bit greedy.  Surely 48 bits of
> >  hash and 16 bits of collision-chain-position would plenty.  You would
> >  need 65537 entries before a collision was even possible, and
> >  billions before it was at all likely. (How big does a set of 48bit
> >  numbers have to get before the probability that "No subset of 65536
> >  numbers are all the same" drops below 0.95?)
> 
> Neil,
>you can get a hash collision with two entries.

You need at least 65537 entries before there is any possibility of
collision between two
   "48-bit-hash ++ 16-bit-sequence-number"
objects where the 16-bit-sequence-number is chosen to be different from all
other 16 bit sequence numbers combined with the same 48 bit hash.

NeilBrown
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 7/7] [RFC] APM emulation driver for class batteries

2007-04-11 Thread Anton Vorontsov
It finds battery with "main_battery" flag set (or with max_capacity if no
batteries marked as main), and converts battery values to APM form.

---
 drivers/battery/Kconfig |7 +++
 drivers/battery/Makefile|1 +
 drivers/battery/apm_power.c |  121 +++
 3 files changed, 129 insertions(+), 0 deletions(-)
 create mode 100644 drivers/battery/apm_power.c

diff --git a/drivers/battery/Kconfig b/drivers/battery/Kconfig
index 0c14ae0..bbf8283 100644
--- a/drivers/battery/Kconfig
+++ b/drivers/battery/Kconfig
@@ -15,4 +15,11 @@ config BATTERY_DS2760
help
  Say Y here to enable support for batteries with ds2760 chip.
 
+config APM_POWER
+   tristate "APM emulation"
+   depends on BATTERY && APM
+   help
+ Say Y here to enable support APM status emulation using
+ battery class devices.
+
 endmenu
diff --git a/drivers/battery/Makefile b/drivers/battery/Makefile
index 9902513..cea5807 100644
--- a/drivers/battery/Makefile
+++ b/drivers/battery/Makefile
@@ -1,2 +1,3 @@
 obj-$(CONFIG_BATTERY)  += battery.o
 obj-$(CONFIG_BATTERY_DS2760)   += ds2760_battery.o
+obj-$(CONFIG_APM_POWER)+= apm_power.o
diff --git a/drivers/battery/apm_power.c b/drivers/battery/apm_power.c
new file mode 100644
index 000..5e741c1
--- /dev/null
+++ b/drivers/battery/apm_power.c
@@ -0,0 +1,121 @@
+/*
+ * Copyright (c) 2007 Eugeny Boger
+ *
+ * Use consistent with the GNU GPL is permitted,
+ * provided that this copyright notice is
+ * preserved in its entirety in all copies and derived works.
+ */
+
+#include 
+#include 
+#include 
+#include 
+
+#define BATTERY_PROPERTY(property) (main_battery->get_##property ? \
+   main_battery->get_##property(main_battery) : 0)
+
+static struct battery *main_battery;
+
+static void (*old_apm_get_power_status)(struct apm_power_info*);
+
+static void apm_battery_find_main_battery(void)
+{
+   struct device *dev;
+   struct battery *bat, *batm;
+   int max_capacity = 0;
+
+   main_battery = NULL;
+   batm = NULL;
+   list_for_each_entry(dev, _class->devices, node) {
+   bat = dev_get_drvdata(dev);
+   /* If none of battery devices cantains 'main_battery' flag,
+  choice one with max capacity */
+   if (bat->get_max_capacity)
+   if (bat->get_max_capacity(bat) > max_capacity) {
+   batm = bat;
+   max_capacity = bat->get_max_capacity(bat);
+   }
+
+   if (bat->main_battery)
+   main_battery = bat;
+   }
+   if (!main_battery)
+   main_battery = batm;
+}
+
+static void apm_battery_apm_get_power_status(struct apm_power_info *info)
+{
+   int bat_current;
+
+   down(_class->sem);
+   apm_battery_find_main_battery();
+   if (!main_battery) {
+   up(_class->sem);
+   return;
+   }
+
+   if (BATTERY_PROPERTY(status) == BATTERY_STATUS_FULL)
+   info->battery_life = 100;
+   else
+   if (BATTERY_PROPERTY(max_capacity) -
+   BATTERY_PROPERTY(min_capacity))
+   info->battery_life = ((BATTERY_PROPERTY(capacity) -
+BATTERY_PROPERTY(min_capacity)) * 100) /
+(BATTERY_PROPERTY(max_capacity) -
+ BATTERY_PROPERTY(min_capacity));
+   else
+   info->battery_life = -1;
+   if ((BATTERY_PROPERTY(status) == BATTERY_STATUS_CHARGING)
+   || (BATTERY_PROPERTY(status) == BATTERY_STATUS_NOT_CHARGING)
+   || (BATTERY_PROPERTY(status) == BATTERY_STATUS_FULL))
+   info->ac_line_status = APM_AC_ONLINE;
+   else
+   info->ac_line_status = APM_AC_OFFLINE;
+
+   if (BATTERY_PROPERTY(status) == BATTERY_STATUS_CHARGING)
+   info->battery_status = APM_BATTERY_STATUS_CHARGING;
+   else {
+   if (info->battery_life > 50)
+   info->battery_status = APM_BATTERY_STATUS_HIGH;
+   else if (info->battery_life > 5)
+   info->battery_status = APM_BATTERY_STATUS_LOW;
+   else
+   info->battery_status = APM_BATTERY_STATUS_CRITICAL;
+   }
+   info->battery_flag = info->battery_status;
+
+   bat_current = BATTERY_PROPERTY(current);
+   if (bat_current)
+   info->time = ((BATTERY_PROPERTY(capacity) - 
+ BATTERY_PROPERTY(min_capacity)) * 60) /
+bat_current;
+   else
+   info->time = -1;
+
+   info->units = APM_UNITS_MINS;
+
+   up(_class->sem);
+   return;
+}
+
+static int __init apm_battery_init(void)
+{
+   printk(KERN_INFO "APM Battery Driver\n");
+

[PATCH 6/7] [RFC] ds2760 battery driver

2007-04-11 Thread Anton Vorontsov
This is driver for batteries with ds2760 chip inside. Such batteries
used in almost every HP iPaq and HTC PDAs/phones.

---
 drivers/battery/Kconfig  |7 +
 drivers/battery/Makefile |1 +
 drivers/battery/ds2760_battery.c |  466 ++
 include/linux/ds2760_battery.h   |   32 +++
 4 files changed, 506 insertions(+), 0 deletions(-)
 create mode 100644 drivers/battery/ds2760_battery.c
 create mode 100644 include/linux/ds2760_battery.h

diff --git a/drivers/battery/Kconfig b/drivers/battery/Kconfig
index c386593..0c14ae0 100644
--- a/drivers/battery/Kconfig
+++ b/drivers/battery/Kconfig
@@ -8,4 +8,11 @@ config BATTERY
  Say Y here to enable generic battery status reporting in
  the /sys filesystem.
 
+config BATTERY_DS2760
+   tristate "DS2760 battery driver (HP iPAQ & others)"
+   depends on BATTERY && W1
+   select W1_SLAVE_DS2760
+   help
+ Say Y here to enable support for batteries with ds2760 chip.
+
 endmenu
diff --git a/drivers/battery/Makefile b/drivers/battery/Makefile
index a2239cb..9902513 100644
--- a/drivers/battery/Makefile
+++ b/drivers/battery/Makefile
@@ -1 +1,2 @@
 obj-$(CONFIG_BATTERY)  += battery.o
+obj-$(CONFIG_BATTERY_DS2760)   += ds2760_battery.o
diff --git a/drivers/battery/ds2760_battery.c b/drivers/battery/ds2760_battery.c
new file mode 100644
index 000..a686304
--- /dev/null
+++ b/drivers/battery/ds2760_battery.c
@@ -0,0 +1,466 @@
+/*
+ * Driver for batteries with DS2760 chips inside.
+ *
+ * Copyright (c) 2007 Anton Vorontsov
+ *   2004 Matt Reimer
+ *   2004 Szabolcs Gyurko
+ *
+ * Use consistent with the GNU GPL is permitted,
+ * provided that this copyright notice is
+ * preserved in its entirety in all copies and derived works.
+ *
+ * Author:  Anton Vorontsov <[EMAIL PROTECTED]>
+ *  February 2007
+ *
+ *  Matt Reimer <[EMAIL PROTECTED]>
+ *  April 2004, 2005
+ *
+ *  Szabolcs Gyurko <[EMAIL PROTECTED]>
+ *  September 2004
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "../w1/w1.h"
+#include "../w1/slaves/w1_ds2760.h"
+
+struct ds2760_device_info {
+   struct battery_info *bi;
+
+   /* DS2760 data, valid after calling ds2760_battery_read_status() */
+   unsigned long update_time;  /* jiffies when data read */
+   char raw[DS2760_DATA_SIZE]; /* raw DS2760 data */
+   int voltage_raw;/* units of 4.88 mV */
+   int voltage_mV; /* units of mV */
+   int current_raw;/* units of 0.625 mA */
+   int current_mA; /* units of mA */
+   int accum_current_raw;  /* units of 0.25 mAh */
+   int accum_current_mAh;  /* units of mAh */
+   int temp_raw;   /* units of 0.125 C */
+   int temp_C; /* units of 0.1 C */
+   int rated_capacity; /* units of mAh */
+   int rem_capacity;   /* percentage */
+   int full_active_mAh;/* units of mAh */
+   int empty_mAh;  /* units of mAh */
+   int life_min;   /* units of minutes */
+   int charge_status;  /* BATTERY_STATUS_* */
+
+   int full_counter;
+   struct battery batt_cdev;
+   struct device *w1_dev;
+   struct workqueue_struct *monitor_wqueue;
+   struct delayed_work monitor_work;
+};
+
+static unsigned int cache_time = 1000;
+module_param(cache_time, uint, 0644);
+MODULE_PARM_DESC(cache_time, "cache time in milliseconds");
+
+/* Some batteries have their rated capacity stored a N * 10 mAh, while
+ * others use an index into this table. */
+static int rated_capacities[] = {
+   0,
+   920,/* Samsung */
+   920,/* BYD */
+   920,/* Lishen */
+   920,/* NEC */
+   1440,   /* Samsung */
+   1440,   /* BYD */
+   1440,   /* Lishen */
+   1440,   /* NEC */
+   2880,   /* Samsung */
+   2880,   /* BYD */
+   2880,   /* Lishen */
+   2880/* NEC */
+};
+
+/* array is level at temps 0C, 10C, 20C, 30C, 40C
+ * temp is in Celsius */
+static int battery_interpolate(int array[], int temp)
+{
+   int index, dt;
+
+   if (temp <= 0)
+   return array[0];
+   if (temp >= 40)
+   return array[4];
+
+   index = temp / 10;
+   dt= temp % 10;
+
+   return array[index] + (((array[index + 1] - array[index]) * dt) / 10);
+}
+
+static int ds2760_battery_read_status(struct ds2760_device_info *di)
+{
+   int ret, i, start, count, scale[5];
+
+   if (di->update_time && time_before(jiffies, di->update_time +
+  msecs_to_jiffies(cache_time)))
+   return 0;
+
+   if (!di->w1_dev)
+   return 0;
+
+   /* The first time 

[PATCH] [Trivial] [Doc] Add webpages' URL and summarize 3 lines.

2007-04-11 Thread Miguel Ojeda

Trivial patch, against -rc6. Please apply, thanks.
---

CREDITS:
- Summarize 3 lines into one.
- Add webpage.

MAINTAINERS:
- Add auxdisplay drivers/tree webpages.

CREDITS |7 +++
MAINTAINERS |4 
2 files changed, 7 insertions(+), 4 deletions(-)

Signed-off-by: Miguel Ojeda Sandonis <[EMAIL PROTECTED]>
---
diff --git a/CREDITS b/CREDITS
index 6bd8ab8..f990730 100644
--- a/CREDITS
+++ b/CREDITS
@@ -2573,10 +2573,9 @@ S: Australia

N: Miguel Ojeda Sandonis
E: [EMAIL PROTECTED]
-D: Author: Auxiliary LCD Controller driver (ks0108)
-D: Author: Auxiliary LCD driver (cfag12864b)
-D: Author: Auxiliary LCD framebuffer driver (cfag12864bfb)
-D: Maintainer: Auxiliary display drivers tree (drivers/auxdisplay/*)
+W: http://maxextreme.googlepages.com/
+D: Author of the ks0108, cfag12864b and cfag12864bfb auxiliary display drivers.
+D: Maintainer of the auxiliary display drivers tree (drivers/auxdisplay/*)
S: C/ Mieses 20, 9-B
S: Valladolid 47009
S: Spain
diff --git a/MAINTAINERS b/MAINTAINERS
index 829407f..2a658ef 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -672,6 +672,7 @@ AUXILIARY DISPLAY DRIVERS
P:  Miguel Ojeda Sandonis
M:  [EMAIL PROTECTED]
L:  linux-kernel@vger.kernel.org
+W: http://auxdisplay.googlepages.com/
S:  Maintained

AVR32 ARCHITECTURE
@@ -884,12 +885,14 @@ CFAG12864B LCD DRIVER
P:  Miguel Ojeda Sandonis
M:  [EMAIL PROTECTED]
L:  linux-kernel@vger.kernel.org
+W: http://auxdisplay.googlepages.com/
S:  Maintained

CFAG12864BFB LCD FRAMEBUFFER DRIVER
P:  Miguel Ojeda Sandonis
M:  [EMAIL PROTECTED]
L:  linux-kernel@vger.kernel.org
+W: http://auxdisplay.googlepages.com/
S:  Maintained

COMMON INTERNET FILE SYSTEM (CIFS)
@@ -2020,6 +2023,7 @@ KS0108 LCD CONTROLLER DRIVER
P:  Miguel Ojeda Sandonis
M:  [EMAIL PROTECTED]
L:  linux-kernel@vger.kernel.org
+W: http://auxdisplay.googlepages.com/
S:  Maintained

LAPB module

--
Miguel Ojeda
http://maxextreme.googlepages.com/index.htm
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: If not readdir() then what?

2007-04-11 Thread Jörn Engel
On Wed, 11 April 2007 16:23:21 -0700, H. Peter Anvin wrote:
> David Lang wrote:
> >On Thu, 12 Apr 2007, Neil Brown wrote:
> >
> >>For the second.
> >> You say that you " would need at least 96 bits in order to make that
> >> guarantee; 64 bits of hash, plus a 32-bit count value in the hash
> >> collision chain".  I think 96 is a bit greedy.  Surely 48 bits of
> >> hash and 16 bits of collision-chain-position would plenty.  You would
> >> need 65537 entries before a collision was even possible, and
> >> billions before it was at all likely. (How big does a set of 48bit
> >> numbers have to get before the probability that "No subset of 65536
> >> numbers are all the same" drops below 0.95?)
> >
> >  you can get a hash collision with two entries.
> 
> Yes, but the probability is 2^-n for an n-bit hash, assuming it's 
> uniformly distributed.
> 
> The probability approaches 1/2 as the number of entries hashes 
> approaches 2^(n/2) (birthday number.)

I believe you are both barking up the wrong tree.  Neil proposed a 16bit
collision chain.  With that, it takes 65537 entries before a collision
chain overflow is possible.

Calling a collision chain overflow "collision" is inviting confusion, of
course. :)

Jörn

-- 
The competent programmer is fully aware of the strictly limited size of
his own skull; therefore he approaches the programming task in full
humility, and among other things he avoids clever tricks like the plague.
-- Edsger W. Dijkstra
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 5/7] [RFC] ds2760 W1 slave

2007-04-11 Thread Anton Vorontsov
This is W1 slave for ds2760 chip, found inside almost every HP iPaq and
HTC PDAs/phones.

---
 drivers/w1/slaves/Kconfig |   13 +++
 drivers/w1/slaves/Makefile|1 +
 drivers/w1/slaves/w1_ds2760.c |  162 +
 drivers/w1/slaves/w1_ds2760.h |   52 +
 drivers/w1/w1_family.h|1 +
 5 files changed, 229 insertions(+), 0 deletions(-)
 create mode 100644 drivers/w1/slaves/w1_ds2760.c
 create mode 100644 drivers/w1/slaves/w1_ds2760.h

diff --git a/drivers/w1/slaves/Kconfig b/drivers/w1/slaves/Kconfig
index 904e5ae..df95d6c 100644
--- a/drivers/w1/slaves/Kconfig
+++ b/drivers/w1/slaves/Kconfig
@@ -35,4 +35,17 @@ config W1_SLAVE_DS2433_CRC
  Each block has 30 bytes of data and a two byte CRC16.
  Full block writes are only allowed if the CRC is valid.
 
+config W1_SLAVE_DS2760
+   tristate "Dallas 2760 battery monitor chip (HP iPAQ & others)"
+   depends on W1
+   help
+ If you enable this you will have the DS2760 battery monitor
+ chip support.
+
+ The battery monitor chip is used in many batteries/devices
+ as the one who is responsible for charging/discharging/monitoring
+ Li+ batteries.
+
+ If you are unsure, say N.
+
 endmenu
diff --git a/drivers/w1/slaves/Makefile b/drivers/w1/slaves/Makefile
index 725dcfd..a8eb752 100644
--- a/drivers/w1/slaves/Makefile
+++ b/drivers/w1/slaves/Makefile
@@ -5,4 +5,5 @@
 obj-$(CONFIG_W1_SLAVE_THERM)   += w1_therm.o
 obj-$(CONFIG_W1_SLAVE_SMEM)+= w1_smem.o
 obj-$(CONFIG_W1_SLAVE_DS2433)  += w1_ds2433.o
+obj-$(CONFIG_W1_SLAVE_DS2760)  += w1_ds2760.o
 
diff --git a/drivers/w1/slaves/w1_ds2760.c b/drivers/w1/slaves/w1_ds2760.c
new file mode 100644
index 000..21b0ef6
--- /dev/null
+++ b/drivers/w1/slaves/w1_ds2760.c
@@ -0,0 +1,162 @@
+/*
+ * 1-Wire implementation for the ds2760 chip
+ *
+ * Copyright (c) 2004-2005, Szabolcs Gyurko <[EMAIL PROTECTED]>
+ *
+ * Use consistent with the GNU GPL is permitted,
+ * provided that this copyright notice is
+ * preserved in its entirety in all copies and derived works.
+ *
+ */
+
+#include 
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "../w1.h"
+#include "../w1_int.h"
+#include "../w1_family.h"
+#include "w1_ds2760.h"
+
+static int w1_ds2760_io(struct device *dev, char *buf, int addr, size_t count,
+int io)
+{
+   struct w1_slave *sl = container_of(dev, struct w1_slave, dev);
+
+   if (!dev)
+   return 0;
+
+   mutex_lock(>master->mutex);
+
+   if (addr > DS2760_DATA_SIZE || addr < 0) {
+   count = 0;
+   goto out;
+   }
+   if (addr + count > DS2760_DATA_SIZE)
+   count = DS2760_DATA_SIZE - addr;
+
+   if (!w1_reset_select_slave(sl)) {
+   if (!io) {
+   w1_write_8(sl->master, W1_DS2760_READ_DATA);
+   w1_write_8(sl->master, addr);
+   count = w1_read_block(sl->master, buf, count);
+   } else {
+   w1_write_8(sl->master, W1_DS2760_WRITE_DATA);
+   w1_write_8(sl->master, addr);
+   w1_write_block(sl->master, buf, count);
+   /* XXX w1_write_block returns void, not n_written */
+   }
+   }
+
+out:
+   mutex_unlock(>master->mutex);
+
+   return count;
+}
+
+int w1_ds2760_read(struct device *dev, char *buf, int addr, size_t count)
+{
+   return w1_ds2760_io(dev, buf, addr, count, 0);
+}
+
+int w1_ds2760_write(struct device *dev, char *buf, int addr, size_t count)
+{
+   return w1_ds2760_io(dev, buf, addr, count, 1);
+}
+
+/* io = 0 means copy from EEPROM to SRAM, 1 means from SRAM to EEPROM */
+static int w1_ds2760_eeprom(struct device *dev, int addr, int io)
+{
+   struct w1_slave *sl = container_of(dev, struct w1_slave, dev);
+   int ret = 0;
+
+   mutex_lock(>master->mutex);
+
+   if (!w1_reset_select_slave(sl)) {
+   if (!io)
+   w1_write_8(sl->master, W1_DS2760_RECALL_DATA);
+   else
+   w1_write_8(sl->master, W1_DS2760_COPY_DATA);
+   w1_write_8(sl->master, addr);
+   }
+
+   mutex_unlock(>master->mutex);
+
+   return ret;
+}
+
+int w1_ds2760_recall(struct device *dev, int addr)
+{
+   return w1_ds2760_eeprom(dev, addr, 0);
+}
+
+int w1_ds2760_copy(struct device *dev, int addr)
+{
+   return w1_ds2760_eeprom(dev, addr, 1);
+}
+
+static ssize_t w1_ds2760_read_bin(struct kobject *kobj, char *buf, loff_t off,
+  size_t count)
+{
+   struct device *dev = container_of(kobj, struct device, kobj);
+   return w1_ds2760_read(dev, buf, off, count);
+}
+
+static struct bin_attribute w1_ds2760_bin_attr = {
+   .attr = {
+   .name = "w1_slave",
+   .mode = S_IRUGO,
+   .owner = THIS_MODULE,
+  

[patch] x86_64: more fixes to node_possible_map runtime setup

2007-04-11 Thread Siddha, Suresh B
On Mon, Apr 09, 2007 at 04:13:28PM -0700, Siddha, Suresh B wrote:
> On Mon, Apr 09, 2007 at 03:05:01PM -0700, [EMAIL PROTECTED] wrote:
> > Subject: x86_64-set-node_possible_map-at-runtime fix
> > From: David Rientjes <[EMAIL PROTECTED]>
> > 
> > Clear node_possible_map if numa_emulation() fails for some reason, such as
> > a failed hash shift, but setup_node_range() has already set some fake nodes
> > as online.
> 
> David, Looking at your fix, I think we require more fixes in this area.
> Please review the appended patch. Thanks.

Andrew, Please apply the appended patch. Goes on top of the
x86_64-set-node_possible_map-at-runtime-fix.patch

thanks, suresh
---

Subject: [patch] x86_64: more fixes to node_possible_map runtime setup
From: Suresh Siddha <[EMAIL PROTECTED]>

More fixes in the failure cases and a small cleanup in numa emulation case.

Signed-off-by: Suresh Siddha <[EMAIL PROTECTED]>
Acked-by: David Rientjes <[EMAIL PROTECTED]>
---

--- linux-2.6.21-rc6/arch/x86_64/mm/numa.c~ 2007-04-09 15:59:03.0 
-0700
+++ linux-2.6.21-rc6/arch/x86_64/mm/numa.c  2007-04-09 17:44:38.0 
-0700
@@ -298,7 +298,6 @@ static int __init setup_node_range(int n
ret = -1;
}
nodes[nid].end = *addr;
-   node_set_online(nid);
node_set(nid, node_possible_map);
printk(KERN_INFO "Faking node %d at %016Lx-%016Lx (%LuMB)\n", nid,
   nodes[nid].start, nodes[nid].end,
@@ -483,7 +482,7 @@ out:
 * SRAT.
 */
remove_all_active_ranges();
-   for_each_online_node(i) {
+   for_each_node_mask(i, node_possible_map) {
e820_register_active_regions(i, nodes[i].start >> PAGE_SHIFT,
nodes[i].end >> PAGE_SHIFT);
setup_node_bootmem(i, nodes[i].start, nodes[i].end);
@@ -510,11 +509,13 @@ void __init numa_initmem_init(unsigned l
if (!numa_off && !acpi_scan_nodes(start_pfn << PAGE_SHIFT,
  end_pfn << PAGE_SHIFT))
return;
+   nodes_clear(node_possible_map);
 #endif
 
 #ifdef CONFIG_K8_NUMA
if (!numa_off && !k8_scan_nodes(start_pfn

[PATCH 4/7] [RFC] remove "#if 0" around find_bus function, export it.

2007-04-11 Thread Anton Vorontsov
This function were placed in "#if 0" because nobody was using it. We do use.

See http://lwn.net/Articles/210610/

---
 drivers/base/bus.c |5 ++---
 include/linux/device.h |2 ++
 2 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/base/bus.c b/drivers/base/bus.c
index 253868e..971efa2 100644
--- a/drivers/base/bus.c
+++ b/drivers/base/bus.c
@@ -667,14 +667,13 @@ void put_bus(struct bus_type * bus)
  *
  * Note that kset_find_obj increments bus' reference count.
  */
-#if 0
+
 struct bus_type * find_bus(char * name)
 {
struct kobject * k = kset_find_obj(_subsys.kset, name);
return k ? to_bus(k) : NULL;
 }
-#endif  /*  0  */
-
+EXPORT_SYMBOL_GPL(find_bus);
 
 /**
  * bus_add_attrs - Add default attributes for this bus.
diff --git a/include/linux/device.h b/include/linux/device.h
index 5cf30e9..4015b39 100644
--- a/include/linux/device.h
+++ b/include/linux/device.h
@@ -68,6 +68,8 @@ extern void bus_unregister(struct bus_type * bus);
 
 extern int __must_check bus_rescan_devices(struct bus_type * bus);
 
+extern struct bus_type *find_bus(char *name);
+
 /* iterator helpers for buses */
 
 int bus_for_each_dev(struct bus_type * bus, struct device * start, void * data,
-- 
1.5.0.5-dirty
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3/7] [RFC] Battery monitoring class

2007-04-11 Thread Anton Vorontsov
Here is battery monitor class. According to first copyright string, we're
maintaining it since 2003. I've took few days and cleaned it up to be
more suitable for mainline inclusion.

It differs from battery class at git://git.infradead.org/battery-2.6.git:

* It's using external power kernel interface, i.e. does not fake external
  powers as batteries. (Same thing David Woodhouse planed last year).

* It have predefined set of attributes, this eliminates code duplication
  by battery drivers. And also gives opportunity to write emulation drivers
  for legacy stuff (APM emulation driver follow).

  If driver can't afford some attribute, it will not appear in sysfs.

* It insists on reusing its predefined attributes *and* their units.
  So, userspace getting expected values for any battery.
  
  Also common units is required for APM/ACPI emulation.
  
  Though our battery class insisting on re-usage, but not forces it. If some
  battery driver can't convert its own raw values (can't imagine why), then
  driver is free to implement its own attributes *and* additional _units
  attribute. Though, this scheme is discouraged.

* LEDs support. Each battery register its trigger, and gadgets with LEDs
  can quickly bind to battery-charging / battery-full triggers.

Here how it looks like from user space:

# ls /sys/class/battery/main-battery/
capacity  max_capacity  max_voltage   min_current  power   subsystem  uevent
current   max_current   min_capacity  min_voltage  status  temp   voltage
# cat /sys/class/battery/main-battery/status
Full
# cat /sys/class/leds/h5400\:green-right/trigger
none h5400-radio timer hwtimer main-battery-charging [main-battery-full]
# cat /sys/class/leds/h5400\:green-right/brightness
255

---
 drivers/Kconfig   |2 +
 drivers/Makefile  |1 +
 drivers/battery/Kconfig   |   11 ++
 drivers/battery/Makefile  |1 +
 drivers/battery/battery.c |  303 +
 include/linux/battery.h   |   98 +++
 6 files changed, 416 insertions(+), 0 deletions(-)
 create mode 100644 drivers/battery/Kconfig
 create mode 100644 drivers/battery/Makefile
 create mode 100644 drivers/battery/battery.c
 create mode 100644 include/linux/battery.h

diff --git a/drivers/Kconfig b/drivers/Kconfig
index c546de3..c3a0038 100644
--- a/drivers/Kconfig
+++ b/drivers/Kconfig
@@ -56,6 +56,8 @@ source "drivers/w1/Kconfig"
 
 source "drivers/power/Kconfig"
 
+source "drivers/battery/Kconfig"
+
 source "drivers/hwmon/Kconfig"
 
 source "drivers/mfd/Kconfig"
diff --git a/drivers/Makefile b/drivers/Makefile
index 2bdaae7..7cbfd37 100644
--- a/drivers/Makefile
+++ b/drivers/Makefile
@@ -61,6 +61,7 @@ obj-$(CONFIG_RTC_LIB) += rtc/
 obj-$(CONFIG_I2C)  += i2c/
 obj-$(CONFIG_W1)   += w1/
 obj-$(CONFIG_EXTERNAL_POWER)   += power/
+obj-$(CONFIG_BATTERY)  += battery/
 obj-$(CONFIG_HWMON)+= hwmon/
 obj-$(CONFIG_PHONE)+= telephony/
 obj-$(CONFIG_MD)   += md/
diff --git a/drivers/battery/Kconfig b/drivers/battery/Kconfig
new file mode 100644
index 000..c386593
--- /dev/null
+++ b/drivers/battery/Kconfig
@@ -0,0 +1,11 @@
+
+menu "Battery support"
+
+config BATTERY
+   tristate "Battery monitoring support"
+   select EXTERNAL_POWER
+   help
+ Say Y here to enable generic battery status reporting in
+ the /sys filesystem.
+
+endmenu
diff --git a/drivers/battery/Makefile b/drivers/battery/Makefile
new file mode 100644
index 000..a2239cb
--- /dev/null
+++ b/drivers/battery/Makefile
@@ -0,0 +1 @@
+obj-$(CONFIG_BATTERY)  += battery.o
diff --git a/drivers/battery/battery.c b/drivers/battery/battery.c
new file mode 100644
index 000..32b8288
--- /dev/null
+++ b/drivers/battery/battery.c
@@ -0,0 +1,303 @@
+/*
+ *  Universal battery monitor class
+ *
+ *  Copyright (c) 2007  Anton Vorontsov <[EMAIL PROTECTED]>
+ *  Copyright (c) 2004  Szabolcs Gyurko
+ *  Copyright (c) 2003  Ian Molton <[EMAIL PROTECTED]>
+ *
+ *  Modified: 2004, Oct Szabolcs Gyurko
+ *
+ *  You may use this code as per GPL version 2
+ *
+ * All voltages, currents, capacities and temperatures in mV, mA, mAh and
+ * tenths of a degree unless otherwise stated. It's driver's job to convert
+ * its raw values to which this class operates. If for some reason driver
+ * can't afford this requirement, then it have to create its own attributes,
+ * plus additional "XYZ_units" for each of them.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+/* If we have hwtimer trigger, then use it to blink charging LED */
+#if defined(CONFIG_LEDS_TRIGGER_HWTIMER) || \
+defined(CONFIG_LEDS_TRIGGER_HWTIMER_MODULE)
+   #define led_trigger_register_charging led_trigger_register_hwtimer
+   #define led_trigger_unregister_charging led_trigger_unregister_hwtimer
+#else
+   #define led_trigger_register_charging led_trigger_register_simple
+   #define 

[PATCH 1/7] [RFC] External power framework

2007-04-11 Thread Anton Vorontsov
External power framework - power supplies and power supplicants.

Supplicants (batteries so far) may ask to notify they when power supply
arrive/gone. This framework used by battery class (next patches).

It's permitted for supply to be bound to several supplicants (think main
and backup batteries).

It's also permitted for supplicants to consume power from several
external supplies (say AC and USB).

Here is how it look like from userspace:

# pwd
/sys/class/power_supply
# ls
ac  usb
# cat ac/online usb/online
1
0

---
 drivers/Kconfig|2 +
 drivers/Makefile   |1 +
 drivers/power/Kconfig  |   13 ++
 drivers/power/Makefile |1 +
 drivers/power/external_power.c |  318 
 include/linux/external_power.h |   54 +++
 6 files changed, 389 insertions(+), 0 deletions(-)
 create mode 100644 drivers/power/Kconfig
 create mode 100644 drivers/power/Makefile
 create mode 100644 drivers/power/external_power.c
 create mode 100644 include/linux/external_power.h

diff --git a/drivers/Kconfig b/drivers/Kconfig
index 050323f..c546de3 100644
--- a/drivers/Kconfig
+++ b/drivers/Kconfig
@@ -54,6 +54,8 @@ source "drivers/spi/Kconfig"
 
 source "drivers/w1/Kconfig"
 
+source "drivers/power/Kconfig"
+
 source "drivers/hwmon/Kconfig"
 
 source "drivers/mfd/Kconfig"
diff --git a/drivers/Makefile b/drivers/Makefile
index 3a718f5..2bdaae7 100644
--- a/drivers/Makefile
+++ b/drivers/Makefile
@@ -60,6 +60,7 @@ obj-$(CONFIG_I2O) += message/
 obj-$(CONFIG_RTC_LIB)  += rtc/
 obj-$(CONFIG_I2C)  += i2c/
 obj-$(CONFIG_W1)   += w1/
+obj-$(CONFIG_EXTERNAL_POWER)   += power/
 obj-$(CONFIG_HWMON)+= hwmon/
 obj-$(CONFIG_PHONE)+= telephony/
 obj-$(CONFIG_MD)   += md/
diff --git a/drivers/power/Kconfig b/drivers/power/Kconfig
new file mode 100644
index 000..17349c1
--- /dev/null
+++ b/drivers/power/Kconfig
@@ -0,0 +1,13 @@
+
+menu "External power support"
+
+config EXTERNAL_POWER
+   tristate "External power kernel interface"
+   help
+ Say Y here to enable kernel external power detection interface,
+ like AC or USB. Information also will exported to userspace via
+ /sys/class/external_power/ directory.
+
+ This interface is mandatory for battery class support.
+
+endmenu
diff --git a/drivers/power/Makefile b/drivers/power/Makefile
new file mode 100644
index 000..c303b45
--- /dev/null
+++ b/drivers/power/Makefile
@@ -0,0 +1 @@
+obj-$(CONFIG_EXTERNAL_POWER)  += external_power.o
diff --git a/drivers/power/external_power.c b/drivers/power/external_power.c
new file mode 100644
index 000..21c25a4
--- /dev/null
+++ b/drivers/power/external_power.c
@@ -0,0 +1,318 @@
+/* 
+ * Linux kernel interface for external power suppliers/supplicants
+ *
+ * Copyright (c) 2007  Anton Vorontsov <[EMAIL PROTECTED]>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+static struct class *power_supply_class;
+
+static LIST_HEAD(supplicants);
+static struct rw_semaphore supplicants_sem;
+
+struct bound_supply {
+   struct power_supply *psy;
+   struct list_head node;
+};
+
+struct bound_supplicant {
+   struct power_supplicant *pst;
+   struct list_head node;
+};
+
+int power_supplicant_am_i_supplied(struct power_supplicant *pst)
+{
+   int ret = 0;
+   struct bound_supply *bpsy;
+
+   pr_debug("%s\n", __FUNCTION__);
+   down(_supply_class->sem);
+   list_for_each_entry(bpsy, >bound_supplies, node) {
+   if (bpsy->psy->is_online(bpsy->psy)) {
+   ret = 1;
+   goto out;
+   }
+   }
+out:
+   up(_supply_class->sem);
+
+   return ret;
+}
+
+static void unbind_pst_from_psys(struct power_supplicant *pst)
+{
+   struct bound_supply *bpsy, *bpsy_tmp;
+   struct bound_supplicant *bpst, *bpst_tmp;
+
+   list_for_each_entry_safe(bpsy, bpsy_tmp, >bound_supplies, node) {
+   list_for_each_entry_safe(bpst, bpst_tmp,
+   >psy->bound_supplicants, node) {
+   if (bpst->pst == pst) {
+   list_del(>node);
+   kfree(bpst);
+   break;
+   }
+   }
+   list_del(>node);
+   kfree(bpsy);
+   }
+
+   return;
+}
+
+static void unbind_psy_from_psts(struct power_supply *psy)
+{
+   struct bound_supply *bpsy, *bpsy_tmp;
+   struct bound_supplicant *bpst, *bpst_tmp;
+
+   list_for_each_entry_safe(bpst, bpst_tmp, >bound_supplicants,
+   

[PATCH 2/7] [RFC] Common power driver for Linux gadgets

2007-04-11 Thread Anton Vorontsov
This driver used to stop code/logic duplication through different
machines we porting at handhelds.org. pda_power register machs' power
supplies, and will take care about notifying batteries about power
changes through external power interface.

This driver should be suitable for almost every Linux gadget today.


Here is brief example how we use it:

static int h5000_is_ac_online(void)
{
return !!(samcop_get_gpio_a(_samcop.dev) &
 SAMCOP_GPIO_GPA_ADP_IN_STATUS);
}

static int h5000_is_usb_online(void)
{
return !!(samcop_get_gpio_a(_samcop.dev) &
 SAMCOP_GPIO_GPA_USB_DETECT);
}

static void h5000_set_charge(int flags)
{
SET_H5400_GPIO(CHG_EN, !!flags);
SET_H5400_GPIO(EXT_CHG_RATE, !!(flags & PDA_POWER_CHARGE_AC));
SET_H5400_GPIO(USB_CHG_RATE, !!(flags & PDA_POWER_CHARGE_USB));
return;
}

static struct pda_power_pdata h5000_power_pdata = {
.is_ac_online = h5000_is_ac_online,
.is_usb_online = h5000_is_usb_online,
.set_charge = h5000_set_charge,
};

static struct resource h5000_power_resourses[] = {
[0] = {
.name = "ac",
.flags = IORESOURCE_IRQ | IORESOURCE_IRQ_HIGHEDGE |
 IORESOURCE_IRQ_LOWEDGE,
},
[1] = {
.name = "usb",
.flags = IORESOURCE_IRQ | IORESOURCE_IRQ_HIGHEDGE |
 IORESOURCE_IRQ_LOWEDGE,
},
};

static struct platform_device h5000_power_pdev = {
.name = "pda-power",
.id = -1,
.resource = h5000_power_resourses,
.num_resources = ARRAY_SIZE(h5000_power_resourses),
.dev = {
.platform_data = _power_pdata,
},
};

---
 drivers/power/Kconfig |8 ++
 drivers/power/Makefile|1 +
 drivers/power/pda_power.c |  218 +
 include/linux/pda_power.h |   27 ++
 4 files changed, 254 insertions(+), 0 deletions(-)
 create mode 100644 drivers/power/pda_power.c
 create mode 100644 include/linux/pda_power.h

diff --git a/drivers/power/Kconfig b/drivers/power/Kconfig
index 17349c1..b87779e 100644
--- a/drivers/power/Kconfig
+++ b/drivers/power/Kconfig
@@ -10,4 +10,12 @@ config EXTERNAL_POWER
 
  This interface is mandatory for battery class support.
 
+config PDA_POWER
+   tristate "Generic PDA/phone power driver"
+   depends on EXTERNAL_POWER
+   help
+ Say Y here to enable generic power driver for PDAs and phones with
+ one or two external power supplies (AC/USB) connected to main and
+ backup batteries, and optional builtin charger.
+
 endmenu
diff --git a/drivers/power/Makefile b/drivers/power/Makefile
index c303b45..6f084e7 100644
--- a/drivers/power/Makefile
+++ b/drivers/power/Makefile
@@ -1 +1,2 @@
 obj-$(CONFIG_EXTERNAL_POWER)  += external_power.o
+obj-$(CONFIG_PDA_POWER)   += pda_power.o
diff --git a/drivers/power/pda_power.c b/drivers/power/pda_power.c
new file mode 100644
index 000..0256ee4
--- /dev/null
+++ b/drivers/power/pda_power.c
@@ -0,0 +1,218 @@
+/*
+ * Common power driver for PDAs and phones with one or two external
+ * power supplies (AC/USB) connected to main and backup batteries,
+ * and optional builtin charger.
+ *
+ * Copyright 2007 Anton Vorontsov <[EMAIL PROTECTED]>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include 
+#include 
+#include 
+#include 
+
+/* 
+ * include/linux/ioport.h does not provide flags for generic IRQ trigger
+ * types. So, we're using "ISA PnP IRQ specific bits", and converting them.
+ */
+static unsigned int get_irq_flags(struct resource *res)
+{
+   unsigned int flags = IRQF_DISABLED;
+
+   if (res->flags & IORESOURCE_IRQ_HIGHEDGE)
+   flags |= IRQF_TRIGGER_RISING;
+   if (res->flags & IORESOURCE_IRQ_LOWEDGE)
+   flags |= IRQF_TRIGGER_FALLING;
+   if (res->flags & IORESOURCE_IRQ_HIGHLEVEL)
+   flags |= IRQF_TRIGGER_HIGH;
+   if (res->flags & IORESOURCE_IRQ_LOWLEVEL)
+   flags |= IRQF_TRIGGER_LOW;
+   if (res->flags & IORESOURCE_IRQ_SHAREABLE)
+   flags |= IRQF_SHARED;
+
+   return flags;
+}
+
+static struct resource *ac_irq, *usb_irq;
+static struct pda_power_pdata *pdata;
+
+static int pda_power_is_ac_online(struct power_supply *psy)
+{
+   return pdata->is_ac_online ? pdata->is_ac_online() : 0;
+}
+
+static int pda_power_is_usb_online(struct power_supply *psy)
+{
+   return pdata->is_usb_online ? pdata->is_usb_online() : 0;
+}
+
+static char *pda_power_supplied_to[] = {
+   "main-battery",
+   "backup-battery",
+};
+
+static struct power_supply pda_power_supplies[] = {
+   {
+   .name = "ac",
+   .type = "ac",
+   .supplied_to = pda_power_supplied_to,
+ 

Re: If not readdir() then what?

2007-04-11 Thread H. Peter Anvin

David Lang wrote:

On Thu, 12 Apr 2007, Neil Brown wrote:


For the second.
 You say that you " would need at least 96 bits in order to make that
 guarantee; 64 bits of hash, plus a 32-bit count value in the hash
 collision chain".  I think 96 is a bit greedy.  Surely 48 bits of
 hash and 16 bits of collision-chain-position would plenty.  You would
 need 65537 entries before a collision was even possible, and
 billions before it was at all likely. (How big does a set of 48bit
 numbers have to get before the probability that "No subset of 65536
 numbers are all the same" drops below 0.95?)


Neil,
  you can get a hash collision with two entries.



Yes, but the probability is 2^-n for an n-bit hash, assuming it's 
uniformly distributed.


The probability approaches 1/2 as the number of entries hashes 
approaches 2^(n/2) (birthday number.)


-hpa
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.21-rc6-mm1 USB related boot hang

2007-04-11 Thread Andrew Morton
On Thu, 12 Apr 2007 01:07:00 +0200
Helge Hafting <[EMAIL PROTECTED]> wrote:

> On Wed, Apr 11, 2007 at 01:43:46PM -0700, Andrew Morton wrote:
> > 
> > OK.  If you add initcall_debug to the kernel boot command line, what's the
> > last thing we call?
> 
> The last messages (handwritten, somewhat shortened)
> calling hid_init+0x0/0x10()
> returned 0
> ran for 0 msec
> calling hid_init+0x0/0x50()
> usbcore registered new interface driver hiddev
> 
> and then it hangs completely.
> 

OK, thanks.  If it happens to be, I'll bisect it down.  Chances are it
won't, and it gets merged, and we get to futz around with it for a week or
two while holding up 2.6.22.  I can only think we must enjoy doing it this way.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: If not readdir() then what?

2007-04-11 Thread Theodore Tso
On Thu, Apr 12, 2007 at 08:32:05AM +1000, Neil Brown wrote:
> For the first:
>   You are storing an internal tree representation of part of the
>   directory attached to the 'struct file'.
>   Would it be possible to store this attached to the page via the
>   ->private pointer?  What would avoid the allocate/create/free
>   overhead on every request.

The reason why we are storing it associated with the file pointer
instead of the page/block is because the a filename insertion might
cause a node split, in which case half of the 4k block gets copied to
another block.  We need a stable pointer to where we are in the tree
that can cope with hash collisions, and that's the reason for creating
red/black tree in the first place, since it *doesn't* get split and
reorganized when the directory's hash tree gets reorg'ed.  So
attaching the tree to the page breaks the reason why we have the
separate data structure in the first place.

>   You suggest caching the open files in nfsd.  While that is probably
>   possible (I've thought of it a number of times) is would also be
>   quite complex, e.g. requiring some sort of call-back to close all
>   those files when the filesystem is unexported.  And it is very easy
>   to get caching heuristics wrong.  Leveraging the page-cache which is
>   a very mature cache seems to make a lot of sense.

Is it really that complex?  The simplest way of handling it is simply
keeping a open directory fd cache in a per-filesystem rbtree index
which is indexed by file handle and contains the file pointer.  When
you unexport the filesystem, you simply walk the rbtree and close all
of the file descriptors; no callback is required.

The caching hueristics are an issue; but fixed-size cache with a
simple LFU replacement strategy isn't all that complex to implement.
If 95% of the time, the readdir's come in quick succession, even a
small cache will probably provide huge performance gains, and
increaing the cache size past some critical point will probably only
provide marginal improvements.  

> For the second.
>   You say that you " would need at least 96 bits in order to make that
>   guarantee; 64 bits of hash, plus a 32-bit count value in the hash
>   collision chain".  I think 96 is a bit greedy.  Surely 48 bits of 
>   hash and 16 bits of collision-chain-position would plenty.  You would
>   need 65537 entries before a collision was even possible, and
>   billions before it was at all likely. (How big does a set of 48bit
>   numbers have to get before the probability that "No subset of 65536
>   numbers are all the same" drops below 0.95?)
> 
>   This would really require that the collision-chain-index was stable
>   across create/delete.  Doing that while you have the tree in the
>   page cache is probably easy enough.  Doing it across reboots is
>   probably not possible without on-disk changes.

Actually, no, we can't keep the collision chain count stable across a
create/delete even while the tree is cached.  At least, not without
storing a huge amount of state associated with each page.  (It would
be a lot more work than simply having nfsd keep a fd cache for
directory streams ;-).

If we need create/delete stability, probably our only sane
implementation choice is to just stick with a 63-bit hash, and cross
our fingers and hope for the best.  

If nfsd caches the last N used directory caches, where N is roughly
proportional to the number of active clients, and the clients all only
use the last cookie returned in the readdir entry (since it would be
stupid to use one of the earlier ones and request the server to
re-send something which the client already has), at least in the
absense of telldir/seekdir calls, then that might be quite sufficient,
even if we return multiple direntory entries which contain hash
collisions to the client.  As long as the directory fd is cached, and
the client just uses the last cookie to fetch the next batch of
dirents, we'll be fine.

- Ted
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[patch 10/31] HID: Do not discard truncated input reports

2007-04-11 Thread Greg KH
-stable review patch.  If anyone has any objections, please let us know.

--
From: Adam Kropelin <[EMAIL PROTECTED]>

HID: Do not discard truncated input reports

Truncated reports should not be discarded since it prevents buggy
devices from communicating with userspace.

Prior to the regession introduced in 2.6.20, a shorter-than-expected
report in hid_input_report() was passed thru after having the missing
bytes cleared. This behavior was established over a few patches in the
2.6.early-teens days, including commit
cd6104572bca9e4afe0dcdb8ecd65ef90b01297b.

This patch restores the previous behavior and fixes the regression.

Signed-off-by: Adam Kropelin <[EMAIL PROTECTED]>
Signed-off-by: Jiri Kosina <[EMAIL PROTECTED]>
Signed-off-by: Greg Kroah-Hartman <[EMAIL PROTECTED]>

---
 drivers/hid/hid-core.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/hid/hid-core.c
+++ b/drivers/hid/hid-core.c
@@ -975,7 +975,7 @@ int hid_input_report(struct hid_device *
 
if (size < rsize) {
dbg("report %d is too short, (%d < %d)", report->id, size, 
rsize);
-   return -1;
+   memset(data + size, 0, rsize - size);
}
 
if ((hid->claimed & HID_CLAIMED_HIDDEV) && hid->hiddev_report_event)

-- 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.21-rc6-mm1 USB related boot hang

2007-04-11 Thread Helge Hafting
On Wed, Apr 11, 2007 at 01:43:46PM -0700, Andrew Morton wrote:
> 
> OK.  If you add initcall_debug to the kernel boot command line, what's the
> last thing we call?

The last messages (handwritten, somewhat shortened)
calling hid_init+0x0/0x10()
returned 0
ran for 0 msec
calling hid_init+0x0/0x50()
usbcore registered new interface driver hiddev

and then it hangs completely.

Helge Hafting
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[patch 11/31] Fix calculation for size of filemap_attr array in md/bitmap.

2007-04-11 Thread Greg KH
-stable review patch.  If anyone has any objections, please let us know.

--
From: Neil Brown <[EMAIL PROTECTED]>

If 'num_pages' were ever 1 more than a multiple of 8 (32bit platforms)
for of 16 (64 bit platforms). filemap_attr would be allocated one
'unsigned long' shorter than required.  We need a round-up in there.


Signed-off-by: Neil Brown <[EMAIL PROTECTED]>
Signed-off-by: Greg Kroah-Hartman <[EMAIL PROTECTED]>

---
 drivers/md/bitmap.c |4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

--- a/drivers/md/bitmap.c
+++ b/drivers/md/bitmap.c
@@ -863,9 +863,7 @@ static int bitmap_init_from_disk(struct 
 
/* We need 4 bits per page, rounded up to a multiple of sizeof(unsigned 
long) */
bitmap->filemap_attr = kzalloc(
-   (((num_pages*4/8)+sizeof(unsigned long)-1)
-/sizeof(unsigned long))
-   *sizeof(unsigned long),
+   roundup( DIV_ROUND_UP(num_pages*4, 8), sizeof(unsigned long)),
GFP_KERNEL);
if (!bitmap->filemap_attr)
goto out;

-- 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[patch 09/31] DVB: pluto2: fix incorrect TSCR register setting

2007-04-11 Thread Greg KH
-stable review patch.  If anyone has any objections, please let us know.

--
From: Andreas Oberritter <[EMAIL PROTECTED]>

DVB: pluto2: fix incorrect TSCR register setting

The ADEF bits in the TSCR register have different meanings in read
and write mode. For this reason ADEF has to be reset on every
read-modify-write operation.

This patch introduces a special write function for this register, which
takes care of it.

Thanks to Holger Magnussen for pointing my nose at this problem.

(cherry picked from commit 1489f90a49f0603a393e1800d729050f6e332bec)

Signed-off-by: Andreas Oberritter <[EMAIL PROTECTED]>
Signed-off-by: Mauro Carvalho Chehab <[EMAIL PROTECTED]>
Signed-off-by: Michael Krufky <[EMAIL PROTECTED]>
Signed-off-by: Greg Kroah-Hartman <[EMAIL PROTECTED]>

---
 drivers/media/dvb/pluto2/pluto2.c |   22 ++
 1 file changed, 14 insertions(+), 8 deletions(-)

--- a/drivers/media/dvb/pluto2/pluto2.c
+++ b/drivers/media/dvb/pluto2/pluto2.c
@@ -149,6 +149,15 @@ static inline void pluto_rw(struct pluto
writel(val, >io_mem[reg]);
 }
 
+static void pluto_write_tscr(struct pluto *pluto, u32 val)
+{
+   /* set the number of packets */
+   val &= ~TSCR_ADEF;
+   val |= TS_DMA_PACKETS / 2;
+
+   pluto_writereg(pluto, REG_TSCR, val);
+}
+
 static void pluto_setsda(void *data, int state)
 {
struct pluto *pluto = data;
@@ -213,11 +222,11 @@ static void pluto_reset_ts(struct pluto 
 
if (val & TSCR_RSTN) {
val &= ~TSCR_RSTN;
-   pluto_writereg(pluto, REG_TSCR, val);
+   pluto_write_tscr(pluto, val);
}
if (reenable) {
val |= TSCR_RSTN;
-   pluto_writereg(pluto, REG_TSCR, val);
+   pluto_write_tscr(pluto, val);
}
 }
 
@@ -339,7 +348,7 @@ static irqreturn_t pluto_irq(int irq, vo
}
 
/* ACK the interrupt */
-   pluto_writereg(pluto, REG_TSCR, tscr | TSCR_IACK);
+   pluto_write_tscr(pluto, tscr | TSCR_IACK);
 
return IRQ_HANDLED;
 }
@@ -348,9 +357,6 @@ static void __devinit pluto_enable_irqs(
 {
u32 val = pluto_readreg(pluto, REG_TSCR);
 
-   /* set the number of packets */
-   val &= ~TSCR_ADEF;
-   val |= TS_DMA_PACKETS / 2;
/* disable AFUL and LOCK interrupts */
val |= (TSCR_MSKA | TSCR_MSKL);
/* enable DMA and OVERFLOW interrupts */
@@ -358,7 +364,7 @@ static void __devinit pluto_enable_irqs(
/* clear pending interrupts */
val |= TSCR_IACK;
 
-   pluto_writereg(pluto, REG_TSCR, val);
+   pluto_write_tscr(pluto, val);
 }
 
 static void pluto_disable_irqs(struct pluto *pluto)
@@ -370,7 +376,7 @@ static void pluto_disable_irqs(struct pl
/* clear pending interrupts */
val |= TSCR_IACK;
 
-   pluto_writereg(pluto, REG_TSCR, val);
+   pluto_write_tscr(pluto, val);
 }
 
 static int __devinit pluto_hw_init(struct pluto *pluto)

-- 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[patch 07/31] sky2: phy workarounds for Yukon EC-U A1

2007-04-11 Thread Greg KH
-stable review patch.  If anyone has any objections, please let us know.

--
From: Stephen Hemminger <[EMAIL PROTECTED]>

The workaround Yukon EC-U wasn't comparing with correct
version and wasn't doing correct setup. Without it, 88e8056
throws all sorts of errors.

Signed-off-by: Stephen Hemminger <[EMAIL PROTECTED]>
Signed-off-by: Greg Kroah-Hartman <[EMAIL PROTECTED]>

---
 drivers/net/sky2.c |6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

--- a/drivers/net/sky2.c
+++ b/drivers/net/sky2.c
@@ -524,9 +524,9 @@ static void sky2_phy_init(struct sky2_hw
ledover &= ~PHY_M_LED_MO_RX;
}
 
-   if (hw->chip_id == CHIP_ID_YUKON_EC_U && hw->chip_rev == 
CHIP_REV_YU_EC_A1) {
+   if (hw->chip_id == CHIP_ID_YUKON_EC_U &&
+   hw->chip_rev == CHIP_REV_YU_EC_U_A1) {
/* apply fixes in PHY AFE */
-   pg = gm_phy_read(hw, port, PHY_MARV_EXT_ADR);
gm_phy_write(hw, port, PHY_MARV_EXT_ADR, 255);
 
/* increase differential signal amplitude in 10BASE-T */
@@ -538,7 +538,7 @@ static void sky2_phy_init(struct sky2_hw
gm_phy_write(hw, port, 0x17, 0x2002);
 
/* set page register to 0 */
-   gm_phy_write(hw, port, PHY_MARV_EXT_ADR, pg);
+   gm_phy_write(hw, port, PHY_MARV_EXT_ADR, 0);
} else {
gm_phy_write(hw, port, PHY_MARV_LED_CTRL, ledctrl);
 

-- 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[patch 14/31] Fix IFB net driver input device crashes

2007-04-11 Thread Greg KH
-stable review patch.  If anyone has any objections, please let us know.

--
From: Patrick McHardy <[EMAIL PROTECTED]>

[IFB]: Fix crash on input device removal

The input_device pointer is not refcounted, which means the device may
disappear while packets are queued, causing a crash when ifb passes packets
with a stale skb->dev pointer to netif_rx().

Fix by storing the interface index instead and do a lookup where neccessary.

Signed-off-by: Patrick McHardy <[EMAIL PROTECTED]>
Acked-by: Jamal Hadi Salim <[EMAIL PROTECTED]>
Signed-off-by: David S. Miller <[EMAIL PROTECTED]>
Signed-off-by: Greg Kroah-Hartman <[EMAIL PROTECTED]>

---
 drivers/net/ifb.c  |   35 +--
 include/linux/skbuff.h |5 +++--
 include/net/pkt_cls.h  |7 +--
 net/core/dev.c |8 
 net/core/skbuff.c  |2 +-
 net/sched/act_mirred.c |2 +-
 6 files changed, 27 insertions(+), 32 deletions(-)

--- a/drivers/net/ifb.c
+++ b/drivers/net/ifb.c
@@ -96,17 +96,24 @@ static void ri_tasklet(unsigned long dev
skb->tc_verd = SET_TC_NCLS(skb->tc_verd);
stats->tx_packets++;
stats->tx_bytes +=skb->len;
+
+   skb->dev = __dev_get_by_index(skb->iif);
+   if (!skb->dev) {
+   dev_kfree_skb(skb);
+   stats->tx_dropped++;
+   break;
+   }
+   skb->iif = _dev->ifindex;
+
if (from & AT_EGRESS) {
dp->st_rx_frm_egr++;
dev_queue_xmit(skb);
} else if (from & AT_INGRESS) {
-
dp->st_rx_frm_ing++;
+   skb_pull(skb, skb->dev->hard_header_len);
netif_rx(skb);
-   } else {
-   dev_kfree_skb(skb);
-   stats->tx_dropped++;
-   }
+   } else
+   BUG();
}
 
if (netif_tx_trylock(_dev)) {
@@ -157,26 +164,10 @@ static int ifb_xmit(struct sk_buff *skb,
stats->rx_packets++;
stats->rx_bytes+=skb->len;
 
-   if (!from || !skb->input_dev) {
-dropped:
+   if (!(from & (AT_INGRESS|AT_EGRESS)) || !skb->iif) {
dev_kfree_skb(skb);
stats->rx_dropped++;
return ret;
-   } else {
-   /*
-* note we could be going
-* ingress -> egress or
-* egress -> ingress
-   */
-   skb->dev = skb->input_dev;
-   skb->input_dev = dev;
-   if (from & AT_INGRESS) {
-   skb_pull(skb, skb->dev->hard_header_len);
-   } else {
-   if (!(from & AT_EGRESS)) {
-   goto dropped;
-   }
-   }
}
 
if (skb_queue_len(>rq) >= dev->tx_queue_len) {
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -188,7 +188,7 @@ enum {
  * @sk: Socket we are owned by
  * @tstamp: Time we arrived
  * @dev: Device we arrived on/are leaving by
- * @input_dev: Device we arrived on
+ * @iif: ifindex of device we arrived on
  * @h: Transport layer header
  * @nh: Network layer header
  * @mac: Link layer header
@@ -235,7 +235,8 @@ struct sk_buff {
struct sock *sk;
struct skb_timeval  tstamp;
struct net_device   *dev;
-   struct net_device   *input_dev;
+   int iif;
+   /* 4 byte hole on 64 bit*/
 
union {
struct tcphdr   *th;
--- a/include/net/pkt_cls.h
+++ b/include/net/pkt_cls.h
@@ -352,10 +352,13 @@ tcf_change_indev(struct tcf_proto *tp, c
 static inline int
 tcf_match_indev(struct sk_buff *skb, char *indev)
 {
+   struct net_device *dev;
+
if (indev[0]) {
-   if  (!skb->input_dev)
+   if  (!skb->iif)
return 0;
-   if (strcmp(indev, skb->input_dev->name))
+   dev = __dev_get_by_index(skb->iif);
+   if (!dev || strcmp(indev, dev->name))
return 0;
}
 
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -1741,8 +1741,8 @@ static int ing_filter(struct sk_buff *sk
if (dev->qdisc_ingress) {
__u32 ttl = (__u32) G_TC_RTTL(skb->tc_verd);
if (MAX_RED_LOOP < ttl++) {
-   printk(KERN_WARNING "Redir loop detected Dropping 
packet (%s->%s)\n",
-   skb->input_dev->name, skb->dev->name);
+   printk(KERN_WARNING "Redir loop detected Dropping 
packet (%d->%d)\n",
+   skb->iif, skb->dev->ifindex);
return TC_ACT_SHOT;
}
 
@@ -1775,8 +1775,8 @@ int netif_receive_skb(struct sk_buff *sk
if (!skb->tstamp.off_sec)

[patch 12/31] 8139too: RTNL and flush_scheduled_work deadlock

2007-04-11 Thread Greg KH
-stable review patch.  If anyone has any objections, please let us know.

--
From: Francois Romieu <[EMAIL PROTECTED]>

Your usual dont-flush_scheduled_work-with-RTNL-held stuff.

It is a bit different here since the thread runs permanently
or is only occasionally kicked for recovery depending on the
hardware revision.

Signed-off-by: Francois Romieu <[EMAIL PROTECTED]>
Cc: Ben Greear <[EMAIL PROTECTED]>
Signed-off-by: Greg Kroah-Hartman <[EMAIL PROTECTED]>

---
 drivers/net/8139too.c |   40 +---
 1 file changed, 17 insertions(+), 23 deletions(-)

--- a/drivers/net/8139too.c
+++ b/drivers/net/8139too.c
@@ -1109,6 +1109,8 @@ static void __devexit rtl8139_remove_one
 
assert (dev != NULL);
 
+   flush_scheduled_work();
+
unregister_netdev (dev);
 
__rtl8139_cleanup_dev (dev);
@@ -1603,18 +1605,21 @@ static void rtl8139_thread (struct work_
struct net_device *dev = tp->mii.dev;
unsigned long thr_delay = next_tick;
 
+   rtnl_lock();
+
+   if (!netif_running(dev))
+   goto out_unlock;
+
if (tp->watchdog_fired) {
tp->watchdog_fired = 0;
rtl8139_tx_timeout_task(work);
-   } else if (rtnl_trylock()) {
-   rtl8139_thread_iter (dev, tp, tp->mmio_addr);
-   rtnl_unlock ();
-   } else {
-   /* unlikely race.  mitigate with fast poll. */
-   thr_delay = HZ / 2;
-   }
+   } else
+   rtl8139_thread_iter(dev, tp, tp->mmio_addr);
 
-   schedule_delayed_work(>thread, thr_delay);
+   if (tp->have_thread)
+   schedule_delayed_work(>thread, thr_delay);
+out_unlock:
+   rtnl_unlock ();
 }
 
 static void rtl8139_start_thread(struct rtl8139_private *tp)
@@ -1626,19 +1631,11 @@ static void rtl8139_start_thread(struct 
return;
 
tp->have_thread = 1;
+   tp->watchdog_fired = 0;
 
schedule_delayed_work(>thread, next_tick);
 }
 
-static void rtl8139_stop_thread(struct rtl8139_private *tp)
-{
-   if (tp->have_thread) {
-   cancel_rearming_delayed_work(>thread);
-   tp->have_thread = 0;
-   } else
-   flush_scheduled_work();
-}
-
 static inline void rtl8139_tx_clear (struct rtl8139_private *tp)
 {
tp->cur_tx = 0;
@@ -1696,12 +1693,11 @@ static void rtl8139_tx_timeout (struct n
 {
struct rtl8139_private *tp = netdev_priv(dev);
 
+   tp->watchdog_fired = 1;
if (!tp->have_thread) {
-   INIT_DELAYED_WORK(>thread, rtl8139_tx_timeout_task);
+   INIT_DELAYED_WORK(>thread, rtl8139_thread);
schedule_delayed_work(>thread, next_tick);
-   } else
-   tp->watchdog_fired = 1;
-
+   }
 }
 
 static int rtl8139_start_xmit (struct sk_buff *skb, struct net_device *dev)
@@ -2233,8 +2229,6 @@ static int rtl8139_close (struct net_dev
 
netif_stop_queue (dev);
 
-   rtl8139_stop_thread(tp);
-
if (netif_msg_ifdown(tp))
printk(KERN_DEBUG "%s: Shutting down ethercard, status was 
0x%4.4x.\n",
dev->name, RTL_R16 (IntrStatus));

-- 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[patch 06/31] sky2: turn on clocks when doing resume

2007-04-11 Thread Greg KH
-stable review patch.  If anyone has any objections, please let us know.

--
From: Stephen Hemminger <[EMAIL PROTECTED]>

Some of these chips are disabled until clock is enabled.
This fixes:
 http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=404107

Signed-off-by: Stephen Hemminger <[EMAIL PROTECTED]>
Signed-off-by: Greg Kroah-Hartman <[EMAIL PROTECTED]>

---
 drivers/net/sky2.c |7 +++
 1 file changed, 7 insertions(+)

--- a/drivers/net/sky2.c
+++ b/drivers/net/sky2.c
@@ -2421,6 +2421,10 @@ static int sky2_reset(struct sky2_hw *hw
return -EOPNOTSUPP;
}
 
+   /* Make sure and enable all clocks */
+   if (hw->chip_id == CHIP_ID_YUKON_EC_U)
+   sky2_pci_write32(hw, PCI_DEV_REG3, 0);
+
hw->chip_rev = (sky2_read8(hw, B2_MAC_CFG) & CFG_CHIP_R_MSK) >> 4;
 
/* This rev is really old, and requires untested workarounds */
@@ -3639,6 +3643,9 @@ static int sky2_resume(struct pci_dev *p
 
pci_restore_state(pdev);
pci_enable_wake(pdev, PCI_D0, 0);
+
+   if (hw->chip_id == CHIP_ID_YUKON_EC_U)
+   sky2_pci_write32(hw, PCI_DEV_REG3, 0);
sky2_set_power_state(hw, PCI_D0);
 
err = sky2_reset(hw);

-- 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[patch 05/31] sky2: turn carrier off when down

2007-04-11 Thread Greg KH
-stable review patch.  If anyone has any objections, please let us know.

--
From: Stephen Hemminger <[EMAIL PROTECTED]>

Driver needs to turn off carrier when down.

Signed-off-by: Stephen Hemminger <[EMAIL PROTECTED]>
Signed-off-by: Greg Kroah-Hartman <[EMAIL PROTECTED]>

---
 drivers/net/sky2.c |1 +
 1 file changed, 1 insertion(+)

--- a/drivers/net/sky2.c
+++ b/drivers/net/sky2.c
@@ -1506,6 +1506,7 @@ static int sky2_down(struct net_device *
 
/* Stop more packets from being queued */
netif_stop_queue(dev);
+   netif_carrier_off(dev);
 
/* Disable port IRQ */
imask = sky2_read32(hw, B0_IMSK);

-- 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[patch 20/31] Fix TCP slow_start_after_idle sysctl

2007-04-11 Thread Greg KH
-stable review patch.  If anyone has any objections, please let us know.

--
From: David Miller <[EMAIL PROTECTED]>

[TCP]: slow_start_after_idle should influence cwnd validation too

For the cases that slow_start_after_idle are meant to deal
with, it is almost a certainty that the congestion window
tests will think the connection is application limited and
we'll thus decrease the cwnd there too.  This defeats the
whole point of setting slow_start_after_idle to zero.

So test it there too.

We do not cancel out the entire tcp_cwnd_validate() function
so that if the sysctl is changed we still have the validation
state maintained.

Signed-off-by: David S. Miller <[EMAIL PROTECTED]>
Signed-off-by: Greg Kroah-Hartman <[EMAIL PROTECTED]>

---
 net/ipv4/tcp_output.c |3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -943,7 +943,8 @@ static void tcp_cwnd_validate(struct soc
if (tp->packets_out > tp->snd_cwnd_used)
tp->snd_cwnd_used = tp->packets_out;
 
-   if ((s32)(tcp_time_stamp - tp->snd_cwnd_stamp) >= 
inet_csk(sk)->icsk_rto)
+   if (sysctl_tcp_slow_start_after_idle &&
+   (s32)(tcp_time_stamp - tp->snd_cwnd_stamp) >= 
inet_csk(sk)->icsk_rto)
tcp_cwnd_application_limited(sk);
}
 }

-- 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[patch 22/31] knfsd: allow nfsd READDIR to return 64bit cookies

2007-04-11 Thread Greg KH
-stable review patch.  If anyone has any objections, please let us know.

--
From: Neil Brown <[EMAIL PROTECTED]>

[PATCH] knfsd: allow nfsd READDIR to return 64bit cookies

->readdir passes lofft_t offsets (used as nfs cookies) to
nfs3svc_encode_entry{,_plus}, but when they pass it on to encode_entry it
becomes an 'off_t', which isn't good.

So filesystems that returned 64bit offsets would lose.

Signed-off-by: Neil Brown <[EMAIL PROTECTED]>
Signed-off-by: Andrew Morton <[EMAIL PROTECTED]>
Signed-off-by: Linus Torvalds <[EMAIL PROTECTED]>
Cc: Chuck Ebbert <[EMAIL PROTECTED]>
Signed-off-by: Greg Kroah-Hartman <[EMAIL PROTECTED]>
---
 fs/nfsd/nfs3xdr.c |6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

--- a/fs/nfsd/nfs3xdr.c
+++ b/fs/nfsd/nfs3xdr.c
@@ -844,8 +844,8 @@ compose_entry_fh(struct nfsd3_readdirres
 #define NFS3_ENTRY_BAGGAGE (2 + 1 + 2 + 1)
 #define NFS3_ENTRYPLUS_BAGGAGE (1 + 21 + 1 + (NFS3_FHSIZE >> 2))
 static int
-encode_entry(struct readdir_cd *ccd, const char *name,
-int namlen, off_t offset, ino_t ino, unsigned int d_type, int plus)
+encode_entry(struct readdir_cd *ccd, const char *name, int namlen,
+loff_t offset, ino_t ino, unsigned int d_type, int plus)
 {
struct nfsd3_readdirres *cd = container_of(ccd, struct nfsd3_readdirres,
common);
@@ -865,7 +865,7 @@ encode_entry(struct readdir_cd *ccd, con
*cd->offset1 = htonl(offset64 & 0x);
cd->offset1 = NULL;
} else {
-   xdr_encode_hyper(cd->offset, (u64) offset);
+   xdr_encode_hyper(cd->offset, offset64);
}
}
 

-- 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[patch 18/31] Fix IPSEC replay window handling

2007-04-11 Thread Greg KH
-stable review patch.  If anyone has any objections, please let us know.

--
From: Herbert Xu <[EMAIL PROTECTED]>

[IPSEC]: Reject packets within replay window but outside the bit mask

Up until this point we've accepted replay window settings greater than
32 but our bit mask can only accomodate 32 packets.  Thus any packet
with a sequence number within the window but outside the bit mask would
be accepted.

This patch causes those packets to be rejected instead.

Signed-off-by: Herbert Xu <[EMAIL PROTECTED]>
Signed-off-by: David S. Miller <[EMAIL PROTECTED]>
Signed-off-by: Greg Kroah-Hartman <[EMAIL PROTECTED]>

---
 net/xfrm/xfrm_state.c |3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

--- a/net/xfrm/xfrm_state.c
+++ b/net/xfrm/xfrm_state.c
@@ -1220,7 +1220,8 @@ int xfrm_replay_check(struct xfrm_state 
return 0;
 
diff = x->replay.seq - seq;
-   if (diff >= x->props.replay_window) {
+   if (diff >= min_t(unsigned int, x->props.replay_window,
+ sizeof(x->replay.bitmap) * 8)) {
x->stats.replay_window++;
return -EINVAL;
}

-- 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[patch 21/31] ide: use correct IDE error recovery

2007-04-11 Thread Greg KH
-stable review patch.  If anyone has any objections, please let us know.

--
From: Suleiman Souhlal <[EMAIL PROTECTED]>

ide: use correct IDE error recovery

IDE error recovery is using IDLE IMMEDIATE if the drive is busy or has DRQ set.
This violates the ATA spec (can only send IDLE?? IMMEDIATE when drive is not
busy) and really hoses up some drives (modern drives will not be able to
recover using this error handling).  The correct thing to do is issue a SRST
followed by a SET FEATURES command.  This is what Western Digital recommends
for error recovery and what Western Digital says Windows does. ?? It?? also does
not violate the ATA spec as far as I can tell.

Bart:
* port the patch over the current tree
* undo the recalibration code removal
* send SET FEATURES command after checking for good drive status
* don't check whether the current request is of REQ_TYPE_ATA_{CMD,TASK}
  type because we need to send SET FEATURES before handling any requests
* some pre-ATA4 drives require INITIALIZE DEVICE PARAMETERS command before
  other commands (except IDENTIFY) so send SET FEATURES only if there are
  no pending drive->special requests
* update comments and patch description
* any bugs introduced by this patch are mine and not Suleiman's :-)

Signed-off-by: Suleiman Souhlal <[EMAIL PROTECTED]>
Acked-by: Alan Cox <[EMAIL PROTECTED]>
Cc: Chuck Ebbert <[EMAIL PROTECTED]>
Signed-off-by: Bartlomiej Zolnierkiewicz <[EMAIL PROTECTED]>

---
 drivers/ide/ide-io.c   |   32 +---
 drivers/ide/ide-iops.c |3 +++
 include/linux/ide.h|1 +
 3 files changed, 25 insertions(+), 11 deletions(-)

--- a/drivers/ide/ide-io.c
+++ b/drivers/ide/ide-io.c
@@ -519,21 +519,24 @@ static ide_startstop_t ide_ata_error(ide
if ((stat & DRQ_STAT) && rq_data_dir(rq) == READ && 
hwif->err_stops_fifo == 0)
try_to_flush_leftover_data(drive);
 
+   if (rq->errors >= ERROR_MAX || blk_noretry_request(rq)) {
+   ide_kill_rq(drive, rq);
+   return ide_stopped;
+   }
+
if (hwif->INB(IDE_STATUS_REG) & (BUSY_STAT|DRQ_STAT))
-   /* force an abort */
-   hwif->OUTB(WIN_IDLEIMMEDIATE, IDE_COMMAND_REG);
+   rq->errors |= ERROR_RESET;
 
-   if (rq->errors >= ERROR_MAX || blk_noretry_request(rq))
-   ide_kill_rq(drive, rq);
-   else {
-   if ((rq->errors & ERROR_RESET) == ERROR_RESET) {
-   ++rq->errors;
-   return ide_do_reset(drive);
-   }
-   if ((rq->errors & ERROR_RECAL) == ERROR_RECAL)
-   drive->special.b.recalibrate = 1;
+   if ((rq->errors & ERROR_RESET) == ERROR_RESET) {
++rq->errors;
+   return ide_do_reset(drive);
}
+
+   if ((rq->errors & ERROR_RECAL) == ERROR_RECAL)
+   drive->special.b.recalibrate = 1;
+
+   ++rq->errors;
+
return ide_stopped;
 }
 
@@ -1025,6 +1028,13 @@ static ide_startstop_t start_request (id
if (!drive->special.all) {
ide_driver_t *drv;
 
+   /*
+* We reset the drive so we need to issue a SETFEATURES.
+* Do it _after_ do_special() restored device parameters.
+*/
+   if (drive->current_speed == 0xff)
+   ide_config_drive_speed(drive, drive->desired_speed);
+
if (rq->cmd_type == REQ_TYPE_ATA_CMD ||
rq->cmd_type == REQ_TYPE_ATA_TASK ||
rq->cmd_type == REQ_TYPE_ATA_TASKFILE)
--- a/drivers/ide/ide-iops.c
+++ b/drivers/ide/ide-iops.c
@@ -1123,6 +1123,9 @@ static void pre_reset(ide_drive_t *drive
if (HWIF(drive)->pre_reset != NULL)
HWIF(drive)->pre_reset(drive);
 
+   if (drive->current_speed != 0xff)
+   drive->desired_speed = drive->current_speed;
+   drive->current_speed = 0xff;
 }
 
 /*
--- a/include/linux/ide.h
+++ b/include/linux/ide.h
@@ -607,6 +607,7 @@ typedef struct ide_drive_s {
 u8 init_speed; /* transfer rate set at boot */
 u8 pio_speed;  /* unused by core, used by some drivers for 
fallback from DMA */
 u8 current_speed;  /* current transfer rate set */
+   u8  desired_speed;  /* desired transfer rate set */
 u8 dn; /* now wide spread use */
 u8 wcache; /* status of write cache */
u8  acoustic;   /* acoustic management */

-- 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[patch 19/31] Fix tcindex classifier ABI borkage...

2007-04-11 Thread Greg KH
-stable review patch.  If anyone has any objections, please let us know.

--
From: Patrick McHardy <[EMAIL PROTECTED]>

[NET_SCHED]: cls_tcindex: fix compatibility breakage

Userspace uses an integer for TCA_TCINDEX_SHIFT, the kernel was changed
to expect and use a u16 value in 2.6.11, which broke compatibility on
big endian machines. Change back to use int.

Reported by Ole Reinartz <[EMAIL PROTECTED]>

Signed-off-by: Patrick McHardy <[EMAIL PROTECTED]>
Signed-off-by: David S. Miller <[EMAIL PROTECTED]>
Signed-off-by: Greg Kroah-Hartman <[EMAIL PROTECTED]>

---
 net/sched/cls_tcindex.c |4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- a/net/sched/cls_tcindex.c
+++ b/net/sched/cls_tcindex.c
@@ -245,9 +245,9 @@ tcindex_set_parms(struct tcf_proto *tp, 
}
 
if (tb[TCA_TCINDEX_SHIFT-1]) {
-   if (RTA_PAYLOAD(tb[TCA_TCINDEX_SHIFT-1]) < sizeof(u16))
+   if (RTA_PAYLOAD(tb[TCA_TCINDEX_SHIFT-1]) < sizeof(int))
goto errout;
-   cp.shift = *(u16 *) RTA_DATA(tb[TCA_TCINDEX_SHIFT-1]);
+   cp.shift = *(int *) RTA_DATA(tb[TCA_TCINDEX_SHIFT-1]);
}
 
err = -EBUSY;

-- 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[patch 30/31] fix page leak during core dump

2007-04-11 Thread Greg KH
-stable review patch.  If anyone has any objections, please let us know.

--

From: Brian Pomerantz <[EMAIL PROTECTED]>

When the dump cannot occur most likely because of a full file system and
the page to be written is the zero page, the call to page_cache_release()
is missed.

Signed-off-by: Brian Pomerantz <[EMAIL PROTECTED]>
Cc: Hugh Dickins <[EMAIL PROTECTED]>
Cc: Nick Piggin <[EMAIL PROTECTED]>
Cc: David Howells <[EMAIL PROTECTED]>
Cc: <[EMAIL PROTECTED]>
Signed-off-by: Andrew Morton <[EMAIL PROTECTED]>
Signed-off-by: Greg Kroah-Hartman <[EMAIL PROTECTED]>

---
 fs/binfmt_elf.c   |5 -
 fs/binfmt_elf_fdpic.c |2 +-
 2 files changed, 5 insertions(+), 2 deletions(-)

--- a/fs/binfmt_elf.c
+++ b/fs/binfmt_elf.c
@@ -1704,7 +1704,10 @@ static int elf_core_dump(long signr, str
DUMP_SEEK(PAGE_SIZE);
} else {
if (page == ZERO_PAGE(addr)) {
-   DUMP_SEEK(PAGE_SIZE);
+   if (!dump_seek(file, PAGE_SIZE)) {
+   page_cache_release(page);
+   goto end_coredump;
+   }
} else {
void *kaddr;
flush_cache_page(vma, addr,
--- a/fs/binfmt_elf_fdpic.c
+++ b/fs/binfmt_elf_fdpic.c
@@ -1473,8 +1473,8 @@ static int elf_fdpic_dump_segments(struc
DUMP_SEEK(file->f_pos + PAGE_SIZE);
}
else if (page == ZERO_PAGE(addr)) {
-   DUMP_SEEK(file->f_pos + PAGE_SIZE);
page_cache_release(page);
+   DUMP_SEEK(file->f_pos + PAGE_SIZE);
}
else {
void *kaddr;

-- 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] make MADV_FREE lazily free memory

2007-04-11 Thread Rik van Riel

Eric Dumazet wrote:

Rik van Riel a écrit :

Make it possible for applications to have the kernel free memory
lazily.  This reduces a repeated free/malloc cycle from freeing
pages and allocating them, to just marking them freeable.  If the
application wants to reuse them before the kernel needs the memory,
not even a page fault will happen.


I dont understand this last sentence. If not even a page fault happens, 
how the kernel knows that the page was eventually reused by the 
application, and should not be freed in case of memory pressure ?


Before maybe freeing the page, the kernel checks the referenced
and dirty bits of the page table entries mapping that page.


ptr = mmap(some space);
madvise(ptr, length, MADV_FREE);
/* kernel may free the pages */


All this call does is:
- clear the accessed and dirty bits
- move the page to the far end of the inactive list,
  where it will be the first to be reclaimed


sleep(10);

/* what the application must do know before reusing space ? */
memset(ptr, data, 1);
/* kernel should not free ptr[0..1] now */


Two things can happen here.

If this program used the pages before the kernel needed
them, the program will be reusing its old pages.

If the kernel got there first, you will get page faults
and the kernel will fill in the memory with new pages.

Both of these alternatives are transparent to userspace.

--
Politics is the struggle between those who want to make their country
the best in the world, and those who believe it already is.  Each group
calls the other unpatriotic.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: mconf not removed by make mrproper

2007-04-11 Thread Nigel Cunningham
Hi.

On Sun, 2007-04-01 at 23:17 +0200, Sam Ravnborg wrote:
> On Thu, Feb 01, 2007 at 02:05:49PM +1100, Nigel Cunningham wrote:
> > Hi.
> > 
> > The scripts/kconfig/mconf target isn't removed by the make mrproper
> > target. I can see a couple of possibilities, but wasn't sure which you'd
> > prefer, so thought I'd just raise the issue.
> > 
> > It's only an issue for me because my patch generation script relies on
> > make mrproper making a properly clean tree.
> 
> Fixed - thanks.
> 
>   Sam

Works fine here; thanks!

Acked-by: Nigel Cunningham <[EMAIL PROTECTED]>


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] markers-linker-generic

2007-04-11 Thread Frank Ch. Eigler

Andrew Morton <[EMAIL PROTECTED]> writes:

> [...]  I am told that the systemtap developers plan to (or are)
> using this infrastructure.

Indeed.

> If correct: what is their reason for preferring it over kprobes?
> [...]

It's not a preference - it's more of a supplement.  It's helpful when
some combination of such factors exists:

- kprobe int3-fault dispatching overhead orders of magnitude too high
- fault dispatching not permissible in some areas
- local context variables not easily retrievable via dwarf information
- dwarf information not available at all
- costs of permanently placed but passive marker acceptable

>From systemtap's point of view, instrumentation hooked to markers,
kprobes, and other facilities like timers, coexist just fine.  A
greater number of probe-able event sources makes for a richer tool.


- FChE
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[patch 17/31] Fix TCP receiver side SWS handling.

2007-04-11 Thread Greg KH
-stable review patch.  If anyone has any objections, please let us know.

--
From: John Heffner <[EMAIL PROTECTED]>

[TCP]: Do receiver-side SWS avoidance for rcvbuf < MSS.

Signed-off-by: John Heffner <[EMAIL PROTECTED]>
Signed-off-by: David S. Miller <[EMAIL PROTECTED]>
Signed-off-by: Greg Kroah-Hartman <[EMAIL PROTECTED]>

---
 net/ipv4/tcp_output.c |3 +++
 1 file changed, 3 insertions(+)

--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -1607,6 +1607,9 @@ u32 __tcp_select_window(struct sock *sk)
 */
if (window <= free_space - mss || window > free_space)
window = (free_space/mss)*mss;
+   else if (mss == full_space &&
+free_space > window + full_space/2)
+   window = free_space;
}
 
return window;

-- 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


  1   2   3   4   5   6   7   8   9   >