date:20071216

[PATCH] [NET]: Remove FASTCALL/fastcall macros

2007-12-16 Thread Harvey Harrison

X86_32 was the last user of the FASTCALL/fastcall macros, now that it
uses regparm(3) by default, these macros expand to nothing.

Signed-off-by: Harvey Harrison [EMAIL PROTECTED]
---
Dave, this is a wrap-up of my patch in your net-2.6.25.git with
the build breakage fix from Andrew Morton included.  This also
chops the fastcall macro which is also now empty.

Original: 172f5efbe9150a82e8d5b0562bbe128492ac820e
Buildfix: 451ff1232b1ff3d32635ea4844ef0d1376460c21

 drivers/net/ns83820.c  |   12 
 include/net/bluetooth/rfcomm.h |4 ++--
 include/net/sock.h |4 ++--
 net/bluetooth/rfcomm/core.c|4 ++--
 net/core/dev.c |2 +-
 net/core/sock.c|4 ++--
 6 files changed, 13 insertions(+), 17 deletions(-)

diff --git a/drivers/net/ns83820.c b/drivers/net/ns83820.c
index ea71f6d..b42c05f 100644
--- a/drivers/net/ns83820.c
+++ b/drivers/net/ns83820.c
@@ -611,8 +611,7 @@ static inline int rx_refill(struct net_device *ndev, gfp_t 
gfp)
return i ? 0 : -ENOMEM;
 }
 
-static void FASTCALL(rx_refill_atomic(struct net_device *ndev));
-static void fastcall rx_refill_atomic(struct net_device *ndev)
+static void rx_refill_atomic(struct net_device *ndev)
 {
rx_refill(ndev, GFP_ATOMIC);
 }
@@ -633,8 +632,7 @@ static inline void clear_rx_desc(struct ns83820 *dev, 
unsigned i)
build_rx_desc(dev, dev-rx_info.descs + (DESC_SIZE * i), 0, 0, 
CMDSTS_OWN, 0);
 }
 
-static void FASTCALL(phy_intr(struct net_device *ndev));
-static void fastcall phy_intr(struct net_device *ndev)
+static void phy_intr(struct net_device *ndev)
 {
struct ns83820 *dev = PRIV(ndev);
static const char *speeds[] = { 10, 100, 1000, 1000(?), 1000F 
};
@@ -832,8 +830,7 @@ static void ns83820_cleanup_rx(struct ns83820 *dev)
}
 }
 
-static void FASTCALL(ns83820_rx_kick(struct net_device *ndev));
-static void fastcall ns83820_rx_kick(struct net_device *ndev)
+static void ns83820_rx_kick(struct net_device *ndev)
 {
struct ns83820 *dev = PRIV(ndev);
/*if (nr_rx_empty(dev) = NR_RX_DESC/4)*/ {
@@ -854,8 +851,7 @@ static void fastcall ns83820_rx_kick(struct net_device 
*ndev)
 /* rx_irq
  *
  */
-static void FASTCALL(rx_irq(struct net_device *ndev));
-static void fastcall rx_irq(struct net_device *ndev)
+static void rx_irq(struct net_device *ndev)
 {
struct ns83820 *dev = PRIV(ndev);
struct rx_info *info = dev-rx_info;
diff --git a/include/net/bluetooth/rfcomm.h b/include/net/bluetooth/rfcomm.h
index 25aa575..98ec7a3 100644
--- a/include/net/bluetooth/rfcomm.h
+++ b/include/net/bluetooth/rfcomm.h
@@ -252,8 +252,8 @@ static inline void rfcomm_dlc_put(struct rfcomm_dlc *d)
rfcomm_dlc_free(d);
 }
 
-extern void FASTCALL(__rfcomm_dlc_throttle(struct rfcomm_dlc *d));
-extern void FASTCALL(__rfcomm_dlc_unthrottle(struct rfcomm_dlc *d));
+extern void __rfcomm_dlc_throttle(struct rfcomm_dlc *d);
+extern void __rfcomm_dlc_unthrottle(struct rfcomm_dlc *d);
 
 static inline void rfcomm_dlc_throttle(struct rfcomm_dlc *d)
 {
diff --git a/include/net/sock.h b/include/net/sock.h
index f415992..803d8f2 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -774,14 +774,14 @@ do {  
\
lockdep_init_map((sk)-sk_lock.dep_map, (name), (key), 0); \
 } while (0)
 
-extern void FASTCALL(lock_sock_nested(struct sock *sk, int subclass));
+extern void lock_sock_nested(struct sock *sk, int subclass);
 
 static inline void lock_sock(struct sock *sk)
 {
lock_sock_nested(sk, 0);
 }
 
-extern void FASTCALL(release_sock(struct sock *sk));
+extern void release_sock(struct sock *sk);
 
 /* BH context may only use the following locking interface. */
 #define bh_lock_sock(__sk) spin_lock(((__sk)-sk_lock.slock))
diff --git a/net/bluetooth/rfcomm/core.c b/net/bluetooth/rfcomm/core.c
index d3e4e18..0c2c937 100644
--- a/net/bluetooth/rfcomm/core.c
+++ b/net/bluetooth/rfcomm/core.c
@@ -465,7 +465,7 @@ int rfcomm_dlc_send(struct rfcomm_dlc *d, struct sk_buff 
*skb)
return len;
 }
 
-void fastcall __rfcomm_dlc_throttle(struct rfcomm_dlc *d)
+void __rfcomm_dlc_throttle(struct rfcomm_dlc *d)
 {
BT_DBG(dlc %p state %ld, d, d-state);
 
@@ -476,7 +476,7 @@ void fastcall __rfcomm_dlc_throttle(struct rfcomm_dlc *d)
rfcomm_schedule(RFCOMM_SCHED_TX);
 }
 
-void fastcall __rfcomm_dlc_unthrottle(struct rfcomm_dlc *d)
+void __rfcomm_dlc_unthrottle(struct rfcomm_dlc *d)
 {
BT_DBG(dlc %p state %ld, d, d-state);
 
diff --git a/net/core/dev.c b/net/core/dev.c
index 06615df..d48c9cf 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -2143,7 +2143,7 @@ static int process_backlog(struct napi_struct *napi, int 
quota)
  *
  * The entry's receive function will be scheduled to run
  */
-void fastcall __napi_schedule(struct napi_struct *n)
+void __napi_schedule(struct napi_struct *n)
 {
unsigned long flags;
 
diff --git

Re: 2.6.24-rc5-mm1: problems with cat /proc/kpageflags

2007-12-16 Thread David Miller

From: Matt Mackall [EMAIL PROTECTED]
Date: Sun, 16 Dec 2007 20:11:49 -0600

 But as the function doesn't actually show up in your stack trace,
 something else is probably wrong. So I'd also try commenting out
 pieces of that function until it started working.

Some piece of state is being indirectly corrupted and this
is showing up later in some unrelated operation.

Can someone send me this kpageflags patch under seperate
cover?  I'll try figure out why it farts on sparc64.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH net-2.6.25 8/8] Remove unused IPV4TYPE macros

2007-12-16 Thread David Miller

From: Joe Perches [EMAIL PROTECTED]
Date: Sun, 16 Dec 2007 20:01:06 -0800

 On Sun, 2007-12-16 at 13:48 -0800, David Miller wrote:
  From: Joe Perches [EMAIL PROTECTED]
  Date: Thu, 13 Dec 2007 15:39:01 -0800
   Signed-off-by: Joe Perches [EMAIL PROTECTED]
  Applied, thanks for doing this work Joe.

 I broke the parisc build.  Bad Joe...

 Here's the patch:

 Signed-off-by: Joe Perches [EMAIL PROTECTED]

Applied, thanks.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] [NET]: Remove FASTCALL/fastcall macros

2007-12-16 Thread David Miller

From: Harvey Harrison [EMAIL PROTECTED]
Date: Sun, 16 Dec 2007 20:16:25 -0800

 X86_32 was the last user of the FASTCALL/fastcall macros, now that it
 uses regparm(3) by default, these macros expand to nothing.
 
 Signed-off-by: Harvey Harrison [EMAIL PROTECTED]
 ---
 Dave, this is a wrap-up of my patch in your net-2.6.25.git with
 the build breakage fix from Andrew Morton included.  This also
 chops the fastcall macro which is also now empty.

If only it applied:

[EMAIL PROTECTED]:~/src/GIT/net-2.6.25$ pcheck diff
+ git apply --check --whitespace=error-all diff
error: patch failed: drivers/net/ns83820.c:611
error: drivers/net/ns83820.c: patch does not apply
error: patch failed: include/net/bluetooth/rfcomm.h:252
error: include/net/bluetooth/rfcomm.h: patch does not apply
error: patch failed: include/net/sock.h:774
error: include/net/sock.h: patch does not apply
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

kconfig: Obey KCONFIG_ALLCONFIG choices with randconfig.

2007-12-16 Thread Paul Mundt

Currently when using KCONFIG_ALLCONFIG with randconfig the choice options
are clobbered. As recommended by Roman, this adds an is_new test to see
whether to select a new option or obey the existing one.

This is a resend of the earlier patch a couple of weeks ago, since there
was no reply. Original thread is at http://lkml.org/lkml/2007/11/28/94

It would be nice to have this for 2.6.24.

Signed-off-by: Paul Mundt [EMAIL PROTECTED]

---

 scripts/kconfig/conf.c |3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/scripts/kconfig/conf.c b/scripts/kconfig/conf.c
index a38787a..8d6f174 100644
--- a/scripts/kconfig/conf.c
+++ b/scripts/kconfig/conf.c
@@ -374,7 +374,8 @@ static int conf_choice(struct menu *menu)
continue;
break;
case set_random:
-   def = (random() % cnt) + 1;
+   if (is_new)
+   def = (random() % cnt) + 1;
case set_default:
case set_yes:
case set_mod:
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 1.0.0.0 DNS replies for many domain names (network)

2007-12-16 Thread Vaidyanathan Srinivasan

* Amogh Hooshdar [EMAIL PROTECTED] [2007-12-14 17:20:17]:

 I am having a strange problem with Debian Etch 4.0 (both 64-bit and
 32-bit) using 2.6.18 kernel. Most websites do not open with browser,
 Pidgin and most other GUI applicatoins. but I am able to ping them
 fine. I am also able to do nslookup properly. When I tried to
 investigate it with Wireshark net sniffer, I observed the following.
 
 PROBLEM WITH 2.6.18
 Say, I try to open www.google.com, browser sends DNS query for
 www.google.com to my DNS server which is correctly configured in
 resolv.conf. It replies with the correct IP address. www.google.com
 redirects the browser to www.google.co.in. browser sends a DNS query
 again for www.google.co.in and the DNS server replies with 1.0.0.0
 which obviously is the wrong address.

I had this problem on Debian 4.0 and it was due to bug in the DSL
router.  I had DNS server set to 192.168.1.1 that is my DSL router
that holds the real DNS IP and forwards the DNS lookup request.

Once in a while the DNS proxy server will give out 1.0.0.0.  The
solution I used was to find the real DNS server and fill it in
resolv.conf  This avoids the DNS proxy on the router and then problem
went away.

https://bugs.launchpad.net/ubuntu/+bug/81057

--Vaidy
 
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: div64: Rework 64-bit type safety checks in do_div().

2007-12-16 Thread Al Viro

On Mon, Dec 17, 2007 at 12:20:19PM +0900, Paul Mundt wrote:
 (Adding Ingo to CC regarding kernel/lockdep_proc.c..)
 
 That seems to be an accurate asessment, yes. If do_div(s64, ...) is buggy
 behaviour, then the current check is fine, and the callsites should be
 corrected. Though if there's code in-tree that relies on s64 do_div, that 
 seems
 to be a more problematic issue.

It is a bug and the only existing callers that manage to work are those that
make sure that signed value is positive.  Still asking for trouble...
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] NET: ns83820.c remove fastcall macro

2007-12-16 Thread Harvey Harrison

Signed-off-by: Harvey Harrison [EMAIL PROTECTED]
---
Dave, this is the remainder of the FASTCALL/fastcall removal
patch that is not already in your tree.
Generated against net-2.6.25.git

 drivers/net/ns83820.c |9 -
 1 files changed, 4 insertions(+), 5 deletions(-)

diff --git a/drivers/net/ns83820.c b/drivers/net/ns83820.c
index 3652c6c..b42c05f 100644
--- a/drivers/net/ns83820.c
+++ b/drivers/net/ns83820.c
@@ -611,8 +611,7 @@ static inline int rx_refill(struct net_device *ndev, gfp_t 
gfp)
return i ? 0 : -ENOMEM;
 }
 
-static void rx_refill_atomic(struct net_device *ndev);
-static void fastcall rx_refill_atomic(struct net_device *ndev)
+static void rx_refill_atomic(struct net_device *ndev)
 {
rx_refill(ndev, GFP_ATOMIC);
 }
@@ -633,7 +632,7 @@ static inline void clear_rx_desc(struct ns83820 *dev, 
unsigned i)
build_rx_desc(dev, dev-rx_info.descs + (DESC_SIZE * i), 0, 0, 
CMDSTS_OWN, 0);
 }
 
-static void fastcall phy_intr(struct net_device *ndev)
+static void phy_intr(struct net_device *ndev)
 {
struct ns83820 *dev = PRIV(ndev);
static const char *speeds[] = { 10, 100, 1000, 1000(?), 1000F 
};
@@ -831,7 +830,7 @@ static void ns83820_cleanup_rx(struct ns83820 *dev)
}
 }
 
-static void fastcall ns83820_rx_kick(struct net_device *ndev)
+static void ns83820_rx_kick(struct net_device *ndev)
 {
struct ns83820 *dev = PRIV(ndev);
/*if (nr_rx_empty(dev) = NR_RX_DESC/4)*/ {
@@ -852,7 +851,7 @@ static void fastcall ns83820_rx_kick(struct net_device 
*ndev)
 /* rx_irq
  *
  */
-static void fastcall rx_irq(struct net_device *ndev)
+static void rx_irq(struct net_device *ndev)
 {
struct ns83820 *dev = PRIV(ndev);
struct rx_info *info = dev-rx_info;
-- 
1.5.4.rc0.1083.gf568



--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] lib: proportion: fix underflow in prop_norm_percpu()

2007-12-16 Thread Jiang zhe

On Fri, 2007-12-14 at 17:01 +0100, Peter Zijlstra wrote:
 Subject: lib: proportion: fix underflow in prop_norm_percpu()
 
 Zhe Jiang noticed that its possible to underflow pl-events in
 prop_norm_percpu() when the value returned by percpu_counter_read() is less
 than the error on that read and the period delay  1. In that case half might
 not trigger the batch increment and the value will be identical on the next
 iteration, causing the same half to be subtracted again and again.
 
 Fix this by rewriting the division as a single subtraction instead of a
 subtraction loop and using percpu_counter_sum() when the value returned
 by percpu_counter_read() is smaller than the error.
 
 The latter is still needed if we want pl-events to shrink properly in the
 error region.
 
 Jiang, can I get a Reviewed-by from you? - if you agree that is :-)
 
 Signed-off-by: Peter Zijlstra [EMAIL PROTECTED]
 Cc: zhejiang [EMAIL PROTECTED]
 ---
  lib/proportions.c |   36 +++-
  1 file changed, 15 insertions(+), 21 deletions(-)
 
 Index: linux-2.6/lib/proportions.c
 ===
 --- linux-2.6.orig/lib/proportions.c
 +++ linux-2.6/lib/proportions.c
 @@ -190,6 +190,8 @@ prop_adjust_shift(int *pl_shift, unsigne
   * PERCPU
   */
  
 +#define PROP_BATCH (8*(1+ilog2(nr_cpu_ids)))
 +
  int prop_local_init_percpu(struct prop_local_percpu *pl)
  {
   spin_lock_init(pl-lock);
 @@ -230,31 +232,23 @@ void prop_norm_percpu(struct prop_global
  
   spin_lock_irqsave(pl-lock, flags);
   prop_adjust_shift(pl-shift, pl-period, pg-shift);
 +
   /*
* For each missed period, we half the local counter.
* basically:
*   pl-events  (global_period - pl-period);
 -  *
 -  * but since the distributed nature of percpu counters make division
 -  * rather hard, use a regular subtraction loop. This is safe, because
 -  * the events will only every be incremented, hence the subtraction
 -  * can never result in a negative number.
*/
 - while (pl-period != global_period) {
 - unsigned long val = percpu_counter_read(pl-events);
 - unsigned long half = (val + 1)  1;
 -
 - /*
 -  * Half of zero won't be much less, break out.
 -  * This limits the loop to shift iterations, even
 -  * if we missed a million.
 -  */
 - if (!val)
 - break;
 -
 - percpu_counter_add(pl-events, -half);
 - pl-period += period;
 - }
 + period = (global_period - pl-period)  (pg-shift - 1);
 + if (period  BITS_PER_LONG) {
 + s64 val = percpu_counter_read(pl-events);
 +
 + if (val  (nr_cpu_ids * PROP_BATCH))
 + val = percpu_counter_sum(pl-events);
 +
 + __percpu_counter_add(pl-events, -val + (val  period), 
 PROP_BATCH);
 + } else
 + percpu_counter_set(pl-events, 0);
 +
   pl-period = global_period;
   spin_unlock_irqrestore(pl-lock, flags);
  }
 @@ -267,7 +261,7 @@ void __prop_inc_percpu(struct prop_descr
   struct prop_global *pg = prop_get_global(pd);
  
   prop_norm_percpu(pg, pl);
 - percpu_counter_add(pl-events, 1);
 + __percpu_counter_add(pl-events, 1, PROP_BATCH);
   percpu_counter_add(pg-events, 1);
   prop_put_global(pd, pg);
  }
 

Reviewed-by: Jiang Zhe [EMAIL PROTECTED]

Thanks!



--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

/sys/block [was: [PATCH 007 of 7] md: Get name for block device in sysfs]

2007-12-16 Thread Michael Tokarev

Kay Sievers wrote:
 On Mon, 2007-12-17 at 09:43 +1100, Neil Brown wrote:
 On Saturday December 15, [EMAIL PROTECTED] wrote:
 On Dec 14, 2007 7:26 AM, NeilBrown [EMAIL PROTECTED] wrote:
 Given an fd on a block device, returns a string like

 /block/sda/sda1

 which can be used to find related information in /sys.
 
 As pointed out to when you came up with the idea, we can't do this. A 
 devpath
 is a path to the device and will not necessarily start with /block for 
 block
 devices. It may start with /devices and can be much longer than
 BDEVNAME_SIZE*2  + 10.
 When you say will not necessarily can I take that to mean that it
 currently does, but it might (will) change??
 
 It's in -mm. The devpath for all block devices, like for all other
 devices, will start with /devices/* if !SYSFS_DEPRECATED.

This is the second time I come across this (planned?) change, and for
the second time I can't understand it.

How to distinguish char devices from block devices in sysfs?
Is the only way to read a symlink `subsystem' in the device
directory?

For now, I've a shell code (used heavily in numerous places),
which looks like this:

  function makedev() {
...
case $DEVPATH in
  /block/*) TYPE=b ;;
  *) TYPE=c ;;
esac
...
mknod /dev/$DEV $TYPE $MAJOR $MINOR
  }

The only external process invocation in there is mknod, all
the rest is done using pure shell constructs.  Is it really
necessary to spawn another process just to read a symlink
now?  It will be almost 2 times slower

(Sure thing this may be rewritten in C, but using shell it's
MUCH easier to customize if necessary.)

Also, /sys/block/ directory is very easy to use currently, --
unlike other /sys/ stuff which is way too deep and often
placed in unknown/unexpected places (and /sys/class/ and
/sys/bus/ directories are changing all the time).

What's the benefit of moving things from /sys/block/ to
/sys/devices/ ?

Thanks.

/mjt
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] ecryptfs: fix fsx data corruption problems

2007-12-16 Thread Eric Sandeen

ecryptfs in 2.6.24-rc3 wasn't surviving fsx for me at all,
dying after 4 ops.  Generally, encountering problems with stale
data and improperly zeroed pages.  An extending truncate + write
for example would expose stale data.

With the changes below I got to a million ops and beyond with all
mmap ops disabled - mmap still needs work.  (A version of this
patch on a RHEL5 kernel ran for over 110 million fsx ops)

I added a few comments as well, to the best of my understanding
as I read through the code.

Signed-off-by: Eric Sandeen [EMAIL PROTECTED]

---


Index: linux-2.6.24-rc3/fs/ecryptfs/mmap.c
===
--- linux-2.6.24-rc3.orig/fs/ecryptfs/mmap.c
+++ linux-2.6.24-rc3/fs/ecryptfs/mmap.c
@@ -263,14 +263,13 @@ out:
return 0;
 }
 
+/* This function must zero any hole we create */
 static int ecryptfs_prepare_write(struct file *file, struct page *page,
  unsigned from, unsigned to)
 {
int rc = 0;
+   loff_t prev_page_end_size;
 
-   if (from == 0  to == PAGE_CACHE_SIZE)
-   goto out;   /* If we are writing a full page, it will be
-  up to date. */
if (!PageUptodate(page)) {
rc = ecryptfs_read_lower_page_segment(page, page-index, 0,
  PAGE_CACHE_SIZE,
@@ -283,22 +282,32 @@ static int ecryptfs_prepare_write(struct
} else
SetPageUptodate(page);
}
-   if (page-index != 0) {
-   loff_t end_of_prev_pg_pos =
-   (((loff_t)page-index  PAGE_CACHE_SHIFT) - 1);
 
-   if (end_of_prev_pg_pos  i_size_read(page-mapping-host)) {
+   prev_page_end_size = ((loff_t)page-index  PAGE_CACHE_SHIFT);
+
+   /*
+* If creating a page or more of holes, zero them out via truncate.
+* Note, this will increase i_size.
+*/
+   if (page-index != 0) {
+   if (prev_page_end_size  i_size_read(page-mapping-host)) {
rc = ecryptfs_truncate(file-f_path.dentry,
-  end_of_prev_pg_pos);
+  prev_page_end_size);
if (rc) {
printk(KERN_ERR Error on attempt to 
   truncate to (higher) offset [%lld];
-   rc = [%d]\n, end_of_prev_pg_pos, rc);
+   rc = [%d]\n, prev_page_end_size, rc);
goto out;
}
}
-   if (end_of_prev_pg_pos + 1  i_size_read(page-mapping-host))
-   zero_user_page(page, 0, PAGE_CACHE_SIZE, KM_USER0);
+   }
+   /*
+* Writing to a new page, and creating a small hole from start of page?
+* Zero it out.
+*/
+   if ((i_size_read(page-mapping-host) == prev_page_end_size) 
+   (from != 0)) {
+   zero_user_page(page, 0, PAGE_CACHE_SIZE, KM_USER0);
}
 out:
return rc;
Index: linux-2.6.24-rc3/fs/ecryptfs/read_write.c
===
--- linux-2.6.24-rc3.orig/fs/ecryptfs/read_write.c
+++ linux-2.6.24-rc3/fs/ecryptfs/read_write.c
@@ -124,6 +124,10 @@ int ecryptfs_write(struct file *ecryptfs
loff_t pos;
int rc = 0;
 
+   /*
+* if we are writing beyond current size, then start pos
+* at the current size - we'll fill in zeros from there.
+*/
if (offset  ecryptfs_file_size)
pos = ecryptfs_file_size;
else
@@ -137,6 +141,7 @@ int ecryptfs_write(struct file *ecryptfs
if (num_bytes  total_remaining_bytes)
num_bytes = total_remaining_bytes;
if (pos  offset) {
+   /* remaining zeros to write, up to destination offset */
size_t total_remaining_zeros = (offset - pos);
 
if (num_bytes  total_remaining_zeros)
@@ -167,17 +172,27 @@ int ecryptfs_write(struct file *ecryptfs
}
}
ecryptfs_page_virt = kmap_atomic(ecryptfs_page, KM_USER0);
+
+   /*
+* pos: where we're now writing, offset: where the request was
+* If current pos is before request, we are filling zeros
+* If we are at or beyond request, we are writing the *data*
+* If we're in a fresh page beyond eof, zero it in either case
+*/
+   if (pos  offset || !start_offset_in_page) {
+   /* We are extending past the previous end of the file.
+* Fill in zero values to the end of the page */
+   memset(((char *)ecryptfs_page_virt
+

Re: 1.0.0.0 DNS replies for many domain names (network)

2007-12-16 Thread Amogh Hooshdar

I fixed this by installing bind9 which has named server. After
installing bind9, I used the default configuration, which understands
DNS  type queries and uses the root name servers and other servers
for resolution.

On Dec 17, 2007 10:21 AM, Vaidyanathan Srinivasan
[EMAIL PROTECTED] wrote:
 * Amogh Hooshdar [EMAIL PROTECTED] [2007-12-14 17:20:17]:

  I am having a strange problem with Debian Etch 4.0 (both 64-bit and
  32-bit) using 2.6.18 kernel. Most websites do not open with browser,
  Pidgin and most other GUI applicatoins. but I am able to ping them
  fine. I am also able to do nslookup properly. When I tried to
  investigate it with Wireshark net sniffer, I observed the following.
 
  PROBLEM WITH 2.6.18
  Say, I try to open www.google.com, browser sends DNS query for
  www.google.com to my DNS server which is correctly configured in
  resolv.conf. It replies with the correct IP address. www.google.com
  redirects the browser to www.google.co.in. browser sends a DNS query
  again for www.google.co.in and the DNS server replies with 1.0.0.0
  which obviously is the wrong address.

 I had this problem on Debian 4.0 and it was due to bug in the DSL
 router.  I had DNS server set to 192.168.1.1 that is my DSL router
 that holds the real DNS IP and forwards the DNS lookup request.

 Once in a while the DNS proxy server will give out 1.0.0.0.  The
 solution I used was to find the real DNS server and fill it in
 resolv.conf  This avoids the DNS proxy on the router and then problem
 went away.

 https://bugs.launchpad.net/ubuntu/+bug/81057

 --Vaidy


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: [PATCH 2/3] arch/ : Platform changes for UCC TDM driver for MPC8323ERDB.Also includes related QE changes.

2007-12-16 Thread Aggrwal Poonam

Thanks Stephen for your comments.
I have gone through them.
Shall incorporate them and repost the patch.

Sorry for late reply as I was on leave for the last week. 


With Regards
Poonam 
 
 

-Original Message-
From: Stephen Rothwell [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, December 11, 2007 5:49 AM
To: Aggrwal Poonam
Cc: [EMAIL PROTECTED]; linux-kernel@vger.kernel.org;
[EMAIL PROTECTED]; Gala Kumar; [EMAIL PROTECTED]; Barkowski
Michael; Cutler Richard; Kalra Ashish
Subject: Re: [PATCH 2/3] arch/ : Platform changes for UCC TDM driver for
MPC8323ERDB.Also includes related QE changes.

On Mon, 10 Dec 2007 17:39:22 +0530 (IST) Poonam_Aggrwal-b10812
[EMAIL PROTECTED] wrote:

 +++ b/arch/powerpc/sysdev/qe_lib/qe.c
 @@ -149,22 +149,116 @@ EXPORT_SYMBOL(qe_issue_cmd);
   */
  static unsigned int brg_clk = 0;
  
 -unsigned int get_brg_clk(void)
 +u32 get_brg_clk(enum qe_clock brgclk, enum qe_clock *brg_source)
  {
 - struct device_node *qe;
 - if (brg_clk)
 - return brg_clk;
 + struct device_node *qe, *brg, *clocks;
 + enum qe_clock brg_src;
 + u32 brg_input_freq = 0;
 + u32 brg_num;
 + const unsigned int *prop;
  
 - qe = of_find_node_by_type(NULL, qe);
 - if (qe) {
 + *brg_source = 0;
 +
 + brg_num = brgclk - QE_BRG1;
 + brg = of_find_compatible_node(NULL, NULL, fsl,cpm-brg);
 + if (brg) {
   unsigned int size;
 - const u32 *prop = of_get_property(qe, brg-frequency,
size);
 - brg_clk = *prop;
 - of_node_put(qe);
 - };
 + prop = of_get_property(brg,
 + fsl,brg-sources, size);
 +
 + brg_src = *(prop + brg_num);

You should probably sanity check that prop is not NULL and points to
something large enough.

You don't use brg after here, so the of_node_put(brg) could go here to
save putting it in multiple places later.  Also, currently there are
paths through the following code that do not do the of_node_put(brg).

 + if (brg_src == 0) {
 + *brg_source = 0;
 + if (brg_clk  0) {
 + of_node_put(brg);
 + return brg_clk;
 + }
 + qe = of_find_node_by_type(NULL, qe);
 + if (qe) {
 + unsigned int size;
 + prop = of_get_property
 + (qe, brg-frequency,
size);
 + of_node_put(qe);
 + of_node_put(brg);
 + return *prop;

NULL check here (yes, I know that the old code didn't check).

 + }
 + } else {
 + *brg_source = brg_src + QE_CLK1 - 1;
 + clocks = of_find_compatible_node(NULL, NULL,
 +
fsl,cpm-clocks);
 + prop = of_get_property(clocks,
 + #clock-cells, size);
 + /*
 +  * clock-cells = 1 only supported right now.
 +  */
 + if (*prop != 1)

Again check for NULL (and possibly size).

 + return 0;
 + prop = of_get_property(clocks,
 + clock-frequency,
size);
 +
 + brg_input_freq = *(prop+(brg_src - 1));

And again.

 + of_node_put(clocks);
 + of_node_put(brg);
 + return brg_input_freq;
 + }
 + }
   return brg_clk;
  }
--
Cheers,
Stephen Rothwell[EMAIL PROTECTED]
http://www.canb.auug.org.au/~sfr/
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch 1/2] [RFC] Simple tamper-proof device filesystem.

2007-12-16 Thread David Newall


Tetsuo Handa wrote:

If Bob is malicious and creates /dev/sda1 with block-8-2 attribute [...]


Bob can't do that.  Only root can.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/5] power: RFC: introduce a new power API

2007-12-16 Thread Anton Vorontsov

Hello Andres, David,

Firstly, Andres, thank you for the efforts.

I quite foreseen what exactly you had in mind when we were
discussing the idea. With patches it's indeed easier to show
flaws of this approach.


On Sun, Dec 16, 2007 at 09:36:24PM -0500, David Woodhouse wrote:
 On Sun, 2007-12-16 at 21:24 -0500, Andres Salomon wrote:
  This API has the power_supply drivers device their own device_attribute
  list; I find this to be a lot more flexible and cleaner.  

I don't see how this is more flexible and cleaner. See below.

  For example,
  rather than having a function with a huge switch statement (as olpc_battery
  currently has), we have separate callback functions.

Is this an improvement? Look into ds2760_battery.c. I scared to
imagine what it will look like after conversion.

As for olpc's huge switch statement, it could be split into
functions _without_ drastic changes to PSY class. As the bonus,
you will get _inlining_ of these functions by gcc, because
there will be just single user of these functions. With
exported-via-pointers functions you can't do that.

You have tons of similar functions with similar functionality, that
only differs by the data source. That scheme was in the early PSY
class I posted here this summer. And I turned it down, fortunately.


On a bet, I can convert huge switch statement to nicely look switch
statement. It will as nice as ds2760's.

The problem isn't in the PSY class.

  We're not limited
  to drivers only being able to pass 'int' and 'char*'s in sysfs,

You're not limited to int and char *. Anything more than that
is unnecessary, so far.

  we're
  not forced to keep a global string around in memory (as is again the
  case for olpc_battery's serial number code),

If battery chip can report strings, then you anyway must keep it in
the memory. The question is when to allocate memory and when to free
it. Side question is for how long to keep it.

Given that that string is small enough (dozen bytes), it's doesn't
matter for how long we'll allocate it. So, in most cases it's easy
to answer: allocate at probe, free at remove, so keep it for whole
battery lifetime. (In contrast, adding tons of functions will waste
_much more_ space than these dozen bytes!)


IIRC this is the main difficulty you're facing with current properties
approach. You've converted whole class to the something different..
but you didn't show a single user of that change. Sorry, olpc still
using hard-coded manufacturer string:

+static ssize_t olpc_bat_manufacturer(struct device *dev,
+   struct device_attribute *attr, char *buf)
+{
+   int ret;
+   uint8_t ec_byte;
+
+   ret = olpc_bat_get_status(ec_byte);
+   if (ret)
+   return ret;
+
+   ec_byte = BAT_ADDR_MFR_TYPE;
+   ret = olpc_ec_cmd(EC_BAT_EEPROM, ec_byte, 1, ec_byte, 1);
+   if (ret)
+   return ret;

+   switch (ec_byte  4) {
+   case 1:
+   strcpy(buf, Gold Peak);
break;
+   case 2:
+   strcpy(buf, BYD);
break;
+   default:
+   strcpy(buf, Unknown);
break;
+   }
+
+   return ret;
+}

In other words: all these strings can and should be static. Why
spend cpu cycles on strcpy'ing things that can be not strcpy'ed?

I don't see S/N function. I'm sure it could be implemented easily
using today's properties approach.

  we don't have ordering
  restrictions w/ the return value being interpreted based upon where it's
  located in the array... etc.  

What exact restrictions you're talking about? There are no
restrictions per se.

  The other API seems to encourage driver
  authors to get their custom sysfs knobs into the list of sysfs knobs, and
  this one doesn't.

Yes, API is encouraging to add knobs, but not just any knobs. Only
ones that make sense as a property of a PSY (opposite to some kind
property of PSY driver itself). The count of such properties is
limited, physically.

I'm recalling question about raw data. No, PSY class isn't for raw
data you're getting from the firmware. Implement driver-specific
binary attribute, that will contain device-specific raw data.
Ideally, you should not export raw data at all (though, good idea
is to export them into the debugfs).

  If there is interest in this API, I'll convert the rest of the power_supply
  drivers over to it and resubmit patches.
 
 Looks sane enough to me.

Heh..

 If Anton has no objections, I'll merge it.

Sorry, lots of objections.

-- 
Anton Vorontsov
email: [EMAIL PROTECTED]
backup email: [EMAIL PROTECTED]
irc://irc.freenode.net/bd2
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 2.6.24-rc5-mm 1/3] gpiolib: basic support for 16-bit PCA9539 GPIO expander

2007-12-16 Thread eric miao

[ Yup, it's an issue, patch updated as below:]

From 8de0246423cbbd0c6bb03a20baf61d360930c350 Mon Sep 17 00:00:00 2001
From: eric miao [EMAIL PROTECTED]
Date: Mon, 10 Dec 2007 17:19:12 +0800
Subject: [PATCH] gpiolib: basic support for 16-bit PCA9539 GPIO expander

1. use 16-bit register access to simplify the logic, cache OUTPUT
   and DIRECTION registers for fast access

2. platform code is required to setup
   a) the numbering of GPIO for PCA9539 (base and number)
   c) pass pca9539_platform_data within i2c_board_info

Derived from drivers/i2c/chips/pca9539.c (which has no current known
users).

Signed-off-by: eric miao [EMAIL PROTECTED]
---
 drivers/gpio/Kconfig|   10 ++
 drivers/gpio/Makefile   |1 +
 drivers/gpio/pca9539.c  |  254 +++
 include/linux/i2c/pca9539.h |   18 +++
 4 files changed, 283 insertions(+), 0 deletions(-)
 create mode 100644 drivers/gpio/pca9539.c
 create mode 100644 include/linux/i2c/pca9539.h

diff --git a/drivers/gpio/Kconfig b/drivers/gpio/Kconfig
index dd9e697..4b54f60 100644
--- a/drivers/gpio/Kconfig
+++ b/drivers/gpio/Kconfig
@@ -9,6 +9,16 @@ menu GPIO Expanders

 comment I2C GPIO expanders:

+config GPIO_PCA9539
+   tristate PCA9539 16-bit I/O port
+   depends on I2C
+   help
+ Say yes here to support the PCA9539 16-bit I/O port. These
+ parts are made by NXP and TI.
+
+ This driver can also be built as a module.  If so, the module
+ will be called pca9539.
+
 config GPIO_PCF857X
tristate PCF857x, PCA857x, and PCA967x I2C GPIO expanders
depends on I2C
diff --git a/drivers/gpio/Makefile b/drivers/gpio/Makefile
index 575bb57..d14fc1e 100644
--- a/drivers/gpio/Makefile
+++ b/drivers/gpio/Makefile
@@ -1,4 +1,5 @@
 # gpio support: dedicated expander chips, etc

 obj-$(CONFIG_GPIO_MCP23S08)+= mcp23s08.o
+obj-$(CONFIG_GPIO_PCA9539) += pca9539.o
 obj-$(CONFIG_GPIO_PCF857X) += pcf857x.o
diff --git a/drivers/gpio/pca9539.c b/drivers/gpio/pca9539.c
new file mode 100644
index 000..050a378
--- /dev/null
+++ b/drivers/gpio/pca9539.c
@@ -0,0 +1,254 @@
+/*
+ *  pca9539.c - 16-bit I/O port with interrupt and reset
+ *
+ *  Copyright (C) 2005 Ben Gardner [EMAIL PROTECTED]
+ *  Copyright (C) 2007 Marvell International Ltd.
+ *
+ *  Derived from drivers/i2c/chips/pca9539.c (which has no current known
+ *  users).
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; version 2 of the License.
+ */
+
+#include linux/module.h
+#include linux/init.h
+#include linux/i2c.h
+#include linux/i2c/pca9539.h
+
+#include asm/gpio.h
+
+#define NR_PCA9539_GPIOS   16
+
+#define PCA9539_INPUT  0
+#define PCA9539_OUTPUT 2
+#define PCA9539_INVERT 4
+#define PCA9539_DIRECTION  6
+
+struct pca9539_chip {
+   unsigned gpio_start;
+   uint16_t reg_output;
+   uint16_t reg_direction;
+
+   struct i2c_client *client;
+   struct gpio_chip gpio_chip;
+};
+
+static int pca9539_write_reg(struct pca9539_chip *chip, int reg, uint16_t val)
+{
+   return i2c_smbus_write_word_data(chip-client, reg, val);
+}
+
+static int pca9539_read_reg(struct pca9539_chip *chip, int reg, uint16_t *val)
+{
+   int ret;
+
+   ret = i2c_smbus_read_word_data(chip-client, reg);
+   if (ret  0) {
+   dev_err(chip-client-dev, failed reading register\n);
+   return ret;
+   }
+
+   *val = (uint16_t)ret;
+   return 0;
+}
+
+static int pca9539_gpio_direction_input(struct gpio_chip *gc, unsigned off)
+{
+   struct pca9539_chip *chip;
+   uint16_t reg_val;
+   int ret;
+
+   chip = container_of(gc, struct pca9539_chip, gpio_chip);
+
+   reg_val = chip-reg_direction | (1u  off);
+   ret = pca9539_write_reg(chip, PCA9539_DIRECTION, reg_val);
+   if (ret)
+   return -EIO;
+
+   chip-reg_direction = reg_val;
+   return 0;
+}
+
+static int pca9539_gpio_direction_output(struct gpio_chip *gc,
+   unsigned off, int val)
+{
+   struct pca9539_chip *chip;
+   uint16_t reg_val;
+   int ret;
+
+   chip = container_of(gc, struct pca9539_chip, gpio_chip);
+
+   /* set output level */
+   if (val)
+   reg_val = chip-reg_output | (1u  off);
+   else
+   reg_val = chip-reg_output  ~(1u  off);
+
+   ret = pca9539_write_reg(chip, PCA9539_OUTPUT, reg_val);
+   if (ret)
+   return -EIO;
+
+   chip-reg_output = reg_val;
+
+   /* then direction */
+   reg_val = chip-reg_direction  ~(1u  off);
+   ret = pca9539_write_reg(chip, PCA9539_DIRECTION, reg_val);
+   if (ret)
+   return -EIO;
+
+   chip-reg_direction = reg_val;
+   return 0;
+}
+
+static int pca9539_gpio_get_value(struct gpio_chip *gc, unsigned off)
+{
+   struct pca9539_chip

Re: [PATCH 2.6.24-rc5-mm 1/3] gpiolib: basic support for 16-bit PCA9539 GPIO expander

2007-12-16 Thread eric miao

[ forget about the previous patch, sorry for my carelessness not to
free the chip structure, below is the correct one ]

From c4be69e8dad28dc75e80b393f9c60f740cca7047 Mon Sep 17 00:00:00 2001
From: eric miao [EMAIL PROTECTED]
Date: Mon, 10 Dec 2007 17:19:12 +0800
Subject: [PATCH] gpiolib: basic support for 16-bit PCA9539 GPIO expander

1. use 16-bit register access to simplify the logic, cache OUTPUT
   and DIRECTION registers for fast access

2. platform code is required to setup
   a) the numbering of GPIO for PCA9539 (base and number)
   c) pass pca9539_platform_data within i2c_board_info

Derived from drivers/i2c/chips/pca9539.c (which has no current known
users).

Signed-off-by: eric miao [EMAIL PROTECTED]
---
 drivers/gpio/Kconfig|   10 ++
 drivers/gpio/Makefile   |1 +
 drivers/gpio/pca9539.c  |  260 +++
 include/linux/i2c/pca9539.h |   18 +++
 4 files changed, 289 insertions(+), 0 deletions(-)
 create mode 100644 drivers/gpio/pca9539.c
 create mode 100644 include/linux/i2c/pca9539.h

diff --git a/drivers/gpio/Kconfig b/drivers/gpio/Kconfig
index dd9e697..4b54f60 100644
--- a/drivers/gpio/Kconfig
+++ b/drivers/gpio/Kconfig
@@ -9,6 +9,16 @@ menu GPIO Expanders

 comment I2C GPIO expanders:

+config GPIO_PCA9539
+   tristate PCA9539 16-bit I/O port
+   depends on I2C
+   help
+ Say yes here to support the PCA9539 16-bit I/O port. These
+ parts are made by NXP and TI.
+
+ This driver can also be built as a module.  If so, the module
+ will be called pca9539.
+
 config GPIO_PCF857X
tristate PCF857x, PCA857x, and PCA967x I2C GPIO expanders
depends on I2C
diff --git a/drivers/gpio/Makefile b/drivers/gpio/Makefile
index 575bb57..d14fc1e 100644
--- a/drivers/gpio/Makefile
+++ b/drivers/gpio/Makefile
@@ -1,4 +1,5 @@
 # gpio support: dedicated expander chips, etc

 obj-$(CONFIG_GPIO_MCP23S08)+= mcp23s08.o
+obj-$(CONFIG_GPIO_PCA9539) += pca9539.o
 obj-$(CONFIG_GPIO_PCF857X) += pcf857x.o
diff --git a/drivers/gpio/pca9539.c b/drivers/gpio/pca9539.c
new file mode 100644
index 000..fc8bee4
--- /dev/null
+++ b/drivers/gpio/pca9539.c
@@ -0,0 +1,260 @@
+/*
+ *  pca9539.c - 16-bit I/O port with interrupt and reset
+ *
+ *  Copyright (C) 2005 Ben Gardner [EMAIL PROTECTED]
+ *  Copyright (C) 2007 Marvell International Ltd.
+ *
+ *  Derived from drivers/i2c/chips/pca9539.c (which has no current known
+ *  users).
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; version 2 of the License.
+ */
+
+#include linux/module.h
+#include linux/init.h
+#include linux/i2c.h
+#include linux/i2c/pca9539.h
+
+#include asm/gpio.h
+
+#define NR_PCA9539_GPIOS   16
+
+#define PCA9539_INPUT  0
+#define PCA9539_OUTPUT 2
+#define PCA9539_INVERT 4
+#define PCA9539_DIRECTION  6
+
+struct pca9539_chip {
+   unsigned gpio_start;
+   uint16_t reg_output;
+   uint16_t reg_direction;
+
+   struct i2c_client *client;
+   struct gpio_chip gpio_chip;
+};
+
+static int pca9539_write_reg(struct pca9539_chip *chip, int reg, uint16_t val)
+{
+   return i2c_smbus_write_word_data(chip-client, reg, val);
+}
+
+static int pca9539_read_reg(struct pca9539_chip *chip, int reg, uint16_t *val)
+{
+   int ret;
+
+   ret = i2c_smbus_read_word_data(chip-client, reg);
+   if (ret  0) {
+   dev_err(chip-client-dev, failed reading register\n);
+   return ret;
+   }
+
+   *val = (uint16_t)ret;
+   return 0;
+}
+
+static int pca9539_gpio_direction_input(struct gpio_chip *gc, unsigned off)
+{
+   struct pca9539_chip *chip;
+   uint16_t reg_val;
+   int ret;
+
+   chip = container_of(gc, struct pca9539_chip, gpio_chip);
+
+   reg_val = chip-reg_direction | (1u  off);
+   ret = pca9539_write_reg(chip, PCA9539_DIRECTION, reg_val);
+   if (ret)
+   return -EIO;
+
+   chip-reg_direction = reg_val;
+   return 0;
+}
+
+static int pca9539_gpio_direction_output(struct gpio_chip *gc,
+   unsigned off, int val)
+{
+   struct pca9539_chip *chip;
+   uint16_t reg_val;
+   int ret;
+
+   chip = container_of(gc, struct pca9539_chip, gpio_chip);
+
+   /* set output level */
+   if (val)
+   reg_val = chip-reg_output | (1u  off);
+   else
+   reg_val = chip-reg_output  ~(1u  off);
+
+   ret = pca9539_write_reg(chip, PCA9539_OUTPUT, reg_val);
+   if (ret)
+   return -EIO;
+
+   chip-reg_output = reg_val;
+
+   /* then direction */
+   reg_val = chip-reg_direction  ~(1u  off);
+   ret = pca9539_write_reg(chip, PCA9539_DIRECTION, reg_val);
+   if (ret)
+   return -EIO;
+
+   chip-reg_direction = reg_val;
+   return 0;
+}
+
+static int

Re: [PATCH 2.6.24-rc5-mm 2/3] gpiolib: add Generic IRQ support for 16-bit PCA9539 GPIO expander

2007-12-16 Thread eric miao

[updated according to David's suggestion to handle the error
of I2C transfer]

From c9b78718488dadc702f40789bd532d1f1765d76e Mon Sep 17 00:00:00 2001
From: eric miao [EMAIL PROTECTED]
Date: Mon, 10 Dec 2007 17:24:36 +0800
Subject: [PATCH] gpiolib: add Generic IRQ support for 16-bit PCA9539
GPIO expander

This patch adds the generic IRQ support for the PCA9539 on-chip GPIOs.

Note: due to the inaccessibility of the generic IRQ code within modules,
this support is only available if the driver is built-in.

Signed-off-by: eric miao [EMAIL PROTECTED]
---
 drivers/gpio/Kconfig   |   11 +++-
 drivers/gpio/pca9539.c |  185 
 2 files changed, 195 insertions(+), 1 deletions(-)

diff --git a/drivers/gpio/Kconfig b/drivers/gpio/Kconfig
index 4b54f60..a4f89a6 100644
--- a/drivers/gpio/Kconfig
+++ b/drivers/gpio/Kconfig
@@ -17,7 +17,16 @@ config GPIO_PCA9539
  parts are made by NXP and TI.

  This driver can also be built as a module.  If so, the module
- will be called pca9539.
+ will be called pca9539.  Note: the Generic IRQ support for the
+ chip will only be available if the driver is built-in
+
+config GPIO_PCA9539_GENERIC_IRQ
+   bool Generic IRQ support for PCA9539
+   depends on GPIO_PCA9539=y  GENERIC_HARDIRQS
+   help
+ Say yes here to support the Generic IRQ for the PCA9539 on-chip
+ GPIO lines. Only pin-changed IRQs (IRQ_TYPE_EDGE_BOTH) are
+ supported in hardware.

 config GPIO_PCF857X
tristate PCF857x, PCA857x, and PCA967x I2C GPIO expanders
diff --git a/drivers/gpio/pca9539.c b/drivers/gpio/pca9539.c
index fc8bee4..10f9549 100644
--- a/drivers/gpio/pca9539.c
+++ b/drivers/gpio/pca9539.c
@@ -14,6 +14,9 @@

 #include linux/module.h
 #include linux/init.h
+#include linux/irq.h
+#include linux/interrupt.h
+#include linux/workqueue.h
 #include linux/i2c.h
 #include linux/i2c/pca9539.h

@@ -33,6 +36,22 @@ struct pca9539_chip {

struct i2c_client *client;
struct gpio_chip gpio_chip;
+#ifdef CONFIG_GPIO_PCA9539_GENERIC_IRQ
+   /*
+* Note: Generic IRQ is not accessible within module code, the IRQ
+* support will thus _only_ be available if the driver is built-in
+*/
+   int irq;/* IRQ for the chip itself */
+   int irq_start;  /* starting IRQ for the on-chip GPIO lines */
+
+   uint16_t irq_mask;
+   uint16_t irq_falling_edge;
+   uint16_t irq_rising_edge;
+   uint16_t last_input;
+
+   struct irq_chip irq_chip;
+   struct work_struct irq_work;
+#endif
 };

 static int pca9539_write_reg(struct pca9539_chip *chip, int reg, uint16_t val)
@@ -155,6 +174,158 @@ static int pca9539_init_gpio(struct pca9539_chip *chip)
return gpiochip_add(gc);
 }

+#ifdef CONFIG_GPIO_PCA9539_GENERIC_IRQ
+/* FIXME: change to schedule_delayed_work() here if reading out of
+ * registers does not reflect the actual pin levels
+ */
+
+static void pca9539_irq_work(struct work_struct *work)
+{
+   struct pca9539_chip *chip;
+   uint16_t input, mask, rising, falling;
+   int ret, i;
+
+   chip = container_of(work, struct pca9539_chip, irq_work);
+
+   ret = pca9539_read_reg(chip, PCA9539_INPUT, input);
+   if (ret  0)
+   return;
+
+   mask = (input ^ chip-last_input)  chip-irq_mask;
+   rising = (input  mask)  chip-irq_rising_edge;
+   falling = (~input  mask)  chip-irq_falling_edge;
+
+   irq_enter();
+
+   for (i = 0; i  NR_PCA9539_GPIOS; i++) {
+   if ((rising | falling)  (1u  i)) {
+   int irq = chip-irq_start + i;
+   struct irq_desc *desc;
+
+   desc = irq_desc + irq;
+   desc_handle_irq(irq, desc);
+   }
+   }
+
+   irq_exit();
+
+   chip-last_input = input;
+}
+
+static void fastcall
+pca9539_irq_demux(unsigned int irq, struct irq_desc *desc)
+{
+   struct pca9539_chip *chip = desc-handler_data;
+
+   desc-chip-mask(chip-irq);
+   desc-chip-ack(chip-irq);
+   schedule_work(chip-irq_work);
+   desc-chip-unmask(chip-irq);
+}
+
+static void pca9539_irq_mask(unsigned int irq)
+{
+   struct irq_desc *desc = irq_desc + irq;
+   struct pca9539_chip *chip = desc-chip_data;
+
+   chip-irq_mask = ~(1u  (irq - chip-irq_start));
+}
+
+static void pca9539_irq_unmask(unsigned int irq)
+{
+   struct irq_desc *desc = irq_desc + irq;
+   struct pca9539_chip *chip = desc-chip_data;
+
+   chip-irq_mask |= 1u  (irq - chip-irq_start);
+}
+
+static void pca9539_irq_ack(unsigned int irq)
+{
+   /* unfortunately, we have to provide an empty irq_chip.ack even
+* if we do nothing here, Generic IRQ will complain otherwise
+*/
+}
+
+static int pca9539_irq_set_type(unsigned int irq, unsigned int type)
+{
+   struct irq_desc *desc = irq_desc + irq;
+   struct pca9539_chip *chip = desc-chip_data;
+

Re: [PATCH 2.6.24-rc5-mm 3/3] gpiolib: obsolete drivers/i2c/chips/pca9539.c

2007-12-16 Thread eric miao

[ Updated according to Jean's suggestion, thanks ]

From 5b4d907da17d57ec168643ebd847278e8d7267f9 Mon Sep 17 00:00:00 2001
From: eric miao [EMAIL PROTECTED]
Date: Sat, 15 Dec 2007 12:07:26 +0800
Subject: [PATCH] gpiolib: obsolete drivers/i2c/chips/pca9539.c and related files

for the following reasons:

1. there is currently no known users of this driver

2. the functionality of this driver is well supported with the recent
   proposed drivers/gpio/pca9539.c, using GPIO_LIB

Signed-off-by: eric miao [EMAIL PROTECTED]
Acked-by: Ben Gardner [EMAIL PROTECTED]
---
 Documentation/i2c/chips/pca9539 |   47 -
 drivers/i2c/chips/Kconfig   |   10 --
 drivers/i2c/chips/Makefile  |1 -
 drivers/i2c/chips/pca9539.c |  196 ---
 4 files changed, 0 insertions(+), 254 deletions(-)
 delete mode 100644 Documentation/i2c/chips/pca9539
 delete mode 100644 drivers/i2c/chips/pca9539.c

diff --git a/Documentation/i2c/chips/pca9539 b/Documentation/i2c/chips/pca9539
deleted file mode 100644
index c4fce6a..000
--- a/Documentation/i2c/chips/pca9539
+++ /dev/null
@@ -1,47 +0,0 @@
-Kernel driver pca9539
-=
-
-Supported chips:
-  * Philips PCA9539
-Prefix: 'pca9539'
-Addresses scanned: 0x74 - 0x77
-Datasheet:
-http://www.semiconductors.philips.com/acrobat/datasheets/PCA9539_2.pdf
-
-Author: Ben Gardner [EMAIL PROTECTED]
-
-
-Description

-
-The Philips PCA9539 is a 16 bit low power I/O device.
-All 16 lines can be individually configured as an input or output.
-The input sense can also be inverted.
-The 16 lines are split between two bytes.
-
-
-Sysfs entries
--
-
-Each is a byte that maps to the 8 I/O bits.
-A '0' suffix is for bits 0-7, while '1' is for bits 8-15.
-
-input[01] - read the current value
-output[01]- sets the output value
-direction[01] - direction of each bit: 1=input, 0=output
-invert[01]- toggle the input bit sense
-
-input reads the actual state of the line and is always available.
-The direction defaults to input for all channels.
-
-
-General Remarks

-
-Note that each output, direction, and invert entry controls 8 lines.
-You should use the read, modify, write sequence.
-For example. to set output bit 0 of 1.
-  val=$(cat output0)
-  val=$(( $val | 1 ))
-  echo $val  output0
-
diff --git a/drivers/i2c/chips/Kconfig b/drivers/i2c/chips/Kconfig
index 2e1c24f..a676f57 100644
--- a/drivers/i2c/chips/Kconfig
+++ b/drivers/i2c/chips/Kconfig
@@ -65,16 +65,6 @@ config SENSORS_PCF8574
  These devices are hard to detect and rarely found on mainstream
  hardware.  If unsure, say N.

-config SENSORS_PCA9539
-   tristate Philips PCA9539 16-bit I/O port
-   depends on EXPERIMENTAL
-   help
- If you say yes here you get support for the Philips PCA9539
- 16-bit I/O port.
-
- This driver can also be built as a module.  If so, the module
- will be called pca9539.
-
 config SENSORS_PCF8591
tristate Philips PCF8591
depends on EXPERIMENTAL
diff --git a/drivers/i2c/chips/Makefile b/drivers/i2c/chips/Makefile
index ca924e1..bc9e9ca 100644
--- a/drivers/i2c/chips/Makefile
+++ b/drivers/i2c/chips/Makefile
@@ -8,7 +8,6 @@ obj-$(CONFIG_DS1682)+= ds1682.o
 obj-$(CONFIG_SENSORS_EEPROM)   += eeprom.o
 obj-$(CONFIG_SENSORS_MAX6875)  += max6875.o
 obj-$(CONFIG_SENSORS_M41T00)   += m41t00.o
-obj-$(CONFIG_SENSORS_PCA9539)  += pca9539.o
 obj-$(CONFIG_SENSORS_PCF8574)  += pcf8574.o
 obj-$(CONFIG_SENSORS_PCF8591)  += pcf8591.o
 obj-$(CONFIG_ISP1301_OMAP) += isp1301_omap.o
diff --git a/drivers/i2c/chips/pca9539.c b/drivers/i2c/chips/pca9539.c
deleted file mode 100644
index f43c4e7..000
--- a/drivers/i2c/chips/pca9539.c
+++ /dev/null
@@ -1,196 +0,0 @@
-/*
-pca9539.c - 16-bit I/O port with interrupt and reset
-
-Copyright (C) 2005 Ben Gardner [EMAIL PROTECTED]
-
-This program is free software; you can redistribute it and/or modify
-it under the terms of the GNU General Public License as published by
-the Free Software Foundation; version 2 of the License.
-*/
-
-#include linux/module.h
-#include linux/init.h
-#include linux/slab.h
-#include linux/i2c.h
-#include linux/hwmon-sysfs.h
-
-/* Addresses to scan */
-static unsigned short normal_i2c[] = {0x74, 0x75, 0x76, 0x77, I2C_CLIENT_END};
-
-/* Insmod parameters */
-I2C_CLIENT_INSMOD_1(pca9539);
-
-enum pca9539_cmd
-{
-   PCA9539_INPUT_0 = 0,
-   PCA9539_INPUT_1 = 1,
-   PCA9539_OUTPUT_0= 2,
-   PCA9539_OUTPUT_1= 3,
-   PCA9539_INVERT_0= 4,
-   PCA9539_INVERT_1= 5,
-   PCA9539_DIRECTION_0 = 6,
-   PCA9539_DIRECTION_1 = 7,
-};
-
-static int pca9539_attach_adapter(struct i2c_adapter *adapter);
-static int pca9539_detect(struct i2c_adapter *adapter, int address, int kind);
-static int pca9539_detach_client(struct i2c_client *client);
-
-/* This is the

Re: 2.6.24-rc5-mm1: problems with cat /proc/kpageflags

2007-12-16 Thread Andrew Morton

On Sun, 16 Dec 2007 20:26:11 -0800 (PST) David Miller [EMAIL PROTECTED] wrote:

 From: Matt Mackall [EMAIL PROTECTED]
 Date: Sun, 16 Dec 2007 20:11:49 -0600
 
  But as the function doesn't actually show up in your stack trace,
  something else is probably wrong. So I'd also try commenting out
  pieces of that function until it started working.
 
 Some piece of state is being indirectly corrupted and this
 is showing up later in some unrelated operation.
 
 Can someone send me this kpageflags patch under seperate
 cover?  I'll try figure out why it farts on sparc64.

hm, non trivial.  It's the third-from-last patch in:

maps4-add-proportional-set-size-accounting-in-smaps.patch
maps4-rework-task_size-macros.patch
maps4-rework-task_size-macros-mips-fix.patch
maps4-move-is_swap_pte.patch
maps4-introduce-a-generic-page-walker.patch
maps4-use-pagewalker-in-clear_refs-and-smaps.patch
maps4-simplify-interdependence-of-maps-and-smaps.patch
maps4-move-clear_refs-code-to-task_mmuc.patch
maps4-regroup-task_mmu-by-interface.patch
maps4-add-proc-pid-pagemap-interface.patch
maps4-add-proc-kpagecount-interface.patch
maps4-add-proc-kpageflags-interface.patch
maps4-make-page-monitoring-proc-file-optional.patch
maps4-make-page-monitoring-proc-file-optional-fix.patch

from
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.24-rc5/2.6.24-rc5-mm1/broken-out

That patch series does apply OK to mainline though.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch 1/2] [RFC] Simple tamper-proof device filesystem.

2007-12-16 Thread penguin-kernel

Hello.

David Wagner wrote:
 If the attacker gets full administrator-level access on your machine,
 there are a gazillion ways the attacker can prevent other admins from
 logging on. This patch can't prevent that.  It sounds like this patch
 is trying to solve a fundamentally unsolveable problem.

Please be aware that I'm saying if this filesystem is used with MAC.

Without MAC, an attacker who got root privilege can do whatever he/she want to 
do.
But with MAC, an attacker who got root privilege can't do whatever he/she want 
to do.
Only actions permitted by MAC's policy are permitted for the attacker who got 
root privilege.

I'm not saying that
this filesystem can prevent attackers from mounting other filesystem over this 
filesystem,
nor this filesystem can prevent attackers from executing /sbin/iptables or 
/usr/bin/passwd.
They are MAC's business.
What this filesystem can do is guarantee filename and its attribute.

If MAC(such as SELinux, TOMOYO Linux) allows attackers to
mount other filesystem over this filesystem, this filesystem is no longer 
tamper-proof.
But as long as MAC prevents attackers from mounting other filesystem over this 
filesystem,
this filesystem can remain tamper-proof.

Regards.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 3/3] net: wireless: bcm43xx: big_buffer_sem semaphore to mutex

2007-12-16 Thread Larry Finger

[EMAIL PROTECTED] wrote:
 On Dec 15, 2007 7:38 AM,  [EMAIL PROTECTED] wrote:
 I'll build latest wireless git without ipv6 late tonight.
 
 Ok, built and tested, and it's actually faster! Although still not as
 fast as bcm43xx or softmac or whatever the problem is, I was getting a
 steady 200 kB/s (as opposed to 500 kB/s for bcm43xx with the same
 file/server). I'm not sure if it was the absence of ipv6 or the
 commits included in updating my git repository though. Either way, I'm
 fairly happy that I'm out of dial-up speed territory.
 
 It'd be nice to be able to fully shake loose whatever is causing the
 speed drain - and I call it a drain since sometimes the connection
 starts out much faster, but slowly throttles down to whatever speed
 it'll stick at (used to be 40 kB/s, but now is 200 kB/s). It does seem
 to be like a cap or limit, as in if I download two files, each one
 will download at 100 kB/s.
 
 If anyone can help I'd really appreciate it. I know that bcm43xx will
 someday be dropped, and when that day comes, it'd be nice if people
 with this hardware have at least similar performance with b43 (myself
 especially).

One major difference between bcm43xx-SoftMAC and b43-mac80211 is that the 
former always used a fixed
rate; whereas mac80211 tries to adjust the bit rate according to the 
transmission conditions.
Perhaps it isn't working quite right in your case because of some peculiarity 
of your AP. IIRC, you
have an 802.11b AP. If so, you will get the same bit speed behavior for 
mac80211 as for bcdm43xx by
issuing a 'sudo iwconfig eth1 rate 11M' command.

Larry
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] scripts/checkpatch.pl: add a check for the patch level (patch -pnum)

2007-12-16 Thread Borislav Petkov

Being bitten by this several times myself here's a quick hack for checking the 
patch
level of the diffs in a patch file. It works only when checkpatch.pl is called
from within the kernel tree.

---
Signed-off-by: Borislav Petkov [EMAIL PROTECTED]

--
 scripts/checkpatch.pl |   18 ++
 1 files changed, 18 insertions(+), 0 deletions(-)

diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index 579f50f..b1329fc 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -653,6 +653,18 @@ sub CHK {
}
 }
 
+sub check_patchlevel {
+
+   my ($path) = @_;
+   $path =~ s![^/]*/!!;
+
+   if ($tree) {
+   if (!stat($path)) {
+   WARN(Check the patchlevel (hint: patch option -p));
+   }
+   }
+}
+
 sub process {
my $filename = shift;
my @lines = @_;
@@ -713,10 +725,16 @@ sub process {
 #extract the filename as it passes
if ($line=~/^\+\+\+\s+(\S+)/) {
$realfile=$1;
+
+   if ($realfile) {
+   check_patchlevel($realfile);
+   }
+
$realfile =~ [EMAIL PROTECTED]/]*/@@;
$in_comment = 0;
next;
}
+
 #extract the line range in the file after the patch is applied
if ($line=~/[EMAIL PROTECTED]@ -\d+(?:,\d+)? \+(\d+)(,(\d+))? 
[EMAIL PROTECTED]@/) {
$is_patch = 1;

-- 
Regards/Gruß,
Boris.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 3/3] net: wireless: bcm43xx: big_buffer_sem semaphore to mutex

2007-12-16 Thread mvtodevnull

On Dec 17, 2007 1:52 AM, Larry Finger [EMAIL PROTECTED] wrote:

 One major difference between bcm43xx-SoftMAC and b43-mac80211 is that the 
 former always used a fixed
 rate; whereas mac80211 tries to adjust the bit rate according to the 
 transmission conditions.
 Perhaps it isn't working quite right in your case because of some peculiarity 
 of your AP. IIRC, you
 have an 802.11b AP. If so, you will get the same bit speed behavior for 
 mac80211 as for bcdm43xx by
 issuing a 'sudo iwconfig eth1 rate 11M' command.

I don't know what happened before, but after a reboot, I can't repeat
the 200 kB/s speed. It's back down to 40 kB/s, just like originally. I
didn't move the laptop, or the ap, the only thing I can think of that
might have changed is the noise level. FWIW, link quality is
consistently the same or better with b43.

Anyway, I'd noticed before that the bit rate starts at 1 Mb/s and
quickly scales to 11 Mb/s, but I tried setting it manually anyway and
didn't see any change. In fact, I set the rate to 5.5 Mb/s as well as
1 Mb/s and the download speed was the same with all three (around
30-40 kB/s).
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/5] power: RFC: introduce a new power API

2007-12-16 Thread Andres Salomon

On Mon, 17 Dec 2007 08:51:23 +0300
Anton Vorontsov [EMAIL PROTECTED] wrote:

 Hello Andres, David,
 
 Firstly, Andres, thank you for the efforts.
 
 I quite foreseen what exactly you had in mind when we were
 discussing the idea. With patches it's indeed easier to show
 flaws of this approach.
 
 
 On Sun, Dec 16, 2007 at 09:36:24PM -0500, David Woodhouse wrote:
  On Sun, 2007-12-16 at 21:24 -0500, Andres Salomon wrote:
   This API has the power_supply drivers device their own device_attribute
   list; I find this to be a lot more flexible and cleaner.  
 
 I don't see how this is more flexible and cleaner. See below.
 
   For example,
   rather than having a function with a huge switch statement (as 
   olpc_battery
   currently has), we have separate callback functions.
 
 Is this an improvement? Look into ds2760_battery.c. I scared to
 imagine what it will look like after conversion.

Why?  It would not look bad after conversion.  Basically:

static ssize_t ds2760_battery_get_status(struct device *dev,
struct device_attribute *attr, char *buf)
{
struct ds2760_device_info *di = to_ds2760_device_info(psy);
return power_supply_status_str(di-charge_status, buf);
}
static ssize_t ds2760_battery_get_voltage_now(struct device *dev,
struct device_attribute *attr, char *buf)
{
struct ds2760_device_info *di = to_ds2760_device_info(psy);
ds2760_battery_read_status(di);
return sprintf(buf, %d\n, di-voltage_uV);
}

...an so on.

If I wanted to get really clever, I could do:

#define DS2760_CALLBACK(name, fmt, var)   \
static ssize_t ds2760_battery_get_##name(struct device *dev,  \
struct device_attribute *attr, char *buf) \
{ \
struct ds2760_device_info *di = to_ds2760_device_info(psy); \
ds2760_battery_read_status(di);   \
return sprintf(buf, fmt, var);\
}

DS2760_CALLBACK(voltage_now, %d\n, di-voltage_uV)
DS2760_CALLBACK(current_now, %d\n, di-current_uA)

etc.. but, I'm not trying to compress lines of code, I'm trying
to ensure things are readable.

 
 As for olpc's huge switch statement, it could be split into
 functions _without_ drastic changes to PSY class. As the bonus,
 you will get _inlining_ of these functions by gcc, because
 there will be just single user of these functions. With
 exported-via-pointers functions you can't do that.
 
 You have tons of similar functions with similar functionality, that
 only differs by the data source. That scheme was in the early PSY
 class I posted here this summer. And I turned it down, fortunately.
 
 
 On a bet, I can convert huge switch statement to nicely look switch
 statement. It will as nice as ds2760's.
 
 The problem isn't in the PSY class.
 

We're still going to be stuck with a huge switch statement.  Yes, it
wouldn't be *as* big, but ds2760_battery.c has a decently sized switch
statement, and olpc_battery.c has even more properties.

The huge switch statement is the least of my worries, though.  Getting
rid of it is just a bonus.


   We're not limited
   to drivers only being able to pass 'int' and 'char*'s in sysfs,
 
 You're not limited to int and char *. Anything more than that
 is unnecessary, so far.
 

See below about the eeprom dump.  Originally, it was desired for this
to be a hex string; after that, binary.  Of course, once I actually
started adding device_attributes to olpc_battery.c, I started wondering;
why not just make *all* the properties device_attributes?  And, what if
I want to show something larger than a signed int?  What if I have a
value that I want to pad with 0's?


   we're
   not forced to keep a global string around in memory (as is again the
   case for olpc_battery's serial number code),
 
 If battery chip can report strings, then you anyway must keep it in
 the memory. The question is when to allocate memory and when to free
 it. Side question is for how long to keep it.
 
 Given that that string is small enough (dozen bytes), it's doesn't
 matter for how long we'll allocate it. So, in most cases it's easy
 to answer: allocate at probe, free at remove, so keep it for whole
 battery lifetime. (In contrast, adding tons of functions will waste
 _much more_ space than these dozen bytes!)
 
 
 IIRC this is the main difficulty you're facing with current properties
 approach. You've converted whole class to the something different..
 but you didn't show a single user of that change. Sorry, olpc still
 using hard-coded manufacturer string:
 

Well, no, I was talking about the serial number string.  It's not
upstream yet, it's just in OLPC's repo.

http://dev.laptop.org/git?p=olpc-2.6;a=commitdiff;h=f9b4313060ab9047942707da6d3084f7792e714c

Note bat_serial, 17 bytes sitting around.  That's actually not that
bad, merely awkward.  Worse (and what caused me to start reworking
the API) was a dump of the

Re: Fw: [PACKET]: Fix /proc/net/packet crash due to bogus private pointer

2007-12-16 Thread Herbert Xu

On Sat, Dec 15, 2007 at 11:56:04PM -0800, Andrew Morton wrote:
 On Sun, 16 Dec 2007 01:37:01 -0500 Miles Lane [EMAIL PROTECTED] wrote:
 
   On Sun, Dec 16, 2007 at 11:07:07AM +0800, Herbert Xu wrote:

So I posted this patch after 19:00 PST on 15 Dec.

  Dec 15 13:44:39 syntropy kernel:  #0:  (p-lock){--..}, at:
  [crypto_algapi:seq_read+0x25/0x191c1] seq_read+0x25/0x26f
 
 So your kernel is still feeding garbage into lockdep.
 
 Are you really really sure that kernel had Herbert's patch applied?

The above log message is stamped as 13:44 PST.  I gotta say
this doesn't look good :)

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmVHI~} [EMAIL PROTECTED]
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: dst cache overflow

2007-12-16 Thread Tobias Diedrich

Herbert Xu wrote:
 On Sat, Dec 15, 2007 at 11:08:58AM +0100, Tobias Diedrich wrote:
 
  Hmm, how do I look for that, if netstat doesn't look suspicous?
 
 Thanks.  What does /proc/net/sockstat show?

[EMAIL PROTECTED]:~$ cat /proc/net/sockstat
sockets: used 143
TCP: inuse 16 orphan 0 tw 4 alloc 21 mem 1
UDP: inuse 8
UDPLITE: inuse 0
RAW: inuse 0
FRAG: inuse 0 memory 0

HTH,

-- 
Tobias  PGP: http://9ac7e0bc.uguu.de
このメールは十割再利用されたビットで作られています。
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 3/3] [UDP6]: Counter increment on BH mode

2007-12-16 Thread Ingo Molnar


* Herbert Xu [EMAIL PROTECTED] wrote:

 On Sat, Dec 15, 2007 at 07:43:28PM +0100, Ingo Molnar wrote:
 
  we could perhaps introduce stat_smp_processor_id(), which signals that 
  the CPU id is used for statistical purposes and does not have to be 
  exact? In any case, your patch looks good too.
 
 Unfortunately that doesn't work because we can then have two CPUs 
 trying to update the same counter which may corrupt it.

ah, indeed. I missed that correctness aspect of your patch. Good catch!

Ingo
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.24-rc5-git1: Reported regressions from 2.6.23

2007-12-16 Thread Ingo Molnar


* Rafael J. Wysocki [EMAIL PROTECTED] wrote:

 Subject   : 2.6.24-rc3-git2 softlockup detected
 Submitter : Kamalesh Babulal [EMAIL PROTECTED]
 Date  : 2007-11-28 15:46
 References: http://lkml.org/lkml/2007/11/28/16
 http://bugzilla.kernel.org/show_bug.cgi?id=9472
 Handled-By: Andrew Morton [EMAIL PROTECTED]
 Ingo Molnar [EMAIL PROTECTED]
 Patch : 

resolved by:

  http://lkml.org/lkml/2007/12/15/41

tested by Kamalesh Babulal:

  http://lkml.org/lkml/2007/12/15/97
  http://lkml.org/lkml/2007/12/16/1

Ingo
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.24-rc5-mm1

2007-12-16 Thread Andrew Morton

On Sat, 15 Dec 2007 22:20:24 +0300 Alexey Dobriyan [EMAIL PROTECTED] wrote:

 FWIW, I got the following at reboot after some tests were finished:
 
 get_unused_fd: slot 3 not NULL!
 get_unised_fd: slot 4 not NULL!
 general protection fault:  [1] PREEMPT SMP
 last sysfs file /sys/class/scsh_host/host1/link_power_management_policy
 
 and that's all.

Beats me - nobody has been fiddling with that code for a while.  Worrisome.

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 3/3] net: wireless: bcm43xx: big_buffer_sem semaphore to mutex

2007-12-16 Thread Ingo Molnar


* John W. Linville [EMAIL PROTECTED] wrote:

  It's not that simple.  For example, regression testing will be a 
  major PITA if one needs to switch back and forth from the new driver 
  to the old one in the process.
 
 Not really true -- a single system can easily have firmware installed 
 for b43, b43legacy, and bcm43xx at the same time and switch back and 
 forth between them.

as long as the version 4 firmware blob is present in the system, will 
testers have a fully fluid test- and work-flow when migrating across 
from bcm43xx to b43, without any other changes to an existing Linux 
installation? (i.e. no udev tweaks, no forced upgrades of components, 
etc.)

Will it Just Work in bisection as well, when a tester's kernel 
flip/flops between bcm43xx and b43 - like it does for the other 3000+ 
drivers in the kernel?

Note that we are _NOT_ interested in might or can scenarios. We are 
interested in preserving the _existing_ bcm43xx installed base and how 
much of a seemless migration the b43 transition will be. _THAT_ is what 
the no regressions upstream rule is about, not the ideal distro 
scenario you outline above. It is YOUR total obligation as a kernel 
maintainer to ensure that you dont break old installations. How hard is 
that to understand? This is not rocket science.

Ingo
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: problem with ending requests asynchronously in my block device driver

2007-12-16 Thread Jon Masters

On Sat, 2007-12-15 at 12:52 -0800, a_kumar wrote:

 I've a block device driver which does the following, 

Why not send the actual code?

 This code works fine with most of the kernel versions, but fails on some
 like , Linux 2.6.18-8.el5-xen 

You've provided no information. What we need:

*). A well formed report, complete with oops, panic, other output.
*). Description of how it fails.

Note also that there is no upstream Linux 2.6.18-8.el5-xen kernel.

There is a Red Hat Enterprise Linux kernel release with that revision
(this is the one that shipped in the GA RHEL5.0 kernel). You should
contact your vendor for support with their kernel if you are unable to
provide a well-formed bug report against an upstream kernel release.

Jon.


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] x86_64: fix problems due to use of outb to port 80 on some AMD64x2 laptops, etc.

2007-12-16 Thread Ingo Molnar


* H. Peter Anvin [EMAIL PROTECTED] wrote:

 Paul Rolland wrote:
 Just an idea : from what I've read, the problem (port 80 hanging) only occurs
 on 'modern' machines...

 It happens on *one single* modern machine...

 Let's keep that in perspective.

two or three i think (and an unknown of others where random, 
unexplained freezes were thought to be hw borkage), but yeah, it's 
still a very low proportion.

Ingo
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [Bug 9182] Critical memory leak (dirty pages)

2007-12-16 Thread Krzysztof Oledzki




On Sat, 15 Dec 2007, Andrew Morton wrote:


On Sun, 16 Dec 2007 00:08:52 +0100 (CET) Krzysztof Oledzki [EMAIL PROTECTED] 
wrote:




On Sat, 15 Dec 2007, [EMAIL PROTECTED] wrote:


http://bugzilla.kernel.org/show_bug.cgi?id=9182


--- Comment #33 from [EMAIL PROTECTED]  2007-12-15 14:19 ---
Krzysztof, I'd hate point you to a hard path (at least time consuming), but
you've done a lot of digging by now anyway. How about git bisecting between
2.6.20-rc2 and rc1? Here is great info on bisecting:
http://www.kernel.org/doc/local/git-quick.html


As I'm smarter than git-bistect I can tell that 2.6.20-rc1-git8 is as bad
as 2.6.20-rc2 but 2.6.20-rc1-git8 with one patch reverted seems to be OK.
So it took me only 2 reboots. ;)

The guilty patch is the one I proposed just an hour ago:
  
http://git.kernel.org/?p=linux%2Fkernel%2Fgit%2Fstable%2Flinux-2.6.20.y.git;a=commitdiff_plain;h=fba2591bf4e418b6c3f9f8794c9dd8fe40ae7bd9

So:
  - 2.6.20-rc1: OK
  - 2.6.20-rc1-git8 with fba2591bf4e418b6c3f9f8794c9dd8fe40ae7bd9 reverted: OK
  - 2.6.20-rc1-git8: very BAD
  - 2.6.20-rc2: very BAD
  - 2.6.20-rc4: very BAD
  - = 2.6.20: BAD (but not *very* BAD!)



well..  We have code which has been used by *everyone* for a year and it's
misbehaving for you alone.


No, not for me alone. Probably only I and Thomas Osterried have systems 
where it is so easy to reproduce. Please note that the problem exists on 
my all systems, but only on one it is critical. It is enough to run
sync; sleep 1; sunc; sleep 1; sync; grep Drirty /proc/meminfo to be sure. 
With =2.6.20-rc1-git8 it *never* falls to 0 an *all* my hosts but only 
on one it goes to ~200MB in about 2 weeks and then everything dies:

http://bugzilla.kernel.org/attachment.cgi?id=13824
http://bugzilla.kernel.org/attachment.cgi?id=13825
http://bugzilla.kernel.org/attachment.cgi?id=13826
http://bugzilla.kernel.org/attachment.cgi?id=13827


 I wonder what you're doing that is different/special.

Me to. :|


Which filesystem, which mount options


 - ext3 on RAID1 (MD): / - rootflags=data=journal
 - ext3 on LVM on RAID5 (MD)
 - nfs

/dev/md0 on / type ext3 (rw)
proc on /proc type proc (rw)
sysfs on /sys type sysfs (rw,nosuid,nodev,noexec)
devpts on /dev/pts type devpts (rw,nosuid,noexec)
/dev/mapper/VolGrp0-usr on /usr type ext3 (rw,nodev,data=journal)
/dev/mapper/VolGrp0-var on /var type ext3 (rw,nodev,data=journal)
/dev/mapper/VolGrp0-squid_spool on /var/cache/squid/cd0 type ext3 
(rw,nosuid,nodev,noatime,data=writeback)
/dev/mapper/VolGrp0-squid_spool2 on /var/cache/squid/cd1 type ext3 
(rw,nosuid,nodev,noatime,data=writeback)
/dev/mapper/VolGrp0-news_spool on /var/spool/news type ext3 
(rw,nosuid,nodev,noatime)
shm on /dev/shm type tmpfs (rw,noexec,nosuid,nodev)
usbfs on /proc/bus/usb type usbfs (rw,noexec,nosuid,devmode=0664,devgid=85)
owl:/usr/gentoo-nfs on /usr/gentoo-nfs type nfs 
(ro,nosuid,nodev,noatime,bg,intr,tcp,addr=192.168.129.26)



what sort of workload?
Different, depending on a host: mail (postfix + amavisd + spamassasin + 
clamav + sqlgray), squid, mysql, apache, nfs, rsync,  But it seems 
that the biggest problem is on the host running mentioned mail service.


Thanks.

Best regards,

Krzysztof Olędzki

Re: [PATCH] x86_64: fix problems due to use of outb to port 80 on some AMD64x2 laptops, etc.

2007-12-16 Thread Ingo Molnar


* H. Peter Anvin [EMAIL PROTECTED] wrote:

 Pavel Machek wrote:

 this is also something for v2.6.24 merging.

 As much as I like this patch, I do not think it is suitable for
 .24. Too risky, I'd say.


 No kidding!  We're talking about removing a hack that has been 
 successful on thousands of pieces of hardware over 15 years because it 
 ^[*]
 breaks ONE machine.

[*] - none of which needs it anymore -

there, fixed it for you ;-)

So lets keep this in perspective: this is a hack that only helps on a 
very low number of systems. (the PIT of one PII era chipset is known to 
be affected)

unfortunately this hack's side-effects are mis-used by an unknown number 
of drivers to mask PCI posting bugs. We want to figure out those bugs 
(safely and carefully) and we want to remove this hack from modern 
machines that dont need it. Doing anything else would be superstition.

anyway, we likely wont be doing anything about this in .24.

Ingo
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 2/3] Add GD-Rom support to the SEGA Dreamcast

2007-12-16 Thread Paul Mundt

Ok, I don't know anything about the CD-ROM layer, so I've just commented
on the general and SH-specific stuff. Hopefully someone with a clue
in this area (ie, not me) can offer input on the rest of the bits.

On Sun, Dec 16, 2007 at 12:21:21AM +, Adrian McMenamin wrote:
 +/* GD Rom registers */
 +#define GDROM_BASE_REG 0xA05F7000
 +#define GDROM_ALTSTATUS_REG GDROM_BASE_REG + 0x18
 +#define GDROM_DATA_REG GDROM_BASE_REG + 0x80
 +#define GDROM_ERROR_REG GDROM_BASE_REG + 0x84
 +#define GDROM_INTSEC_REG GDROM_BASE_REG + 0x88
 +#define GDROM_SECNUM_REG GDROM_BASE_REG + 0x8C
 +#define GDROM_BCL_REG  GDROM_BASE_REG + 0x90
 +#define GDROM_BCH_REG GDROM_BASE_REG + 0x94
 +#define GDROM_DSEL_REG GDROM_BASE_REG + 0x98
 +#define GDROM_STATUSCOMMAND_REG GDROM_BASE_REG + 0x9C
 +#define GDROM_RESET_REG GDROM_BASE_REG + 0x4E4
 +
 +#define GDROM_DATA_REG_P0 0x005F7080
 +
 +#define GDROM_DMA_STARTADDR_REG GDROM_BASE_REG + 0x404
 +#define GDROM_DMA_LENGTH_REG GDROM_BASE_REG + 0x408
 +#define GDROM_DMA_DIRECTION_REG GDROM_BASE_REG + 0x40C
 +#define GDROM_DMA_ENABLE_REG GDROM_BASE_REG + 0x414
 +#define GDROM_DMA_STATUS_REG GDROM_BASE_REG + 0x418
 +#define GDROM_DMA_WAIT_REG GDROM_BASE_REG + 0x4A0
 +#define GDROM_DMA_ACCESS_CTRL_REG GDROM_BASE_REG + 0x4B8
 +
These should all be encapsulated by brackets.

 +static void wait_clrbusy(void)
 +{
 + while (ctrl_inb(GDROM_ALTSTATUS_REG)  0x80)
 + schedule();
 +}
 + 
 +static void gdrom_wait_busy_sleeps(void)
 +{
 + /* Wait to get busy first */
 + while ((ctrl_inb(GDROM_ALTSTATUS_REG)  0x80) == 0)
 + schedule();
 + /* Now wait for busy to clear */
 + wait_clrbusy();
 +}
 +
Are you sure you can tolerate a schedule() in here, as opposed to a
cpu_relax()? If you're going to schedule away whilst polling a bit, you
may as well just use a wait queue and wait_on_bit() or so. This seems
like a lot of extra latency for this though.

 +static void gdrom_spicommand(void *spi_string, int buflen)
 +{
 + short *cmd = spi_string;
[snip]

 + wait_clrbusy();
 + ctrl_outb(GDROM_COM_PACKET, GDROM_STATUSCOMMAND_REG);
 + while ((ctrl_inb(GDROM_ALTSTATUS_REG)  0x88) != 8)
 + ; /* wait for DRQ to be set to 1 */
cpu_relax()

 +static char gdrom_execute_diagnostic(void)
 +{
 + int count;
 + /* Restart the GDROM */
 + ctrl_outl(0x1f, GDROM_RESET_REG);
 + for (count = 0xa000; count  0xa020; count += 4)
 + ctrl_inl(count);

Er? This ranged dummy reading of the P2 space needs some explanation. The
GD-ROM isn't even mapped in to this space, so this seems like a hack to
either work around a timing issue or a write ordering problem.

 +static int gdrom_get_last_session(struct cdrom_device_info *cd_info,
 struct cdrom_multisession *ms_info)
 +{
[snip]

 + }   
 + }
 + else

Questionable placement of else.

 + printk(KERN_DEBUG Disk is GDROM\n);
 + if (gd.toc)
 + kfree(gd.toc);

Useless if.

 + gd.toc = kzalloc(sizeof(struct gdromtoc), GFP_KERNEL);
 + if (!gd.toc) {
 + err = -ENOMEM;
 + goto clean_tocB;
 + }
 + if (tocuse)

Broken indendation.

 +static int gdrom_open(struct cdrom_device_info *cd_info, int purpose)
 +{
 + int err;
 + /* spin up the disk */
 + err = gdrom_preparedisk_cmd();
 + if (err)
 + return -EIO;
 + 
 + return 0;

Perhaps gdrom_preparedisk_cmd() should just hand back -EIO in the error
case and you can just pass that on directly. If you have no other users
of it, then just move the work in to the gdrom_open() directly.

 +static void gdrom_release(struct cdrom_device_info *cd_info)
 +{
 +}
 +
Do you really need this? If this is some sort of driver model damage, a
comment to that extent would be helpful, otherwise just kill this off.

 +static int gdrom_mediachanged(struct cdrom_device_info *cd_info, int ignore)
 +{
 + /* check the sense key */
 + char sense = ctrl_inb(GDROM_ERROR_REG);
 + if ((sense  0xF0) == 0x60)
 + return 1;
 + return 0;
 +}
 +
Just return (ctrl_inb(GDROM_ERROR_REG)  0xf0) == 0x60 ?

 +static int gdrom_hardreset(struct cdrom_device_info * cd_info)
 +{
 + int count;
 + ctrl_outl(0x1f, GDROM_RESET_REG);
 + for (count = 0xa000; count  0xa020; count += 4)
 + ctrl_inl(count);
 + return 0;
 +}
 +
More strange P2 abuse. If this is the officially recommended reset
method in the GD-ROM errat^H^H^H^Hspecification, it paints a pretty good
picture of its commercial success.

 + if (bufstring){
 + memcpy(bufstring, sense[4], 2); /* return additional sense 
 data */
 + }
 +
Useless braces.

 + if (sense_key  2)
 + return 0;
 + return -EIO;
 +}
 + 
 +static struct cdrom_device_ops gdrom_ops = {
 + .open   = gdrom_open,
 + .release= gdrom_release,
 + .drive_status   =

Re: [Bug 9182] Critical memory leak (dirty pages)

2007-12-16 Thread Andrew Morton

On Sun, 16 Dec 2007 10:33:20 +0100 (CET) Krzysztof Oledzki [EMAIL PROTECTED] 
wrote:

 
 
 On Sat, 15 Dec 2007, Andrew Morton wrote:
 
  On Sun, 16 Dec 2007 00:08:52 +0100 (CET) Krzysztof Oledzki [EMAIL 
  PROTECTED] wrote:
 
 
 
  On Sat, 15 Dec 2007, [EMAIL PROTECTED] wrote:
 
  http://bugzilla.kernel.org/show_bug.cgi?id=9182
 
 
  --- Comment #33 from [EMAIL PROTECTED]  2007-12-15 14:19 ---
  Krzysztof, I'd hate point you to a hard path (at least time consuming), 
  but
  you've done a lot of digging by now anyway. How about git bisecting 
  between
  2.6.20-rc2 and rc1? Here is great info on bisecting:
  http://www.kernel.org/doc/local/git-quick.html
 
  As I'm smarter than git-bistect I can tell that 2.6.20-rc1-git8 is as bad
  as 2.6.20-rc2 but 2.6.20-rc1-git8 with one patch reverted seems to be OK.
  So it took me only 2 reboots. ;)
 
  The guilty patch is the one I proposed just an hour ago:

  http://git.kernel.org/?p=linux%2Fkernel%2Fgit%2Fstable%2Flinux-2.6.20.y.git;a=commitdiff_plain;h=fba2591bf4e418b6c3f9f8794c9dd8fe40ae7bd9
 
  So:
- 2.6.20-rc1: OK
- 2.6.20-rc1-git8 with fba2591bf4e418b6c3f9f8794c9dd8fe40ae7bd9 
  reverted: OK
- 2.6.20-rc1-git8: very BAD
- 2.6.20-rc2: very BAD
- 2.6.20-rc4: very BAD
- = 2.6.20: BAD (but not *very* BAD!)
 
 
  well..  We have code which has been used by *everyone* for a year and it's
  misbehaving for you alone.
 
 No, not for me alone. Probably only I and Thomas Osterried have systems 
 where it is so easy to reproduce. Please note that the problem exists on 
 my all systems, but only on one it is critical. It is enough to run
 sync; sleep 1; sunc; sleep 1; sync; grep Drirty /proc/meminfo to be sure. 
 With =2.6.20-rc1-git8 it *never* falls to 0 an *all* my hosts but only 
 on one it goes to ~200MB in about 2 weeks and then everything dies:
 http://bugzilla.kernel.org/attachment.cgi?id=13824
 http://bugzilla.kernel.org/attachment.cgi?id=13825
 http://bugzilla.kernel.org/attachment.cgi?id=13826
 http://bugzilla.kernel.org/attachment.cgi?id=13827
 
   I wonder what you're doing that is different/special.
 Me to. :|
 
  Which filesystem, which mount options
 
   - ext3 on RAID1 (MD): / - rootflags=data=journal

It wouldn't surprise me if this is specific to data=journal: that
journalling mode is pretty complex wrt dairty-data handling and isn't well
tested.

Does switching that to data=writeback change things?

THomas, do you have ext3 data=journal on any filesytems?

   - ext3 on LVM on RAID5 (MD)
   - nfs
 
 /dev/md0 on / type ext3 (rw)
 proc on /proc type proc (rw)
 sysfs on /sys type sysfs (rw,nosuid,nodev,noexec)
 devpts on /dev/pts type devpts (rw,nosuid,noexec)
 /dev/mapper/VolGrp0-usr on /usr type ext3 (rw,nodev,data=journal)
 /dev/mapper/VolGrp0-var on /var type ext3 (rw,nodev,data=journal)
 /dev/mapper/VolGrp0-squid_spool on /var/cache/squid/cd0 type ext3 
 (rw,nosuid,nodev,noatime,data=writeback)
 /dev/mapper/VolGrp0-squid_spool2 on /var/cache/squid/cd1 type ext3 
 (rw,nosuid,nodev,noatime,data=writeback)
 /dev/mapper/VolGrp0-news_spool on /var/spool/news type ext3 
 (rw,nosuid,nodev,noatime)
 shm on /dev/shm type tmpfs (rw,noexec,nosuid,nodev)
 usbfs on /proc/bus/usb type usbfs (rw,noexec,nosuid,devmode=0664,devgid=85)
 owl:/usr/gentoo-nfs on /usr/gentoo-nfs type nfs 
 (ro,nosuid,nodev,noatime,bg,intr,tcp,addr=192.168.129.26)
 
 
  what sort of workload?
 Different, depending on a host: mail (postfix + amavisd + spamassasin + 
 clamav + sqlgray), squid, mysql, apache, nfs, rsync,  But it seems 
 that the biggest problem is on the host running mentioned mail service.
 

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.24-rc5-mm1: problems with cat /proc/kpageflags

2007-12-16 Thread Mariusz Kozlowski

Hello,

cat /proc/kpageflags on sparc64 causes the box to lock.
I can not write on any terminal - but I can issue sysrqs and switch
between consoles.

cat process hangs in read(3, ...

sysrq-w shows:

syslogd   D 0069240c 0  2470  1
Call Trace:
 [00692224] 
 [00692224] 
 [00692224] 
 [00692224] 
 [00692224] 
 [00692224] 
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 2/3] Add GD-Rom support to the SEGA Dreamcast

2007-12-16 Thread Christoph Hellwig

On Sun, Dec 16, 2007 at 06:50:19PM +0900, Paul Mundt wrote:
  +static irqreturn_t gdrom_command_interrupt(int irq, void *dev_id)
  +{
  +   if (dev_id != gd)
  +   return IRQ_NONE;
 
 You aren't setting this up as a shared IRQ, so this shouldn't be
 necessary.

It's not nessecary for shared irqs either.  The irq code will never
pass you a different cookied back than the one you passed in.
Everything else would be a nighmare and these cargo cult workarounds
wouldn't help either.

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.24-rc5-mm1: problems with cat /proc/kpageflags

2007-12-16 Thread Mariusz Kozlowski

   cat /proc/kpageflags on sparc64 causes the box to lock.
 I can not write on any terminal - but I can issue sysrqs and switch
 between consoles.
 
 cat process hangs in read(3, ...
 
 sysrq-w shows:
 
 syslogd   D 0069240c 0  2470  1
 Call Trace:
  [00692224] 
  [00692224] 
  [00692224] 
  [00692224] 
  [00692224] 
  [00692224] 

aggrh ... please ignore.

Sent by mistake when retyping info from sparc (no camera right now :/)

Will reply soon with correct data.

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [Bug 9182] Critical memory leak (dirty pages)

2007-12-16 Thread Krzysztof Oledzki




On Sun, 16 Dec 2007, [EMAIL PROTECTED] wrote:


http://bugzilla.kernel.org/show_bug.cgi?id=9182





--- Comment #39 from [EMAIL PROTECTED]  2007-12-16 01:58 ---


So:
  - 2.6.20-rc1: OK
  - 2.6.20-rc1-git8 with fba2591bf4e418b6c3f9f8794c9dd8fe40ae7bd9 reverted: OK
  - 2.6.20-rc1-git8: very BAD
  - 2.6.20-rc2: very BAD
  - 2.6.20-rc4: very BAD
  - = 2.6.20: BAD (but not *very* BAD!)


based on the great info you already acquired, you should be able to
bisect this rather effectively, via:

2.6.20-rc1-git8 == 921320210bd2ec4f17053d283355b73048ac0e56

$ git-bisect start
$ git-bisect bad 921320210bd2ec4f17053d283355b73048ac0e56
$ git-bisect good v2.6.20-rc1
Bisecting: 133 revisions left to test after this

so about 7-8 bootups would pinpoint the breakage.


Except that I have very limited time where I can do my tests on this host. 
Please also note that it takes about ~2h after a reboot, to be 100% sure. 
So, 7-8 bootups = 14-16h. :|



It would likely pinpoint fba2591b, so it would perhaps be best to first
attempt a revert of fba2591b on a recent kernel.


I wish I could: :(

[EMAIL PROTECTED]:/usr/src/linux-2.6.23.9$ cat ..p1 |patch -p1 --dry-run -R
patching file fs/hugetlbfs/inode.c
Hunk #1 succeeded at 203 (offset 27 lines).
patching file include/linux/page-flags.h
Hunk #1 succeeded at 262 (offset 9 lines).
patching file mm/page-writeback.c
Hunk #1 succeeded at 903 (offset 58 lines).
patching file mm/truncate.c
Unreversed patch detected!  Ignore -R? [n] y
Hunk #1 succeeded at 52 with fuzz 2 (offset 1 line).
Hunk #2 FAILED at 85.
Hunk #3 FAILED at 365.
Hunk #4 FAILED at 400.
3 out of 4 hunks FAILED -- saving rejects to file mm/truncate.c.rej

Best regards,

Krzysztof Olędzki

Re: [PATCH] Tosa keyboard support

2007-12-16 Thread Russell King - ARM Linux

On Tue, Dec 11, 2007 at 06:38:51PM +0300, Dmitry Baryshkov wrote:
 Sorry, posted wrong version of patch. Here is correct version:
 
 Support keyboard on tosa (Sharp Zaurus SL-6000x).
 Largely based on patches by Dirk Opfer.

Looks fine to me, but Dmitry Torokhov needs to look at it; he maintains
the input subsystem.  Note that the current input subsystem mailing list
is not the one you have in the CC line - please always check MAINTAINERS
for the correct addresses.

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.24-rc5-mm1: problems with cat /proc/kpageflags

2007-12-16 Thread Mariusz Kozlowski

Hello

 Will reply soon with correct data.

Ok here it goes:

cat /proc/kpageflags on sparc64 causes the box to lock.
I can not write on any terminal - but I can issue sysrqs and switch
between consoles.

cat process hangs in read(3, ...

sysrq-w shows:

syslogd   D 0069240c 0  2470  1
Call Trace:
 [00692224] __down+0x8c/0x100
 [0069240c] __down_interruptible+0x174/0x1a0 
 [006935d4] mutex_trylock+0xfc/0x1e0 
 [00695c7c] lock_kernel+0x24/0x40 
 [005b0cc0] tty_write+0x168/0x200 
 [004d0b08] do_loop_readv_writev+0x30/0x60
 [00507540] compat_do_readv_writev+0x268/0x280
 [005075b0] compat_sys_writev+0x58/0x80
 [004062d4] linux_sparc_syscall32+0x3c/0x40
 [f7e3f408] 0xf7e3f410

then when I try to ssh to the sparc machine I fail but at sparc you
can see this:

BUG: soft lockup - CPU#0 stuck for 11s! [sshd:3227]
TSTATE: 009911009607 TPC: 00430c2c TNPC:00430c30 Y: 
Not tainted
TCP: __delay+0x34/0x60
g0:  g1: 0042875103e3 g2: 00430800 g3: 
0001869c
g4: f800bf086100 g5: f8007f832000 g6: f800be4a g7: 
0004
o0: 0042875103e3 o1:  o2: 00430c78 o3: 

o4: 7fff o5:  sp: f800be4a2e81 ret_pc: 
00430c24
RPC: __delay+0x2c/0x60
l0: 0042875100df l1: 007a4000 l2:  l3: 
007d9000
l4:  l5: 0001 l6:  l7: 

i0: 0382 i1: f800be4a0400 i2: 00445d3c i3: 

i4: 0002 i5: 0045388c i6: f800be4a2f41 i7: 
00430c6c
I7: udelay+0x14/0x20

When this happens box seems to react only to sysrq-b or manual reset.
Anything else is useless.

Regards,

Mariusz
#
# Automatically generated make config: don't edit
# Linux kernel version: 2.6.24-rc5-mm1
# Fri Dec 14 19:47:15 2007
#
CONFIG_SPARC=y
CONFIG_SPARC64=y
CONFIG_GENERIC_TIME=y
CONFIG_GENERIC_CMOS_UPDATE=y
CONFIG_GENERIC_CLOCKEVENTS=y
CONFIG_64BIT=y
CONFIG_MMU=y
CONFIG_QUICKLIST=y
CONFIG_STACKTRACE_SUPPORT=y
CONFIG_LOCKDEP_SUPPORT=y
CONFIG_ARCH_MAY_HAVE_PC_FDC=y
# CONFIG_ARCH_HAS_ILOG2_U32 is not set
# CONFIG_ARCH_HAS_ILOG2_U64 is not set
CONFIG_AUDIT_ARCH=y
CONFIG_ARCH_NO_VIRT_TO_BUS=y
CONFIG_OF=y
CONFIG_GENERIC_HARDIRQS_NO__DO_IRQ=y
CONFIG_ARCH_SUPPORTS_AOUT=y
CONFIG_SPARC64_PAGE_SIZE_8KB=y
# CONFIG_SPARC64_PAGE_SIZE_64KB is not set
# CONFIG_SPARC64_PAGE_SIZE_512KB is not set
# CONFIG_SPARC64_PAGE_SIZE_4MB is not set
CONFIG_SECCOMP=y
# CONFIG_HZ_100 is not set
CONFIG_HZ_250=y
# CONFIG_HZ_300 is not set
# CONFIG_HZ_1000 is not set
CONFIG_HZ=250
# CONFIG_HOTPLUG_CPU is not set
CONFIG_DEFCONFIG_LIST=/lib/modules/$UNAME_RELEASE/.config

#
# General setup
#
CONFIG_EXPERIMENTAL=y
CONFIG_LOCK_KERNEL=y
CONFIG_INIT_ENV_ARG_LIMIT=32
CONFIG_LOCALVERSION=
# CONFIG_LOCALVERSION_AUTO is not set
CONFIG_SWAP=y
CONFIG_SYSVIPC=y
CONFIG_SYSVIPC_SYSCTL=y
CONFIG_POSIX_MQUEUE=y
# CONFIG_BSD_PROCESS_ACCT is not set
# CONFIG_TASKSTATS is not set
# CONFIG_AUDIT is not set
CONFIG_IKCONFIG=y
CONFIG_IKCONFIG_PROC=y
CONFIG_LOG_BUF_SHIFT=18
# CONFIG_CGROUPS is not set
# CONFIG_FAIR_GROUP_SCHED is not set
CONFIG_SYSFS_DEPRECATED=y
CONFIG_RELAY=y
CONFIG_NAMESPACES=y
# CONFIG_UTS_NS is not set
# CONFIG_IPC_NS is not set
# CONFIG_USER_NS is not set
# CONFIG_PID_NS is not set
# CONFIG_BLK_DEV_INITRD is not set
# CONFIG_CC_OPTIMIZE_FOR_SIZE is not set
CONFIG_SYSCTL=y
# CONFIG_EMBEDDED is not set
CONFIG_UID16=y
CONFIG_SYSCTL_SYSCALL=y
CONFIG_KALLSYMS=y
CONFIG_KALLSYMS_ALL=y
# CONFIG_KALLSYMS_EXTRA_PASS is not set
CONFIG_HOTPLUG=y
CONFIG_PRINTK=y
CONFIG_BUG=y
CONFIG_ELF_CORE=y
CONFIG_BASE_FULL=y
CONFIG_FUTEX=y
CONFIG_ANON_INODES=y
CONFIG_EPOLL=y
CONFIG_SIGNALFD=y
CONFIG_TIMERFD=y
CONFIG_EVENTFD=y
CONFIG_SHMEM=y
CONFIG_VM_EVENT_COUNTERS=y
CONFIG_SLUB_DEBUG=y
# CONFIG_SLAB is not set
CONFIG_SLUB=y
# CONFIG_SLOB is not set
CONFIG_PROC_PAGE_MONITOR=y
# CONFIG_PROFILING is not set
# CONFIG_MARKERS is not set
CONFIG_HAVE_OPROFILE=y
# CONFIG_KPROBES is not set
CONFIG_HAVE_KPROBES=y
CONFIG_RT_MUTEXES=y
# CONFIG_TINY_SHMEM is not set
CONFIG_BASE_SMALL=0
CONFIG_MODULES=y
CONFIG_MODULE_UNLOAD=y
CONFIG_MODULE_FORCE_UNLOAD=y
# CONFIG_MODVERSIONS is not set
# CONFIG_MODULE_SRCVERSION_ALL is not set
CONFIG_KMOD=y
CONFIG_STOP_MACHINE=y
CONFIG_BLOCK=y
CONFIG_BLK_DEV_IO_TRACE=y
# CONFIG_BLK_DEV_BSG is not set
CONFIG_BLOCK_COMPAT=y

#
# IO Schedulers
#
CONFIG_IOSCHED_NOOP=y
CONFIG_IOSCHED_AS=y
CONFIG_IOSCHED_DEADLINE=y
CONFIG_IOSCHED_CFQ=y
CONFIG_DEFAULT_AS=y
# CONFIG_DEFAULT_DEADLINE is not set
# CONFIG_DEFAULT_CFQ is not set
# CONFIG_DEFAULT_NOOP is not set
CONFIG_DEFAULT_IOSCHED=anticipatory
CONFIG_SYSVIPC_COMPAT=y
CONFIG_GENERIC_HARDIRQS=y

#
# General machine setup
#
CONFIG_TICK_ONESHOT=y
CONFIG_NO_HZ=y
CONFIG_HIGH_RES_TIMERS=y
CONFIG_GENERIC_CLOCKEVENTS_BUILD=y
CONFIG_SMP=y
CONFIG_NR_CPUS=4
#

[patch 0/2] [RFC] Simple tamper-proof device filesystem.

2007-12-16 Thread Tetsuo Handa

Hello.

I have proposed this filesystem a few years ago.
Once again, I'm proposing this filesystem toward inclusion into mainline.
I'll update for -mm tree if this filesystem is likely acceptable.

Regards.

(This is a resent message of [00/02] since it seems to be dropped.)
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PACKET]: Fix /proc/net/packet crash due to bogus private pointer

2007-12-16 Thread Mariusz Kozlowski

Hello,

 Surprise surprise.  The namespace seq patch missed two spots in
 AF_PACKET.
 
 [PACKET]: Fix /proc/net/packet crash due to bogus private pointer
 
 The seq_open_net patch changed the meaning of seq-private.
 Unfortunately it missed two spots in AF_PACKET, which still
 used the old way of dereferencing seq-private, thus causing
 weird and wonderful crashes when reading /proc/net/packet.
 
 This patch fixes them.

True :) It fixes both my x86 and sprac64. Thanks.

Mariusz
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] initrd: Fix virtual/physical mix-up in overwrite test

2007-12-16 Thread Geert Uytterhoeven

On recent kernels, I get the following error when using an initrd:

| initrd overwritten (0x00b78000  0x07668000) - disabling it.

My Amiga 4000 has 12 MiB of RAM at physical address 0x0740 (virtual
0x).
The initrd is located at the end of RAM: 0x00b78000 - 0x00c0 (virtual).
The overwrite test compares the (virtual) initrd location to the (physical)
first available memory location, which fails.

This patch converts initrd_start to a page frame number, so it can be safely
compared with min_low_pfn.

Before the introduction of discontiguous memory support on m68k
(12d810c1b8c2b913d48e629e2b5c01d105029839), min_low_pfn was just left
untouched by the m68k-specific code (zero, I guess), and everything worked
fine.

Signed-off-by: Geert Uytterhoeven [EMAIL PROTECTED]
---
On several platforms, initrd_below_start_ok is set to 1:

| arch/mips/kernel/setup.c:   initrd_below_start_ok = 1;
| arch/parisc/mm/init.c:  initrd_below_start_ok = 1;
| arch/powerpc/kernel/prom.c: initrd_below_start_ok = 1;
| arch/ppc/platforms/hdpu.c:  initrd_below_start_ok = 1;
| arch/xtensa/kernel/setup.c: initrd_below_start_ok = 1;

Some of these may be workarounds for this bug. Please check.

 init/main.c |4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- a/init/main.c
+++ b/init/main.c
@@ -598,9 +598,9 @@ asmlinkage void __init start_kernel(void
 
 #ifdef CONFIG_BLK_DEV_INITRD
if (initrd_start  !initrd_below_start_ok 
-   initrd_start  min_low_pfn  PAGE_SHIFT) {
+   virt_to_pfn(initrd_start)  min_low_pfn) {
printk(KERN_CRIT initrd overwritten (0x%08lx  0x%08lx) - 
-   disabling it.\n,initrd_start,min_low_pfn  PAGE_SHIFT);
+   disabling it.\n, virt_to_pfn(initrd_start), min_low_pfn);
initrd_start = 0;
}
 #endif

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [EMAIL PROTECTED]

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say programmer or something like that.
-- Linus Torvalds
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] Tosa keyboard support

2007-12-16 Thread Dmitry

Hi,

2007/12/16, Russell King - ARM Linux [EMAIL PROTECTED]:
 On Tue, Dec 11, 2007 at 06:38:51PM +0300, Dmitry Baryshkov wrote:
  Sorry, posted wrong version of patch. Here is correct version:
 
  Support keyboard on tosa (Sharp Zaurus SL-6000x).
  Largely based on patches by Dirk Opfer.

 Looks fine to me, but Dmitry Torokhov needs to look at it; he maintains
 the input subsystem.  Note that the current input subsystem mailing list
 is not the one you have in the CC line - please always check MAINTAINERS
 for the correct addresses.


Ok, thanks,
resent to linux-input.


-- 
With best wishes
Dmitry
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch 1/2] [RFC] Simple tamper-proof device filesystem.

2007-12-16 Thread Tetsuo Handa

A brief description about SYAORAN:

 SYAORAN stands for Simple Yet All-important Object Realizing Abiding
 Nexus. SYAORAN is a filesystem for /dev with Mandatory Access Control.

 /dev needs to be writable, but this means that files on /dev might be
 tampered with. SYAORAN can restrict combinations of (pathname, attribute)
 that the system can create. The attribute is one of directory, regular
 file, FIFO, UNIX domain socket, symbolic link, character or block device
 file with major/minor device numbers.

 SYAORAN can ensure /dev/null is a character device file with major=1 minor=3.

 Policy specifications for this filesystem is at
 http://tomoyo.sourceforge.jp/en/1.5.x/policy-syaoran.html

Why not use FUSE?

 Because /dev has to be available through the lifetime of the kernel.
 It is not acceptable if /dev stops working due to SIGKILL or OOM-killer.

Why not use SELinux?

 Because SELinux doesn't guarantee filename and its attribute.
 The purpose of this filesystem is to ensure filename and its attribute
 (e.g. /dev/null is guaranteed to be a character device file
 with major=1 and minor=3).

Signed-off-by:  Tetsuo Handa [EMAIL PROTECTED]
---
 fs/syaoran/syaoran.c |  338 +
 fs/syaoran/syaoran.h |  964 +++
 2 files changed, 1302 insertions(+)

--- /dev/null
+++ linux-2.6.24-rc5/fs/syaoran/syaoran.c
@@ -0,0 +1,338 @@
+/*
+ * fs/syaoran/syaoran.c
+ *
+ * Implementation of the Tamper-Proof Device Filesystem.
+ *
+ * Portions Copyright (C) 2005-2007  NTT DATA CORPORATION
+ *
+ * Version: 1.5.3-pre   2007/12/16
+ *
+ * This filesystem is developed using the ramfs implementation.
+ *
+ */
+/*
+ * Resizable simple ram filesystem for Linux.
+ *
+ * Copyright (C) 2000 Linus Torvalds.
+ *   2000 Transmeta Corp.
+ *
+ * Usage limits added by David Gibson, Linuxcare Australia.
+ * This file is released under the GPL.
+ */
+
+/*
+ * NOTE! This filesystem is probably most useful
+ * not as a real filesystem, but as an example of
+ * how virtual filesystems can be written.
+ *
+ * It doesn't get much simpler than this. Consider
+ * that this file implements the full semantics of
+ * a POSIX-compliant read-write filesystem.
+ *
+ * Note in particular how the filesystem does not
+ * need to implement any data structures of its own
+ * to keep track of the virtual data: using the VFS
+ * caches is sufficient.
+ */
+
+#include linux/module.h
+#include linux/fs.h
+#include linux/pagemap.h
+#include linux/highmem.h
+#include linux/time.h
+#include linux/init.h
+#include linux/string.h
+#include linux/backing-dev.h
+#include linux/sched.h
+#include linux/uaccess.h
+
+static struct super_operations syaoran_ops;
+static struct address_space_operations syaoran_aops;
+static struct inode_operations syaoran_file_inode_operations;
+static struct inode_operations syaoran_dir_inode_operations;
+static struct inode_operations syaoran_symlink_inode_operations;
+static struct file_operations syaoran_file_operations;
+
+static struct backing_dev_info syaoran_backing_dev_info = {
+   .ra_pages = 0,/* No readahead */
+   .capabilities = BDI_CAP_NO_ACCT_DIRTY | BDI_CAP_NO_WRITEBACK |
+   BDI_CAP_MAP_DIRECT | BDI_CAP_MAP_COPY |
+   BDI_CAP_READ_MAP | BDI_CAP_WRITE_MAP | BDI_CAP_EXEC_MAP,
+};
+
+#include syaoran.h
+
+static struct inode *syaoran_get_inode(struct super_block *sb, int mode,
+  dev_t dev)
+{
+   struct inode *inode = new_inode(sb);
+
+   if (inode) {
+   struct timespec now = CURRENT_TIME;
+   inode-i_mode = mode;
+   inode-i_uid = current-fsuid;
+   inode-i_gid = current-fsgid;
+   inode-i_blocks = 0;
+   inode-i_mapping-a_ops = syaoran_aops;
+   inode-i_mapping-backing_dev_info = syaoran_backing_dev_info;
+   inode-i_atime = now;
+   inode-i_mtime = now;
+   inode-i_ctime = now;
+   switch (mode  S_IFMT) {
+   default:
+   init_special_inode(inode, mode, dev);
+   if (S_ISBLK(mode))
+   inode-i_fop = wrapped_def_blk_fops;
+   else if (S_ISCHR(mode))
+   inode-i_fop = wrapped_def_chr_fops;
+   inode-i_op = syaoran_file_inode_operations;
+   break;
+   case S_IFREG:
+   inode-i_op = syaoran_file_inode_operations;
+   inode-i_fop = syaoran_file_operations;
+   break;
+   case S_IFDIR:
+   inode-i_op = syaoran_dir_inode_operations;
+   inode-i_fop = simple_dir_operations;
+   /*
+* directory inodes start off with i_nlink == 2
+*  (for . entry)
+*/
+

[patch 2/2] [RFC] Simple tamper-proof device filesystem.

2007-12-16 Thread Tetsuo Handa


Signed-off-by: Tetsuo Handa [EMAIL PROTECTED]
---
 fs/Kconfig  |   21 +
 fs/Makefile |1 +
 2 files changed, 22 insertions(+)

--- linux-2.6.24-rc5.orig/fs/Kconfig
+++ linux-2.6.24-rc5/fs/Kconfig
@@ -1555,6 +1555,27 @@ config UFS_DEBUG
  Y here.  This will result in _many_ additional debugging messages to 
be
  written to the system log.
 
+config SYAORAN_FS
+   tristate SYAORAN (Tamper-Proof Device Filesystem) support
+   help
+ Say Y or M here to support the Tamper-Proof Device Filesystem.
+
+ SYAORAN stands for
+ Simple Yet All-important Object Realizing Abiding Nexus.
+ SYAORAN is a filesystem for /dev with Mandatory Access Control.
+
+ The system can't work if /dev is read-only.
+ Therefore you need to mount a writable filesystem (such as tmpfs)
+ for /dev if root fs is read-only.
+
+ But the writable /dev means that files on /dev might be tampered.
+ For example, if /dev/null is deleted and re-created as a symbolic
+ link to /dev/hda by an attacker, the contents of the IDE HDD
+ will be destroyed at a blow.
+
+ SYAORAN can ensure /dev/null is a character device file
+ with major=1 minor=3.
+
 endmenu
 
 menuconfig NETWORK_FILESYSTEMS
--- linux-2.6.24-rc5.orig/fs/Makefile
+++ linux-2.6.24-rc5/fs/Makefile
@@ -118,3 +118,4 @@ obj-$(CONFIG_HPPFS) += hppfs/
 obj-$(CONFIG_DEBUG_FS) += debugfs/
 obj-$(CONFIG_OCFS2_FS) += ocfs2/
 obj-$(CONFIG_GFS2_FS)   += gfs2/
+obj-$(CONFIG_SYAORAN_FS)+= syaoran/syaoran.o
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: oops with 2.6.23.1, marvel, software raid, reiserfs and samba

2007-12-16 Thread Andrew Morton

On Fri, 07 Dec 2007 19:49:52 -0800 jeffunit [EMAIL PROTECTED] wrote:

 I am running linux kernel 2.6.23.1, which I compiled.
 The base system was mandriva 2008.
 
 I have a dual processor pentium III 933 system.
 It has 3gb of ram, an intel stl-2 motherboard.
 It also has a promise 100 tx2 pata controller,
 a supermicro marvell based 8 port pcix sata controller,
 and a nvidia pci based video card.
 
 I have the os on a pata drive, and have made a software raid array
 consisting of 4 sata drives attached to the pcix sata controller.
 I created the array, and formatted with reiserfs 3.6
 I have run bonnie++ (filesystem benchmark) on the array without incident.
 When I use samba-3.0.25b-4.3 and copy files from a windows machine to 
 the fileserver,
 every so often, the fileserver crashes or hangs. It seems to happen
 more often under heavy samba traffic.
 Enclosed is the oops from syslog.
 I also have a 'kernel bug' from syslog if that would be helpful.
 
 jeff
 
 
 Dec  7 17:20:52 sata_fileserver kernel: BUG: unable to handle kernel 
 NULL pointer dereference at virtual address 000d
 Dec  7 17:20:52 sata_fileserver kernel:  printing eip:
 Dec  7 17:20:52 sata_fileserver kernel: c02cc820
 Dec  7 17:20:52 sata_fileserver kernel: *pde = 
 Dec  7 17:20:52 sata_fileserver kernel: Oops:  [#1]
 Dec  7 17:20:52 sata_fileserver kernel: SMP
 Dec  7 17:20:52 sata_fileserver kernel: Modules linked in: raid456 
 async_xor async_memcpy async_tx xor iptable_raw xt_comment xt_policy 
 xt_multiport ipt_ULOG ipt_TTL ipt_ttl ipt_TOS ipt_tos ipt_SAME 
 ipt_REJECT ipt_REDIRECT ipt_recent ipt_owner ipt_NETMAP 
 ipt_MASQUERADE ipt_LOG ipt_iprange ipt_ECN ipt_ecn ipt_CLUSTERIP 
 ipt_ah ipt_addrtype nf_nat_tftp nf_nat_snmp_basic nf_nat_sip 
 nf_nat_pptp nf_nat_proto_gre nf_nat_irc nf_nat_h323 nf_nat_ftp 
 nf_nat_amanda ts_kmp nf_conntrack_amanda nf_conntrack_tftp 
 nf_conntrack_sip nf_conntrack_proto_sctp nf_conntrack_pptp 
 nf_conntrack_proto_gre nf_conntrack_netlink nf_conntrack_netbios_ns 
 nf_conntrack_irc nf_conntrack_h323 nf_conntrack_ftp xt_tcpmss 
 xt_pkttype xt_physdev xt_NFQUEUE xt_NFLOG xt_MARK xt_mark xt_mac 
 xt_limit xt_length xt_helper xt_hashlimit ip6_tables xt_dccp 
 xt_conntrack xt_CONNMARK xt_connmark xt_CLASSIFY nfsd xt_tcpudp 
 exportfs auth_rpcgss xt_state iptable_nat nf_nat nf_conntrack_ipv4 
 nf_conntrack nfs iptable_mangle lockd nfs_acl sunrpc nfnetlink 
 iptable_filter ip_table
 Dec  7 17:20:52 sata_fileserver kernel:  x_tables af_packet ipv6 
 snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq snd_pcm_oss 
 snd_mixer_oss ipmi_si ipmi_msghandler binfmt_misc loop nls_utf8 ntfs 
 dm_mod usb_storage sg sd_mod sata_mv libata scsi_mod video output 
 thermal sbs processor fan container button dock battery ac floppy 
 snd_emu10k1 snd_rawmidi snd_ac97_codec ac97_bus snd_pcm 
 snd_seq_device snd_timer snd_page_alloc snd_util_mem snd_hwdep 
 ehci_hcd snd ohci_hcd i2c_piix4 uhci_hcd soundcore e1000 sworks_agp 
 i2c_core ide_cd usbcore agpgart emu10k1_gp gameport tsdev evdev 
 reiserfs ide_disk serverworks pdc202xx_new ide_core
 Dec  7 17:20:52 sata_fileserver kernel: CPU:1
 Dec  7 17:20:52 sata_fileserver kernel: 
 EIP:0060:[c02cc820]Not tainted VLI
 Dec  7 17:20:52 sata_fileserver kernel: EFLAGS: 00210202   (2.6.23.1 #1)
 Dec  7 17:20:52 sata_fileserver kernel: EIP is at tcp_recvmsg+0x150/0xbf0
 Dec  7 17:20:52 sata_fileserver kernel: eax:    ebx: 
 f55c4b60   ecx: 784e2c7c   edx: f63f63d8
 Dec  7 17:20:52 sata_fileserver kernel: esi: 784e2c7a   edi: 
 f63f614c   ebp: e21fde24   esp: e21fddc4
 Dec  7 17:20:52 sata_fileserver kernel: ds: 007b   es: 007b   fs: 
 00d8  gs: 0033  ss: 0068
 Dec  7 17:20:52 sata_fileserver kernel: Process smbd (pid: 9524, 
 ti=e21fc000 task=f5109000 task.ti=e21fc000)
 Dec  7 17:20:52 sata_fileserver kernel: Stack:   
  c13e5740 f557b000 c03fa300  e21fde90
 Dec  7 17:20:52 sata_fileserver kernel:f63f60e0  
 0b64 f63f63d8 05b4 0001  
 Dec  7 17:20:52 sata_fileserver kernel: 05b4 
 e21fde4c 7fff e21fde28  c03a4de0 e21fde90
 Dec  7 17:20:52 sata_fileserver kernel: Call Trace:
 Dec  7 17:20:53 sata_fileserver kernel:  [c010542a] 
 show_trace_log_lvl+0x1a/0x30
 Dec  7 17:20:53 sata_fileserver kernel:  [c01054eb] 
 show_stack_log_lvl+0xab/0xd0
 Dec  7 17:20:53 sata_fileserver kernel:  [c01056e1] 
 show_registers+0x1d1/0x2d0
 Dec  7 17:20:53 sata_fileserver kernel:  [c01058f6] die+0x116/0x250
 Dec  7 17:20:53 sata_fileserver kernel:  [c011f52b] 
 do_page_fault+0x28b/0x6a0
 Dec  7 17:20:53 sata_fileserver kernel:  [c030938a] error_code+0x72/0x78
 Dec  7 17:20:53 sata_fileserver kernel:  [c0295423] 
 sock_common_recvmsg+0x43/0x60
 Dec  7 17:20:53 sata_fileserver kernel:  [c029301c] 
 sock_aio_read+0x11c/0x130
 Dec  7 17:20:53 sata_fileserver kernel:  [c017db30] do_sync_read+0xd0/0x110
 Dec  7 17:20:53 sata_fileserver kernel:  [c017e47d]

Re: [PATCH 3/3] net: wireless: bcm43xx: big_buffer_sem semaphore to mutex

2007-12-16 Thread Michael Buesch

On Sunday 16 December 2007 03:30:16 Larry Finger wrote:
 Michael Buesch wrote:
  On Sunday 16 December 2007 00:18:43 Rafael J. Wysocki wrote:
  Well, the only problem with that is I suspect there are some newer cards 
  that
  work better with v3 firmware, although they are supposed to support both.
  
  And I suspect that you are wrong until you show me one. :)
 
 The BCM4311/1 card used to work better with bcm43xx than it did with b43; 
 however, since the power 
 control problem was solved in b43, there is very little difference. When I 
 built my special system 
 to use the BCM4311 with b43legacy, there was no difference.
 
 I don't know of any cards that work better with bcm43xx than with b43. Of 
 course, that is comparing 
 SoftMAC with mac80211. There is, of course, no comparison.

This was about version 3 firmware vs version 4 firmware.
I doubt the firmware makes any difference at all.

-- 
Greetings Michael.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch 1/2] [RFC] Simple tamper-proof device filesystem.

2007-12-16 Thread David Newall


Tetsuo Handa wrote:

 /dev needs to be writable, but this means that files on /dev might be
 tampered with.


I infer that you mean /dev needs to be writable by anyone, not by just 
its owner or owner and group (conventionally root/root.)  This goes 
against conventional wisdom, which is that /dev must be writable only by 
the administrator.  Why do you say otherwise?

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 3/3] net: wireless: bcm43xx: big_buffer_sem semaphore to mutex

2007-12-16 Thread Michael Buesch

On Sunday 16 December 2007 10:22:43 Ingo Molnar wrote:
 
 * John W. Linville [EMAIL PROTECTED] wrote:
 
   It's not that simple.  For example, regression testing will be a 
   major PITA if one needs to switch back and forth from the new driver 
   to the old one in the process.
  
  Not really true -- a single system can easily have firmware installed 
  for b43, b43legacy, and bcm43xx at the same time and switch back and 
  forth between them.
 
 as long as the version 4 firmware blob is present in the system, will 
 testers have a fully fluid test- and work-flow when migrating across 
 from bcm43xx to b43, without any other changes to an existing Linux 
 installation? (i.e. no udev tweaks, no forced upgrades of components, 
 etc.)
 
 Will it Just Work in bisection as well, when a tester's kernel 
 flip/flops between bcm43xx and b43 - like it does for the other 3000+ 
 drivers in the kernel?
 
 Note that we are _NOT_ interested in might or can scenarios. We are 
 interested in preserving the _existing_ bcm43xx installed base and how 
 much of a seemless migration the b43 transition will be. _THAT_ is what 
 the no regressions upstream rule is about, not the ideal distro 
 scenario you outline above. It is YOUR total obligation as a kernel 
 maintainer to ensure that you dont break old installations. How hard is 
 that to understand? This is not rocket science.

I see no reason for b43 to break, if the firmware is properly installed.
In fact, almost all installation related bugreports we receive are
caused by missing or incorrectly installed firmware.
I would really _like_ to make installing firmware easier or make the
whole need for it vanish, but I simply can not at this point.
But anyway, installing it is not rocket science, either. The only thing
you have to know is where your distribution stores the firmware image files.
If you know that it's a matter of invoking one b43-fwcutter command
to install it. This process can be automated in the distribution's rpm
or deb package scripts.

b43lagacy/ssb is completely featured with module autoload support.
So if you have firmware installed it will automatically load all required
modules and create the network device(s) for it without any user interaction.

If that doesn't work, then stupid distributions are shipping braindamaged
udev scripts that pin a mac address to a specific driver name (see another
mail in this thread). I can _not_ fix this from within the kernel and
I will absolutely shift all responsibility and blame for this to the
maintainers of the distribution's udev scripts.
That's not a b43 specific problem then. Other drivers do break with these
scripts, too.

-- 
Greetings Michael.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch 1/2] [RFC] Simple tamper-proof device filesystem.

2007-12-16 Thread Tetsuo Handa

Hello.

David Newall wrote:
 Tetsuo Handa wrote:
   /dev needs to be writable, but this means that files on /dev might be
   tampered with.
 
 I infer that you mean /dev needs to be writable by anyone, not by just 
 its owner or owner and group (conventionally root/root.)  This goes 
 against conventional wisdom, which is that /dev must be writable only by 
 the administrator.  Why do you say otherwise?
I didn't mean that /dev is writable by everybody.
I meant that /dev must be mounted for read-write mode
(even if one wants to mount / for read-only mode).

Regards.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch 1/2] [RFC] Simple tamper-proof device filesystem.

2007-12-16 Thread David Newall


Tetsuo Handa wrote:

David Newall wrote:
  

Tetsuo Handa wrote:


 /dev needs to be writable, but this means that files on /dev might be
 tampered with.
  
I infer that you mean /dev needs to be writable by anyone, not by just 
its owner or owner and group (conventionally root/root.)  This goes 
against conventional wisdom, which is that /dev must be writable only by 
the administrator.  Why do you say otherwise?


I didn't mean that /dev is writable by everybody.
  


Glad to hear it! :)


I meant that /dev must be mounted for read-write mode
  


Again, why?
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch 1/2] [RFC] Simple tamper-proof device filesystem.

2007-12-16 Thread Tetsuo Handa

Hello.

  I meant that /dev must be mounted for read-write mode
 
 Again, why?

You can mount / partition for read-only mode if you wish to do so.
But you cannot make /dev directory for read-only.
You won't be able to login to the system because /sbin/mingetty
fails to chown/chmod /dev/tty* if /dev is mounted for read-only mode.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.24-rc5-mm1: problems with cat /proc/kpageflags

2007-12-16 Thread Mariusz Kozlowski

 cat /proc/kpageflags on sparc64 causes the box to lock.
 I can not write on any terminal - but I can issue sysrqs and switch
 between consoles.
 
 cat process hangs in read(3, ...

cat /proc/kpagecount produces similar symptoms. box is locked - sysrq-w sshd 
trace:

__down
__down_interruptible
kobject_get
lock_kernel
chrdev_open
__dentry_open
nameidata_to_filp
open_pathname
do_sys_open
sparc32_open
linux_sparc_syscall32

then again:

BUG: soft lockup - CPU#0 stuck for 11s! [sshd:3242]
...
TPC: spitfire_xcall_helper+0xa0/0x100
...
RPC: spitfire_xcall_helper+0xac/0x100
...
I7: flush_dcache_page_all+0x1a4/0x1e0

or:

BUG: soft lockup - CPU#0 stuck for 11s! [sshd:3242]
...
TPC: tick_get_tick+0xc/0x20
...
RPC: __handle_softirq_continue+0x20/0x24
...
I7: __delay+0x2c/0x60

Box is unusable. Easy to reproduce - every time.

Regards,

Mariusz
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[RANDOM] Move two variables to read_mostly section to save memory

2007-12-16 Thread Eric Dumazet


While examining vmlinux namelist on i686, I noticed :

c0581300 D random_table
c0581480 d input_pool
c0581580 d random_read_wakeup_thresh
c0581584 d random_write_wakeup_thresh
c0581600 d blocking_pool

That means that the two integers random_read_wakeup_thresh and 
random_write_wakeup_thresh use a full cache entry (128 bytes).


Moving them to read_mostly section can shrinks vmlinux by 120 bytes.

# size vmlinux*
   textdata bss dec hex filename
4835553  450210  610304 5896067  59f783 vmlinux.after_patch
4835553  450330  610304 5896187  59f7fb vmlinux.before_patch

Signed-off-by: Eric Dumazet [EMAIL PROTECTED]
diff --git a/drivers/char/random.c b/drivers/char/random.c
index 5fee056..af48e86 100644
--- a/drivers/char/random.c
+++ b/drivers/char/random.c
@@ -256,14 +256,14 @@
  * The minimum number of bits of entropy before we wake up a read on
  * /dev/random.  Should be enough to do a significant reseed.
  */
-static int random_read_wakeup_thresh = 64;
+static int random_read_wakeup_thresh __read_mostly = 64;
 
 /*
  * If the entropy count falls under this number of bits, then we
  * should wake up processes which are selecting or polling on write
  * access to /dev/random.
  */
-static int random_write_wakeup_thresh = 128;
+static int random_write_wakeup_thresh __read_mostly = 128;
 
 /*
  * When the input pool goes over trickle_thresh, start dropping most

Re: oops with 2.6.23.1, marvel, software raid, reiserfs and samba

2007-12-16 Thread Herbert Xu

Andrew Morton [EMAIL PROTECTED] wrote:

 Dec  7 17:20:53 sata_fileserver kernel: Code: 6c 39 df 74 59 8d b6 00 
 00 00 00 85 db 74 4f 8b 55 cc 8d 43 20 8b 0a 3b 48 18 0f 88 f4 05 00 
 00 89 ce 2b 70 18 8b 83 90 00 00 00 0f b6 50 0d 89 d0 83 e0 02 3c 
 01 8b 43 50 83 d6 ff 39 c6 0f 82

This means that skb-network_header == NULL so this line crashes:

if (tcp_hdr(skb)-syn)
offset--;

 That's a networking crash.  Do the oops traces which you're getting all look
 like this one?

What's spooky is that I just did a google and we've had reports
since 1998 all crashing on exactly the same line in tcp_recvmsg.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmVHI~} [EMAIL PROTECTED]
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch 1/2] [RFC] Simple tamper-proof device filesystem.

2007-12-16 Thread David Newall


Tetsuo Handa wrote:

I meant that /dev must be mounted for read-write mode
  

Again, why?



You won't be able to login to the system because /sbin/mingetty
fails to chown/chmod /dev/tty* if /dev is mounted for read-only mode.
  


Good point.  So, if only root can modify files in /dev, what's the 
problem you're fixing?  (I'm sure you tried to explain this in your 
original post, but your reasons weren't clear to me.)

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 1/4] x86: clean up local_{32|64}.h

2007-12-16 Thread Harvey Harrison

Common prefix from both files moved to local.h

Change __inline__ to inline

Signed-off-by: Harvey Harrison [EMAIL PROTECTED]
---
 include/asm-x86/local.h|   19 +--
 include/asm-x86/local_32.h |   34 ++
 include/asm-x86/local_64.h |   25 ++---
 3 files changed, 33 insertions(+), 45 deletions(-)

diff --git a/include/asm-x86/local.h b/include/asm-x86/local.h
index f416893..1e6b5af 100644
--- a/include/asm-x86/local.h
+++ b/include/asm-x86/local.h
@@ -1,10 +1,25 @@
+#ifndef _ARCH_LOCAL_H
+#define _ARCH_LOCAL_H
+
+#include linux/percpu.h
+#include asm/system.h
+#include asm/atomic.h
+
+typedef struct
+{
+   atomic_long_t a;
+} local_t;
+
+#define LOCAL_INIT(i)  { ATOMIC_LONG_INIT(i) }
+
+#define local_read(l)  atomic_long_read((l)-a)
+#define local_set(l,i) atomic_long_set((l)-a, (i))
+
 #ifdef CONFIG_X86_32
 # include local_32.h
 #else
 # include local_64.h
 #endif
-#ifndef _ARCH_LOCAL_H
-#define _ARCH_LOCAL_H
 
 #define local_inc_return(l)  (local_add_return(1,l))
 #define local_dec_return(l)  (local_sub_return(1,l))
diff --git a/include/asm-x86/local_32.h b/include/asm-x86/local_32.h
index 10ec0cf..f3bc4d9 100644
--- a/include/asm-x86/local_32.h
+++ b/include/asm-x86/local_32.h
@@ -1,35 +1,21 @@
 #ifndef _ARCH_I386_LOCAL_H
 #define _ARCH_I386_LOCAL_H
 
-#include linux/percpu.h
-#include asm/system.h
-#include asm/atomic.h
-
-typedef struct
-{
-   atomic_long_t a;
-} local_t;
-
-#define LOCAL_INIT(i)  { ATOMIC_LONG_INIT(i) }
-
-#define local_read(l)  atomic_long_read((l)-a)
-#define local_set(l,i) atomic_long_set((l)-a, (i))
-
-static __inline__ void local_inc(local_t *l)
+static inline void local_inc(local_t *l)
 {
__asm__ __volatile__(
incl %0
:+m (l-a.counter));
 }
 
-static __inline__ void local_dec(local_t *l)
+static inline void local_dec(local_t *l)
 {
__asm__ __volatile__(
decl %0
:+m (l-a.counter));
 }
 
-static __inline__ void local_add(long i, local_t *l)
+static inline void local_add(long i, local_t *l)
 {
__asm__ __volatile__(
addl %1,%0
@@ -37,7 +23,7 @@ static __inline__ void local_add(long i, local_t *l)
:ir (i));
 }
 
-static __inline__ void local_sub(long i, local_t *l)
+static inline void local_sub(long i, local_t *l)
 {
__asm__ __volatile__(
subl %1,%0
@@ -54,7 +40,7 @@ static __inline__ void local_sub(long i, local_t *l)
  * true if the result is zero, or false for all
  * other cases.
  */
-static __inline__ int local_sub_and_test(long i, local_t *l)
+static inline int local_sub_and_test(long i, local_t *l)
 {
unsigned char c;
 
@@ -73,7 +59,7 @@ static __inline__ int local_sub_and_test(long i, local_t *l)
  * returns true if the result is 0, or false for all other
  * cases.
  */
-static __inline__ int local_dec_and_test(local_t *l)
+static inline int local_dec_and_test(local_t *l)
 {
unsigned char c;
 
@@ -92,7 +78,7 @@ static __inline__ int local_dec_and_test(local_t *l)
  * and returns true if the result is zero, or false for all
  * other cases.
  */
-static __inline__ int local_inc_and_test(local_t *l)
+static inline int local_inc_and_test(local_t *l)
 {
unsigned char c;
 
@@ -112,7 +98,7 @@ static __inline__ int local_inc_and_test(local_t *l)
  * if the result is negative, or false when
  * result is greater than or equal to zero.
  */
-static __inline__ int local_add_negative(long i, local_t *l)
+static inline int local_add_negative(long i, local_t *l)
 {
unsigned char c;
 
@@ -130,7 +116,7 @@ static __inline__ int local_add_negative(long i, local_t *l)
  *
  * Atomically adds @i to @l and returns @i + @l
  */
-static __inline__ long local_add_return(long i, local_t *l)
+static inline long local_add_return(long i, local_t *l)
 {
long __i;
 #ifdef CONFIG_M386
@@ -156,7 +142,7 @@ no_xadd: /* Legacy 386 processor */
 #endif
 }
 
-static __inline__ long local_sub_return(long i, local_t *l)
+static inline long local_sub_return(long i, local_t *l)
 {
return local_add_return(-i,l);
 }
diff --git a/include/asm-x86/local_64.h b/include/asm-x86/local_64.h
index ae9a573..da61076 100644
--- a/include/asm-x86/local_64.h
+++ b/include/asm-x86/local_64.h
@@ -1,19 +1,6 @@
 #ifndef _ARCH_X8664_LOCAL_H
 #define _ARCH_X8664_LOCAL_H
 
-#include linux/percpu.h
-#include asm/atomic.h
-
-typedef struct
-{
-   atomic_long_t a;
-} local_t;
-
-#define LOCAL_INIT(i)  { ATOMIC_LONG_INIT(i) }
-
-#define local_read(l)  atomic_long_read((l)-a)
-#define local_set(l,i) atomic_long_set((l)-a, (i))
-
 static inline void local_inc(local_t *l)
 {
__asm__ __volatile__(
@@ -55,7 +42,7 @@ static inline void local_sub(long i, local_t *l)
  * true if the result is zero, or false for all
  * other cases.
  */
-static __inline__ int local_sub_and_test(long i, local_t *l)
+static inline int local_sub_and_test(long i, local_t *l)
 {

[PATCH 2/4] x86: fix asm memory constraints in local_64.h

2007-12-16 Thread Harvey Harrison

Use the shorter +m form rather than =m and m.

Signed-off-by: Harvey Harrison [EMAIL PROTECTED]
---
 include/asm-x86/local_64.h |   30 ++
 1 files changed, 14 insertions(+), 16 deletions(-)

diff --git a/include/asm-x86/local_64.h b/include/asm-x86/local_64.h
index da61076..7808f53 100644
--- a/include/asm-x86/local_64.h
+++ b/include/asm-x86/local_64.h
@@ -5,32 +5,30 @@ static inline void local_inc(local_t *l)
 {
__asm__ __volatile__(
incq %0
-   :=m (l-a.counter)
-   :m (l-a.counter));
+   :+m (l-a.counter));
 }
 
 static inline void local_dec(local_t *l)
 {
__asm__ __volatile__(
decq %0
-   :=m (l-a.counter)
-   :m (l-a.counter));
+   :+m (l-a.counter));
 }
 
 static inline void local_add(long i, local_t *l)
 {
__asm__ __volatile__(
addq %1,%0
-   :=m (l-a.counter)
-   :ir (i), m (l-a.counter));
+   :+m (l-a.counter)
+   :ir (i));
 }
 
 static inline void local_sub(long i, local_t *l)
 {
__asm__ __volatile__(
subq %1,%0
-   :=m (l-a.counter)
-   :ir (i), m (l-a.counter));
+   :+m (l-a.counter)
+   :ir (i));
 }
 
 /**
@@ -48,8 +46,8 @@ static inline int local_sub_and_test(long i, local_t *l)
 
__asm__ __volatile__(
subq %2,%0; sete %1
-   :=m (l-a.counter), =qm (c)
-   :ir (i), m (l-a.counter) : memory);
+   :+m (l-a.counter), =qm (c)
+   :ir (i) : memory);
return c;
 }
 
@@ -67,8 +65,8 @@ static inline int local_dec_and_test(local_t *l)
 
__asm__ __volatile__(
decq %0; sete %1
-   :=m (l-a.counter), =qm (c)
-   :m (l-a.counter) : memory);
+   :+m (l-a.counter), =qm (c)
+   : : memory);
return c != 0;
 }
 
@@ -86,8 +84,8 @@ static inline int local_inc_and_test(local_t *l)
 
__asm__ __volatile__(
incq %0; sete %1
-   :=m (l-a.counter), =qm (c)
-   :m (l-a.counter) : memory);
+   :+m (l-a.counter), =qm (c)
+   : : memory);
return c != 0;
 }
 
@@ -106,8 +104,8 @@ static inline int local_add_negative(long i, local_t *l)
 
__asm__ __volatile__(
addq %2,%0; sets %1
-   :=m (l-a.counter), =qm (c)
-   :ir (i), m (l-a.counter) : memory);
+   :+m (l-a.counter), =qm (c)
+   :ir (i) : memory);
return c;
 }
 
-- 
1.5.4.rc0.1083.gf568


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch 1/2] [RFC] Simple tamper-proof device filesystem.

2007-12-16 Thread Tetsuo Handa

Hello.

David Newall wrote:
  You won't be able to login to the system because /sbin/mingetty
  fails to chown/chmod /dev/tty* if /dev is mounted for read-only mode.
 
 Good point.  So, if only root can modify files in /dev, what's the 
 problem you're fixing?  (I'm sure you tried to explain this in your 
 original post, but your reasons weren't clear to me.)

In 2003, I was trying to make / partition read-only to avoid tampering system 
files.
Use of policy based mandatory access control (such as SELinux) is
one of ways to avoid tampering, but management of policy was a daunting task.
So, I tried to store / partition in a read-only medium so that
the system is free from tampering system files.

When I attended at Security Stadium 2003 as a defense side,
I was using devfs for /dev directory. The files in /dev directory
were deleted by attckers and the administrator was unable to login.
So I developed this filesystem so that attackers who got root privilege
can't tamper files in /dev directory.
Not many systems mount / partition for read-only mode,
thus there may be few needs for read-only / partition.

But use of this filesystem is still valid when this filesystem is used with
policy based mandatory access control (such as SELinux, TOMOYO Linux)
because this filesystem guarantees where policy based mandatory access control
can't guarantee (i.e. filename and its attribute).
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 3/4] x86: Unify local_{32|64}.h

2007-12-16 Thread Harvey Harrison

Introduce macros to deal with X86_32 using longs and X86_64
using quads.  Small comment fixes to make files match.

Signed-off-by: Harvey Harrison [EMAIL PROTECTED]
---
 include/asm-x86/local.h|   17 +
 include/asm-x86/local_32.h |   28 ++--
 include/asm-x86/local_64.h |   18 +-
 3 files changed, 40 insertions(+), 23 deletions(-)

diff --git a/include/asm-x86/local.h b/include/asm-x86/local.h
index 1e6b5af..8839c36 100644
--- a/include/asm-x86/local.h
+++ b/include/asm-x86/local.h
@@ -14,6 +14,23 @@ typedef struct
 
 #define local_read(l)  atomic_long_read((l)-a)
 #define local_set(l,i) atomic_long_set((l)-a, (i))
+/*
+ * X86_32 uses longs
+ * X86_64 uses quads
+ */
+#ifdef CONFIG_X86_32
+# define ASM_INC incl
+# define ASM_DEC decl
+# define ASM_ADD addl
+# define ASM_SUB subl
+# define ASM_XADD xaddl
+#else
+# define ASM_INC incq
+# define ASM_DEC decq
+# define ASM_ADD addq
+# define ASM_SUB subq
+# define ASM_XADD xaddq
+#endif
 
 #ifdef CONFIG_X86_32
 # include local_32.h
diff --git a/include/asm-x86/local_32.h b/include/asm-x86/local_32.h
index f3bc4d9..ff6d1d2 100644
--- a/include/asm-x86/local_32.h
+++ b/include/asm-x86/local_32.h
@@ -4,21 +4,21 @@
 static inline void local_inc(local_t *l)
 {
__asm__ __volatile__(
-   incl %0
+   ASM_INC %0
:+m (l-a.counter));
 }
 
 static inline void local_dec(local_t *l)
 {
__asm__ __volatile__(
-   decl %0
+   ASM_DEC %0
:+m (l-a.counter));
 }
 
 static inline void local_add(long i, local_t *l)
 {
__asm__ __volatile__(
-   addl %1,%0
+   ASM_ADD %1,%0
:+m (l-a.counter)
:ir (i));
 }
@@ -26,7 +26,7 @@ static inline void local_add(long i, local_t *l)
 static inline void local_sub(long i, local_t *l)
 {
__asm__ __volatile__(
-   subl %1,%0
+   ASM_SUB %1,%0
:+m (l-a.counter)
:ir (i));
 }
@@ -34,7 +34,7 @@ static inline void local_sub(long i, local_t *l)
 /**
  * local_sub_and_test - subtract value from variable and test result
  * @i: integer value to subtract
- * @l: pointer of type local_t
+ * @l: pointer to type local_t
  *
  * Atomically subtracts @i from @l and returns
  * true if the result is zero, or false for all
@@ -45,7 +45,7 @@ static inline int local_sub_and_test(long i, local_t *l)
unsigned char c;
 
__asm__ __volatile__(
-   subl %2,%0; sete %1
+   ASM_SUB %2,%0; sete %1
:+m (l-a.counter), =qm (c)
:ir (i) : memory);
return c;
@@ -53,7 +53,7 @@ static inline int local_sub_and_test(long i, local_t *l)
 
 /**
  * local_dec_and_test - decrement and test
- * @l: pointer of type local_t
+ * @l: pointer to type local_t
  *
  * Atomically decrements @l by 1 and
  * returns true if the result is 0, or false for all other
@@ -64,7 +64,7 @@ static inline int local_dec_and_test(local_t *l)
unsigned char c;
 
__asm__ __volatile__(
-   decl %0; sete %1
+   ASM_DEC %0; sete %1
:+m (l-a.counter), =qm (c)
: : memory);
return c != 0;
@@ -72,7 +72,7 @@ static inline int local_dec_and_test(local_t *l)
 
 /**
  * local_inc_and_test - increment and test
- * @l: pointer of type local_t
+ * @l: pointer to type local_t
  *
  * Atomically increments @l by 1
  * and returns true if the result is zero, or false for all
@@ -83,7 +83,7 @@ static inline int local_inc_and_test(local_t *l)
unsigned char c;
 
__asm__ __volatile__(
-   incl %0; sete %1
+   ASM_INC %0; sete %1
:+m (l-a.counter), =qm (c)
: : memory);
return c != 0;
@@ -91,8 +91,8 @@ static inline int local_inc_and_test(local_t *l)
 
 /**
  * local_add_negative - add and test if negative
- * @l: pointer of type local_t
  * @i: integer value to add
+ * @l: pointer to type local_t
  *
  * Atomically adds @i to @l and returns true
  * if the result is negative, or false when
@@ -103,7 +103,7 @@ static inline int local_add_negative(long i, local_t *l)
unsigned char c;
 
__asm__ __volatile__(
-   addl %2,%0; sets %1
+   ASM_ADD %2,%0; sets %1
:+m (l-a.counter), =qm (c)
:ir (i) : memory);
return c;
@@ -111,8 +111,8 @@ static inline int local_add_negative(long i, local_t *l)
 
 /**
  * local_add_return - add and return
- * @l: pointer of type local_t
  * @i: integer value to add
+ * @l: pointer to type local_t
  *
  * Atomically adds @i to @l and returns @i + @l
  */
@@ -127,7 +127,7 @@ static inline long local_add_return(long i, local_t *l)
/* Modern 486+ processor */
__i = i;
__asm__ __volatile__(
-   xaddl %0, %1;
+   ASM_XADD %0, %1;
:+r (i), +m (l-a.counter)

[PATCH 3/9] readahead: auto detection of sequential mmap reads

2007-12-16 Thread Fengguang Wu

Auto-detect sequential mmap reads and do sync/async readahead for them.

The sequential mmap readahead will be triggered when
- sync readahead: it's a major fault and (prev_offset==offset-1);
- async readahead: minor fault on PG_readahead page with valid readahead state.

It's a bit conservative to require valid readahead state for async readahead,
which means we don't do readahead for interleaved reads for now, but let's make
it safe for this initial try.

Signed-off-by: Fengguang Wu [EMAIL PROTECTED]
---

---
 mm/filemap.c |6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

--- linux-2.6.24-rc4-mm1.orig/mm/filemap.c
+++ linux-2.6.24-rc4-mm1/mm/filemap.c
@@ -1318,7 +1318,8 @@ static void do_sync_mmap_readahead(struc
if (VM_RandomReadHint(vma))
return;
 
-   if (VM_SequentialReadHint(vma)) {
+   if (VM_SequentialReadHint(vma) ||
+   offset - 1 == (ra-prev_pos  PAGE_CACHE_SHIFT)) {
page_cache_sync_readahead(mapping, ra, file, offset, 1);
return;
}
@@ -1360,7 +1361,8 @@ static void do_async_mmap_readahead(stru
return;
if (ra-mmap_miss  0)
ra-mmap_miss--;
-   if (PageReadahead(page))
+   if (PageReadahead(page) 
+   offset == ra-start + ra-size - ra-async_size)
page_cache_async_readahead(mapping, ra, file, page, offset, 1);
 }
 

-- 
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 0/9] mmap read-around and readahead

2007-12-16 Thread Fengguang Wu

Andrew,

Here are the mmap read-around related patches initiated by Linus.
They are for linux-2.6.24-rc4-mm1.  The one major new feature -
auto detection and early readahead for mmap sequential reads - runs
as expected on my desktop :-)


[PATCH 1/9] readahead: simplify readahead call scheme
[PATCH 2/9] readahead: clean up and simplify the code for filemap page fault 
readahead
[PATCH 3/9] readahead: auto detection of sequential mmap reads
[PATCH 4/9] readahead: quick startup on sequential mmap readahead
[PATCH 5/9] readahead: make ra_submit() non-static
[PATCH 6/9] readahead: save mmap read-around states in file_ra_state
[PATCH 7/9] readahead: remove unused do_page_cache_readahead()
[PATCH 8/9] readahead: move max_sane_readahead() calls into 
force_page_cache_readahead()
[PATCH 9/9] readahead: call max_sane_readahead() in ondemand_readahead()

Thank you,
Fengguang
-- 
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 9/9] readahead: call max_sane_readahead() in ondemand_readahead()

2007-12-16 Thread Fengguang Wu

Apply the max_sane_readahead() limit in ondemand_readahead().
Just in case someone aggressively set a huge readahead size.

Signed-off-by: Fengguang Wu [EMAIL PROTECTED]
---
 mm/readahead.c |4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- linux-2.6.24-rc4-mm1.orig/mm/readahead.c
+++ linux-2.6.24-rc4-mm1/mm/readahead.c
@@ -324,9 +324,9 @@ ondemand_readahead(struct address_space 
   bool hit_readahead_marker, pgoff_t offset,
   unsigned long req_size)
 {
-   int max = ra-ra_pages; /* max readahead pages */
pgoff_t prev_offset;
-   int sequential;
+   int sequential;
+   int max = max_sane_readahead(ra-ra_pages);  /* max readahead pages */
 
/*
 * It's the expected callback offset, assume sequential access.

-- 
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 5/9] readahead: make ra_submit() non-static

2007-12-16 Thread Fengguang Wu

Make ra_submit() non-static and callable from other files.

Signed-off-by: Fengguang Wu [EMAIL PROTECTED]
---
---
 include/linux/mm.h |3 +++
 mm/readahead.c |2 +-
 2 files changed, 4 insertions(+), 1 deletion(-)

--- linux-2.6.24-rc4-mm1.orig/include/linux/mm.h
+++ linux-2.6.24-rc4-mm1/include/linux/mm.h
@@ -1103,6 +1103,9 @@ void page_cache_async_readahead(struct a
unsigned long size);
 
 unsigned long max_sane_readahead(unsigned long nr);
+unsigned long ra_submit(struct file_ra_state *ra,
+   struct address_space *mapping,
+   struct file *filp);
 
 /* Do stack extension */
 extern int expand_stack(struct vm_area_struct *vma, unsigned long address);
--- linux-2.6.24-rc4-mm1.orig/mm/readahead.c
+++ linux-2.6.24-rc4-mm1/mm/readahead.c
@@ -242,7 +242,7 @@ subsys_initcall(readahead_init);
 /*
  * Submit IO for the read-ahead request in file_ra_state.
  */
-static unsigned long ra_submit(struct file_ra_state *ra,
+unsigned long ra_submit(struct file_ra_state *ra,
   struct address_space *mapping, struct file *filp)
 {
int actual;

-- 
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 7/9] readahead: remove unused do_page_cache_readahead()

2007-12-16 Thread Fengguang Wu

Remove do_page_cache_readahead().
Its last user, mmap read-around, has been changed to call ra_submit().

Also, the no-readahead-if-congested logic is not appropriate here. 
Raw 1-page reads can only makes things painfully slower, and
users are pretty sensitive about the slow loading of executables.

Signed-off-by: Fengguang Wu [EMAIL PROTECTED]
---
 include/linux/mm.h |2 --
 mm/readahead.c |   16 
 2 files changed, 18 deletions(-)

--- linux-2.6.24-rc4-mm1.orig/include/linux/mm.h
+++ linux-2.6.24-rc4-mm1/include/linux/mm.h
@@ -1084,8 +1084,6 @@ int write_one_page(struct page *page, in
 #define VM_MAX_READAHEAD   128 /* kbytes */
 #define VM_MIN_READAHEAD   16  /* kbytes (includes current page) */
 
-int do_page_cache_readahead(struct address_space *mapping, struct file *filp,
-   pgoff_t offset, unsigned long nr_to_read);
 int force_page_cache_readahead(struct address_space *mapping, struct file 
*filp,
pgoff_t offset, unsigned long nr_to_read);
 
--- linux-2.6.24-rc4-mm1.orig/mm/readahead.c
+++ linux-2.6.24-rc4-mm1/mm/readahead.c
@@ -208,22 +208,6 @@ int force_page_cache_readahead(struct ad
 }
 
 /*
- * This version skips the IO if the queue is read-congested, and will tell the
- * block layer to abandon the readahead if request allocation would block.
- *
- * force_page_cache_readahead() will ignore queue congestion and will block on
- * request queues.
- */
-int do_page_cache_readahead(struct address_space *mapping, struct file *filp,
-   pgoff_t offset, unsigned long nr_to_read)
-{
-   if (bdi_read_congested(mapping-backing_dev_info))
-   return -1;
-
-   return __do_page_cache_readahead(mapping, filp, offset, nr_to_read, 0);
-}
-
-/*
  * Given a desired number of PAGE_CACHE_SIZE readahead pages, return a
  * sensible upper limit.
  */

-- 
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 4/9] readahead: quick startup on sequential mmap readahead

2007-12-16 Thread Fengguang Wu

When the user explicitly sets MADV_SEQUENTIAL, we should really avoid the slow
readahead size ramp-up phase and start full-size readahead immediately.

This patch won't change behavior for the auto-detected sequential mmap reads.
Its previous read-around size is ra_pages/2, so it will be doubled to the full
readahead size anyway.

Signed-off-by: Fengguang Wu [EMAIL PROTECTED]
---
 mm/filemap.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- linux-2.6.24-rc4-mm1.orig/mm/filemap.c
+++ linux-2.6.24-rc4-mm1/mm/filemap.c
@@ -1320,7 +1320,7 @@ static void do_sync_mmap_readahead(struc
 
if (VM_SequentialReadHint(vma) ||
offset - 1 == (ra-prev_pos  PAGE_CACHE_SHIFT)) {
-   page_cache_sync_readahead(mapping, ra, file, offset, 1);
+   page_cache_sync_readahead(mapping, ra, file, offset, 
ra-ra_pages);
return;
}
 

-- 
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 1/9] readahead: simplify readahead call scheme

2007-12-16 Thread Fengguang Wu

It is insane and error-prone to insist on the call sites to check for async
readahead after doing any sync one. I.e. whenever someone do a sync readahead:

if (!page)
page_cache_sync_readahead(...);

He must try async readahead, too:

page = find_get_page(...);
if (PageReadahead(page))
page_cache_async_readahead(...);

The tricky point is that PG_readahead could be set by a sync readahead for the
_current_ newly faulted in page, and the readahead code simply expects one more
callback to handle it. If the caller fails to do so, it will miss the
PG_readahead bits and never able to start an async readahead.

Avoid it by piggy-backing the async part _inside_ the readahead code.

Now if an async readahead should be started immediately after a sync one,
the readahead logic itself will do it. So the following code becomes valid:

if (!page)
page_cache_sync_readahead(...);
else if (PageReadahead(page))
page_cache_async_readahead(...);

Signed-off-by: Fengguang Wu [EMAIL PROTECTED]
---
 mm/readahead.c |8 
 1 file changed, 8 insertions(+)

--- linux-2.6.24-rc4-mm1.orig/mm/readahead.c
+++ linux-2.6.24-rc4-mm1/mm/readahead.c
@@ -402,6 +402,14 @@ ondemand_readahead(struct address_space 
ra-async_size = ra-size  req_size ? ra-size - req_size : ra-size;
 
 readit:
+   /*
+* An async readahead should be triggered immediately.
+* Instead of demanding all call sites to check for async readahead
+* immediate after a sync one, start the async part now and here.
+*/
+   if (!hit_readahead_marker  ra-size == ra-async_size)
+   ra-size *= 2;
+
return ra_submit(ra, mapping, filp);
 }
 

-- 
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 6/9] readahead: save mmap read-around states in file_ra_state

2007-12-16 Thread Fengguang Wu

Change mmap read-around to share the same code style and data structure
with readahead code.

Signed-off-by: Fengguang Wu [EMAIL PROTECTED]
---
 mm/filemap.c |   14 --
 1 file changed, 8 insertions(+), 6 deletions(-)

--- linux-2.6.24-rc4-mm1.orig/mm/filemap.c
+++ linux-2.6.24-rc4-mm1/mm/filemap.c
@@ -1333,13 +1333,15 @@ static void do_sync_mmap_readahead(struc
if (ra-mmap_miss  MMAP_LOTSAMISS)
return;
 
-   ra_pages = max_sane_readahead(file-f_ra.ra_pages);
+   /*
+* mmap read-around
+*/
+   ra_pages = max_sane_readahead(ra-ra_pages);
if (ra_pages) {
-   pgoff_t start = 0;
-
-   if (offset  ra_pages / 2)
-   start = offset - ra_pages / 2;
-   do_page_cache_readahead(mapping, file, start, ra_pages);
+   ra-start = max_t(long, 0, offset - ra_pages / 2);
+   ra-size = ra_pages;
+   ra-async_size = 0;
+   ra_submit(ra, mapping, file);
}
 }
 

-- 
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 8/9] readahead: move max_sane_readahead() calls into force_page_cache_readahead()

2007-12-16 Thread Fengguang Wu

Simplify code by moving max_sane_readahead() calls into
force_page_cache_readahead().

Signed-off-by: Fengguang Wu [EMAIL PROTECTED]
---
 mm/fadvise.c   |2 +-
 mm/filemap.c   |3 +--
 mm/madvise.c   |3 +--
 mm/readahead.c |1 +
 4 files changed, 4 insertions(+), 5 deletions(-)

--- linux-2.6.24-rc4-mm1.orig/mm/fadvise.c
+++ linux-2.6.24-rc4-mm1/mm/fadvise.c
@@ -89,7 +89,7 @@ asmlinkage long sys_fadvise64_64(int fd,

ret = force_page_cache_readahead(mapping, file,
start_index,
-   max_sane_readahead(nrpages));
+   nrpages);
if (ret  0)
ret = 0;
break;
--- linux-2.6.24-rc4-mm1.orig/mm/filemap.c
+++ linux-2.6.24-rc4-mm1/mm/filemap.c
@@ -1242,8 +1242,7 @@ do_readahead(struct address_space *mappi
if (!mapping || !mapping-a_ops || !mapping-a_ops-readpage)
return -EINVAL;
 
-   force_page_cache_readahead(mapping, filp, index,
-   max_sane_readahead(nr));
+   force_page_cache_readahead(mapping, filp, index, nr);
return 0;
 }
 
--- linux-2.6.24-rc4-mm1.orig/mm/madvise.c
+++ linux-2.6.24-rc4-mm1/mm/madvise.c
@@ -123,8 +123,7 @@ static long madvise_willneed(struct vm_a
end = vma-vm_end;
end = ((end - vma-vm_start)  PAGE_SHIFT) + vma-vm_pgoff;
 
-   force_page_cache_readahead(file-f_mapping,
-   file, start, max_sane_readahead(end - start));
+   force_page_cache_readahead(file-f_mapping, file, start, end - start);
return 0;
 }
 
--- linux-2.6.24-rc4-mm1.orig/mm/readahead.c
+++ linux-2.6.24-rc4-mm1/mm/readahead.c
@@ -187,6 +187,7 @@ int force_page_cache_readahead(struct ad
if (unlikely(!mapping-a_ops-readpage  !mapping-a_ops-readpages))
return -EINVAL;
 
+   nr_to_read = max_sane_readahead(nr_to_read);
while (nr_to_read) {
int err;
 

-- 
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 2/9] readahead: clean up and simplify the code for filemap page fault readahead

2007-12-16 Thread Fengguang Wu

From: Linus Torvalds [EMAIL PROTECTED]

This shouldn't really change behavior all that much, but the single
rather complex function with read-ahead inside a loop etc is broken up
into more manageable pieces.

The behaviour is also less subtle, with the read-ahead being done up-front 
rather than inside some subtle loop and thus avoiding the now unnecessary 
extra state variables (ie did_readaround is gone).

Cc: Nick Piggin [EMAIL PROTECTED]
Cc: Andrew Morton [EMAIL PROTECTED]
Cc: Fengguang Wu [EMAIL PROTECTED]
Signed-off-by: Linus Torvalds [EMAIL PROTECTED]
---

Ok, so this is something I did in Mexico when I wasn't scuba-diving, and 
was watching the kids at the pool. It was brought on by looking at git 
mmap file behaviour under cold-cache behaviour: git does ok, but my laptop 
disk is really slow, and I tried to verify that the kernel did a 
reasonable job of read-ahead when taking page faults.

I think it did, but quite frankly, the filemap_fault() code was totally 
unreadable. So this separates out the read-ahead cases, and adds more 
comments, and also changes it so that we do asynchronous read-ahead 
*before* we actually wait for the page we are waiting for to become 
unlocked.

Not that it seems to make any real difference on my laptop, but I really 
hated how it was doing a

page = get_lock_page(..)

and then doing read-ahead after that: which just guarantees that we have 
to wait for any out-standing IO on page to complete before we can even 
submit any new read-ahead! That just seems totally broken!

So it replaces the get_lock_page() at the top with a broken-out page 
cache lookup, which allows us to look at the page state flags and make 
appropriate decisions on what we should do without waiting for the locked 
bit to clear.

It does add many more lines than it removes:

 mm/filemap.c |  192 
+++---
 1 files changed, 130 insertions(+), 62 deletions(-)

but that's largely due to (a) the new function headers etc due to the 
split-up and (b) new or extended comments especially about the helper 
functions. The code, in many ways, is actually simpler, apart from the 
fairly trivial expansion of the equivalent of get_lock_page() into the 
function.

Comments? I tried to avoid changing the read-ahead logic itself, although 
the old code did some strange things like doing *both* async readahead and 
then looking up the page and doing sync readahead (which I think was just 
due to the code being so damn messily organized, not on purpose).

Linus

---
 mm/filemap.c |  190 +
 1 file changed, 128 insertions(+), 62 deletions(-)

--- linux-2.6.24-rc4-mm1.orig/mm/filemap.c
+++ linux-2.6.24-rc4-mm1/mm/filemap.c
@@ -1302,6 +1302,86 @@ static int fastcall page_cache_read(stru
 
 #define MMAP_LOTSAMISS  (100)
 
+/*
+ * Synchronous readahead happens when we don't even find
+ * a page in the page cache at all.
+ */
+static void do_sync_mmap_readahead(struct vm_area_struct *vma,
+  struct file_ra_state *ra,
+  struct file *file,
+  pgoff_t offset)
+{
+   unsigned long ra_pages;
+   struct address_space *mapping = file-f_mapping;
+
+   /* If we don't want any read-ahead, don't bother */
+   if (VM_RandomReadHint(vma))
+   return;
+
+   if (VM_SequentialReadHint(vma)) {
+   page_cache_sync_readahead(mapping, ra, file, offset, 1);
+   return;
+   }
+
+   ra-mmap_miss++;
+
+   /*
+* Do we miss much more than hit in this file? If so,
+* stop bothering with read-ahead. It will only hurt.
+*/
+   if (ra-mmap_miss  MMAP_LOTSAMISS)
+   return;
+
+   ra_pages = max_sane_readahead(file-f_ra.ra_pages);
+   if (ra_pages) {
+   pgoff_t start = 0;
+
+   if (offset  ra_pages / 2)
+   start = offset - ra_pages / 2;
+   do_page_cache_readahead(mapping, file, start, ra_pages);
+   }
+}
+
+/*
+ * Asynchronous readahead happens when we find the page,
+ * but it is busy being read, so we want to possibly
+ * extend the readahead further..
+ */
+static void do_async_mmap_readahead(struct vm_area_struct *vma,
+   struct file_ra_state *ra,
+   struct file *file,
+   struct page *page,
+   pgoff_t offset)
+{
+   struct address_space *mapping = file-f_mapping;
+
+   /* If we don't want any read-ahead, don't bother */
+   if (VM_RandomReadHint(vma))
+   return;
+   if (ra-mmap_miss  0)
+   ra-mmap_miss--;
+   if (PageReadahead(page))
+   page_cache_async_readahead(mapping, ra, file, page, offset, 1);
+}
+
+/*
+ * A successful mmap hit is when we

[PATCH 4/4] x86: Final unification of local_{32|64}.h

2007-12-16 Thread Harvey Harrison

No differences except for the defintion of local_add_return on
X86_64. The X86_32 version is just fine as it is protected with
ifdef CONFIG_M386 so use it directly.

Signed-off-by: Harvey Harrison [EMAIL PROTECTED]
---
 include/asm-x86/local.h|  149 ++-
 include/asm-x86/local_32.h |  150 
 include/asm-x86/local_64.h |  134 ---
 3 files changed, 145 insertions(+), 288 deletions(-)

diff --git a/include/asm-x86/local.h b/include/asm-x86/local.h
index 8839c36..3939859 100644
--- a/include/asm-x86/local.h
+++ b/include/asm-x86/local.h
@@ -14,6 +14,7 @@ typedef struct
 
 #define local_read(l)  atomic_long_read((l)-a)
 #define local_set(l,i) atomic_long_set((l)-a, (i))
+
 /*
  * X86_32 uses longs
  * X86_64 uses quads
@@ -32,11 +33,151 @@ typedef struct
 # define ASM_XADD xaddq
 #endif
 
-#ifdef CONFIG_X86_32
-# include local_32.h
-#else
-# include local_64.h
+static inline void local_inc(local_t *l)
+{
+   __asm__ __volatile__(
+   ASM_INC %0
+   :+m (l-a.counter));
+}
+
+static inline void local_dec(local_t *l)
+{
+   __asm__ __volatile__(
+   ASM_DEC %0
+   :+m (l-a.counter));
+}
+
+static inline void local_add(long i, local_t *l)
+{
+   __asm__ __volatile__(
+   ASM_ADD %1,%0
+   :+m (l-a.counter)
+   :ir (i));
+}
+
+static inline void local_sub(long i, local_t *l)
+{
+   __asm__ __volatile__(
+   ASM_SUB %1,%0
+   :+m (l-a.counter)
+   :ir (i));
+}
+
+/**
+ * local_sub_and_test - subtract value from variable and test result
+ * @i: integer value to subtract
+ * @l: pointer to type local_t
+ *
+ * Atomically subtracts @i from @l and returns
+ * true if the result is zero, or false for all
+ * other cases.
+ */
+static inline int local_sub_and_test(long i, local_t *l)
+{
+   unsigned char c;
+
+   __asm__ __volatile__(
+   ASM_SUB %2,%0; sete %1
+   :+m (l-a.counter), =qm (c)
+   :ir (i) : memory);
+   return c;
+}
+
+/**
+ * local_dec_and_test - decrement and test
+ * @l: pointer to type local_t
+ *
+ * Atomically decrements @l by 1 and
+ * returns true if the result is 0, or false for all other
+ * cases.
+ */
+static inline int local_dec_and_test(local_t *l)
+{
+   unsigned char c;
+
+   __asm__ __volatile__(
+   ASM_DEC %0; sete %1
+   :+m (l-a.counter), =qm (c)
+   : : memory);
+   return c != 0;
+}
+
+/**
+ * local_inc_and_test - increment and test
+ * @l: pointer to type local_t
+ *
+ * Atomically increments @l by 1
+ * and returns true if the result is zero, or false for all
+ * other cases.
+ */
+static inline int local_inc_and_test(local_t *l)
+{
+   unsigned char c;
+
+   __asm__ __volatile__(
+   ASM_INC %0; sete %1
+   :+m (l-a.counter), =qm (c)
+   : : memory);
+   return c != 0;
+}
+
+/**
+ * local_add_negative - add and test if negative
+ * @i: integer value to add
+ * @l: pointer to type local_t
+ *
+ * Atomically adds @i to @l and returns true
+ * if the result is negative, or false when
+ * result is greater than or equal to zero.
+ */
+static inline int local_add_negative(long i, local_t *l)
+{
+   unsigned char c;
+
+   __asm__ __volatile__(
+   ASM_ADD %2,%0; sets %1
+   :+m (l-a.counter), =qm (c)
+   :ir (i) : memory);
+   return c;
+}
+
+/**
+ * local_add_return - add and return
+ * @i: integer value to add
+ * @l: pointer to type local_t
+ *
+ * Atomically adds @i to @l and returns @i + @l
+ */
+static inline long local_add_return(long i, local_t *l)
+{
+   long __i;
+#ifdef CONFIG_M386
+   unsigned long flags;
+   if(unlikely(boot_cpu_data.x86 = 3))
+   goto no_xadd;
+#endif
+   /* Modern 486+ processor including X86_64*/
+   __i = i;
+   __asm__ __volatile__(
+   ASM_XADD %0, %1;
+   :+r (i), +m (l-a.counter)
+   : : memory);
+   return i + __i;
+
+#ifdef CONFIG_M386
+no_xadd: /* Legacy 386 processor */
+   local_irq_save(flags);
+   __i = local_read(l);
+   local_set(l, i + __i);
+   local_irq_restore(flags);
+   return i + __i;
 #endif
+}
+
+static inline long local_sub_return(long i, local_t *l)
+{
+   return local_add_return(-i,l);
+}
 
 #define local_inc_return(l)  (local_add_return(1,l))
 #define local_dec_return(l)  (local_sub_return(1,l))
diff --git a/include/asm-x86/local_32.h b/include/asm-x86/local_32.h
deleted file mode 100644
index ff6d1d2..000
--- a/include/asm-x86/local_32.h
+++ /dev/null
@@ -1,150 +0,0 @@
-#ifndef _ARCH_I386_LOCAL_H
-#define _ARCH_I386_LOCAL_H
-
-static inline void local_inc(local_t *l)
-{
-   __asm__ __volatile__(
-   ASM_INC %0
-   :+m (l-a.counter));
-}
-
-static inline

Re: [patch 1/2] [RFC] Simple tamper-proof device filesystem.

2007-12-16 Thread Tetsuo Handa


 But use of this filesystem is still valid when this filesystem is used with
 policy based mandatory access control (such as SELinux, TOMOYO Linux)
 because this filesystem guarantees where policy based mandatory access control
 can't guarantee (i.e. filename and its attribute).
 
Policy based mandatory access control guarantees that
Only Bob can create block device file named sda1 in /dev directory.
But it can't guarantee that /dev/sda1 will have block-8-1 attribute.
If Bob is malicious and creates /dev/sda1 with block-8-2 attribute,
other applications that depends on the attributes of /dev/sda1 goes wrong.
So, this filesystem guarantees that /dev/sda1 has block-8-1 attribute.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: oops with 2.6.23.1, marvel, software raid, reiserfs and samba

2007-12-16 Thread Herbert Xu

On Sun, Dec 16, 2007 at 07:56:56PM +0800, Herbert Xu wrote:

 What's spooky is that I just did a google and we've had reports
 since 1998 all crashing on exactly the same line in tcp_recvmsg.

However, there's been no reports at all since 2000 apart from this
one so the earlier ones are probably not related.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmVHI~} [EMAIL PROTECTED]
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [HP6XX/FIX/PATCH] - Fix bad default keymap in HP Jornada 6xx keyboard driver

2007-12-16 Thread Paul Mundt

On Wed, Dec 12, 2007 at 07:54:52PM +0100, Kristoffer Ericson wrote:
 On Thu, 13 Dec 2007 03:45:58 +0900
 Paul Mundt [EMAIL PROTECTED] wrote:
  On Wed, Dec 12, 2007 at 07:22:07PM +0100, Kristoffer Ericson wrote:
   * This patch fixes the HP Jornada 6xx keyboard default keymap which
   had some bad keymap values. This resulted in wrong key being
   returned when pressed (example : key y returned 'r').
   
  You do realize that the default keymap was written for the Japanese units
  and the Japanese keyboards, right? From the looks of it, you are just
  trying to swap one functional set for another. I can assure you that this
  keymap worked fine on the Japanese units, so calling it a bug is a bit
  misleading.
 
 Mostly true yes. However a few errors entered simply due to me copying
 the keymap poorly in the initial keymap. So it does infact have 'bug'
 keys that wouldn't work properly on neither japanese / european / US
 jornadas. And whatever functional set, this patch fixes those bugs.
 
Ah, ok, so it's a problem with the new driver, rather than something
that's always been broken. No objections then, thanks for clearing that
up.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RANDOM] Move two variables to read_mostly section to save memory

2007-12-16 Thread Adrian Bunk

On Sun, Dec 16, 2007 at 12:45:01PM +0100, Eric Dumazet wrote:
 While examining vmlinux namelist on i686, I noticed :

 c0581300 D random_table
 c0581480 d input_pool
 c0581580 d random_read_wakeup_thresh
 c0581584 d random_write_wakeup_thresh
 c0581600 d blocking_pool

 That means that the two integers random_read_wakeup_thresh and 
 random_write_wakeup_thresh use a full cache entry (128 bytes).

 Moving them to read_mostly section can shrinks vmlinux by 120 bytes.

 # size vmlinux*
textdata bss dec hex filename
 4835553  450210  610304 5896067  59f783 vmlinux.after_patch
 4835553  450330  610304 5896187  59f7fb vmlinux.before_patch

 Signed-off-by: Eric Dumazet [EMAIL PROTECTED]

 diff --git a/drivers/char/random.c b/drivers/char/random.c
 index 5fee056..af48e86 100644
 --- a/drivers/char/random.c
 +++ b/drivers/char/random.c
 @@ -256,14 +256,14 @@
   * The minimum number of bits of entropy before we wake up a read on
   * /dev/random.  Should be enough to do a significant reseed.
   */
 -static int random_read_wakeup_thresh = 64;
 +static int random_read_wakeup_thresh __read_mostly = 64;
  
  /*
   * If the entropy count falls under this number of bits, then we
   * should wake up processes which are selecting or polling on write
   * access to /dev/random.
   */
 -static int random_write_wakeup_thresh = 128;
 +static int random_write_wakeup_thresh __read_mostly = 128;

Please never ever do such ugly and unmaintainable micro-optimizations in 
the code unless you can show a measurable performance improvement of the 
kernel.

cu
Adrian

-- 

   Is there not promise of rain? Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
   Only a promise, Lao Er said.
   Pearl S. Buck - Dragon Seed

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] x86: provide a DMI based port 0x80 I/O delay override.

2007-12-16 Thread Rene Herman


On 16-12-07 00:51, H. Peter Anvin wrote:


Rene Herman wrote:


I hope this is considered half-way correct/sane (note by the way that 
it's not a good idea to switch a native_io_delay_port value since 
plugging in a variable port would clobber register dx for every 
outb_p, which would then have to be reloaded for the next outb again). 
Comments appreciated.


That actually wouldn't be that big of a deal.  Switching values in and 
out of registers is dirt cheap (and MUCH cheaper than an indirect 
function call)


Well, I suppose. With stuff inline, constantly reloading dx also bloats 
things up a bit but yes, out of line who cares. Do you think this version is 
better?


Note, however, that your code doesn't deal with io_delay()'s in the boot 
code (arch/x86/boot) at all, nor (obviously) io_delay()'s in boot 
loaders.  In the boot code, access to DMI data is NOT available (we 
can't even use the INT 15h mover if we want to continue to support 
Loadlin.)


In the boot code, io_delay() is used to slow down accesses to the KBC, 
interrupt controller, INT13h logic, and the NMI gate, and to provide a 
fixed delay during A20 stabilization.


Thanks for the heads up (also saw the SMBIOS update to this) but those don't 
seem to be a problem in fact. David Reed has been running with the simple 
udelay(2) version of this and reported no more hangs. He moreover reported 
no trouble after booting with acpi=off meaning that things seem to be fine 
pre-acpi which the boot code (and this io_delay_init) is. So I believe we 
get to ignore those.


David: I've plugged in your DMI values in this. Could you perhaps test this 
to confirm that it works for you?


Any ACKs, NAKs or further comments from others in this thread also welcome.

Changelog in the patch.

 arch/x86/boot/compressed/misc_32.c |8 ++---
 arch/x86/boot/compressed/misc_64.c |8 ++---
 arch/x86/kernel/Makefile_32|2 -
 arch/x86/kernel/Makefile_64|2 -
 arch/x86/kernel/io_delay.c |   54 
+

 arch/x86/kernel/setup_32.c |2 +
 arch/x86/kernel/setup_64.c |2 +
 include/asm-x86/io_32.h|   17 ++-
 include/asm-x86/io_64.h|   23 ++-
 9 files changed, 80 insertions(+), 38 deletions(-)

Rene.
commit a17ccb1964b53fd4ab00d501b7f229a9a6cf91d1
Author: Rene Herman [EMAIL PROTECTED]
Date:   Sun Dec 16 13:36:39 2007 +0100

x86: provide a DMI based port 0x80 I/O delay override.

Certain (HP) laptops experience trouble from our port 0x80 I/O delay
writes. This patch provides for a DMI based switch to the alternate
diagnostic port 0xed (as used by some BIOSes as well) for these.

David P. Reed confirmed that port 0xed works for him and provides a
proper delay. The symptoms of _not_ working are a hanging machine,
with hwclock use being a direct trigger.

Earlier versions of this attempted to simply use udelay(2), with the
2 being a value tested to be a nicely conservative upper-bound with
help from many on the linux-kernel mailinglist, but that approach has
two problems.

First, pre-loops_per_jiffy calibration (which is post PIT init while
some implementations of the PIT are actually one of the historically
problematic devices that need the delay) udelay() isn't particularly
well-defined. We could initialise loops_per_jiffy conservatively (and
based on CPU family so as to not unduly delay old machines) which
would sort of work, but still leaves problem 2.

Second, delaying isn't the only effect that a write to port 0x80 has.
It's also a PCI posting barrier which some devices may be explicitly
or implicitly relying on. Alan Cox did a survey and found evidence
that additionally some drivers may be racy on SMP without the bus
locking outb.

Switching to an inb() makes the timing too unpredictable and as such,
this DMI based switch should be the safest approach for now. Any more
invasive changes should get more rigid testing first. It's moreover
only very few machines with the problem and a DMI based hack seems
to fit that situation.

This does not change the io_delay() in the boot code which is using
the same port 0x80 I/O delay but those do not appear to be a problem
as David P. Reed reported the problem was already gone after using the
udelay(2) version of this. He moreover reported that booting with
acpi=off also fixed things and seeing as how ACPI isn't touched
until after this DMI based I/O port switch I believe it's safe to
leave the ones in the boot code be.

The DMI strings from David's HP Pavilion dv9000z are in there already
and we need to get/verify the DMI info from other machines with the
problem, notably the HP Pavilion dv6000z.

This patch is partly based on earlier patches from Pavel Machek and
David P. Reed.

Signed-off-by: Rene

[PATCH] x86: Unify kexec_{32|64}.h

2007-12-16 Thread Harvey Harrison

One section collecting all constant defines.  Ifdef the asm
blocks for X86_32/64.

Signed-off-by: Harvey Harrison [EMAIL PROTECTED]
---
 include/asm-x86/kexec.h|  169 +++-
 include/asm-x86/kexec_32.h |   99 --
 include/asm-x86/kexec_64.h |   94 
 3 files changed, 167 insertions(+), 195 deletions(-)

diff --git a/include/asm-x86/kexec.h b/include/asm-x86/kexec.h
index 718ddbf..c90d3c7 100644
--- a/include/asm-x86/kexec.h
+++ b/include/asm-x86/kexec.h
@@ -1,5 +1,170 @@
+#ifndef _KEXEC_H
+#define _KEXEC_H
+
 #ifdef CONFIG_X86_32
-# include kexec_32.h
+# define PA_CONTROL_PAGE   0
+# define VA_CONTROL_PAGE   1
+# define PA_PGD2
+# define VA_PGD3
+# define PA_PTE_0  4
+# define VA_PTE_0  5
+# define PA_PTE_1  6
+# define VA_PTE_1  7
+# ifdef CONFIG_X86_PAE
+#  define PA_PMD_0 8
+#  define VA_PMD_0 9
+#  define PA_PMD_1 10
+#  define VA_PMD_1 11
+#  define PAGES_NR 12
+# else
+#  define PAGES_NR 8
+# endif
 #else
-# include kexec_64.h
+# define PA_CONTROL_PAGE   0
+# define VA_CONTROL_PAGE   1
+# define PA_PGD2
+# define VA_PGD3
+# define PA_PUD_0  4
+# define VA_PUD_0  5
+# define PA_PMD_0  6
+# define VA_PMD_0  7
+# define PA_PTE_0  8
+# define VA_PTE_0  9
+# define PA_PUD_1  10
+# define VA_PUD_1  11
+# define PA_PMD_1  12
+# define VA_PMD_1  13
+# define PA_PTE_1  14
+# define VA_PTE_1  15
+# define PA_TABLE_PAGE 16
+# define PAGES_NR  17
 #endif
+
+#ifndef __ASSEMBLY__
+
+#include linux/string.h
+
+#include asm/page.h
+#include asm/ptrace.h
+
+/*
+ * KEXEC_SOURCE_MEMORY_LIMIT maximum page get_free_page can return.
+ * I.e. Maximum page that is mapped directly into kernel memory,
+ * and kmap is not required.
+ *
+ * So far x86_64 is limited to 40 physical address bits.
+ */
+#ifdef CONFIG_X86_32
+/* Maximum physical address we can use pages from */
+# define KEXEC_SOURCE_MEMORY_LIMIT (-1UL)
+/* Maximum address we can reach in physical address mode */
+# define KEXEC_DESTINATION_MEMORY_LIMIT (-1UL)
+/* Maximum address we can use for the control code buffer */
+# define KEXEC_CONTROL_MEMORY_LIMIT TASK_SIZE
+
+# define KEXEC_CONTROL_CODE_SIZE   4096
+
+/* The native architecture */
+# define KEXEC_ARCH KEXEC_ARCH_386
+
+/* We can also handle crash dumps from 64 bit kernel. */
+# define vmcore_elf_check_arch_cross(x) ((x)-e_machine == EM_X86_64)
+#else
+/* Maximum physical address we can use pages from */
+# define KEXEC_SOURCE_MEMORY_LIMIT  (0xFFUL)
+/* Maximum address we can reach in physical address mode */
+# define KEXEC_DESTINATION_MEMORY_LIMIT (0xFFUL)
+/* Maximum address we can use for the control pages */
+# define KEXEC_CONTROL_MEMORY_LIMIT (0xFFUL)
+
+/* Allocate one page for the pdp and the second for the code */
+# define KEXEC_CONTROL_CODE_SIZE  (4096UL + 4096UL)
+
+/* The native architecture */
+# define KEXEC_ARCH KEXEC_ARCH_X86_64
+#endif
+
+/*
+ * CPU does not save ss and sp on stack if execution is already
+ * running in kernel mode at the time of NMI occurrence. This code
+ * fixes it.
+ */
+static inline void crash_fixup_ss_esp(struct pt_regs *newregs,
+ struct pt_regs *oldregs)
+{
+#ifdef CONFIG_X86_32
+   newregs-sp = (unsigned long)(oldregs-sp);
+   __asm__ __volatile__(
+   xorl %%eax, %%eax\n\t
+   movw %%ss, %%ax\n\t
+   :=a(newregs-ss));
+#endif
+}
+
+/*
+ * This function is responsible for capturing register states if coming
+ * via panic otherwise just fix up the ss and sp if coming via kernel
+ * mode exception.
+ */
+static inline void crash_setup_regs(struct pt_regs *newregs,
+   struct pt_regs *oldregs)
+{
+   if (oldregs) {
+   memcpy(newregs, oldregs, sizeof(*newregs));
+   crash_fixup_ss_esp(newregs, oldregs);
+   } else {
+#ifdef CONFIG_X86_32
+   __asm__ __volatile__(movl %%ebx,%0 : =m(newregs-bx));
+   __asm__ __volatile__(movl %%ecx,%0 : =m(newregs-cx));
+   __asm__ __volatile__(movl %%edx,%0 : =m(newregs-dx));
+   __asm__ __volatile__(movl %%esi,%0 : =m(newregs-si));
+   __asm__ __volatile__(movl %%edi,%0 : =m(newregs-di));
+   __asm__ __volatile__(movl %%ebp,%0 : =m(newregs-bp));
+   __asm__ __volatile__(movl %%eax,%0 : =m(newregs-ax));
+   __asm__ __volatile__(movl %%esp,%0 : =m(newregs-sp));
+   __asm__ __volatile__(movl %%ss, %%eax; :=a(newregs-ss));
+   __asm__

[BUG] NMI Watchdog alert with Linux 2.6.23.11

2007-12-16 Thread Chris Rankin

Hi,

My dual Xeon P4 (HT enabled), 2 GB RAM box crashed last night while playing 
World of Warcraft
under Wine (Mesa 7.1, Radeon 9550 card). This is what appeared on the serial 
console.

Cheers,
Chris

BUG: NMI Watchdog detected LOCKUP on CPU3, eip c0102aac, registers:
CPU:3
EIP:0060:[c0102aac]Not tainted VLI
EFLAGS: 0246   (2.6.23.11 #1)
EIP is at default_idle+0x2c/0x3e
eax:    ebx: c0102a80   ecx: 01cef000   edx: 002694b3
esi: 0003   edi:    ebp:    esp: f7c0bfac
ds: 007b   es: 007b   fs: 00d8  gs:   ss: 0068
Process swapper (pid: 0, ti=f7c0b000 task=f7c23540 task.ti=f7c0b000)
Stack: c010239e 0702080b       
       00d8    
        
Call Trace:
 [c010239e] cpu_idle+0x97/0xcc
 ===
Code: 3d 28 39 35 c0 00 75 32 80 3d 85 ad 30 c0 00 74 29 89 e0 25 00 f0 ff ff 83
 60 0c fd 0f ae f0 89 f6 fa 8b 40 08 a8 04 75 04 fb f4 eb 01 fb 89 e0 25 00 f0
 ff ff 83 48 0c 02 c3 f3 90 c3 55 57 56 



  __
Sent from Yahoo! Mail - a smarter inbox http://uk.mail.yahoo.com

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: soft lockup - CPU#1 stuck for 15s! [swapper:0]

2007-12-16 Thread Parag Warudkar

On Dec 16, 2007 12:15 AM, Parag Warudkar [EMAIL PROTECTED] wrote:
 On Sat, 15 Dec 2007, Parag Warudkar wrote:

  I will run it for a little longer just to be sure - but I don't think it
  will be a problem.

 No problems for last 10 hours - I consider this fixed.


Arghh - spoke 8 hours too soon. I left it running overnight and
morning I see a bunch of softlockups - so NO NOT FIXED.

Parag

BUG: soft lockup - CPU#1 stuck for 13s! [swapper:0]

Pid: 0, comm: swapper Not tainted (2.6.24-rc5 #21)
EIP: 0060:[c0533ca6] EFLAGS: 0206 CPU: 1
EIP is at acpi_idle_enter_simple+0x166/0x1d0
EAX: f7829f88 EBX: 0dab ECX: 0266 EDX: 
ESI:  EDI: 00c056e5 EBP: 00c0493a ESP: f7829f88
 DS: 007b ES: 007b FS: 00d8 GS:  SS: 0068
CR0: 8005003b CR2: 080f6c78 CR3: 00718000 CR4: 06d0
DR0:  DR1:  DR2:  DR3: 
DR6: 0ff0 DR7: 0400
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [Bug 9182] Critical memory leak (dirty pages)

2007-12-16 Thread Krzysztof Oledzki




On Sun, 16 Dec 2007, Andrew Morton wrote:


On Sun, 16 Dec 2007 10:33:20 +0100 (CET) Krzysztof Oledzki [EMAIL PROTECTED] 
wrote:




On Sat, 15 Dec 2007, Andrew Morton wrote:


On Sun, 16 Dec 2007 00:08:52 +0100 (CET) Krzysztof Oledzki [EMAIL PROTECTED] 
wrote:




On Sat, 15 Dec 2007, [EMAIL PROTECTED] wrote:


http://bugzilla.kernel.org/show_bug.cgi?id=9182


--- Comment #33 from [EMAIL PROTECTED]  2007-12-15 14:19 ---
Krzysztof, I'd hate point you to a hard path (at least time consuming), but
you've done a lot of digging by now anyway. How about git bisecting between
2.6.20-rc2 and rc1? Here is great info on bisecting:
http://www.kernel.org/doc/local/git-quick.html


As I'm smarter than git-bistect I can tell that 2.6.20-rc1-git8 is as bad
as 2.6.20-rc2 but 2.6.20-rc1-git8 with one patch reverted seems to be OK.
So it took me only 2 reboots. ;)

The guilty patch is the one I proposed just an hour ago:
  
http://git.kernel.org/?p=linux%2Fkernel%2Fgit%2Fstable%2Flinux-2.6.20.y.git;a=commitdiff_plain;h=fba2591bf4e418b6c3f9f8794c9dd8fe40ae7bd9

So:
  - 2.6.20-rc1: OK
  - 2.6.20-rc1-git8 with fba2591bf4e418b6c3f9f8794c9dd8fe40ae7bd9 reverted: OK
  - 2.6.20-rc1-git8: very BAD
  - 2.6.20-rc2: very BAD
  - 2.6.20-rc4: very BAD
  - = 2.6.20: BAD (but not *very* BAD!)



well..  We have code which has been used by *everyone* for a year and it's
misbehaving for you alone.


No, not for me alone. Probably only I and Thomas Osterried have systems
where it is so easy to reproduce. Please note that the problem exists on
my all systems, but only on one it is critical. It is enough to run
sync; sleep 1; sunc; sleep 1; sync; grep Drirty /proc/meminfo to be sure.
With =2.6.20-rc1-git8 it *never* falls to 0 an *all* my hosts but only
on one it goes to ~200MB in about 2 weeks and then everything dies:
http://bugzilla.kernel.org/attachment.cgi?id=13824
http://bugzilla.kernel.org/attachment.cgi?id=13825
http://bugzilla.kernel.org/attachment.cgi?id=13826
http://bugzilla.kernel.org/attachment.cgi?id=13827


 I wonder what you're doing that is different/special.

Me to. :|


Which filesystem, which mount options


  - ext3 on RAID1 (MD): / - rootflags=data=journal


It wouldn't surprise me if this is specific to data=journal: that
journalling mode is pretty complex wrt dairty-data handling and isn't well
tested.

Does switching that to data=writeback change things?


I'll confirm this tomorrow but it seems that even switching to 
data=ordered (AFAIK default o ext3) is indeed enough to cure this problem.


Two questions remain then: why system dies when dirty reaches ~200MB 
and what is wrong with ext3+data=journal with =2.6.20-rc2?


Best regards,

Krzysztof Olędzki

Re: [PATCH 3/3] net: wireless: bcm43xx: big_buffer_sem semaphore to mutex

2007-12-16 Thread Johannes Berg


On Sun, 2007-12-16 at 00:27 +0100, Michael Buesch wrote:
 On Sunday 16 December 2007 00:18:43 Rafael J. Wysocki wrote:
  Well, the only problem with that is I suspect there are some newer cards 
  that
  work better with v3 firmware, although they are supposed to support both.

Impossible. The firmware is only the MAC.

johannes


signature.asc
Description: This is a digitally signed message part

Re: [PATCH 3/3] net: wireless: bcm43xx: big_buffer_sem semaphore to mutex

2007-12-16 Thread Rafael J. Wysocki

On Sunday, 16 of December 2007, Johannes Berg wrote:
 
 On Sun, 2007-12-16 at 00:27 +0100, Michael Buesch wrote:
  On Sunday 16 December 2007 00:18:43 Rafael J. Wysocki wrote:
   Well, the only problem with that is I suspect there are some newer 
   cards that
   work better with v3 firmware, although they are supposed to support both.
 
 Impossible. The firmware is only the MAC.

OK

Thanks,
Rafael
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 3/3] net: wireless: bcm43xx: big_buffer_sem semaphore to mutex

2007-12-16 Thread Johannes Berg


  On Sun, 2007-12-16 at 00:27 +0100, Michael Buesch wrote:
   On Sunday 16 December 2007 00:18:43 Rafael J. Wysocki wrote:
Well, the only problem with that is I suspect there are some newer 
cards that
work better with v3 firmware, although they are supposed to support 
both.
  
  Impossible. The firmware is only the MAC.
 
 OK

I should probably mention though that of course it is (in theory!)
possible that the card works better with bcm43xx, it just never has
happened for all I know.

johannes


signature.asc
Description: This is a digitally signed message part

Re: [RANDOM] Move two variables to read_mostly section to save memory

2007-12-16 Thread Eric Dumazet


Adrian Bunk a écrit :

On Sun, Dec 16, 2007 at 12:45:01PM +0100, Eric Dumazet wrote:

While examining vmlinux namelist on i686, I noticed :

c0581300 D random_table
c0581480 d input_pool
c0581580 d random_read_wakeup_thresh
c0581584 d random_write_wakeup_thresh
c0581600 d blocking_pool

That means that the two integers random_read_wakeup_thresh and 
random_write_wakeup_thresh use a full cache entry (128 bytes).


Moving them to read_mostly section can shrinks vmlinux by 120 bytes.

# size vmlinux*
   textdata bss dec hex filename
4835553  450210  610304 5896067  59f783 vmlinux.after_patch
4835553  450330  610304 5896187  59f7fb vmlinux.before_patch

Signed-off-by: Eric Dumazet [EMAIL PROTECTED]



diff --git a/drivers/char/random.c b/drivers/char/random.c
index 5fee056..af48e86 100644
--- a/drivers/char/random.c
+++ b/drivers/char/random.c
@@ -256,14 +256,14 @@
  * The minimum number of bits of entropy before we wake up a read on
  * /dev/random.  Should be enough to do a significant reseed.
  */
-static int random_read_wakeup_thresh = 64;
+static int random_read_wakeup_thresh __read_mostly = 64;
 
 /*

  * If the entropy count falls under this number of bits, then we
  * should wake up processes which are selecting or polling on write
  * access to /dev/random.
  */
-static int random_write_wakeup_thresh = 128;
+static int random_write_wakeup_thresh __read_mostly = 128;


Please never ever do such ugly and unmaintainable micro-optimizations in 
the code unless you can show a measurable performance improvement of the 
kernel.


You seem to to be confused between speed micro-otimizations and memory 
savings. This patch has nothing to do about a speed optimization. Here, no 
tradeoff justify a measurable performance improvement study.


I copied this patch to you because your recent proposal to remove read_mostly 
from linux kernel.


Only you find read_mostly ugly and unmaintanable. I find it way more usefull 
than static attributes.


I find 120 bytes is a measurable gain, thank you.


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: oops with 2.6.23.1, marvel, software raid, reiserfs and samba

2007-12-16 Thread jeffunit


At 03:05 AM 12/16/2007, Andrew Morton wrote:

On Fri, 07 Dec 2007 19:49:52 -0800 jeffunit [EMAIL PROTECTED] wrote:

 I am running linux kernel 2.6.23.1, which I compiled.
 The base system was mandriva 2008.

 I have a dual processor pentium III 933 system.
 It has 3gb of ram, an intel stl-2 motherboard.
 It also has a promise 100 tx2 pata controller,
 a supermicro marvell based 8 port pcix sata controller,
 and a nvidia pci based video card.

 I have the os on a pata drive, and have made a software raid array
 consisting of 4 sata drives attached to the pcix sata controller.
 I created the array, and formatted with reiserfs 3.6
 I have run bonnie++ (filesystem benchmark) on the array without incident.
 When I use samba-3.0.25b-4.3 and copy files from a windows machine to
 the fileserver,
 every so often, the fileserver crashes or hangs. It seems to happen
 more often under heavy samba traffic.
 Enclosed is the oops from syslog.
 I also have a 'kernel bug' from syslog if that would be helpful.

 jeff


 Dec  7 17:20:52 sata_fileserver kernel: BUG: unable to handle kernel
 NULL pointer dereference at virtual address 000d
 Dec  7 17:20:52 sata_fileserver kernel:  printing eip:
 Dec  7 17:20:52 sata_fileserver kernel: c02cc820
 Dec  7 17:20:52 sata_fileserver kernel: *pde = 
 Dec  7 17:20:52 sata_fileserver kernel: Oops:  [#1]
 Dec  7 17:20:52 sata_fileserver kernel: SMP
 Dec  7 17:20:52 sata_fileserver kernel: Modules linked in: raid456
 async_xor async_memcpy async_tx xor iptable_raw xt_comment xt_policy
 xt_multiport ipt_ULOG ipt_TTL ipt_ttl ipt_TOS ipt_tos ipt_SAME
 ipt_REJECT ipt_REDIRECT ipt_recent ipt_owner ipt_NETMAP
 ipt_MASQUERADE ipt_LOG ipt_iprange ipt_ECN ipt_ecn ipt_CLUSTERIP
 ipt_ah ipt_addrtype nf_nat_tftp nf_nat_snmp_basic nf_nat_sip
 nf_nat_pptp nf_nat_proto_gre nf_nat_irc nf_nat_h323 nf_nat_ftp
 nf_nat_amanda ts_kmp nf_conntrack_amanda nf_conntrack_tftp
 nf_conntrack_sip nf_conntrack_proto_sctp nf_conntrack_pptp
 nf_conntrack_proto_gre nf_conntrack_netlink nf_conntrack_netbios_ns
 nf_conntrack_irc nf_conntrack_h323 nf_conntrack_ftp xt_tcpmss
 xt_pkttype xt_physdev xt_NFQUEUE xt_NFLOG xt_MARK xt_mark xt_mac
 xt_limit xt_length xt_helper xt_hashlimit ip6_tables xt_dccp
 xt_conntrack xt_CONNMARK xt_connmark xt_CLASSIFY nfsd xt_tcpudp
 exportfs auth_rpcgss xt_state iptable_nat nf_nat nf_conntrack_ipv4
 nf_conntrack nfs iptable_mangle lockd nfs_acl sunrpc nfnetlink
 iptable_filter ip_table
 Dec  7 17:20:52 sata_fileserver kernel:  x_tables af_packet ipv6
 snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq snd_pcm_oss
 snd_mixer_oss ipmi_si ipmi_msghandler binfmt_misc loop nls_utf8 ntfs
 dm_mod usb_storage sg sd_mod sata_mv libata scsi_mod video output
 thermal sbs processor fan container button dock battery ac floppy
 snd_emu10k1 snd_rawmidi snd_ac97_codec ac97_bus snd_pcm
 snd_seq_device snd_timer snd_page_alloc snd_util_mem snd_hwdep
 ehci_hcd snd ohci_hcd i2c_piix4 uhci_hcd soundcore e1000 sworks_agp
 i2c_core ide_cd usbcore agpgart emu10k1_gp gameport tsdev evdev
 reiserfs ide_disk serverworks pdc202xx_new ide_core
 Dec  7 17:20:52 sata_fileserver kernel: CPU:1
 Dec  7 17:20:52 sata_fileserver kernel:
 EIP:0060:[c02cc820]Not tainted VLI
 Dec  7 17:20:52 sata_fileserver kernel: EFLAGS: 00210202   (2.6.23.1 #1)
 Dec  7 17:20:52 sata_fileserver kernel: EIP is at tcp_recvmsg+0x150/0xbf0
 Dec  7 17:20:52 sata_fileserver kernel: eax:    ebx:
 f55c4b60   ecx: 784e2c7c   edx: f63f63d8
 Dec  7 17:20:52 sata_fileserver kernel: esi: 784e2c7a   edi:
 f63f614c   ebp: e21fde24   esp: e21fddc4
 Dec  7 17:20:52 sata_fileserver kernel: ds: 007b   es: 007b   fs:
 00d8  gs: 0033  ss: 0068
 Dec  7 17:20:52 sata_fileserver kernel: Process smbd (pid: 9524,
 ti=e21fc000 task=f5109000 task.ti=e21fc000)
 Dec  7 17:20:52 sata_fileserver kernel: Stack:  
  c13e5740 f557b000 c03fa300  e21fde90
 Dec  7 17:20:52 sata_fileserver kernel:f63f60e0 
 0b64 f63f63d8 05b4 0001  
 Dec  7 17:20:52 sata_fileserver kernel: 05b4
 e21fde4c 7fff e21fde28  c03a4de0 e21fde90
 Dec  7 17:20:52 sata_fileserver kernel: Call Trace:
 Dec  7 17:20:53 sata_fileserver kernel:  [c010542a]
 show_trace_log_lvl+0x1a/0x30
 Dec  7 17:20:53 sata_fileserver kernel:  [c01054eb]
 show_stack_log_lvl+0xab/0xd0
 Dec  7 17:20:53 sata_fileserver kernel:  [c01056e1]
 show_registers+0x1d1/0x2d0
 Dec  7 17:20:53 sata_fileserver kernel:  [c01058f6] die+0x116/0x250
 Dec  7 17:20:53 sata_fileserver kernel:  [c011f52b] 
do_page_fault+0x28b/0x6a0

 Dec  7 17:20:53 sata_fileserver kernel:  [c030938a] error_code+0x72/0x78
 Dec  7 17:20:53 sata_fileserver kernel:  [c0295423]
 sock_common_recvmsg+0x43/0x60
 Dec  7 17:20:53 sata_fileserver kernel:  [c029301c] 
sock_aio_read+0x11c/0x130
 Dec  7 17:20:53 sata_fileserver kernel:  [c017db30] 
do_sync_read+0xd0/0x110

 Dec  7 17:20:53 sata_fileserver kernel:  [c017e47d]

Re: [PATCH] x86: provide a DMI based port 0x80 I/O delay override.

2007-12-16 Thread Ingo Molnar


* Rene Herman [EMAIL PROTECTED] wrote:

 Any ACKs, NAKs or further comments from others in this thread also 
 welcome.

looks good to me. Could you please also provide three more controls that 
i suggested earlier:

 - a boot option enabling/disabling the udelay based code
 - a .config method of enabling/disabling the udelay based code
 - a sysctl to toggle it

if we want to clean this all up we'll need as many controls as possible.

Ingo
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 4/4] x86: Final unification of local_{32|64}.h

2007-12-16 Thread Ingo Molnar


* Harvey Harrison [EMAIL PROTECTED] wrote:

 No differences except for the defintion of local_add_return on X86_64. 
 The X86_32 version is just fine as it is protected with ifdef 
 CONFIG_M386 so use it directly.

thanks, i've applied your 4 patches to x86.git.

btw., now that we have a single unified file, it might make sense to fix 
these checkpatch complaints:

  total: 10 errors, 1 warnings, 257 lines checked

in case you are interested ;-)

Ingo
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] [RFC] be more verbose when probing EDD

2007-12-16 Thread devzero

Hello!

i`m sysadmin for quite some time and while being that, i have come across the 
one or another system, which refused to boot a linux kernel. typical symptom i 
have seen is a blinking cursor in the upper left just after kernel/initrd were 
loaded.

i never spent much time on that and either choose another system for linux or 
choose failsafe option, if the installer of my favourite distro gave that 
option.

since a colleague of mine was hit by that problem some weeks ago and i also 
came across that again recently, i started to investigate deeper and found, 
that EDD Bios probe is the problem here.

i found more than a handful of references on the net where people report 
similar problems. many discussion threads contained that typical smattering 
babble and seldom somebody gave that essential hint try edd=off for which i`m 
sure it would have helped many times.

that`s why i started to spend some thoughts on this how to make this 
easier/better for the user.

so

- it seems there are buggy Bios implementations out there which have problems 
with EDD
- your favourite distro may have set CONFIG_EDD=y|m , so EDD probe is on by 
default quite often nowadays.
- setting edd=off when you get that hang on boot is _not_ obvious.
- adressing this issue may be a little bit late, since i have mostly seen that 
problem on older machines, but not on recently bought
- i have at least two different systems with different types of chipsets to 
demonstrate this

on one of those, i added some printf`s to edd.c and this routine seems to be 
problematic and never returns 

/* Extended Get Device Parameters */

ei-params.length = sizeof(ei-params);
ax = 0x4800;
dx = devno;
asm(pushfl; int $0x13; popfl
: +a (ax), +d (dx), =m (ei-params)
: S (ei-params)
: ebx, ecx, edi);

i had a short conversation with matt domsch and hpa, who both think that 
additional printf`s would be an easy solution and not too bad to be added.

here is a quick and dirty initial patch from me:

--- linux-2.6.23/arch/x86/boot/main.c.orig  2007-12-09 11:40:24.315346712 
+0100
+++ linux-2.6.23/arch/x86/boot/main.c   2007-12-09 16:11:43.644512504 +0100
@@ -152,7 +152,10 @@
 
/* Query EDD information */
 #if defined(CONFIG_EDD) || defined(CONFIG_EDD_MODULE)
-   query_edd();
+printf(Probing EDD (query Bios for boot-device information)\n);
+printf(If boot hangs here, you may have a buggy Bios. Try edd=skipmbr 
or edd=off);
+query_edd();
+printf(\rOK   
\n);
 #endif
/* Do the last things and invoke protected mode */
go_to_protected_mode();


ok, fore sure it`s better to do that stuff in query_edd() itself, but before 
making a better version, i`d like to discuss if such patch would get accepted 
at all and if it`s a valid approach to let the kernel print some line which 
get`s overwritten (\rOK + lot`s of whitespaces) milliseconds later on 
successful function return.

regards
roland

__
Jetzt neu! Im riesigen WEB.DE Club SmartDrive Dateien freigeben und mit 
Freunden teilen! http://www.freemail.web.de/club/smartdrive_ttc.htm/?mc=021134

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Linux 2.6.23.10

2007-12-16 Thread Christian Borntraeger

Am Freitag, 14. Dezember 2007 schrieb Greg Kroah-Hartman:
 Christian Borntraeger (1):
   Future of Linux 2.6.22.y series

This should be: rd: fix data corruption on memory pressure.
Same for 2.6.22

Christian
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] arch_ptrace_stop

2007-12-16 Thread Oleg Nesterov

Roland, I am sorry for delay,

On 12/13, Roland McGrath wrote:

  Currently ptrace_stop() schedules in TASK_TRACED state even if we have a
  pending SIGKILL. With this patch this is still possible, but unless
  arch_ptrace_stop_needed() is true and thus we will check sigkill_pending().

 Currently the siglock is always held throughout.  The case this change
 addresses is when no SIGKILL was already pending before we took the lock.
 Currently, a new SIGKILL cannot come in until we've released the lock,
 which is after we've set TASK_TRACED.  The signal's sender will hold the
 lock while checking each thread's state, waking up any in TASK_TRACED.

 When arch_ptrace_stop_needed() is true, we release the siglock for an
 unknown period (might block, etc).  If a SIGKILL is sent there, it becomes
 pending while we are in TASK_RUNNING or a normal blocked state.  Next we
 finish arch_ptrace_stop() and reacquire the siglock.

Yes, yes, I see.

 Now entering
 TASK_TRACED would leave us unkillable because SIGKILL is already pending
 and nothing else (except PTRACE_CONT et al) will try to wake us up.

But this doesn't differ from the case when SIGKILL was already pending when
we enter ptrace_stop, and arch_ptrace_stop_needed() == false, that was my
point.

Yes, arch_ptrace_stop() can take a long time, might block, etc. But what
about TIF_SYSCALL_TRACE ? The task can recieve SIGKILL while executing the
syscall which can also block and so on, but do_syscall_trace(entryexit == 1)
doesn't check the pending signal.

I should clarify my question. What I can't understand is the subtle dependency
on the result of arch_ptrace_stop_needed(). This means that it is hard to
predict the behaviour.

IOW, can't we

- ignore the pending SIGKILL (current behaviour)
-- OR --
- always check it unconditionally, before setting TASK_TRACED

? This looks a bit more consistent to me.

Please also note before setting TASK_TRACED above. With this patch we set
TASK_TRACED under -siglock, and then change the -state to TASK_RUNNING if
killed == 1. Minor, but this doesn't look correct, we can fool the tracer
which does ptrace_check_attach().

Oleg.

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] misc driver: eliminate 256 minor limit deprecated call register_chrdev

2007-12-16 Thread Renzo Davoli

I already posted this patch on September 9th but nobody cared.

Is anybody interested in knowing that there is an old limit for
misc device minors to 256, that we are terminating the minor numbers,
and that there is a deprecated call in this code?

drivers/char/misc.c: the deprecated call is 
register_chrdev and it limits the number of minors to 256.

I propose this patch that eliminate both problems. With this patch 
misc allocates the entire major 10.

This patch was designed for a previous version of the kernel code
(2.6.22?), I have tested it today and applies to 2.6.24-rc5 with -12
lines offset.

renzo

Signed-off-by: Renzo Davoli [EMAIL PROTECTED]

--- a/drivers/char/misc.c   2007-08-05 16:56:59.0 +0200
+++ b/drivers/char/misc.c   2007-09-06 11:07:51.0 +0200
@@ -56,6 +56,8 @@
 static LIST_HEAD(misc_list);
 static DEFINE_MUTEX(misc_mtx);
 
+static struct cdev misc_cdev;
+
 /*
  * Assigned numbers, used for dynamic minors
  */
@@ -273,6 +275,31 @@
 EXPORT_SYMBOL(misc_register);
 EXPORT_SYMBOL(misc_deregister);
 
+static int misc_register_chrdev(void)
+{
+   dev_t from=MKDEV(MISC_MAJOR,0);
+   int rv;
+   int err = -ENOMEM;
+   char *s;
+
+   if ((rv=register_chrdev_region(from,MINORMASK,misc)) != 0)
+   return rv;
+
+   cdev_init(misc_cdev, misc_fops);
+   misc_cdev.owner=misc_fops.owner;
+   kobject_set_name(misc_cdev.kobj, %s, misc);
+   for (s = strchr(kobject_name(misc_cdev.kobj),'/'); s; s = strchr(s, 
'/'))
+   *s = '!';
+   err = cdev_add(misc_cdev, from, MINORMASK);
+   if (err)
+   goto out;
+   return 0;
+out:
+   kobject_put(misc_cdev.kobj);
+   unregister_chrdev_region(from,MINORMASK);
+  return err;
+}
+
 static int __init misc_init(void)
 {
 #ifdef CONFIG_PROC_FS
@@ -286,7 +313,7 @@
if (IS_ERR(misc_class))
return PTR_ERR(misc_class);
 
-   if (register_chrdev(MISC_MAJOR,misc,misc_fops)) {
+   if (misc_register_chrdev()) {
printk(unable to get major %d for misc devices\n,
   MISC_MAJOR);
class_destroy(misc_class);
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] debugfs: Revamp debugfs_create_{u,x,s}{8,16,32,64} to support signed integers

2007-12-16 Thread Mattias Nissler

This makes debugfs use its own file_operations for the value accessor files
created by debugfs_create_XXX. Having that, we can also have proper versions
for signed integers.

Signed-off-by: Mattias Nissler [EMAIL PROTECTED]
---

When writing some debugfs code for mac80211 I wanted to have access to
some s32 variables. I thought it's better to make debugfs support signed
integers instead of adding the functionality locally. I assume generic
versions will be useful for other people as well.

Please CC me for replies.


 fs/debugfs/file.c   |  291 +++
 include/linux/debugfs.h |   45 +++
 2 files changed, 287 insertions(+), 49 deletions(-)

diff --git a/fs/debugfs/file.c b/fs/debugfs/file.c
index fa6b7f7..8808bc0 100644
--- a/fs/debugfs/file.c
+++ b/fs/debugfs/file.c
@@ -56,18 +56,121 @@ const struct inode_operations debugfs_link_operations = {
.follow_link= debugfs_follow_link,
 };
 
-static void debugfs_u8_set(void *data, u64 val)
+/* Simple numeric attributes */
+struct debugfs_num_attr
 {
-   *(u8 *)data = val;
+   void (*get)(void *, char *);
+   void (*set)(void *, char *);
+   char get_buf[24];
+   char set_buf[24];
+   void *pnum;
+   struct mutex mutex;
+};
+
+static int debugfs_num_attr_open(struct inode *inode, struct file *file,
+void (*get)(void *, char*),
+void (*set)(void *, char*))
+{
+   struct debugfs_num_attr *attr;
+
+   attr = kmalloc(sizeof(*attr), GFP_KERNEL);
+   if (!attr)
+   return -ENOMEM;
+
+   attr-get = get;
+   attr-set = set;
+   attr-pnum = inode-i_private;
+   mutex_init(attr-mutex);
+
+   file-private_data = attr;
+
+   return nonseekable_open(inode, file);
+}
+
+static int debugfs_num_attr_close(struct inode *inode, struct file *file)
+{
+   kfree(file-private_data);
+
+   return 0;
+}
+
+static ssize_t debugfs_num_attr_read(struct file *file, char __user *buf,
+size_t len, loff_t *ppos)
+{
+   struct debugfs_num_attr *attr;
+   size_t size;
+   ssize_t ret;
+
+   attr = file-private_data;
+
+   mutex_lock(attr-mutex);
+   if (!*ppos) {
+   /* first read */
+   attr-get(attr-pnum, attr-get_buf);
+   attr-get_buf[sizeof(attr-get_buf) - 1] = '\0';
+   }
+
+   size = strlen(attr-get_buf);
+   ret = simple_read_from_buffer(buf, len, ppos, attr-get_buf, size);
+   mutex_unlock(attr-mutex);
+
+   return ret;
 }
-static u64 debugfs_u8_get(void *data)
+
+static ssize_t debugfs_num_attr_write(struct file *file, const char __user 
*buf,
+ size_t len, loff_t *ppos)
 {
-   return *(u8 *)data;
+   struct debugfs_num_attr *attr;
+   size_t size;
+   ssize_t ret;
+
+   attr = file-private_data;
+
+   mutex_lock(attr-mutex);
+   ret = -EFAULT;
+   size = min(sizeof(attr-set_buf) - 1, len);
+   if (copy_from_user(attr-set_buf, buf, size))
+   goto out;
+
+   ret = len; /* claim we got the whole input */
+   attr-set_buf[size] = '\0';
+   attr-set(attr-pnum, attr-set_buf);
+out:
+   mutex_unlock(attr-mutex);
+   return ret;
 }
-DEFINE_SIMPLE_ATTRIBUTE(fops_u8, debugfs_u8_get, debugfs_u8_set, %llu\n);
+
+#define DEFINE_NUM_ATTR(__fops, __type, __format)  \
+static void __fops ## _get(void *data, char *buf)  \
+{  \
+   scnprintf(buf, 24, __format, *(__type *) data); \
+}  \
+static void __fops ## _set(void *data, char *buf)  \
+{  \
+   sscanf(buf, __format, (__type *) data); \
+}  \
+static int __fops ## _open(struct inode *inode, struct file *file) \
+{  \
+   return debugfs_num_attr_open(inode, file, __fops ## _get,   \
+__fops ## _set);   \
+}  \
+static struct file_operations __fops = {   \
+   .owner   = THIS_MODULE, \
+   .open= __fops ## _open, \
+   .release = debugfs_num_attr_close,  \
+   .read= debugfs_num_attr_read,   \
+   .write   = debugfs_num_attr_write,  \
+};
+
+DEFINE_NUM_ATTR(fops_u8, u8, %hhu\n)
+DEFINE_NUM_ATTR(fops_u16, u16, %hu\n)
+DEFINE_NUM_ATTR(fops_u32, u32,

[PATCH] Implement getgeo for Xen virtual block device.

2007-12-16 Thread Ian Campbell

Hi Jeremy,

The below implements the getgeo hook for Xen block devices. Extracted
from the xen-unstable tree where it has been used for ages.

It is useful to have because it allows things like grub2 (used by the
Debian installer images) to work in a guest domain without having to
sprinkle Xen specific hacks around the place.

Signed-off-by: Ian Campbell [EMAIL PROTECTED]

diff --git a/drivers/block/xen-blkfront.c b/drivers/block/xen-blkfront.c
index 2bdebcb..b0a2e69 100644
--- a/drivers/block/xen-blkfront.c
+++ b/drivers/block/xen-blkfront.c
@@ -37,6 +37,7 @@
 
 #include linux/interrupt.h
 #include linux/blkdev.h
+#include linux/hdreg.h
 #include linux/module.h
 
 #include xen/xenbus.h
@@ -135,6 +136,22 @@ static void blkif_restart_queue_callback(void *arg)
schedule_work(info-work);
 }
 
+int blkif_getgeo(struct block_device *bd, struct hd_geometry *hg)
+{
+   /* We don't have real geometry info, but let's at least return
+  values consistent with the size of the device */
+   sector_t nsect = get_capacity(bd-bd_disk);
+   sector_t cylinders = nsect;
+
+   hg-heads = 0xff;
+   hg-sectors = 0x3f;
+   sector_div(cylinders, hg-heads * hg-sectors);
+   hg-cylinders = cylinders;
+   if ((sector_t)(hg-cylinders + 1) * hg-heads * hg-sectors  nsect)
+   hg-cylinders = 0x;
+   return 0;
+}
+
 /*
  * blkif_queue_request
  *
@@ -939,6 +956,7 @@ static struct block_device_operations xlvbd_block_fops =
.owner = THIS_MODULE,
.open = blkif_open,
.release = blkif_release,
+   .getgeo = blkif_getgeo,
 };
 



-- 
Ian Campbell

'Martyrdom' is the only way a person can become famous without ability.
-- George Bernard Shaw


signature.asc
Description: This is a digitally signed message part

Re: [RANDOM] Move two variables to read_mostly section to save memory

2007-12-16 Thread Adrian Bunk

On Sun, Dec 16, 2007 at 03:44:37PM +0100, Eric Dumazet wrote:
 Adrian Bunk a écrit :
 On Sun, Dec 16, 2007 at 12:45:01PM +0100, Eric Dumazet wrote:
 While examining vmlinux namelist on i686, I noticed :

 c0581300 D random_table
 c0581480 d input_pool
 c0581580 d random_read_wakeup_thresh
 c0581584 d random_write_wakeup_thresh
 c0581600 d blocking_pool

 That means that the two integers random_read_wakeup_thresh and 
 random_write_wakeup_thresh use a full cache entry (128 bytes).

 Moving them to read_mostly section can shrinks vmlinux by 120 bytes.

 # size vmlinux*
textdata bss dec hex filename
 4835553  450210  610304 5896067  59f783 vmlinux.after_patch
 4835553  450330  610304 5896187  59f7fb vmlinux.before_patch

 Signed-off-by: Eric Dumazet [EMAIL PROTECTED]

 diff --git a/drivers/char/random.c b/drivers/char/random.c
 index 5fee056..af48e86 100644
 --- a/drivers/char/random.c
 +++ b/drivers/char/random.c
 @@ -256,14 +256,14 @@
   * The minimum number of bits of entropy before we wake up a read on
   * /dev/random.  Should be enough to do a significant reseed.
   */
 -static int random_read_wakeup_thresh = 64;
 +static int random_read_wakeup_thresh __read_mostly = 64;
   /*
   * If the entropy count falls under this number of bits, then we
   * should wake up processes which are selecting or polling on write
   * access to /dev/random.
   */
 -static int random_write_wakeup_thresh = 128;
 +static int random_write_wakeup_thresh __read_mostly = 128;

 Please never ever do such ugly and unmaintainable micro-optimizations in 
 the code unless you can show a measurable performance improvement of the 
 kernel.

 You seem to to be confused between speed micro-otimizations and memory 
 savings. This patch has nothing to do about a speed optimization. Here, no 
 tradeoff justify a measurable performance improvement study.
 
 I copied this patch to you because your recent proposal to remove 
 read_mostly from linux kernel.

 Only you find read_mostly ugly and unmaintanable. I find it way more 
 usefull than static attributes.

 I find 120 bytes is a measurable gain, thank you.


I am well aware that your patch is about space saving and not speed
improvement.

But trying to save space this way is simply not maintainable.

And it's trivial to see that your patch actually makes the code _bigger_ 
for all people who try hard to get their kernel small and use 
CONFIG_SYSCTL=n - funnily your patch has exactly the problem I described 
as drawback of __read_mostly in the thread you are referring to...


And even more funny, with gcc 4.2 and CONFIG_CC_OPTIMIZE_FOR_SIZE=y your 
patch doesn't seem to make any space difference - are you using an older 
compiler or even worse CONFIG_CC_OPTIMIZE_FOR_SIZE=n for being able to 
see any space difference?

In both cases your code uglification would be even more pointless...


cu
Adrian

-- 

   Is there not promise of rain? Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
   Only a promise, Lao Er said.
   Pearl S. Buck - Dragon Seed

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [argyllcms] Re: [PATCH] usb/hid: Blacklist the Gretag-Macbeth Huey display colorimeter

2007-12-16 Thread Jiri Kosina

On Fri, 14 Dec 2007, Nicolas Mailhot wrote:

  Actually (to put it frankly), I'm amazed that this exceptions list is 
  compiled into the driver. I would have thought that such a list should 
  at least be in a configuration file that an installed application can 
  add or delete, if not something more sophisticated.
 Something more sophisticated will happen someday, this is what we have
 now.

There is also possibility (for quite a few releases already) to change the 
quirk list in runtime. See the 'quirk' parameter to the usbhid module.

  What will happen if the HID driver is fixed to allow arbitrary 
  messages, and I want to switch back to using it rather than libusb ?

What exactly is the problem here? I didn't seem to catch the beginning of 
the thread (or it happened off the list I am subscribed to).

I will apply the patch to my tree, thanks.

-- 
Jiri Kosina
SUSE Labs
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] debugfs: Revamp debugfs_create_{u,x,s}{8,16,32,64} to support signed integers

2007-12-16 Thread Greg KH

On Sun, Dec 16, 2007 at 05:37:59PM +0100, Mattias Nissler wrote:
 This makes debugfs use its own file_operations for the value accessor files
 created by debugfs_create_XXX. Having that, we can also have proper versions
 for signed integers.

Why not tweak the SIMPLE_ATTRIBUTE code to support this instead?  That
way debugfs and all other filesystems could also use these attributes?

thanks,

greg k-h
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

< 1 2 3 4 5 6 >

101 - 200 of 508 matches

Mail list logo