date:20070629

Re: how to determine if the noexec stack is defined by an application

2007-06-29 Thread Arjan van de Ven

On Fri, 2007-06-29 at 18:21 -0700, Florin Andrei wrote:
> Arjan van de Ven wrote:
> >> But it's running a Web service which is a combination of C code and 
> >> Tomcat/Java. I have no clue how to determine which portions specify a 
> >> noexec stack and which don't.
> > 
> > like this:
> > 
> > $ eu-readelf -l /bin/true  | grep STACK
> >   GNU_STACK  0x00 0x 0x 0x00 0x00 RW 0x4
> 
> Is Sun Java 1.5 a known exception - as an application that doesn't set a 
> noexec stack and reverts to default?
> 
> # eu-readelf -l ./java | grep STACK | wc -l
> 0
> 
> But then, this bug report seems to indicate otherwise, if I'm reading it 
> correctly:
> 
> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=5051381


that's not a mainline kernel; and I don't rule out that early RHEL3
versions had a 64/32 bug in this area
> 
-- 
if you want to mail me at work (you don't), use arjan (at) linux.intel.com
Test the interaction between Linux and your BIOS via 
http://www.linuxfirmwarekit.org

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.22-rc6-mm1

2007-06-29 Thread Andrew Morton

On Sat, 30 Jun 2007 00:17:46 -0400 [EMAIL PROTECTED] wrote:

> On Fri, 29 Jun 2007 14:01:30 PDT, Andrew Morton said:
> > On Fri, 29 Jun 2007 10:50:30 -0400
> > [EMAIL PROTECTED] wrote:
> 
> > > Odd - just for grins, I checked what 'make oldconfig' did when handed a 
> > > .config
> > > from 22-rc4-mm2, and it behaved just fine, much to my surprise.
> > 
> > That's probably because your old config file was relatively recent, and
> > had things like CONFIG_BLK_DEV=y in it.
> 
> Ahh...  Yeah, it gets a 'make oldconfig' for pretty
> much every single -mm, I suck at any regression testing other than "since
> the last -mm".
> 

All my .configs have mouldered since I lost the ability to have .config be
a symlink to a revision-controlled file (used to carry a custom patch for
this, but it died).

I continue to believe that kbuild's lets-trash-your-symlink behaviour is
obnoxious, but I was unable to persuade anyone else of this.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.22-rc6-mm1

2007-06-29 Thread Valdis . Kletnieks

On Fri, 29 Jun 2007 14:01:30 PDT, Andrew Morton said:
> On Fri, 29 Jun 2007 10:50:30 -0400
> [EMAIL PROTECTED] wrote:

> > Odd - just for grins, I checked what 'make oldconfig' did when handed a 
> > .config
> > from 22-rc4-mm2, and it behaved just fine, much to my surprise.
> 
> That's probably because your old config file was relatively recent, and
> had things like CONFIG_BLK_DEV=y in it.

Ahh...  Yeah, it gets a 'make oldconfig' for pretty
much every single -mm, I suck at any regression testing other than "since
the last -mm".


pgpwUuUiftAZq.pgp
Description: PGP signature

Re: 2.6.22-rcX Transmeta/APM regression

2007-06-29 Thread H. Peter Anvin


[EMAIL PROTECTED] wrote:


Anyway, the patch which introduces the problem is the aptly named 3ebad:
3ebad59056: [PATCH] x86: Save and restore the fixed-range MTRRs of the BSP when 
suspending

2.6.22-rc6 plus that one commit reverted successfully does APM suspend
(and resume) for me.


Okay, I would guess that that patch probably touches MTRRs without 
actually verify that the CPU *has* MTRRs -- the Transmeta Crusoe CPU 
doesn't have MTRRs.


-hpa
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: how about mutual compatibility between Linux's GPLv2 and GPLv3?

2007-06-29 Thread David Schwartz


> > Treating ordinary use as a copyright privilege leads to
> > nonsensical results
> > no matter what you do. For example, you get that I can drop copies of my
> > poem from an airplane and then sue anyone who reads it.

> Who was talking about reading?

They are both ordinary use. It is crazy to treat the ordinary use of a work
as a copyright privilege. If you do this, you get insane results. For
example, coloring in the pages of a coloring book is, arguably, creating a
derivative work. But you don't need a license to do this, because it's the
ordinary use.

My poem from airplanes example is just an example. You get analogously crazy
results if you treat ordinary use of other works as a right under copyright.

> You can read programs as much as you
> can read poems.  But since you (normally) can't run poems, copyright
> law doesn't talk about this, just like it doesn't distinguish source
> from object code of a poem.

You get lucicrous results from copyright laws if lawful physical possession
(of a copy made with consent of the copyright holder) does not grant the
right to ordinary use. You get the same ludicrous results with patents
(imagine if I can buy a product from IBM whose ordinary use always violates
an IBM patent and then IBM can sue me for it or if they use this to prevent
me from selling the product or giving it away).

> But software is different.  So different
> that it's governed by a separate law in Brazil, which could be
> qualified as a subclass of copyright law.  And this law states that
> running programs requires permission from the copyright holder.

So do I have to buy a program and then negotiate the right to run it
separately? That seems very crazy.

> If you find that odd, you may have an idea of how ludicrous patents on
> software, business methods et al are.  At least copyright regulation
> of execution saves us from a few abusive EULAs, created with the
> purpose of, let's see, regulating execution.

Quite the reverse. If execution is a copyright right, then I might need to
agree to a license or conract to get it. If execution is not a copyright
right, then I am safe from such craziness.

> And then, since it's
> already there, why not use it for other restrictions beneficial to the
> vendor that a copyright license couldn't establish?

Jurisdictions that treat ordinary use as a copyright right are simply
insane. I am probably one of the stronger supporters of intellectual
property rights (copyright and patent, not necessarily UCC and EULA issues)
that you will find on this list, and I think that treating ordinary use as a
right is simply insane.

DS


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] automatic CC generation for patch submission

2007-06-29 Thread Adrian Bunk

On Sat, Jun 30, 2007 at 05:34:51AM +0300, Dan Aloni wrote:
>...
> Basically, instead of manually figuring out who to add to CC
> when sending a patch to LKML by looking at MAINTAINERS, a 
> script can look at '.maintainers' files spread across the
> source tree and automatically generate a proper list of CCs
> for a patch.
>...
> To illustrate: If a patch affects a file under 
> drivers/net/e1000, the CC script will look at these files
> 
>   drivers/net/e1000/.maintainers
>   drivers/net/.maintainers
>   drivers/.maintainers
>   .maintainers
> 
> ... to gather up the mailing list addresses or an individual 
> maintainer inbox address.
>...
> Any comments?

As Auke said, maintaining the information in MAINTAINERS would be 
better.

And another important use case that shouldn't require much extra work 
would be to do the same for bug reports.

Generally, you should keep in mind that it must fit into the workflow of 
the people who should use it. E.g. I could imagine that it might make 
more sense if you write a small tool that takes a patch or a path and 
outputs email addresses instead of a huge tool that tries to solve too 
many problems at once and doesn't fit into the workflow of most people.

> Dan Aloni

cu
Adrian

-- 

   "Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
   "Only a promise," Lao Er said.
   Pearl S. Buck - Dragon Seed

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Please release a stable kernel Linux 3.0

2007-06-29 Thread Daniel Hazelton

On Friday 29 June 2007 17:27:34 Rene Herman wrote:
> On 06/29/2007 11:05 PM, Bodo Eggert wrote:
> > Alan Cox <[EMAIL PROTECTED]> wrote:
> >> Indeed if its public domain you may have almost no rights at all
> >> depending what you were given. Once you get the source code you can do
> >> stuff but I don't have to give you that. If its public domain I can find
> >> security holes in it, and refuse to provide the fixed module in source
> >> form even.
> >
> > The GPL forces nobody to not release his module under PD, therefore it
> > can't protect you from that. Even minor changes - like adjusting the
> > module to use to the current API - won't change that, at least in Germany
> > they'd have to qualify as a work of their own in order to create a
> > GPL-only derived work, because anything not qualifying for that could
> > also be integrated into the PD version, and both would remain identical.
>
> What I focussed on when asking were only my wishes as an author but Alan
> (if I understood him right ofcourse) pointed out that _the kernel_ does not
> want integrated code to be in the public domain regardless of my wishes.
>
> Arguably (no doubt, sigh...) someone could distribute the kernel in binary
> form but refuse to provide source for the bits marked as being in the
> public domain alongside it -- yes, can of worms when compared to GPL
> demands, but I believe I can see why one shouldn't even go near there.

Actually, they couldn't. Second PD code became included in the kernel it would 
be covered by the GPL. If it can be shown that the kernel binary was the 
product of merging PD code in, then there is no way top refuse access to the 
PD code.

DRH

> Rene.
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [EMAIL PROTECTED]
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/



-- 
Dialup is like pissing through a pipette. Slow and excruciatingly painful.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: How to enable dev_dbg messaging

2007-06-29 Thread Akinobu Mita

On Fri, Jun 29, 2007 at 10:29:33PM -0500, Jay Cliburn wrote:
> How do I turn on dev_dbg messaging in the kernel?  I can get
> printk(KERN_DEBUG ...) to work just fine, but I don't know how to
> enable dev_dbg.

Defining DEBUG before including  enables dev_dbg to work.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

How to enable dev_dbg messaging

2007-06-29 Thread Jay Cliburn

How do I turn on dev_dbg messaging in the kernel?  I can get
printk(KERN_DEBUG ...) to work just fine, but I don't know how to
enable dev_dbg.

Thanks,
Jay
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 4/4] void unregister_blkdev - make void

2007-06-29 Thread Akinobu Mita

Put WARN_ON and fixed all callers of unregister_blkdev().
Now we can make unregister_blkdev return void.

Cc: Jens Axboe <[EMAIL PROTECTED]>
Signed-off-by: Akinobu Mita <[EMAIL PROTECTED]>

---
 block/genhd.c  |7 +--
 include/linux/fs.h |2 +-
 2 files changed, 2 insertions(+), 7 deletions(-)

Index: 2.6-mm/block/genhd.c
===
--- 2.6-mm.orig/block/genhd.c
+++ 2.6-mm/block/genhd.c
@@ -108,13 +108,11 @@ out:
 
 EXPORT_SYMBOL(register_blkdev);
 
-/* todo: make void - error printk here */
-int unregister_blkdev(unsigned int major, const char *name)
+void unregister_blkdev(unsigned int major, const char *name)
 {
struct blk_major_name **n;
struct blk_major_name *p = NULL;
int index = major_to_index(major);
-   int ret = 0;
 
mutex_lock(_subsys_lock);
for (n = _names[index]; *n; n = &(*n)->next)
@@ -122,15 +120,12 @@ int unregister_blkdev(unsigned int major
break;
if (!*n || strcmp((*n)->name, name)) {
WARN_ON(1);
-   ret = -EINVAL;
} else {
p = *n;
*n = p->next;
}
mutex_unlock(_subsys_lock);
kfree(p);
-
-   return ret;
 }
 
 EXPORT_SYMBOL(unregister_blkdev);
Index: 2.6-mm/include/linux/fs.h
===
--- 2.6-mm.orig/include/linux/fs.h
+++ 2.6-mm/include/linux/fs.h
@@ -1553,7 +1553,7 @@ extern void putname(const char *name);
 
 #ifdef CONFIG_BLOCK
 extern int register_blkdev(unsigned int, const char *);
-extern int unregister_blkdev(unsigned int, const char *);
+extern void unregister_blkdev(unsigned int, const char *);
 extern struct block_device *bdget(dev_t);
 extern void bd_set_size(struct block_device *, loff_t size);
 extern void bd_forget(struct inode *inode);
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] automatic CC generation for patch submission

2007-06-29 Thread Satyam Sharma

On 6/30/07, Satyam Sharma <[EMAIL PROTECTED]> wrote:

On 6/30/07, Kok, Auke <[EMAIL PROTECTED]> wrote:
> [...]
> an easier way to implement this is to add an extra field in the MAINTAINERS
> file, something like below. All the contact info would stay the same, closely
> where applicable and it would allow you to also specify specific files as 
well.

Your idea is quite different though, of course.

Hmm, associating MAINTAINERS and source files. I remember surfing LKML
archives one day and coming across this huge and ugly flamefest when some
guy called Eric S. Raymond (hehe :-) suggested something like this and got
burned.

And before _I_ get burned for being impolite / provoking something, let me
clarify that the comment above was purely in spirit of humour (bad effort,
of course).
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[RFC PATCH 2/3] Generic Trace Setup and Control (GTSC) code

2007-06-29 Thread Tom Zanussi

The Generic Tracing and Control Interface (GTSC) code.

Signed-off-by: Tom Zanussi <[EMAIL PROTECTED]>
Signed-off-by: David Wilder <[EMAIL PROTECTED]>
---
 include/linux/gtsc.h |  104 +
 lib/Kconfig  |   10 
 lib/Makefile |2 
 lib/gtsc.c   |  558 +++
 4 files changed, 674 insertions(+)

diff --git a/include/linux/gtsc.h b/include/linux/gtsc.h
new file mode 100644
index 000..cbb2601
--- /dev/null
+++ b/include/linux/gtsc.h
@@ -0,0 +1,104 @@
+/*
+ * GTSC defines and function prototypes
+ *
+ * Copyright (C) 2006 IBM Inc.
+ *
+ * Tom Zanussi <[EMAIL PROTECTED]>
+ * Martin Hunt <[EMAIL PROTECTED]>
+ * David Wilder <[EMAIL PROTECTED]>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA
+ *
+ */
+#ifndef _LINUX_GTSC_H
+#define _LINUX_GTSC_H
+
+#include 
+
+/*
+ * GTSC channel flags
+ */
+#define TRACE_GLOBAL_CHANNEL   0x01
+#define TRACE_FLIGHT_CHANNEL   0x02
+#define TRACE_DISABLE_STATE0x04
+
+enum trace_state {
+   TRACE_SETUP,
+   TRACE_RUNNING,
+   TRACE_STOPPED,
+};
+
+#define TRACE_ROOT_NAME_SIZE   64  /* Max root dir identifier */
+#define TRACE_NAME_SIZE64  /* Max trace identifier */
+
+/*
+ * Global root user information
+ */
+struct trace_root {
+   struct list_head list;
+   char name[TRACE_ROOT_NAME_SIZE];
+   struct dentry *root;
+   unsigned int users;
+};
+
+/*
+ * Client information
+ */
+struct trace_info {
+   enum trace_state state;
+   struct dentry *state_file;
+   struct rchan *rchan;
+   struct dentry *dir;
+   struct dentry *dropped_file;
+   struct dentry *reset_consumed_file;
+   struct dentry *nr_sub_file;
+   struct dentry *sub_size_file;
+   atomic_t dropped;
+   struct trace_root *root;
+   void *private_data;
+   unsigned int flags;
+   unsigned int buf_size;
+   unsigned int buf_nr;
+};
+
+#ifdef CONFIG_GTSC
+static inline int trace_running(struct trace_info *trace)
+{
+   return trace->state == TRACE_RUNNING;
+}
+struct trace_info *trace_setup(const char *root, const char *name,
+  u32 buf_size, u32 buf_nr, u32 flags);
+int trace_start(struct trace_info *trace);
+int trace_stop(struct trace_info *trace);
+void trace_cleanup_channel(struct trace_info *gt);
+void trace_cleanup(struct trace_info *gt);
+unsigned long long trace_timestamp(void);
+#else
+static inline struct trace_info *trace_setup(const char *root,
+const char *name,
+u32 buf_size,
+u32 buf_nr,
+u32 flags)
+{
+   return NULL;
+}
+static inline int trace_running(struct trace_info *trace) { return 0; }
+static inline int trace_start(struct trace_info *trace) { return -EINVAL; }
+static inline int trace_stop(struct trace_info *trace) {}
+static inline void trace_cleanup_channel(struct trace_info *trace) {}
+static inline void trace_cleanup(struct trace_info *trace) {}
+static inline unsigned long long trace_timestamp(void) { return 0; }
+#endif
+
+#endif
diff --git a/lib/Kconfig b/lib/Kconfig
index 2e7ae6b..b3931f3 100644
--- a/lib/Kconfig
+++ b/lib/Kconfig
@@ -124,4 +124,14 @@ config HAS_DMA
depends on !NO_DMA
default y
 
+config GTSC
+   bool "Generic Trace Setup and Control"
+   select RELAY
+   select DEBUG_FS
+   help
+ This option provides support for the setup, teardown and control
+ of tracing channels from kernel code.  It also provides trace
+ information and control to userspace via a set of debugfs control
+ files.  If unsure, say N.
+
 endmenu
diff --git a/lib/Makefile b/lib/Makefile
index c8c8e20..d9e68fa 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -62,6 +62,8 @@ obj-$(CONFIG_FAULT_INJECTION) += fault-inject.o
 
 lib-$(CONFIG_GENERIC_BUG) += bug.o
 
+obj-$(CONFIG_GTSC) += gtsc.o
+
 hostprogs-y:= gen_crc32table
 clean-files:= crc32table.h
 
diff --git a/lib/gtsc.c b/lib/gtsc.c
new file mode 100644
index 000..ecd0ddf
--- /dev/null
+++ b/lib/gtsc.c
@@ -0,0 +1,558 @@
+/*
+ * Based on blktrace code, Copyright (C) 2006 Jens Axboe <[EMAIL PROTECTED]>
+ * Moved to utt.c by Tom Zanussi <[EMAIL PROTECTED]>, 2006
+ *

[RFC PATCH 3/3] blktrace conversion to GTSC

2007-06-29 Thread Tom Zanussi

This patch converts blktrace to use the Generic Trace Setup and
Control (GTSC) interface.  Also attached is a small patch to the
blktrace user code, needed for the ioctl change.

Signed-off-by: Tom Zanussi <[EMAIL PROTECTED]>
Signed-off-by: David Wilder <[EMAIL PROTECTED]>
---
 block/Kconfig|2 
 block/blktrace.c |  188 +--
 include/linux/blktrace_api.h |   30 +++---
 3 files changed, 40 insertions(+), 180 deletions(-)

diff --git a/block/Kconfig b/block/Kconfig
index a50f481..9ae9a8c 100644
--- a/block/Kconfig
+++ b/block/Kconfig
@@ -30,7 +30,7 @@ config LBD
 config BLK_DEV_IO_TRACE
bool "Support for tracing block io actions"
depends on SYSFS
-   select RELAY
+   select GTSC
select DEBUG_FS
help
  Say Y here, if you want to be able to trace the block layer actions
diff --git a/block/blktrace.c b/block/blktrace.c
index 3f0e7c3..c5d5821 100644
--- a/block/blktrace.c
+++ b/block/blktrace.c
@@ -36,7 +36,7 @@ static void trace_note(struct blk_trace *bt, pid_t pid, int 
action,
 {
struct blk_io_trace *t;
 
-   t = relay_reserve(bt->rchan, sizeof(*t) + len);
+   t = relay_reserve(bt->trace->rchan, sizeof(*t) + len);
if (t) {
const int cpu = smp_processor_id();
 
@@ -126,7 +126,7 @@ void __blk_add_trace(struct blk_trace *bt, sector_t sector, 
int bytes,
pid_t pid;
int cpu;
 
-   if (unlikely(bt->trace_state != Blktrace_running))
+   if (unlikely(!trace_running(bt->trace)))
return;
 
what |= ddir_act[rw & WRITE];
@@ -152,7 +152,7 @@ void __blk_add_trace(struct blk_trace *bt, sector_t sector, 
int bytes,
if (unlikely(tsk->btrace_seq != blktrace_seq))
trace_note_tsk(bt, tsk);
 
-   t = relay_reserve(bt->rchan, sizeof(*t) + pdu_len);
+   t = relay_reserve(bt->trace->rchan, sizeof(*t) + pdu_len);
if (t) {
cpu = smp_processor_id();
sequence = per_cpu_ptr(bt->sequence, cpu);
@@ -178,55 +178,8 @@ void __blk_add_trace(struct blk_trace *bt, sector_t 
sector, int bytes,
 
 EXPORT_SYMBOL_GPL(__blk_add_trace);
 
-static struct dentry *blk_tree_root;
-static struct mutex blk_tree_mutex;
-static unsigned int root_users;
-
-static inline void blk_remove_root(void)
-{
-   if (blk_tree_root) {
-   debugfs_remove(blk_tree_root);
-   blk_tree_root = NULL;
-   }
-}
-
-static void blk_remove_tree(struct dentry *dir)
-{
-   mutex_lock(_tree_mutex);
-   debugfs_remove(dir);
-   if (--root_users == 0)
-   blk_remove_root();
-   mutex_unlock(_tree_mutex);
-}
-
-static struct dentry *blk_create_tree(const char *blk_name)
-{
-   struct dentry *dir = NULL;
-
-   mutex_lock(_tree_mutex);
-
-   if (!blk_tree_root) {
-   blk_tree_root = debugfs_create_dir("block", NULL);
-   if (!blk_tree_root)
-   goto err;
-   }
-
-   dir = debugfs_create_dir(blk_name, blk_tree_root);
-   if (dir)
-   root_users++;
-   else
-   blk_remove_root();
-
-err:
-   mutex_unlock(_tree_mutex);
-   return dir;
-}
-
 static void blk_trace_cleanup(struct blk_trace *bt)
 {
-   relay_close(bt->rchan);
-   debugfs_remove(bt->dropped_file);
-   blk_remove_tree(bt->dir);
free_percpu(bt->sequence);
kfree(bt);
 }
@@ -239,76 +192,14 @@ static int blk_trace_remove(request_queue_t *q)
if (!bt)
return -EINVAL;
 
-   if (bt->trace_state == Blktrace_setup ||
-   bt->trace_state == Blktrace_stopped)
+   if (!trace_running(bt->trace)) {
+   trace_cleanup(bt->trace);
blk_trace_cleanup(bt);
+   }
 
return 0;
 }
 
-static int blk_dropped_open(struct inode *inode, struct file *filp)
-{
-   filp->private_data = inode->i_private;
-
-   return 0;
-}
-
-static ssize_t blk_dropped_read(struct file *filp, char __user *buffer,
-   size_t count, loff_t *ppos)
-{
-   struct blk_trace *bt = filp->private_data;
-   char buf[16];
-
-   snprintf(buf, sizeof(buf), "%u\n", atomic_read(>dropped));
-
-   return simple_read_from_buffer(buffer, count, ppos, buf, strlen(buf));
-}
-
-static const struct file_operations blk_dropped_fops = {
-   .owner =THIS_MODULE,
-   .open = blk_dropped_open,
-   .read = blk_dropped_read,
-};
-
-/*
- * Keep track of how many times we encountered a full subbuffer, to aid
- * the user space app in telling how many lost events there were.
- */
-static int blk_subbuf_start_callback(struct rchan_buf *buf, void *subbuf,
-void *prev_subbuf, size_t prev_padding)
-{
-   struct blk_trace *bt;
-
-   if (!relay_buf_full(buf))
-   return 1;
-
-   bt = buf->chan->private_data;
-

[RFC PATCH 6/6] relay: add relay_reset_consumed()

2007-06-29 Thread Tom Zanussi

This patch allows relay channels to be reset i.e. unconsumed.
Basically allows a 'rewind' function for flight-recorder tracing.

Signed-off-by: Tom Zanussi <[EMAIL PROTECTED]>
Signed-off-by: David Wilder <[EMAIL PROTECTED]>
---
 Documentation/filesystems/relay.txt |   11 ++
 include/linux/relay.h   |1 
 kernel/relay.c  |   59 
 3 files changed, 66 insertions(+), 5 deletions(-)

diff --git a/Documentation/filesystems/relay.txt 
b/Documentation/filesystems/relay.txt
index 18d23f9..d31113a 100644
--- a/Documentation/filesystems/relay.txt
+++ b/Documentation/filesystems/relay.txt
@@ -161,6 +161,7 @@ TBD(curr. line MT:/API/)
 relay_close(chan)
 relay_flush(chan)
 relay_reset(chan)
+relay_reset_consumed(chan)
 
   channel management typically called on instigation of userspace:
 
@@ -452,6 +453,16 @@ state without reallocating channel buffer memory or 
destroying
 existing mappings.  It should however only be called when it's safe to
 do so, i.e. when the channel isn't currently being written to.
 
+The read(2) implementation always 'consumes' the bytes read,
+i.e. those bytes won't be available again to subsequent reads.
+Certain applications may nonetheless wish to allow the 'consumed' data
+to be re-read; relay_reset_consumed() is provided for that purpose -
+it resets the internal consumed counters for all buffers in the
+channel.  For example, if a first set of reads 'drains' the channel,
+and then relay_reset_consumed() is called, a second set of reads will
+get the exact same data (assuming no new data was written between the
+first set of reads and the second).
+
 Finally, there are a couple of utility callbacks that can be used for
 different purposes.  buf_mapped() is called whenever a channel buffer
 is mmapped from user space and buf_unmapped() is called when it's
diff --git a/include/linux/relay.h b/include/linux/relay.h
index 6cd8c44..aca45fa 100644
--- a/include/linux/relay.h
+++ b/include/linux/relay.h
@@ -175,6 +175,7 @@ extern void relay_subbufs_consumed(struct rchan *chan,
   unsigned int cpu,
   size_t consumed);
 extern void relay_reset(struct rchan *chan);
+extern void relay_reset_consumed(struct rchan *chan);
 extern int relay_buf_full(struct rchan_buf *buf);
 
 extern size_t relay_switch_subbuf(struct rchan_buf *buf,
diff --git a/kernel/relay.c b/kernel/relay.c
index 4311101..6806636 100644
--- a/kernel/relay.c
+++ b/kernel/relay.c
@@ -382,6 +382,57 @@ void relay_reset(struct rchan *chan)
 }
 EXPORT_SYMBOL_GPL(relay_reset);
 
+/**
+ * __relay_reset_consumed - reset a channel buffer's consumed count
+ * @buf: the channel buffer
+ *
+ * See relay_reset_consumed for description of effect.
+ */
+static inline void __relay_reset_consumed(struct rchan_buf *buf)
+{
+   size_t n_subbufs = buf->chan->n_subbufs;
+   size_t produced = buf->subbufs_produced;
+   size_t consumed = buf->subbufs_consumed;
+
+   if (produced < n_subbufs)
+   buf->subbufs_consumed = 0;
+   else {
+   consumed = produced - n_subbufs;
+   if (buf->offset)
+   consumed++;
+   buf->subbufs_consumed = consumed;
+   }
+   buf->bytes_consumed = 0;
+}
+
+/**
+ * relay_reset_consumed - reset the channel's consumed counts
+ * @chan: the channel
+ *
+ * This has the effect of making all data previously read (and
+ * not overwritten by subsequent writes) from a channel available
+ * for reading again.
+ *
+ * NOTE: Care should be taken that the channel isn't actually
+ * being used by anything when this call is made.
+ */
+void relay_reset_consumed(struct rchan *chan)
+{
+   unsigned int i;
+   struct rchan_buf *prev = NULL;
+
+   if (!chan)
+   return;
+
+   for (i = 0; i < NR_CPUS; i++) {
+   if (!chan->buf[i] || chan->buf[i] == prev)
+   break;
+   __relay_reset_consumed(chan->buf[i]);
+   prev = chan->buf[i];
+   }
+}
+EXPORT_SYMBOL_GPL(relay_reset_consumed);
+
 /*
  * relay_open_buf - create a new relay channel buffer
  *
@@ -840,11 +891,9 @@ static int relay_file_read_avail(struct rchan_buf *buf, 
size_t read_pos)
return 1;
}
 
-   if (unlikely(produced - consumed >= n_subbufs)) {
-   consumed = (produced / n_subbufs) * n_subbufs;
-   buf->subbufs_consumed = consumed;
-   }
-   
+   if (unlikely(produced - consumed >= n_subbufs))
+   __relay_reset_consumed(buf);
+
produced = (produced % n_subbufs) * subbuf_size + buf->offset;
consumed = (consumed % n_subbufs) * subbuf_size + buf->bytes_consumed;
 



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at

[RFC PATCH 1/3] Generic Trace Setup and Control (GTSC) Documentation

2007-06-29 Thread Tom Zanussi

This is the documentation for the Generic Trace Setup and Control
patchset, first submitted a couple of weeks ago.  See

http://marc.info/?l=linux-kernel=118214274912586=2

for a more detailed description.

I've updated this patch to incorporate the suggestions made by Alexey
Dobriyan in that thread.

Signed-off-by: Tom Zanussi <[EMAIL PROTECTED]>
Signed-off-by: David Wilder <[EMAIL PROTECTED]>
---
 gtsc.txt |  247 +++
 1 file changed, 247 insertions(+)

diff --git a/Documentation/gtsc.txt b/Documentation/gtsc.txt
new file mode 100644
index 000..470d1fc
--- /dev/null
+++ b/Documentation/gtsc.txt
@@ -0,0 +1,247 @@
+Generic Trace Setup and Control (GTSC)
+==
+In the kernel, GTSC provides a simple API for starting and managing
+data channels to user space.  GTSC builds on the relay interface. For a
+complete description of the relay interface, please see:
+Documentation/filesystems/relay.txt.
+
+GTSC provides one layer in a complete tracing application.  The idea of
+the GTSC is to provide a kernel API for the setup and control of tracing
+channels.  User of GTSC must provide a data layer responsible for formatting
+and writing data into the trace channels.  
+
+A layered approach to tracing
+=
+A complete kernel tracing application consists of a data provider and a data
+consumer.  Both provider and consumer contain three layers; each layer works
+in tandem with the corresponding layer in the opposite side.  The layers are
+represented in the following diagram.
+  
+Provider Data layer
+   Formats raw trace data and provides data-related service.
+   For example, adding timestamps used by consumer to sort data.
+
+Provider Control layer
+   Provided by GTSC. Creates trace channels and informs the data layer
+   and consumer of the current state of the trace channels.
+
+Provider Buffering layer
+   Provided by relay. This layer buffers data in the
+   kernel for consumption by the consumer's buffer
+   layer.
+
+Provider (in-kernel facility)
+-
+Consumer (user application)
+
+
+Consumer Buffer layer
+   Reads/consumes data from the provider's data buffers.
+
+Consumer Control layer
+   Communicates to the provider's control layer to control the state
+   of the trace channels. 
+
+Consumer Data layer
+   Sorts and formats data as provided by the provider's data layer.
+
+The provider is coded as a kernel facility.  The consumer is coded as
+a user application.
+ 
+
+GTSC - Features
+==
+The GTSC exploits services and features provided by relay.  These features are:
+- The creation and destruction of relay channels.
+- Buffer management.  Overwrite or non-overwrite modes can be selected
+  as well as global or per-CPU buffering.
+
+Overwrite mode can be called "flight recorder mode".  Flight recorder
+mode is selected by setting the TRACE_FLIGHT_CHANNEL flag when
+creating trace channels.  In flight mode when a tracing buffer is
+full, the oldest records in the buffer will be discarded to make room
+as new records arrive.  In the default non-overwrite mode, new records
+may be written only if the buffer has room.  In either case, to
+prevent data loss, a user space reader must keep the buffers
+drained. GTSC provides a means to detect the number of records that
+have been dropped due to a buffer-full condition (non-overwrite mode
+only).
+
+When per-CPU buffers are used, relay creates one debugfs file for each
+running CPU.  The user-space consumer of the data is responsible for
+reading the per-CPU buffers and collating the records presumably using
+a time stamp or sequence number included in the trace records.  The
+use of global buffers eliminates this extra work of sequencing
+records; however the provider's data layer must hold a lock when
+writing records.  The lock prevents writers running on different CPUs
+from overwriting each other's data.  However, buffering may be slower
+because write to the buffer are serialized. Global buffering is
+selected by setting the TRACE_GLOBAL_CHANNEL flag when creating trace
+channels.
+
+GTSC User Interface
+===
+When a GTSC channel is created and tracing has been started, the following
+directories and files are created in the root of the mounted debugfs.
+
+/debug (root of the debugfs)
+/
+/
+trace0 ... traceN  (Per-CPU trace data, one per CPU)
+   state  (Used to  start and stop tracing)
+dropped(number of records dropped due
+to a full-buffer condition, only)
+for non-TRACE_FLIGHT_CHANNELs)
+rewind ('un-consume' channel data i.e.
+

[RFC PATCH 4/6] relay: add relay_reserve_cpu()

2007-06-29 Thread Tom Zanussi

This patch adds the ability to explicitly specify the per-cpu buffer
to reserve space in.  Needed for early DTI tracing.

Signed-off-by: Tom Zanussi <[EMAIL PROTECTED]>
Signed-off-by: David Wilder <[EMAIL PROTECTED]>
---
 relay.h |   33 +
 1 file changed, 33 insertions(+)

diff --git a/include/linux/relay.h b/include/linux/relay.h
index 6caedef..37a7306 100644
--- a/include/linux/relay.h
+++ b/include/linux/relay.h
@@ -269,6 +269,39 @@ static inline void *relay_reserve(struct rchan *chan, 
size_t length)
 }
 
 /**
+ * relay_reserve_cpu - reserve slot in given cpu's channel buffer
+ * @chan: relay channel
+ * @length: number of bytes to reserve
+ * @cpu: cpu to log to
+ *
+ * Returns pointer to reserved slot, NULL if full.
+ *
+ * Reserves a slot in the given cpu's channel buffer.
+ * Does not protect the buffer at all - caller must provide
+ * appropriate synchronization.
+ *
+ * NOTE: this is almost certainly not the function you want -
+ * use relay_reserve() instead for normal logging.  This version
+ * is specialized for things like early tracing.
+ */
+static inline void *relay_reserve_cpu(struct rchan *chan, size_t length,
+ unsigned int cpu)
+{
+   void *reserved;
+   struct rchan_buf *buf = chan->buf[cpu];
+
+   if (unlikely(buf->offset + length > buf->chan->subbuf_size)) {
+   length = relay_switch_subbuf(buf, length);
+   if (!length)
+   return NULL;
+   }
+   reserved = buf->data + buf->offset;
+   buf->offset += length;
+
+   return reserved;
+}
+
+/**
  * subbuf_start_reserve - reserve bytes at the start of a sub-buffer
  * @buf: relay channel buffer
  * @length: number of bytes to reserve



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[RFC PATCH 5/6] relay: add relay_kernel_read()

2007-06-29 Thread Tom Zanussi

This patch adds a relay_kernel_read() function to relay, which allows
kernel clients to easily extract only the data (i.e. and skip over
padding) from a channel buffer.

Signed-off-by: Tom Zanussi <[EMAIL PROTECTED]>
Signed-off-by: David Wilder <[EMAIL PROTECTED]>
---
 Documentation/filesystems/relay.txt |   14 ++
 include/linux/relay.h   |5 ++
 kernel/relay.c  |   79 
 3 files changed, 90 insertions(+), 8 deletions(-)

diff --git a/Documentation/filesystems/relay.txt 
b/Documentation/filesystems/relay.txt
index d31113a..e9f10cf 100644
--- a/Documentation/filesystems/relay.txt
+++ b/Documentation/filesystems/relay.txt
@@ -185,6 +185,7 @@ TBD(curr. line MT:/API/)
 
 relay_buf_full(buf)
 subbuf_start_reserve(buf, length)
+relay_kernel_read(rbuf, buffer, count, ppos)
 
 
 Creating a channel
@@ -446,6 +447,19 @@ closed.
 Misc
 
 
+relay_kernel_read() provides the same functionality as the userspace
+read(2) implementation, but instead of copying the relay buffer data
+to a buffer in user space, the data is copied to the supplied kernel
+buffer target.  As with user space read(), the sub-buffer padding is
+automatically removed from the output and is not seen by the reader.
+In the case of relay_kernel_read(), there is no file object associated
+with the reader, so it needs to supply a pointer to a ppos variable,
+which will be used to maintain the current read position instead.
+This is useful for applications that may want to provide an
+alternative interface to the relay buffer data or who want access to
+the buffer data without needing to know anything about buffer
+internals.
+
 Some applications may want to keep a channel around and re-use it
 rather than open and close a new channel for each use.  relay_reset()
 can be used for this purpose - it resets a channel to its initial
diff --git a/include/linux/relay.h b/include/linux/relay.h
index aca45fa..6caedef 100644
--- a/include/linux/relay.h
+++ b/include/linux/relay.h
@@ -181,6 +181,11 @@ extern int relay_buf_full(struct rchan_buf *buf);
 extern size_t relay_switch_subbuf(struct rchan_buf *buf,
  size_t length);
 
+extern ssize_t relay_kernel_read(struct rchan_buf *rbuf,
+char *buffer,
+size_t count,
+loff_t *ppos);
+
 /**
  * relay_write - write data into the channel
  * @chan: relay channel
diff --git a/kernel/relay.c b/kernel/relay.c
index 6806636..ed58ee6 100644
--- a/kernel/relay.c
+++ b/kernel/relay.c
@@ -987,6 +987,24 @@ static size_t relay_file_read_end_pos(struct rchan_buf 
*buf,
return end_pos;
 }
 
+/**
+ * subbuf_kernel_read_actor - read up to one subbuf's worth of data
+ */
+static int subbuf_kernel_read_actor(size_t read_start,
+   struct rchan_buf *buf,
+   size_t avail,
+   read_descriptor_t *desc,
+   read_actor_t actor)
+{
+   void *from = buf->start + read_start;
+   memcpy(desc->arg.data, from, avail);
+   desc->arg.data += avail;
+   desc->written += avail;
+   desc->count -= avail;
+
+   return avail;
+}
+
 /*
  * subbuf_read_actor - read up to one subbuf's worth of data
  */
@@ -1058,19 +1076,17 @@ typedef int (*subbuf_actor_t) (size_t read_start,
 /*
  * relay_file_read_subbufs - read count bytes, bridging subbuf boundaries
  */
-static ssize_t relay_file_read_subbufs(struct file *filp, loff_t *ppos,
+static ssize_t relay_file_read_subbufs(struct rchan_buf *buf, loff_t *ppos,
subbuf_actor_t subbuf_actor,
read_actor_t actor,
read_descriptor_t *desc)
 {
-   struct rchan_buf *buf = filp->private_data;
size_t read_start, avail;
int ret;
 
if (!desc->count)
return 0;
 
-   mutex_lock(>f_path.dentry->d_inode->i_mutex);
do {
if (!relay_file_read_avail(buf, *ppos))
break;
@@ -1090,23 +1106,62 @@ static ssize_t relay_file_read_subbufs(struct file 
*filp, loff_t *ppos,
*ppos = relay_file_read_end_pos(buf, read_start, ret);
}
} while (desc->count && ret);
-   mutex_unlock(>f_path.dentry->d_inode->i_mutex);
 
return desc->written;
 }
 
+/**
+ * relay_kernel_read - read relay buffer into kernel target buffer
+ * @rbuf: the relay buffer struct
+ * @buffer: the target kernel buffer
+ * @count: number of bytes to read
+ * @ppos: pointer to read position variable
+ *
+ * Returns the number of bytes copied into buffer.
+ *
+ * Performs the same function as user space read, except that
+ * relay buffer contents are copied to the supplied kernel buffer
+

[PATCH 3/4] void unregister_blkdev - ignore the return value

2007-06-29 Thread Akinobu Mita

Some cdrom drivers stop destruct operations in module_exit() when
unregister_blkdev() failure happens.

But it can't help to stop unloading module. So it will not be good
error handling. Furthermore any other block drivers don't have such
handling and there is no special reason that only those cdrom drivers
have to do.

This patch removes the return value checks for unregister_blkdev().

This change will not hide the bugs. Because unregister_blkdev() prints
error message on failures by the previous patch.

Cc: Eberhard Moenkeberg <[EMAIL PROTECTED]>
Cc: Oliver Raupach <[EMAIL PROTECTED]>
Signed-off-by: Akinobu Mita <[EMAIL PROTECTED]>

---
 drivers/cdrom/aztcd.c  |5 +
 drivers/cdrom/cdu31a.c |6 +-
 drivers/cdrom/cm206.c  |5 +
 drivers/cdrom/gscd.c   |5 +
 drivers/cdrom/optcd.c  |5 +
 drivers/cdrom/sbpcd.c  |8 ++--
 6 files changed, 7 insertions(+), 27 deletions(-)

Index: 2.6-mm/drivers/cdrom/optcd.c
===
--- 2.6-mm.orig/drivers/cdrom/optcd.c
+++ 2.6-mm/drivers/cdrom/optcd.c
@@ -2089,10 +2089,7 @@ static void __exit optcd_exit(void)
 {
del_gendisk(optcd_disk);
put_disk(optcd_disk);
-   if (unregister_blkdev(MAJOR_NR, "optcd") == -EINVAL) {
-   printk(KERN_ERR "optcd: what's that: can't unregister\n");
-   return;
-   }
+   unregister_blkdev(MAJOR_NR, "optcd");
blk_cleanup_queue(opt_queue);
release_region(optcd_port, 4);
printk(KERN_INFO "optcd: module released.\n");
Index: 2.6-mm/drivers/cdrom/sbpcd.c
===
--- 2.6-mm.orig/drivers/cdrom/sbpcd.c
+++ 2.6-mm/drivers/cdrom/sbpcd.c
@@ -5885,12 +5885,8 @@ int __init sbpcd_init(void)
 static void sbpcd_exit(void)
 {
int j;
-   
-   if ((unregister_blkdev(MAJOR_NR, major_name) == -EINVAL))
-   {
-   msg(DBG_INF, "What's that: can't unregister %s.\n", major_name);
-   return;
-   }
+
+   unregister_blkdev(MAJOR_NR, major_name);
release_region(CDo_command,4);
blk_cleanup_queue(sbpcd_queue);
for (j=0;j 0)
Index: 2.6-mm/drivers/cdrom/gscd.c
===
--- 2.6-mm.orig/drivers/cdrom/gscd.c
+++ 2.6-mm/drivers/cdrom/gscd.c
@@ -882,10 +882,7 @@ static void __exit gscd_exit(void)
 
del_gendisk(gscd_disk);
put_disk(gscd_disk);
-   if ((unregister_blkdev(MAJOR_NR, "gscd") == -EINVAL)) {
-   printk("What's that: can't unregister GoldStar-module\n");
-   return;
-   }
+   unregister_blkdev(MAJOR_NR, "gscd");
blk_cleanup_queue(gscd_queue);
release_region(gscd_port, GSCD_IO_EXTENT);
printk(KERN_INFO "GoldStar-module released.\n");
Index: 2.6-mm/drivers/cdrom/aztcd.c
===
--- 2.6-mm.orig/drivers/cdrom/aztcd.c
+++ 2.6-mm/drivers/cdrom/aztcd.c
@@ -1941,10 +1941,7 @@ static void __exit aztcd_exit(void)
 {
del_gendisk(azt_disk);
put_disk(azt_disk);
-   if ((unregister_blkdev(MAJOR_NR, "aztcd") == -EINVAL)) {
-   printk("What's that: can't unregister aztcd\n");
-   return;
-   }
+   unregister_blkdev(MAJOR_NR, "aztcd");
blk_cleanup_queue(azt_queue);
if ((azt_port == 0x1f0) || (azt_port == 0x170)) {
SWITCH_IDE_MASTER;
Index: 2.6-mm/drivers/cdrom/cm206.c
===
--- 2.6-mm.orig/drivers/cdrom/cm206.c
+++ 2.6-mm/drivers/cdrom/cm206.c
@@ -1548,10 +1548,7 @@ static void __exit cm206_exit(void)
printk("Can't unregister cdrom cm206\n");
return;
}
-   if (unregister_blkdev(MAJOR_NR, "cm206")) {
-   printk("Can't unregister major cm206\n");
-   return;
-   }
+   unregister_blkdev(MAJOR_NR, "cm206");
blk_cleanup_queue(cm206_queue);
free_irq(cm206_irq, NULL);
kfree(cd);
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[RFC PATCH 3/6] Conversion of some s390 drivers to DTI

2007-06-29 Thread Tom Zanussi

This patch does a first pass at converting some of the s390 drivers
from s390dbf to DTI.  The others can be converted later if DTI is ever
merged.  Note that the location of the trace files is now under
/sys/kernel/debug/dti rather than /sys/kernel/debug/s390dbf.  There
are also some other differences e.g. there are no longer 'hex views'
in the kernel - this would be done instead via the userspace
'dtiprint' utility.  Also, dynamic resizing isn't currently supported.

Signed-off-by: Tom Zanussi <[EMAIL PROTECTED]>
Signed-off-by: David Wilder <[EMAIL PROTECTED]>
---
 block/dasd.c |   18 --
 block/dasd_int.h |   54 --
 char/tape.h  |9 +
 char/tape_34xx.c |9 -
 char/tape_3590.c |9 -
 char/tape_core.c |9 -
 cio/cio.c|   27 ---
 cio/cio_debug.h  |   19 ---
 8 files changed, 73 insertions(+), 81 deletions(-)

diff --git a/drivers/s390/block/dasd.c b/drivers/s390/block/dasd.c
index bfeca57..eafdaa7 100644
--- a/drivers/s390/block/dasd.c
+++ b/drivers/s390/block/dasd.c
@@ -17,6 +17,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -35,7 +36,7 @@
 /*
  * SECTION: exported variables of dasd.c
  */
-debug_info_t *dasd_debug_area;
+struct dti_info *dasd_debug_area;
 struct dasd_discipline *dasd_diag_discipline_pointer;
 void dasd_int_handler(struct ccw_device *, unsigned long, struct irb *);
 
@@ -181,10 +182,8 @@ dasd_state_known_to_basic(struct dasd_device * device)
return rc;
 
/* register 'device' debug area, used for all DBF_DEV_XXX calls */
-   device->debug_area = debug_register(device->cdev->dev.bus_id, 1, 2,
-   8 * sizeof (long));
-   debug_register_view(device->debug_area, _sprintf_view);
-   debug_set_level(device->debug_area, DBF_WARNING);
+   device->debug_area = dti_register(device->cdev->dev.bus_id, 4096, 4);
+   dti_set_level(device->debug_area, DBF_WARNING);
DBF_DEV_EVENT(DBF_EMERG, device, "%s", "debug area created");
 
device->state = DASD_STATE_BASIC;
@@ -207,7 +206,7 @@ dasd_state_basic_to_known(struct dasd_device * device)
 
DBF_DEV_EVENT(DBF_EMERG, device, "%p debug area deleted", device);
if (device->debug_area != NULL) {
-   debug_unregister(device->debug_area);
+   dti_unregister(device->debug_area);
device->debug_area = NULL;
}
device->state = DASD_STATE_KNOWN;
@@ -1924,7 +1923,7 @@ dasd_exit(void)
dasd_gendisk_exit();
dasd_devmap_exit();
if (dasd_debug_area != NULL) {
-   debug_unregister(dasd_debug_area);
+   dti_unregister(dasd_debug_area);
dasd_debug_area = NULL;
}
 }
@@ -2231,13 +2230,12 @@ dasd_init(void)
init_waitqueue_head(_flush_wq);
 
/* register 'common' DASD debug area, used for all DBF_XXX calls */
-   dasd_debug_area = debug_register("dasd", 1, 2, 8 * sizeof (long));
+   dasd_debug_area = dti_register("dasd", 4096, 4);
if (dasd_debug_area == NULL) {
rc = -ENOMEM;
goto failed;
}
-   debug_register_view(dasd_debug_area, _sprintf_view);
-   debug_set_level(dasd_debug_area, DBF_WARNING);
+   dti_set_level(dasd_debug_area, DBF_WARNING);
 
DBF_EVENT(DBF_EMERG, "%s", "debug area created");
 
diff --git a/drivers/s390/block/dasd_int.h b/drivers/s390/block/dasd_int.h
index 241294c..324ed1e 100644
--- a/drivers/s390/block/dasd_int.h
+++ b/drivers/s390/block/dasd_int.h
@@ -13,6 +13,8 @@
 
 #ifdef __KERNEL__
 
+#include 
+
 /* we keep old device allocation scheme; IOW, minors are still in 0..255 */
 #define DASD_PER_MAJOR (1U << (MINORBITS - DASD_PARTN_BITS))
 #define DASD_PARTN_MASK ((1 << DASD_PARTN_BITS) - 1)
@@ -82,45 +84,45 @@ typedef enum {
  */
 #define DBF_DEV_EVENT(d_level, d_device, d_str, d_data...) \
 do { \
-   debug_sprintf_event(d_device->debug_area, \
-   d_level, \
-   d_str "\n", \
-   d_data); \
+   __dti_printk(d_device->debug_area,   \
+d_level,\
+d_str "\n", \
+d_data);\
 } while(0)
 
 #define DBF_DEV_EXC(d_level, d_device, d_str, d_data...) \
 do { \
-   debug_sprintf_exception(d_device->debug_area, \
-   d_level, \
-   d_str "\n", \
-   d_data); \
+   __dti_printk(d_device->debug_area, \
+d_level,  \
+d_str "\n",   \
+d_data);  \
 } while(0)
 
 #define DBF_EVENT(d_level, d_str, d_data...)\
 do { \
-   debug_sprintf_event(dasd_debug_area, \
-

[RFC PATCH 2/6] Driver Tracing Interface (DTI) code

2007-06-29 Thread Tom Zanussi

The Driver Tracing Interface (DTI) code.

Signed-off-by: Tom Zanussi <[EMAIL PROTECTED]>
Signed-off-by: David Wilder <[EMAIL PROTECTED]>
---
 drivers/base/Kconfig   |   11 
 drivers/base/Makefile  |1 
 drivers/base/dti.c |  836 +
 drivers/base/dti_merged_view.c |  332 
 include/linux/dti.h|  293 ++
 5 files changed, 1473 insertions(+)

diff --git a/drivers/base/Kconfig b/drivers/base/Kconfig
index 5d6312e..fbc9c0e 100644
--- a/drivers/base/Kconfig
+++ b/drivers/base/Kconfig
@@ -49,6 +49,17 @@ config DEBUG_DEVRES
 
  If you are unsure about this, Say N here.
 
+config DTI
+   bool "Driver Tracing Interface (DTI)"
+   select GTSC
+   help
+   Provides  functions to write variable length trace records
+   into a wraparound memory trace buffer. One purpose of
+   this is to inspect the debug traces after a system crash in order to
+   analyze the reason for the failure. The traces are accessable from
+   system dumps via dump analysis tools like crash or lcrash. In live
+   systems the traces can be read via a debugfs interface.
+
 config SYS_HYPERVISOR
bool
default n
diff --git a/drivers/base/Makefile b/drivers/base/Makefile
index b39ea3f..7caa5f5 100644
--- a/drivers/base/Makefile
+++ b/drivers/base/Makefile
@@ -12,6 +12,7 @@ obj-$(CONFIG_NUMA)+= node.o
 obj-$(CONFIG_MEMORY_HOTPLUG_SPARSE) += memory.o
 obj-$(CONFIG_SMP)  += topology.o
 obj-$(CONFIG_SYS_HYPERVISOR) += hypervisor.o
+obj-$(CONFIG_DTI) += dti.o dti_merged_view.o
 
 ifeq ($(CONFIG_DEBUG_DRIVER),y)
 EXTRA_CFLAGS += -DDEBUG
diff --git a/drivers/base/dti.c b/drivers/base/dti.c
new file mode 100644
index 000..2feec11
--- /dev/null
+++ b/drivers/base/dti.c
@@ -0,0 +1,836 @@
+/*
+ *Linux Driver Tracing Interface.
+ *
+ *Copyright (C) IBM Corp. 2007
+ *Author(s): Tom Zanussi <[EMAIL PROTECTED]>
+ *   Dave Wilder <[EMAIL PROTECTED]>
+ *   Michael Holzheu <[EMAIL PROTECTED]>
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+extern int dti_create_merged_views(struct dti_info *dti);
+extern void dti_remove_merged_views(struct dti_info *dti);
+struct file_operations level_fops;
+
+static inline int nr_sub(int size)
+{
+   if (size < 4)
+   return 0;
+   
+   if (size >= 8 * PAGE_SIZE)
+   return 8;
+   else
+   return 4;
+}
+
+static inline int sub_size(int size)
+{
+   if (size < 4)
+   return 0;
+   
+   if (size >= 8 * PAGE_SIZE)
+   return size / 8;
+   else
+   return size / 4;
+}
+
+/*
+ * For dti_printk, maximum size of klog formatting buffer beyond which
+ * truncation will occur
+ */
+#define DTI_PRINTF_TMPBUF_SIZE (1024)
+
+/* per-cpu dti_printf formatting temporary buffer */
+static char dti_printf_tmpbuf[NR_CPUS][DTI_PRINTF_TMPBUF_SIZE];
+
+/*
+ * Low-level registration functions
+ */
+
+static struct dti_info *__dti_register_level(const char *name, int level,
+int sub_size, int nr_sub,
+struct dti_handle *handle)
+{
+   struct dti_info *dti;
+
+   dti = kzalloc(sizeof(*dti), GFP_KERNEL);
+   if(!dti)
+   return NULL;
+
+   dti->trace = trace_setup("dti", name, sub_size, nr_sub,
+TRACE_FLIGHT_CHANNEL | TRACE_DISABLE_STATE);
+   if (!dti->trace)
+   goto setup_failed;
+   
+   dti->handle = handle;
+   dti->level = level;
+   dti->level_ctrl = debugfs_create_file("level", 0,
+ dti->trace->dir, dti,
+ _fops);
+   if (!dti->level_ctrl) {
+   printk("Couldn't create level control file\n");
+   goto level_failed;
+   }
+
+   strncpy(dti->name, name, NAME_MAX);
+
+   return dti;
+
+level_failed:
+   trace_cleanup(dti->trace);
+setup_failed:
+   kfree(dti);
+
+   return NULL;
+}
+
+/**
+ * dti_register_level: create trace dir and level ctrl file
+ *
+ * Internal - exported for setup macros.
+ */
+struct dti_info *dti_register_level(const char *name, int level,
+   struct dti_handle *handle)
+{
+   return __dti_register_level(name, level, sub_size(handle->size),
+   nr_sub(handle->size), handle);
+}
+EXPORT_SYMBOL_GPL(dti_register_level);
+
+static void dti_unregister_level(struct dti_info *dti)
+{
+   debugfs_remove(dti->level_ctrl);
+   trace_cleanup(dti->trace);
+   kfree(dti);
+}
+
+/**
+ * dti_register_channel: create channel part of new trace
+ */
+static int dti_register_channel(struct dti_info *dti)
+{
+   int rc = 0;
+   
+   rc = trace_start(dti->trace);
+   if (rc)
+   return rc;

[RFC PATCH 1/6] Driver Tracing Interface (DTI) Documentation

2007-06-29 Thread Tom Zanussi

Hi,

This patchset contains the code for a tracing and debugging facility
named 'The Driver Tracing Interface (DTI)' after the title of our OLS
paper, to be presented tomorrow. ;-)

It was originally based on ideas from a useful s390 tracing utility
called s390dbf (see Documentation/s390/s390dbf.txt), which allows
driver writers to set up and continually log to a small (or large)
circular 'flight-recording' buffer.  When something goes wrong, the
buffer can be cat'ed from userspace and the contents analyzed to help
determine the source of the problem.

The s390dbf facility supports multiple log levels, so the level of
detail being logged can be dynamically controlled (again in user space
via a control file).  which in turn provides even more help in
determining the source of the problem.  This facility has apparently
proven itself very useful in the field and has helped solve driver
problems on production systems.

The Driver Tracing Interface introduced here does essentially the same
thing, but makes the functionality available to all architectures, not
just s390.  It's built on top of the relay interface, and so makes
per-cpu logging available to users of the interface, something which
the s390dbf facility currently lacks.

Because the data is logged to per-cpu buffers, there needs to be an
easy way for users to easily view it without having to read and merge
the individual per-cpu buffers themselves.  To solve this problem, DTI
provides a merged 'view' via a debugfs file, which merges and presents
the per-cpu data when read.

DTI is also very useful as a debugging tool for kernel development.  I
personally use the klog example in the relay-apps user code all the
time for quick-and-dirty tracing, and I know that other people use
similar home-grown tracing tools (and even post them here).
Hopefully, the DTI interface provides a decent enough simplification
of the relay API that using it for these types of tracing applications
would be attractive to people put off by the bare relay API.

I created a DTI sourceforge site earlier today, but it hasn't been
approved yet.  Once it's available, I'll put a few example patches up
there along with some user tools (for binary tracing).  But in the
meantime, here's a short description of the examples:

- dti-ext2-example.patch - ext2 contains some debugging code that uses
  printk().  Redefining ext2_debug() in ext2_fs.h to use dti_printk()
  basically turns the existing debugging code into a flight-recording
  DTI channel.

- dti-systrace-example.patch - does binary tracing of all interrupts,
  system calls and schedule changes (x86).  This demonstrates using a
  single DTI channel across multiple 'subsystems' as well as custom
  formatting and presentation of the binary data (using the DTI user
  code).

- dti-early-example.patch - does early tracing i.e. before initcalls.
  Logs data starting from the beginning of start_kernel() and when the
  system comes up, boot data is available in a DTI debugfs file.

This patchset includes a few additions to relay, and also a patch that
converts some of the s390 drivers from s390dbf to DTI.

DTI uses the Generic Tracing Setup and Control (GTSC) API, which was
posted a couple of weeks ago and which I'll post an updated version of
following this patchset.

For maximum satisfaction, you should also make sure your kernel has
these two relay patches applied:

http://marc.info/?l=linux-kernel=118214275032484=2
http://marc.info/?l=linux-kernel=118232840921694=2

Thanks also to Michael Ellerman for some useful suggestions on the
interface.

Tom

Signed-off-by: Tom Zanussi <[EMAIL PROTECTED]>
Signed-off-by: David Wilder <[EMAIL PROTECTED]>
---
 dti.txt |  272 
 1 file changed, 272 insertions(+)

diff --git a/Documentation/drivers/dti.txt b/Documentation/drivers/dti.txt
new file mode 100644
index 000..e071167
--- /dev/null
+++ b/Documentation/drivers/dti.txt
@@ -0,0 +1,272 @@
+Driver Tracing Interface (DTI)
+==
+
+The Driver Tracing Interface provides easy-to-use interfaces for
+logging per-cpu 'flight recorder' data in the kernel and merging and
+presenting it to userspace.
+
+On the kernel side, data can be logged as text using a printk()-like
+logging function, or it can be logged in binary form using a couple of
+special binary event logging functions.
+
+On the user side, the combined contents of all the data present in the
+per-cpu relay files can be read from a single debugfs file that merges
+and presents the contents of the per-cpu data files.  This is mostly
+useful for text-based logging, but also works for binary logging.  For
+binary logging, there is a small userspace library and examples
+available which merge and display the binary per-cpu data in userspace
+rather than in the kernel.  See http://sourceforge.net/projects/dti.
+
+DTI also presents an interface to user space allowing it to control the
+amount or 'level' of

[PATCH 2/4 -mm] void unregister_blkdev - delete redundant message

2007-06-29 Thread Akinobu Mita

No need to warn unregister_blkdev() failure by caller.
(The previous patch makes unregister_blkdev() print error message in
error case)

Cc: Grant Likely <[EMAIL PROTECTED]>
Signed-off-by: Akinobu Mita <[EMAIL PROTECTED]>

---
 drivers/block/xsysace.c |4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

Index: 2.6-mm/drivers/block/xsysace.c
===
--- 2.6-mm.orig/drivers/block/xsysace.c
+++ 2.6-mm/drivers/block/xsysace.c
@@ -1157,9 +1157,7 @@ static void __exit ace_exit(void)
 {
pr_debug("Unregistering Xilinx SystemACE driver\n");
driver_unregister(_driver);
-   if (unregister_blkdev(ace_major, "xsysace"))
-   printk(KERN_WARNING "systemace unregister_blkdev(%i) failed\n",
-  ace_major);
+   unregister_blkdev(ace_major, "xsysace");
 }
 
 module_init(ace_init);
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 2/4] void unregister_blkdev - delete redundant messages

2007-06-29 Thread Akinobu Mita

No need to warn unregister_blkdev() failure by the callers.
(The previous patch makes unregister_blkdev() print error message in
error case)

Signed-off-by: Akinobu Mita <[EMAIL PROTECTED]>

---
 drivers/block/acsi.c |4 +---
 drivers/block/loop.c |3 +--
 drivers/block/z2ram.c|4 +---
 drivers/cdrom/cdu31a.c   |4 +---
 drivers/cdrom/mcdx.c |   11 ++-
 drivers/cdrom/sbpcd.c|5 +
 drivers/cdrom/sjcd.c |9 -
 drivers/cdrom/sonycd535.c|6 ++
 drivers/md/dm.c  |4 +---
 drivers/s390/block/dcssblk.c |7 +--
 drivers/sbus/char/jsflash.c  |3 +--
 11 files changed, 16 insertions(+), 44 deletions(-)

Index: 2.6-mm/drivers/block/acsi.c
===
--- 2.6-mm.orig/drivers/block/acsi.c
+++ 2.6-mm/drivers/block/acsi.c
@@ -1775,9 +1775,7 @@ void cleanup_module(void)
del_timer( _timer );
blk_cleanup_queue(acsi_queue);
atari_stram_free( acsi_buffer );
-
-   if (unregister_blkdev( ACSI_MAJOR, "ad" ) != 0)
-   printk( KERN_ERR "acsi: cleanup_module failed\n");
+   unregister_blkdev(ACSI_MAJOR, "ad");
 
for (i = 0; i < NDevices; i++) {
del_gendisk(acsi_gendisk[i]);
Index: 2.6-mm/drivers/block/z2ram.c
===
--- 2.6-mm.orig/drivers/block/z2ram.c
+++ 2.6-mm/drivers/block/z2ram.c
@@ -371,9 +371,7 @@ static void __exit z2_exit(void)
 {
 int i, j;
 blk_unregister_region(MKDEV(Z2RAM_MAJOR, 0), 256);
-if ( unregister_blkdev( Z2RAM_MAJOR, DEVICE_NAME ) != 0 )
-   printk( KERN_ERR DEVICE_NAME ": unregister of device failed\n");
-
+unregister_blkdev(Z2RAM_MAJOR, DEVICE_NAME);
 del_gendisk(z2ram_gendisk);
 put_disk(z2ram_gendisk);
 blk_cleanup_queue(z2_queue);
Index: 2.6-mm/drivers/s390/block/dcssblk.c
===
--- 2.6-mm.orig/drivers/s390/block/dcssblk.c
+++ 2.6-mm/drivers/s390/block/dcssblk.c
@@ -747,14 +747,9 @@ dcssblk_check_params(void)
 static void __exit
 dcssblk_exit(void)
 {
-   int rc;
-
PRINT_DEBUG("DCSSBLOCK EXIT...\n");
s390_root_dev_unregister(dcssblk_root_dev);
-   rc = unregister_blkdev(dcssblk_major, DCSSBLK_NAME);
-   if (rc) {
-   PRINT_ERR("unregister_blkdev() failed!\n");
-   }
+   unregister_blkdev(dcssblk_major, DCSSBLK_NAME);
PRINT_DEBUG("...finished!\n");
 }
 
Index: 2.6-mm/drivers/sbus/char/jsflash.c
===
--- 2.6-mm.orig/drivers/sbus/char/jsflash.c
+++ 2.6-mm/drivers/sbus/char/jsflash.c
@@ -619,8 +619,7 @@ static void __exit jsflash_cleanup_modul
jsf0.busy = 0;
 
misc_deregister(_dev);
-   if (unregister_blkdev(JSFD_MAJOR, "jsfd") != 0)
-   printk("jsfd: cleanup_module failed\n");
+   unregister_blkdev(JSFD_MAJOR, "jsfd");
blk_cleanup_queue(jsf_queue);
 }
 
Index: 2.6-mm/drivers/cdrom/mcdx.c
===
--- 2.6-mm.orig/drivers/cdrom/mcdx.c
+++ 2.6-mm/drivers/cdrom/mcdx.c
@@ -1050,13 +1050,7 @@ static void __exit mcdx_exit(void)
kfree(stuffp);
}
 
-   if (unregister_blkdev(MAJOR_NR, "mcdx") != 0) {
-   xwarn("cleanup() unregister_blkdev() failed\n");
-   }
-#if !MCDX_QUIET
-   else
-   xinfo("cleanup() succeeded\n");
-#endif
+   unregister_blkdev(MAJOR_NR, "mcdx");
blk_cleanup_queue(mcdx_queue);
 }
 
@@ -1240,8 +1234,7 @@ static int __init mcdx_init_drive(int dr
release_region(stuffp->wreg_data, MCDX_IO_SIZE);
kfree(stuffp);
put_disk(disk);
-   if (unregister_blkdev(MAJOR_NR, "mcdx") != 0)
-   xwarn("cleanup() unregister_blkdev() failed\n");
+   unregister_blkdev(MAJOR_NR, "mcdx");
blk_cleanup_queue(mcdx_queue);
return 2;
}
Index: 2.6-mm/drivers/cdrom/sjcd.c
===
--- 2.6-mm.orig/drivers/cdrom/sjcd.c
+++ 2.6-mm/drivers/cdrom/sjcd.c
@@ -1792,9 +1792,9 @@ out2:
 out1:
blk_cleanup_queue(sjcd_queue);
 out0:
-   if ((unregister_blkdev(MAJOR_NR, "sjcd") == -EINVAL))
-   printk("SJCD: cannot unregister device.\n");
-   return (-EIO);
+   unregister_blkdev(MAJOR_NR, "sjcd");
+
+   return -EIO;
 }
 
 static void __exit sjcd_exit(void)
@@ -1803,8 +1803,7 @@ static void __exit sjcd_exit(void)
put_disk(sjcd_disk);
release_region(sjcd_base, 4);
blk_cleanup_queue(sjcd_queue);
-   if ((unregister_blkdev(MAJOR_NR, "sjcd") == -EINVAL))
-   printk("SJCD: cannot unregister device.\n");
+   unregister_blkdev(MAJOR_NR, "sjcd");
printk(KERN_INFO "SJCD: module: removed.\n");
 }

[PATCH 1/4] void unregister_blkdev - do WARN_ON failure

2007-06-29 Thread Akinobu Mita

When unregister_blkdev() has failed, something wrong happened.
This patch adds WARN_ON to notify such badness.

Cc: Jens Axboe <[EMAIL PROTECTED]>
Signed-off-by: Akinobu Mita <[EMAIL PROTECTED]>

---
 block/genhd.c |5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

Index: 2.6-mm/block/genhd.c
===
--- 2.6-mm.orig/block/genhd.c
+++ 2.6-mm/block/genhd.c
@@ -120,9 +120,10 @@ int unregister_blkdev(unsigned int major
for (n = _names[index]; *n; n = &(*n)->next)
if ((*n)->major == major)
break;
-   if (!*n || strcmp((*n)->name, name))
+   if (!*n || strcmp((*n)->name, name)) {
+   WARN_ON(1);
ret = -EINVAL;
-   else {
+   } else {
p = *n;
*n = p->next;
}
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 2/2] void unregister_chrdev - make void

2007-06-29 Thread Akinobu Mita

unregister_chrdev() does not return meaningful value.
This patch makes it return void like most of unregister_* functions.

Signed-off-by: Akinobu Mita <[EMAIL PROTECTED]>

---
 fs/char_dev.c  |3 +--
 include/linux/fs.h |2 +-
 2 files changed, 2 insertions(+), 3 deletions(-)

Index: 2.6-mm/fs/char_dev.c
===
--- 2.6-mm.orig/fs/char_dev.c
+++ 2.6-mm/fs/char_dev.c
@@ -321,14 +321,13 @@ void unregister_chrdev_region(dev_t from
}
 }
 
-int unregister_chrdev(unsigned int major, const char *name)
+void unregister_chrdev(unsigned int major, const char *name)
 {
struct char_device_struct *cd;
cd = __unregister_chrdev_region(major, 0, 256);
if (cd && cd->cdev)
cdev_del(cd->cdev);
kfree(cd);
-   return 0;
 }
 
 static DEFINE_SPINLOCK(cdev_lock);
Index: 2.6-mm/include/linux/fs.h
===
--- 2.6-mm.orig/include/linux/fs.h
+++ 2.6-mm/include/linux/fs.h
@@ -1593,7 +1593,7 @@ extern int alloc_chrdev_region(dev_t *, 
 extern int register_chrdev_region(dev_t, unsigned, const char *);
 extern int register_chrdev(unsigned int, const char *,
   const struct file_operations *);
-extern int unregister_chrdev(unsigned int, const char *);
+extern void unregister_chrdev(unsigned int, const char *);
 extern void unregister_chrdev_region(dev_t, unsigned);
 extern int chrdev_open(struct inode *, struct file *);
 extern void chrdev_show(struct seq_file *,off_t);
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 1/2] void unregister_chrdev - ignore the return value

2007-06-29 Thread Akinobu Mita

register_chrdev() always returns 0.
There is no need to check the return value.

Signed-off-by: Akinobu Mita <[EMAIL PROTECTED]>

---
 arch/cris/arch-v10/drivers/pcf8563.c |4 +---
 arch/cris/arch-v32/drivers/pcf8563.c |4 +---
 arch/sparc64/solaris/socksys.c   |3 +--
 drivers/block/acsi_slm.c |3 +--
 drivers/char/ip2/ip2main.c   |4 +---
 drivers/char/mbcs.c  |7 +--
 drivers/char/stallion.c  |5 +
 drivers/char/viotape.c   |7 +--
 drivers/net/ppp_generic.c|3 +--
 sound/core/sound.c   |3 +--
 10 files changed, 10 insertions(+), 33 deletions(-)

Index: 2.6-mm/drivers/net/ppp_generic.c
===
--- 2.6-mm.orig/drivers/net/ppp_generic.c
+++ 2.6-mm/drivers/net/ppp_generic.c
@@ -2684,8 +2684,7 @@ static void __exit ppp_cleanup(void)
if (atomic_read(_unit_count) || atomic_read(_count))
printk(KERN_ERR "PPP: removing module but units remain!\n");
cardmap_destroy(_ppp_units);
-   if (unregister_chrdev(PPP_MAJOR, "ppp") != 0)
-   printk(KERN_ERR "PPP: failed to unregister PPP device\n");
+   unregister_chrdev(PPP_MAJOR, "ppp");
device_destroy(ppp_class, MKDEV(PPP_MAJOR, 0));
class_destroy(ppp_class);
 }
Index: 2.6-mm/drivers/char/ip2/ip2main.c
===
--- 2.6-mm.orig/drivers/char/ip2/ip2main.c
+++ 2.6-mm/drivers/char/ip2/ip2main.c
@@ -425,9 +425,7 @@ cleanup_module(void)
printk(KERN_ERR "IP2: failed to unregister tty driver (%d)\n", 
err);
}
put_tty_driver(ip2_tty_driver);
-   if ( ( err = unregister_chrdev ( IP2_IPL_MAJOR, pcIpl ) ) ) {
-   printk(KERN_ERR "IP2: failed to unregister IPL driver (%d)\n", 
err);
-   }
+   unregister_chrdev(IP2_IPL_MAJOR, pcIpl);
remove_proc_entry("ip2mem", _root);
 
// free memory
Index: 2.6-mm/sound/core/sound.c
===
--- 2.6-mm.orig/sound/core/sound.c
+++ 2.6-mm/sound/core/sound.c
@@ -451,8 +451,7 @@ static void __exit alsa_sound_exit(void)
 {
snd_info_minor_unregister();
snd_info_done();
-   if (unregister_chrdev(major, "alsa") != 0)
-   snd_printk(KERN_ERR "unable to unregister major device number 
%d\n", major);
+   unregister_chrdev(major, "alsa");
 }
 
 module_init(alsa_sound_init)
Index: 2.6-mm/drivers/block/acsi_slm.c
===
--- 2.6-mm.orig/drivers/block/acsi_slm.c
+++ 2.6-mm/drivers/block/acsi_slm.c
@@ -1025,8 +1025,7 @@ int init_module(void)
 
 void cleanup_module(void)
 {
-   if (unregister_chrdev( ACSI_MAJOR, "slm" ) != 0)
-   printk( KERN_ERR "acsi_slm: cleanup_module failed\n");
+   unregister_chrdev(ACSI_MAJOR, "slm");
atari_stram_free( SLMBuffer );
 }
 #endif
Index: 2.6-mm/drivers/char/stallion.c
===
--- 2.6-mm.orig/drivers/char/stallion.c
+++ 2.6-mm/drivers/char/stallion.c
@@ -4797,7 +4797,6 @@ static void __exit stallion_module_exit(
 {
struct stlbrd *brdp;
unsigned int i, j;
-   int retval;
 
pr_debug("cleanup_module()\n");
 
@@ -4820,9 +4819,7 @@ static void __exit stallion_module_exit(
 
for (i = 0; i < 4; i++)
class_device_destroy(stallion_class, MKDEV(STL_SIOMEMMAJOR, i));
-   if ((retval = unregister_chrdev(STL_SIOMEMMAJOR, "staliomem")))
-   printk("STALLION: failed to un-register serial memory device, "
-   "errno=%d\n", -retval);
+   unregister_chrdev(STL_SIOMEMMAJOR, "staliomem");
class_destroy(stallion_class);
 
pci_unregister_driver(_pcidriver);
Index: 2.6-mm/arch/cris/arch-v10/drivers/pcf8563.c
===
--- 2.6-mm.orig/arch/cris/arch-v10/drivers/pcf8563.c
+++ 2.6-mm/arch/cris/arch-v10/drivers/pcf8563.c
@@ -180,9 +180,7 @@ err:
 void __exit
 pcf8563_exit(void)
 {
-   if (unregister_chrdev(PCF8563_MAJOR, DEVICE_NAME) < 0) {
-   printk(KERN_INFO "%s: Unable to unregister device.\n", 
PCF8563_NAME);
-   }
+   unregister_chrdev(PCF8563_MAJOR, DEVICE_NAME);
 }
 
 /*
Index: 2.6-mm/arch/sparc64/solaris/socksys.c
===
--- 2.6-mm.orig/arch/sparc64/solaris/socksys.c
+++ 2.6-mm/arch/sparc64/solaris/socksys.c
@@ -199,6 +199,5 @@ int __init init_socksys(void)
 
 void __exit cleanup_socksys(void)
 {
-   if (unregister_chrdev(30, "socksys"))
-   printk ("Couldn't unregister socksys character device\n");
+   unregister_chrdev(30, "socksys");
 }
Index: 2.6-mm/arch/cris/arch-v32/drivers/pcf8563.c
===

Re: [RFC] automatic CC generation for patch submission

2007-06-29 Thread Satyam Sharma


On 6/30/07, Kok, Auke <[EMAIL PROTECTED]> wrote:

[...]
an easier way to implement this is to add an extra field in the MAINTAINERS
file, something like below. All the contact info would stay the same, closely
where applicable and it would allow you to also specify specific files as well.


Hmm, associating MAINTAINERS and source files. I remember surfing LKML
archives one day and coming across this huge and ugly flamefest when some
guy called Eric S. Raymond (hehe :-) suggested something like this and got
burned.

Cheers,
S
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] automatic CC generation for patch submission

2007-06-29 Thread Dan Aloni

On Sat, Jun 30, 2007 at 05:01:44AM +0200, Oleg Verych wrote:
> * Date: Sat, 30 Jun 2007 05:34:51 +0300
> >
> > Hello,
> >
> > I'd like to present a suggestion for automatic generation of 
> > carbon copy fields in the E-Mails of posted patches.
> >
> > Basically, instead of manually figuring out who to add to CC
> > when sending a patch to LKML by looking at MAINTAINERS, a 
> > script can look at '.maintainers' files spread across the
> > source tree and automatically generate a proper list of CCs
> > for a patch.
> 
> LKML archive near you, search phrase "BTS", subjects that have
> something about "the quality of the kernel". Good luck. 

Right, however many patches don't map to bug reports and don't 
need the heavy use of BTS. This suggestion is mainly for the 
improvement of peer review concerning code changes submitted 
by people who are not the maintainers.

-- 
Dan Aloni
XIV LTD, http://www.xivstorage.com
da-x (at) monatomic.org, dan (at) xiv.co.il
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.22-rcX Transmeta/APM regression

2007-06-29 Thread linux

Okay, after a ridiculous amount of bisecting and recompiling and
rebooting...

First I had to find out that the kernel stops booting as of
bf50467204: "i386: Use per-cpu GDT immediately on boot"
(With theis commit, it silently stops booting.  The GP fault I
posted earlier comes a little later, but I didn't bother finding it.)

and starts again as of
b0b73cb41d: "i386: msr.h: be paranoid about types and parentheses"

However, one commit before the former suspends properly, and the latter
fails to suspend (exactly the same problem at get_fixed_ranges+0x9/0x60),
so I had to bisect further between the two, backporting the msr.h changes
across the msr-index.h splitoff.

Anyway, the patch which introduces the problem is the aptly named 3ebad:
3ebad59056: [PATCH] x86: Save and restore the fixed-range MTRRs of the BSP when 
suspending

2.6.22-rc6 plus that one commit reverted successfully does APM suspend
(and resume) for me.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: how about mutual compatibility between Linux's GPLv2 and GPLv3?

2007-06-29 Thread Alexandre Oliva

On Jun 28, 2007, "David Schwartz" <[EMAIL PROTECTED]> wrote:

> Alexandre Oliva write:

>> > The GPL does sometimes use the word "may" where it's not clear
>> > whether it
>> > means you have permission or you must be able to. The general rule of
>> > construction is that "may" means permission, unless there's some clear
>> > indication to the contrary. The "may"s in sections one and two are
>> > permisssion against a claim of copyright enfrocement. The "further
>> > restriction" clause is, at it states, only on the exercise of *rights*
>> > (which I think means those rights licensed to you under copyright law,
>> > namely the right of distribution and copying).

>> ... and modification and, depending on the jurisdiction, execution.

> Modification, yes. Execution, well, if the rules of a jurisdiction are
> insane, you will get insane results from almost any contract.

> Fortunately, the GPL makes it clear that execution is unrestricted for GPL'd
> works, but does not style this as a GPL right.

Possible consequence:
http://fsfla.org/svnwiki/blogs/lxo/2007-06-29-gplv3-tivo-and-linux.en

> One would hope that jurisidictions that have such strange rules
> would interpret the GPL to effect the same result under their laws
> as was intended under the laws the GPL was written with respect to,
> to the extent possible.

Hopefully.  But we know how justice is in the US, right? :-(

> Treating ordinary use as a copyright privilege leads to nonsensical results
> no matter what you do. For example, you get that I can drop copies of my
> poem from an airplane and then sue anyone who reads it.

Who was talking about reading?  You can read programs as much as you
can read poems.  But since you (normally) can't run poems, copyright
law doesn't talk about this, just like it doesn't distinguish source
from object code of a poem.  But software is different.  So different
that it's governed by a separate law in Brazil, which could be
qualified as a subclass of copyright law.  And this law states that
running programs requires permission from the copyright holder.

If you find that odd, you may have an idea of how ludicrous patents on
software, business methods et al are.  At least copyright regulation
of execution saves us from a few abusive EULAs, created with the
purpose of, let's see, regulating execution.  And then, since it's
already there, why not use it for other restrictions beneficial to the
vendor that a copyright license couldn't establish?

-- 
Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/
FSF Latin America Board Member http://www.fsfla.org/
Red Hat Compiler Engineer   [EMAIL PROTECTED], gcc.gnu.org}
Free Software Evangelist  [EMAIL PROTECTED], gnu.org}
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] automatic CC generation for patch submission

2007-06-29 Thread Kok, Auke


Dan Aloni wrote:

Hello,

I'd like to present a suggestion for automatic generation of 
carbon copy fields in the E-Mails of posted patches.


Basically, instead of manually figuring out who to add to CC
when sending a patch to LKML by looking at MAINTAINERS, a 
script can look at '.maintainers' files spread across the

source tree and automatically generate a proper list of CCs
for a patch.

To illustrate: If a patch affects a file under 
drivers/net/e1000, the CC script will look at these files


  drivers/net/e1000/.maintainers
  drivers/net/.maintainers
  drivers/.maintainers
  .maintainers

... to gather up the mailing list addresses or an individual 
maintainer inbox address.


A posssible format for this file could be a newline-separated
list of:

  [filename wildcard]:e-mail

For example, drivers/scsi/.maintainers would contain:

  libiscsi.*:[EMAIL PROTECTED]
  scsi_*.c:[EMAIL PROTECTED]

  etc...

Or, instead (or in addition) of having a '.maintainers' file 
each directory we can modify source files by adding parsable 
'/* MAINTAINER: [EMAIL PROTECTED] */' comments. 

Some extensions to the popular E-Mail clients might be needed 
here. Also, a bot reading LKML would automatically send links 
about posted patches to the other mailing lists whenever 
someone forgets to add a CC.


Any comments?


an easier way to implement this is to add an extra field in the MAINTAINERS 
file, something like below. All the contact info would stay the same, closely 
where applicable and it would allow you to also specify specific files as well.


Auke

(horribly whitespace-mutilated copy+paste from thunderbird below)

---
diff --git a/MAINTAINERS b/MAINTAINERS
index 4c3277c..e55be49 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -69,6 +69,7 @@ L: Mailing list that is relevant to this area
 W: Web-page with status/info
 T: SCM tree type and location.  Type is one of: git, hg, quilt.
 S: Status, one of the following:
+F: Directory tree or Files belonging to this subsystem

Supported:  Someone is actually paid to look after this.
Maintained: Someone actually looks after it.
@@ -1880,6 +1881,7 @@ M:[EMAIL PROTECTED]
 L: [EMAIL PROTECTED]
 W: http://sourceforge.net/projects/e1000/
 S: Supported
+F: drivers/net/e100.c

 INTEL PRO/1000 GIGABIT ETHERNET SUPPORT
 P: Jeb Cramer
@@ -1895,6 +1897,7 @@ M:[EMAIL PROTECTED]
 L: [EMAIL PROTECTED]
 W: http://sourceforge.net/projects/e1000/
 S: Supported
+F: drivers/net/e1000/

 INTEL PRO/10GbE SUPPORT
 P: Jeff Kirsher
@@ -1910,6 +1913,7 @@ M:[EMAIL PROTECTED]
 L: [EMAIL PROTECTED]
 W: http://sourceforge.net/projects/e1000/
 S: Supported
+F: drivers/net/ixgb/

 INTEL PRO/WIRELESS 2100 NETWORK CONNECTION SUPPORT
 P: Yi Zhu
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] automatic CC generation for patch submission

2007-06-29 Thread Oleg Verych

* Date: Sat, 30 Jun 2007 05:34:51 +0300
>
> Hello,
>
> I'd like to present a suggestion for automatic generation of 
> carbon copy fields in the E-Mails of posted patches.
>
> Basically, instead of manually figuring out who to add to CC
> when sending a patch to LKML by looking at MAINTAINERS, a 
> script can look at '.maintainers' files spread across the
> source tree and automatically generate a proper list of CCs
> for a patch.

LKML archive near you, search phrase "BTS", subjects that have
something about "the quality of the kernel". Good luck. 


Description in other words. Forget about scripts in source tree, tracking
and management system (as already described) is needed.

--
-o--=O`C
 #oo'L O
<___=E M

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.22-rc: regression: no irda0 interface (2.6.21 was OK), smsc does not find chip

2007-06-29 Thread Bjorn Helgaas

[patch] PNP SMCf010 quirk: work around Toshiba Portege 4000 ACPI issues

When we enable the SMCf010 IR device, the Toshiba Portege 4000 BIOS claims
the device is working, but it really isn't configured correctly.  The BIOS
*will* configure it, but only if we call _SRS after (1) reversing the order
of the SIR and FIR I/O port regions and (2) changing the IRQ from active-high
to active-low.

This patch fixes the 2.6.22 regression:
"no irda0 interface (2.6.21 was OK), smsc does not find chip"

I tested this on a Portege 4000.  The smsc-ircc2 driver correctly detects
the device, and "irattach irda0 -s && irdadump" shows transmitted and
received packets.

Signed-off-by: Bjorn Helgaas <[EMAIL PROTECTED]>

Index: w/drivers/pnp/quirks.c
===
--- w.orig/drivers/pnp/quirks.c 2007-06-27 20:07:45.0 -0600
+++ w/drivers/pnp/quirks.c  2007-06-29 19:28:02.0 -0600
@@ -136,11 +136,10 @@
 
 static void quirk_smc_enable(struct pnp_dev *dev)
 {
-   /*
-* If the BIOS left the device disabled, or it is enabled and
-* responding correctly, we're in good shape.
-*/
-   if (!dev->active || quirk_smc_fir_enabled(dev))
+   struct resource fir, sir, irq;
+
+   pnp_activate_dev(dev);
+   if (quirk_smc_fir_enabled(dev))
return;
 
/*
@@ -152,16 +151,58 @@
 * this.  Fortunately, they do fix things up if we auto-configure
 * the device using its _PRS and _SRS methods.
 */
-   dev_err(>dev, "%s device not responding, auto-configuring "
-   "resources\n", dev->id->id);
+   dev_err(>dev, "%s not responding at SIR 0x%llx, FIR 0x%llx; "
+   "auto-configuring\n", dev->id->id,
+   pnp_port_start(dev, 0), pnp_port_start(dev, 1));
 
pnp_disable_dev(dev);
pnp_init_resource_table(>res);
pnp_auto_config_dev(dev);
pnp_activate_dev(dev);
+   if (quirk_smc_fir_enabled(dev)) {
+   dev_err(>dev, "responds at SIR 0x%llx, FIR 0x%llx\n",
+   pnp_port_start(dev, 0), pnp_port_start(dev, 1));
+   return;
+   }
+
+   /*
+* The Toshiba Portege 4000 _CRS reports the FIR region first,
+* followed by the SIR region.  The BIOS will configure the bridge,
+* but only if we call _SRS with SIR first, then FIR.  It also
+* reports the IRQ as active high, when it is really active low.
+*/
+   dev_err(>dev, "not responding at SIR 0x%llx, FIR 0x%llx; "
+   "swapping SIR/FIR and reconfiguring\n",
+   pnp_port_start(dev, 0), pnp_port_start(dev, 1));
+
+   /*
+* Clear IORESOURCE_AUTO so pnp_activate_dev() doesn't reassign
+* these resources any more.
+*/
+   fir = dev->res.port_resource[0];
+   sir = dev->res.port_resource[1];
+   fir.flags &= ~IORESOURCE_AUTO;
+   sir.flags &= ~IORESOURCE_AUTO;
+
+   irq = dev->res.irq_resource[0];
+   irq.flags &= ~IORESOURCE_AUTO;
+   irq.flags &= ~IORESOURCE_BITS;
+   irq.flags |= IORESOURCE_IRQ_LOWEDGE;
+
+   pnp_disable_dev(dev);
+   dev->res.port_resource[0] = sir;
+   dev->res.port_resource[1] = fir;
+   dev->res.irq_resource[0] = irq;
+   pnp_activate_dev(dev);
+
+   if (quirk_smc_fir_enabled(dev)) {
+   dev_err(>dev, "responds at SIR 0x%llx, FIR 0x%llx\n",
+   pnp_port_start(dev, 0), pnp_port_start(dev, 1));
+   return;
+   }
 
-   if (!quirk_smc_fir_enabled(dev))
-   dev_err(>dev, "giving up; try \"smsc-ircc2.nopnp\"\n");
+   dev_err(>dev, "giving up; try \"smsc-ircc2.nopnp\" and "
+   "email [EMAIL PROTECTED]");
 }
 
 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[RFC] automatic CC generation for patch submission

2007-06-29 Thread Dan Aloni

Hello,

I'd like to present a suggestion for automatic generation of 
carbon copy fields in the E-Mails of posted patches.

Basically, instead of manually figuring out who to add to CC
when sending a patch to LKML by looking at MAINTAINERS, a 
script can look at '.maintainers' files spread across the
source tree and automatically generate a proper list of CCs
for a patch.

To illustrate: If a patch affects a file under 
drivers/net/e1000, the CC script will look at these files

  drivers/net/e1000/.maintainers
  drivers/net/.maintainers
  drivers/.maintainers
  .maintainers

... to gather up the mailing list addresses or an individual 
maintainer inbox address.

A posssible format for this file could be a newline-separated
list of:

  [filename wildcard]:e-mail

For example, drivers/scsi/.maintainers would contain:

  libiscsi.*:[EMAIL PROTECTED]
  scsi_*.c:[EMAIL PROTECTED]

  etc...

Or, instead (or in addition) of having a '.maintainers' file 
each directory we can modify source files by adding parsable 
'/* MAINTAINER: [EMAIL PROTECTED] */' comments. 

Some extensions to the popular E-Mail clients might be needed 
here. Also, a bot reading LKML would automatically send links 
about posted patches to the other mailing lists whenever 
someone forgets to add a CC.

Any comments?

-- 
Dan Aloni
XIV LTD, http://www.xivstorage.com
da-x (at) monatomic.org, dan (at) xiv.co.il
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch 3/6] sys_indirect RFC - sys_indirect core

2007-06-29 Thread Ulrich Drepper


On 6/29/07, Davide Libenzi <[EMAIL PROTECTED]> wrote:

+int indirect_set_context(struct fsa_context *ator,
+const struct indirect_ctx __user * __user *ctxs,
+unsigned int nctxs, struct indirect_op **first)
+{
+   unsigned int i;
+   int error;
+   u32 ctx;
+   const struct indirect_ctx __user *pctx;
+   struct indirect_op *new;
+
+   *first = NULL;
+   for (i = 0; i < nctxs; i++) {
+   if (get_user(pctx, [i]) || get_user(ctx, >ctx))
+   return -EFAULT;
+   if (unlikely(ctx >= ARRAY_SIZE(inprocs) || !inprocs[ctx].set))
+   return -EINVAL;
+   error = (*inprocs[ctx].set)(ator, pctx, );
+   if (unlikely(error))
+   return error;
+   new->next = *first;
+   *first = new;
+   }


If you use one single struct as explained in my last mail all this
shouldn't be necessary.  The sys_indirect syscall would simply points
current->ind_ctx to a kernel-copy of the struct.  Then call the
syscall and on return clear current->ind_ctx.

In the affected syscalls we can then test whether current->ind_ctx is
NULL and if not, enable the extra functionality.

These callbacks etc seem to be far too expensive and complicated.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch 0/6] sys_indirect RFC - sys_indirect introduction

2007-06-29 Thread Ulrich Drepper


On 6/29/07, Davide Libenzi <[EMAIL PROTECTED]> wrote:

[include/linux/indirect.h]
#define SYSIND_CTX_OPENFLAGS0
struct sysind_ctx_OPENFLAGS {
__u32 ctx;
__u32 flags;


I agree that this interface is more than any other in danger of
needing an interface change.  But I think your solution is a bit too
expensive and complex.  You need two reads from userlevel.

The standard way to handle this is a versioned struct.  I.e., define a
struct for the current needs, define an initial version.  To use the
syscall pass the version number and the struct pointer to the syscall.
If the kernel doesn't know the version number it fails.  Otherwise it
might have to read old versions of the struct which is trivial to do.
E.g.:

#define SYSIND_VERSION 1
#define SYSIND_CTX_OPENFLAGS 0
#define SYSIND_CTX_SIGMASK 1
struct sysind_ctx {
 int ctx;
 union {
   int flags;
   kernel_sigset_t  sset;
 };
};

long sys_indirect(unsigned int nr,
  int ctx_version,
  const struct indirect_ctx *ctx,
  const unsigned long *params);


This reduces the number of accesses to userlevel data to one and still
has all the flexibility needed.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: how to determine if the noexec stack is defined by an application

2007-06-29 Thread Florin Andrei


Arjan van de Ven wrote:
But it's running a Web service which is a combination of C code and 
Tomcat/Java. I have no clue how to determine which portions specify a 
noexec stack and which don't.


like this:

$ eu-readelf -l /bin/true  | grep STACK
  GNU_STACK  0x00 0x 0x 0x00 0x00 RW 0x4


Is Sun Java 1.5 a known exception - as an application that doesn't set a 
noexec stack and reverts to default?


# eu-readelf -l ./java | grep STACK | wc -l
0

But then, this bug report seems to indicate otherwise, if I'm reading it 
correctly:


http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=5051381

--
Florin Andrei

http://florin.myip.org/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Kernel doesn't recognize complete memory

2007-06-29 Thread Matti Aarnio

On Fri, Jun 29, 2007 at 10:25:26PM +0200, Frank Fiene wrote:
> Lenovo Z61p, Intel Core2 Duo T7200
> 
> I have 4GB RAM installed and BIOS recognize 4GB RAM.
> Linux kernel (Ubuntu-7.04, 32bit-PAE and 64bit, openSUSE-10.2 32bit-PAE 
> and 64bit) tells me: only 3GB of RAM are installed.

With AMD's Athlon64 the answer would be "look into BIOS settings,
(re)map 3-4 GB memory above 4 GB address mark"..

The PCI-bus and all IO devices in it needs memory space within
the first 4 GB of physical address space.  Some devices support
64-bit addresses in which case they can access whole physical
memory, most probably don't, meaning that they can access only
memory in lowest 4 GB part of the space.

Original 8088 PC had 20 bit address space, and "huge" megabyte
of address space. Then came 286 AT era with 24 bit addresses
and "whopping" 16 megabyte address space..  then came 386 and
finally true 32-bit addresses -- but ISA-space VGA aperture had
to be in there still..  and to gain access to of that memory
located "under" that VGA aperture, there was mapping...

Now the VGA aperture mapping is no more, but the PCI has similar
requirements - albeit with bigger apertures.

So, what does your BIOS have hidden in "advanced" menus (most likely) ?
Something about "remap (PCI?) memory above 4 GB" ?

> Any other user with a 4GB Thinkpad? tytso?
> 
> What can i do? Please help!
> 
> Regards
> Frank

Regards, Matti Aarnio
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Kernel doesn't recognize complete memory

2007-06-29 Thread Russell Harmon


http://kerneltrap.org/node/2450/7217 is likely your problem, although
why you can see 3gb instead of only 1gb of ram, idk... maybe something
ubuntu specific...

On 6/29/07, Frank Fiene <[EMAIL PROTECTED]> wrote:

Lenovo Z61p, Intel Core2 Duo T7200

I have 4GB RAM installed and BIOS recognize 4GB RAM.
Linux kernel (Ubuntu-7.04, 32bit-PAE and 64bit, openSUSE-10.2 32bit-PAE
and 64bit) tells me: only 3GB of RAM are installed.

Any other user with a 4GB Thinkpad? tytso?

What can i do? Please help!

Regards
Frank
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] move suspend includes into right place (was Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy))

2007-06-29 Thread Adrian Bunk

On Sat, Jun 30, 2007 at 12:44:22AM +0200, Pavel Machek wrote:
> Hi!
> 
> > By the way.
> > 
> > > diff --git a/kernel/power/power.h b/kernel/power/power.h
> > > index eb461b8..dc13af5 100644
> > > --- a/kernel/power/power.h
> > > +++ b/kernel/power/power.h
> > 
> > 
> > Don't these definitions need to be exported to userspace? That
> > definitely is not a header file for userspace.
> 
> Yes, they do. Does this look like a fix?
>   Pavel
> 
> --- 
> 
> Split userinterface part of power.h into separate file.
>...

You should also add it to include/linux/Kbuild.
 
cu
Adrian

-- 

   "Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
   "Only a promise," Lao Er said.
   Pearl S. Buck - Dragon Seed

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [OT] Vim highlighting for trailing spaces

2007-06-29 Thread Kyle Moffett


On Jun 29, 2007, at 08:49:42, Dmitry Torokhov wrote:

On 6/29/07, Michael Tokarev <[EMAIL PROTECTED]> wrote:

highlight WhitespaceEOL ctermbg=red guibg=red
match WhitespaceEOL /\s\+$/

Works without any glitches here (not "laggy").  But I don't use  
syntax coloring - never tried if it works with coloring or not.




That only highlights whitespace at the end of the lines. You might  
want to use pattern below to also highlight "tab after space" in  
the middle of the line:


:highlight RedundantSpaces ctermbg=red guibg=red
:match RedundantSpaces /\s\+$\| \+\ze\t/


You missed the nice part about my vimrc patterns: :-D

Kyle Moffett wrote:
It always displays trailing whitespace and spaces-before tabs...  
except if your cursor is at the end of the whitespace.


They intentionally *don't* display whitespace at the end of the line  
to the left of your cursor.  I tried that one (that you quoted), but  
got annoyed by the fact that immediately after you typed any space or  
tab you had a little red blob to the left of the cursor.  So some of  
that "lagginess" is intentionaly (although not all of it, due to vim  
limitations).


Cheers,
Kyle Moffett
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch 0/4] MAP_NOZERO v2 - VM_NOZERO/MAP_NOZERO early summer madness

2007-06-29 Thread Kyle Moffett


On Jun 29, 2007, at 16:12:58, Davide Libenzi wrote:

On Fri, 29 Jun 2007, Andy Isaacson wrote:
I still think that using uid in mm_struct is wrong, and some kind  
of abstraction is required.  I called this "free pool" in  
<[EMAIL PROTECTED]>, but I think that name is  
misleading -- I am not proposing that this should be part of the  
management of free pages, but should be a label which abstracts  
"safe to share freed pages among" groups.  Then different SELinux  
protection domains would simply have different labels.


I think I answered this one at least a couple of times, but anyawy.  
First, that can be whatever cookie we choose. At the moment UID is  
used because it makes easier a fit into _mapcount. Second, SeLinux  
will be able to  disable the feature on a per-process base, or  
globally.


Anything else?


Well I would be very interested in actually being able to use this  
feature under SELinux, I think that just the underlying "can-I-use- 
this-page" logic needs modification.  Maybe "MAP_REUSABLE"?  That  
would both imply that we can accept reused (IE: nonzeroed) memory  
*AND* that the current page may be reused (IE: remapped without  
zeroing), although you could conceivably have one flag for each  
case.  The userspace allocator should be able to (when prompted by  
MAP_REUSABLE) look in a page "pool" of sorts before falling back to a  
zeroed page.  The pool would be created for a given "key" the first  
time it unmaps MAP_REUSABLE pages, possibly using the memory freed by  
said unmap.


The real trick is how to define the "key".  The default, without  
LSMs, should be something like the UID.  SELinux, on the other hand,  
would probably want to use some kind of hash of the label as the  
"key", (and store the label in each pool, as well).  That way SELinux  
could have a simple access-vector check for process:reusepage, as  
well as an access-vector check and type transition for  
"freereusablepage".  Then a policy could allow most user processes to  
unconditionally reuse pages (which would end up in the access-vector- 
cache and therefore be fast), while security-sensitive processes like  
ssh-agent could neither reuse pages nor have their pages reused, even  
if they request it.


Cheers,
Kyle Moffett

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Kernel doesn't recognize complete memory

2007-06-29 Thread Robert Hancock


Frank Fiene wrote:

Lenovo Z61p, Intel Core2 Duo T7200

I have 4GB RAM installed and BIOS recognize 4GB RAM.
Linux kernel (Ubuntu-7.04, 32bit-PAE and 64bit, openSUSE-10.2 32bit-PAE 
and 64bit) tells me: only 3GB of RAM are installed.


Any other user with a 4GB Thinkpad? tytso?

What can i do? Please help!

Regards
Frank


Please post your bootup dmesg output. If your chipset doesn't support 
memory remapping above 4GB or the BIOS doesn't enable it, you won't be 
able to use all 4GB of memory.


--
Robert Hancock  Saskatoon, SK, Canada
To email, remove "nospam" from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] - x86_64-add-ioapic-nmi-support-fix-3

2007-06-29 Thread Randy Dunlap


[adding Andi Kleen]

John Keller wrote:

Place all the IOACPI NMI support code under CONFIG_ACPI
to clear up build errors with certain configs.

Signed-off-by: John Keller <[EMAIL PROTECTED]>
---


Is there some architectural reason that IO APIC NMI support should
require ACPI?

Is this a new requirement?  It seems like a step backwards to me.


Documentation/nmi_watchdog.txt doesn't say anything about ACPI being
needed.  It does say:

"For x86-64, the needed APIC is always compiled in, and the NMI watchdog is
always enabled with I/O-APIC mode (nmi_watchdog=1). Currently, local APIC
mode (nmi_watchdog=2) does not work on x86-64.

Using local APIC (nmi_watchdog=2) needs the first performance register, so
you can't use it for other purposes (such as high precision performance
profiling.) However, at least oprofile and the perfctr driver disable the
local APIC NMI watchdog automatically."




 arch/x86_64/kernel/io_apic.c |   77 +
 1 file changed, 40 insertions(+), 37 deletions(-)


Index: linux-2.6.22-rc6/arch/x86_64/kernel/io_apic.c
===
--- linux-2.6.22-rc6.orig/arch/x86_64/kernel/io_apic.c  2007-06-29 
08:56:46.0 -0500
+++ linux-2.6.22-rc6/arch/x86_64/kernel/io_apic.c   2007-06-29 
10:28:08.109040333 -0500
@@ -76,6 +76,10 @@ struct irq_cfg irq_cfg[NR_IRQS] __read_m
[15] = { .domain = CPU_MASK_ALL, .vector = IRQ15_VECTOR, },
 };
 
+#ifdef CONFIG_ACPI

+static void setup_ioapic_nmi_irq(int ioapic, int pin,
+struct IO_APIC_route_entry *entry);
+#endif
 static int assign_irq_vector(int irq, cpumask_t mask);
 
 #define __apicdebuginit  __init

@@ -1168,9 +1172,6 @@ void __apicdebuginit print_PIC(void)
 
 #endif  /*  0  */
 
-static void setup_ioapic_nmi_irq(int ioapic, int pin,

-struct IO_APIC_route_entry *entry);
-
 static void __init enable_IO_APIC(void)
 {
union IO_APIC_reg_01 reg_01;
@@ -1211,8 +1212,10 @@ static void __init enable_IO_APIC(void)
continue;
}
 
+#ifdef CONFIG_ACPI

if (entry.delivery_mode == dest_NMI)
setup_ioapic_nmi_irq(apic, pin, );
+#endif
}
}
 
@@ -1586,40 +1589,6 @@ static void setup_nmi (void)

printk(" done.\n");
 }
 
-#define disable_nmi_ioapic  mask_IO_APIC_irq

-#define enable_nmi_ioapic   unmask_IO_APIC_irq
-
-static struct irq_chip nmi_ioapic_chip __read_mostly = {
-   .name   = "IO-APIC NMI",
-   .enable = enable_nmi_ioapic,
-   .disable= disable_nmi_ioapic,
-   .mask   = mask_IO_APIC_irq,
-   .unmask = unmask_IO_APIC_irq,
-};
-
-void __init setup_ioapic_nmi_irq(int apic, int pin,
-struct IO_APIC_route_entry *entry)
-{
-   int irq;
-
-   entry->dest_mode = INT_DEST_MODE;
-   entry->dest = cpu_mask_to_apicid(TARGET_CPUS);
-   ioapic_write_entry(apic, pin, *entry);
-
-   irq = mp_apic_pin_to_gsi(apic, pin);
-
-   /* Setup pin_2_irq[irq] entry */
-   add_pin_to_irq(irq, apic, pin);
-
-   irq_desc[irq].status = IRQ_NOREQUEST | IRQ_NO_BALANCING;
-   if (!entry->mask) {
-   irq_desc[irq].status &= ~IRQ_DISABLED;
-   irq_desc[irq].depth = 0;
-   }
-
-   set_irq_chip(irq, _ioapic_chip);
-}
-
 /*
  * This looks a bit hackish but it's about the only one way of sending
  * a few INTA cycles to 8259As and any associated glue logic.  ICR does
@@ -2282,6 +2251,40 @@ void __init io_apic_set_nmi_src_irq(int 
 	ioapic_write_entry(ioapic, pin, entry);

 }
 
+#define disable_nmi_ioapic  mask_IO_APIC_irq

+#define enable_nmi_ioapic   unmask_IO_APIC_irq
+
+static struct irq_chip nmi_ioapic_chip __read_mostly = {
+   .name   = "IO-APIC NMI",
+   .enable = enable_nmi_ioapic,
+   .disable= disable_nmi_ioapic,
+   .mask   = mask_IO_APIC_irq,
+   .unmask = unmask_IO_APIC_irq,
+};
+
+void __init setup_ioapic_nmi_irq(int apic, int pin,
+struct IO_APIC_route_entry *entry)
+{
+   int irq;
+
+   entry->dest_mode = INT_DEST_MODE;
+   entry->dest = cpu_mask_to_apicid(TARGET_CPUS);
+   ioapic_write_entry(apic, pin, *entry);
+
+   irq = mp_apic_pin_to_gsi(apic, pin);
+
+   /* Setup pin_2_irq[irq] entry */
+   add_pin_to_irq(irq, apic, pin);
+
+   irq_desc[irq].status = IRQ_NOREQUEST | IRQ_NO_BALANCING;
+   if (!entry->mask) {
+   irq_desc[irq].status &= ~IRQ_DISABLED;
+   irq_desc[irq].depth = 0;
+   }
+
+   set_irq_chip(irq, _ioapic_chip);
+}
+
 #endif /* CONFIG_ACPI */
 
 



--
~Randy
*** Remember to use Documentation/SubmitChecklist when testing your code ***
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a

Re: [PATCH] Containment measures for slab objects on scatter gather lists

2007-06-29 Thread Alan Cox

> DMA to or from memory should be done via the DMA mapping API.  If we're
> DMAing to/from a limited range within a page, either we should be using
> dma_map_single(), or dma_map_page() with an appropriate offset and size.

If those ranges overlap a cache line then the dma mapping API will not
save your backside.

On a system with a 32 byte cache granularity what happens if you get two
dma mapping calls for x and x+16. Right now the thing that avoids this
occurring is that the allocators don't pack stuff in that hard so x+16
always belongs to the same driver and we can hope driver authors are
sensible 

> sizes, but they do happen.  We handle this on ARM by writing back
> the overlapped lines and invalidating the rest before the DMA operation
> commences, and hope that the overlapped lines aren't touched for the
> duration of the DMA.)

The combination of "hope" and "DMA" isn't a good one for stable system
design. In this situation we should be waving large red flags
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: how to determine if the noexec stack is defined by an application

2007-06-29 Thread Arjan van de Ven

On Sat, 2007-06-30 at 00:41 +0200, Andreas Schwab wrote:
> Arjan van de Ven <[EMAIL PROTECTED]> writes:
> 
> > (all others default to executable stack)
> 
> Except ia64.


for ia64 it depends on the personality actually .. just to make it more
complex.
-- 
if you want to mail me at work (you don't), use arjan (at) linux.intel.com
Test the interaction between Linux and your BIOS via 
http://www.linuxfirmwarekit.org

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] move suspend includes into right place (was Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy))

2007-06-29 Thread Pavel Machek

Hi!

> By the way.
> 
> > diff --git a/kernel/power/power.h b/kernel/power/power.h
> > index eb461b8..dc13af5 100644
> > --- a/kernel/power/power.h
> > +++ b/kernel/power/power.h
> 
> 
> Don't these definitions need to be exported to userspace? That
> definitely is not a header file for userspace.

Yes, they do. Does this look like a fix?
Pavel

--- 

Split userinterface part of power.h into separate file.

Signed-off-by: Pavel Machek <[EMAIL PROTECTED]>


diff --git a/include/linux/power.h b/include/linux/power.h
new file mode 100644
index 000..37bf890
--- /dev/null
+++ b/include/linux/power.h
@@ -0,0 +1,31 @@
+#ifndef INCLUDE_LINUX_POWER_H
+#define INCLUDE_LINUX_POWER_H
+
+/*
+ * This structure is used to pass the values needed for the identification
+ * of the resume swap area from a user space to the kernel via the
+ * SNAPSHOT_SET_SWAP_AREA ioctl
+ */
+struct resume_swap_area {
+   u_int64_t offset;
+   u_int32_t dev;
+} __attribute__((packed));
+
+#define SNAPSHOT_IOC_MAGIC '3'
+#define SNAPSHOT_FREEZE_IO(SNAPSHOT_IOC_MAGIC, 1)
+#define SNAPSHOT_UNFREEZE  _IO(SNAPSHOT_IOC_MAGIC, 2)
+#define SNAPSHOT_ATOMIC_SNAPSHOT   _IOW(SNAPSHOT_IOC_MAGIC, 3, u32) /* 
void * */
+#define SNAPSHOT_ATOMIC_RESTORE_IO(SNAPSHOT_IOC_MAGIC, 4)
+#define SNAPSHOT_FREE  _IO(SNAPSHOT_IOC_MAGIC, 5)
+#define SNAPSHOT_SET_IMAGE_SIZE_IOW(SNAPSHOT_IOC_MAGIC, 6, 
u32) /* unsigned long */
+#define SNAPSHOT_AVAIL_SWAP_IOR(SNAPSHOT_IOC_MAGIC, 7, u32) /* 
void * */
+#define SNAPSHOT_GET_SWAP_PAGE _IOR(SNAPSHOT_IOC_MAGIC, 8, u32) /* 
void * */
+#define SNAPSHOT_FREE_SWAP_PAGES   _IO(SNAPSHOT_IOC_MAGIC, 9)
+#define SNAPSHOT_SET_SWAP_FILE _IOW(SNAPSHOT_IOC_MAGIC, 10, u32) /* 
unsigned int */
+#define SNAPSHOT_S2RAM _IO(SNAPSHOT_IOC_MAGIC, 11)
+#define SNAPSHOT_PMOPS _IOW(SNAPSHOT_IOC_MAGIC, 12, u32) /* 
unsigned int */
+#define SNAPSHOT_SET_SWAP_AREA _IOW(SNAPSHOT_IOC_MAGIC, 13, \
+   struct resume_swap_area)
+#define SNAPSHOT_IOC_MAXNR 13
+
+#endif
diff --git a/kernel/power/power.h b/kernel/power/power.h
index 41d33eb..e68352b 100644
--- a/kernel/power/power.h
+++ b/kernel/power/power.h
@@ -1,5 +1,9 @@
+#ifndef KERNEL_POWER_POWER_H
+#define KERNEL_POWER_POWER_H
+
 #include 
 #include 
+#include 
 
 struct swsusp_info {
struct new_utsname  uts;
@@ -114,33 +118,6 @@ extern int snapshot_write_next(struct sn
 extern void snapshot_write_finalize(struct snapshot_handle *handle);
 extern int snapshot_image_loaded(struct snapshot_handle *handle);
 
-/*
- * This structure is used to pass the values needed for the identification
- * of the resume swap area from a user space to the kernel via the
- * SNAPSHOT_SET_SWAP_AREA ioctl
- */
-struct resume_swap_area {
-   u_int64_t offset;
-   u_int32_t dev;
-} __attribute__((packed));
-
-#define SNAPSHOT_IOC_MAGIC '3'
-#define SNAPSHOT_FREEZE_IO(SNAPSHOT_IOC_MAGIC, 1)
-#define SNAPSHOT_UNFREEZE  _IO(SNAPSHOT_IOC_MAGIC, 2)
-#define SNAPSHOT_ATOMIC_SNAPSHOT   _IOW(SNAPSHOT_IOC_MAGIC, 3, u32) /* 
void * */
-#define SNAPSHOT_ATOMIC_RESTORE_IO(SNAPSHOT_IOC_MAGIC, 4)
-#define SNAPSHOT_FREE  _IO(SNAPSHOT_IOC_MAGIC, 5)
-#define SNAPSHOT_SET_IMAGE_SIZE_IOW(SNAPSHOT_IOC_MAGIC, 6, 
u32) /* unsigned long */
-#define SNAPSHOT_AVAIL_SWAP_IOR(SNAPSHOT_IOC_MAGIC, 7, u32) /* 
void * */
-#define SNAPSHOT_GET_SWAP_PAGE _IOR(SNAPSHOT_IOC_MAGIC, 8, u32) /* 
void * */
-#define SNAPSHOT_FREE_SWAP_PAGES   _IO(SNAPSHOT_IOC_MAGIC, 9)
-#define SNAPSHOT_SET_SWAP_FILE _IOW(SNAPSHOT_IOC_MAGIC, 10, u32) /* 
unsigned int */
-#define SNAPSHOT_S2RAM _IO(SNAPSHOT_IOC_MAGIC, 11)
-#define SNAPSHOT_PMOPS _IOW(SNAPSHOT_IOC_MAGIC, 12, u32) /* 
unsigned int */
-#define SNAPSHOT_SET_SWAP_AREA _IOW(SNAPSHOT_IOC_MAGIC, 13, \
-   struct resume_swap_area)
-#define SNAPSHOT_IOC_MAXNR 13
-
 #define PMOPS_PREPARE  1
 #define PMOPS_ENTER2
 #define PMOPS_FINISH   3
@@ -165,3 +142,5 @@ extern int suspend_enter(suspend_state_t
 struct timeval;
 extern void swsusp_show_speed(struct timeval *, struct timeval *,
unsigned int, char *);
+
+#endif

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: how to determine if the noexec stack is defined by an application

2007-06-29 Thread Andreas Schwab

Arjan van de Ven <[EMAIL PROTECTED]> writes:

> (all others default to executable stack)

Except ia64.

Andreas.

-- 
Andreas Schwab, SuSE Labs, [EMAIL PROTECTED]
SuSE Linux Products GmbH, Maxfeldstraße 5, 90409 Nürnberg, Germany
PGP key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] Optional Beeping During Resume From Suspend To Ram.

2007-06-29 Thread Pavel Machek

Hi!

> > >  ALIGN
> > >   .align  4096
> > > @@ -31,6 +46,11 @@ wakeup_code:
> > >   movw%cs, %ax
> > >   movw%ax, %ds# Make ds:0 
> > > point to wakeup_start
> > >   movw%ax, %ss
> > > +
> > > + testl   $1, beep_flags - wakeup_code
> > > + jz  1f
> > > + BEEP
> > > +1:
> > 
> > Can we rename/reuse existing flag variable?
> 
> Sorry, but I can't resist the opportunity to say "Send a patch!" :)
> 
> Seriously, though, I'd prefer not to. If we rename that acpi video flags 
> variable (I assume this is what you're thinking of), we only create cause for 
> confusion. A variable should for debugging or for controlling quirks, not for 
> both at the same time.

Cause for confusion? We are currently using 2 bits of that variable,
and we want to add one more bit. I seriously doubt that can confuse
anyone.
Pavel

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Please release a stable kernel Linux 3.0

2007-06-29 Thread Pavel Machek

Hi!

> > Now, perhaps redhat should get someone to work on suspend/hibernation
> > support (kernel level)? IIRC you had Nigel at one point, but he was
> > working on something else?
> > 
> > Rafael and me am trying to look after hibernation, but I believe noone
> > is really working on suspend :-(.
> 
> I've been trying to for some time, but I still need to learn more.  Also, the
> issues in there are difficult to debug.
> 
> Right now I'm collecting information and trying to help where I know what to
> do. :-)

Thanks for your work!

I'm basically trying to do that, too, but dedicated person working at
suspend would still be nice.
Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] Containment measures for slab objects on scatter gather lists

2007-06-29 Thread Alan Cox

On Fri, 29 Jun 2007 13:45:29 -0700
Andrew Morton <[EMAIL PROTECTED]> wrote:

> On Fri, 29 Jun 2007 13:16:57 +0100
> Alan Cox <[EMAIL PROTECTED]> wrote:
> 
> > > If those operations involve modifying that slab page's pageframe then what
> > > stops concurrent dma'ers from stomping on each other's changes?  As in:
> > > why aren't we already buggy?
> > 
> > Or DMA operations falling out with CPU operations in the same memory
> > area. Not all platforms have hardware consistency and some will blat the
> > entire page out of cache.
> 
> Is that just a performance problem, or can data be lost here?  It depends
> on the meaning of "blat": writeback?  invalidate?  More details, please.

Invalidate. Sorry didn't realise it they hadn't discovered that word down
under.

If you've got something packing objects in tight we are going
to have fun with cache handling simply because the CPU cache granularity
may mean that the invalidate also invalidates a few bytes on (ie a 12
byte object will invalidate 16 bytes of memory) and you've just removed
any CPU held changes in the start of the next object.

Alan
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Dynamic ticks make system jerking

2007-06-29 Thread Uwe Kleine-König

Hello,

Uwe Kleine-König wrote:
> > it's nohz=off not no_hz=off
> Currently I'm not sure, which I was really using.  I think it was the
> right one, because dmesg changed.  But anyhow, I will retest if I made
> it wrong.
OK, I was really using no_hz, and with nohz I got approx the same result
as with CONFIG_NO_HZ=n.  Uups.

> > > First I wondered why set_next_event is called twice between two timer
> > > interrupts most of the time.  I found out that the timer is programed
> > > for the next tick in any case and if nothing needs the next tick, the
> > > interval is enlarged.  I didn't spend time yet to check if it is
> > > easier/faster to only program the timer once.
> > 
> > We optimize for the non-idle path, i.e. we keep the timer running as
> > long as we are not idle. Once we go idle we reprogram it.
> OK, sounds sane.
Probably you understand that better, but your argument only convinced me
for a short time.  Is it really better to program at least once and in
the idle case a 2nd time instead of only once every time.  Moreover
if you program the timer late you can notice if the time to tick is
already over because the timer irq handling took to long (or CONFIG_HZ
is too large).

> > I looked at the patch and as far as I can understand it, there is
> > nothing obviously wrong about it.
> What a pity.
I found the problem, it had only indirectly to do with nohz.  I didn't
acknowledge the serial interrupt but as the timer and the serial need
the same acknowledgement the serial irq got his ack always when the
timer triggerd.  Up to now that delay didn't stick out as the delay was
< 10ms.

> > One remark: why did you expand the clocksource to be 64 bit? The generic
> > code handles the 32 bit wrap already, so the expansion in your read
> > routine is adding overhead. The counter will wrap every 24 seconds, so
> > there is nothing to worry about.
> OK, I thought it to be better to have 64bit + some overhead.  I will
> change that.
done.  (But I reserve the right to evaluate the 64bit case and check how
worse the overhead is :-)

Best regards
Uwe

-- 
Uwe Kleine-König

http://www.google.com/search?q=sin%28pi%2F2%29
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: how to determine if the noexec stack is defined by an application

2007-06-29 Thread Arjan van de Ven

On Sat, 2007-06-30 at 00:15 +0200, Andreas Schwab wrote:
> Arjan van de Ven <[EMAIL PROTECTED]> writes:
> 
> > like this:
> >
> > $ eu-readelf -l /bin/true  | grep STACK
> >   GNU_STACK  0x00 0x 0x 0x00 0x00 RW 0x4
> >
> >
> > (replace /bin/true with the binary or library you want to check)
> >
> > if it says "RW" like here, it'll have non-executable stack. If it says
> > "RWX" or if this line is absent entirely, the stack will be executable.
> 
> The last part is not true.  Some architectures (especially newer ones)
> default to non-exec stack.  The absense of a GNU_STACK header represents
> the default.

ok you're right; powerpc64 defaults to non-executable stack
(all others default to executable stack)

-- 
if you want to mail me at work (you don't), use arjan (at) linux.intel.com
Test the interaction between Linux and your BIOS via 
http://www.linuxfirmwarekit.org

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: how to determine if the noexec stack is defined by an application

2007-06-29 Thread Andreas Schwab

Arjan van de Ven <[EMAIL PROTECTED]> writes:

> like this:
>
> $ eu-readelf -l /bin/true  | grep STACK
>   GNU_STACK  0x00 0x 0x 0x00 0x00 RW 0x4
>
>
> (replace /bin/true with the binary or library you want to check)
>
> if it says "RW" like here, it'll have non-executable stack. If it says
> "RWX" or if this line is absent entirely, the stack will be executable.

The last part is not true.  Some architectures (especially newer ones)
default to non-exec stack.  The absense of a GNU_STACK header represents
the default.

Andreas.

-- 
Andreas Schwab, SuSE Labs, [EMAIL PROTECTED]
SuSE Linux Products GmbH, Maxfeldstraße 5, 90409 Nürnberg, Germany
PGP key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Is it time for remove (crap) ALSA from kernel tree ?

2007-06-29 Thread Rene Herman

Robert Hancock  shaw.ca> writes:

> In the case of S/PDIF output on ice1724 (and probably other cards), it 
> would be nice if ALSA defaulted to routing default audio to both the 
> S/PDIF and analog ports, as this is what most users would normally 
> expect.. The Windows drivers work like that, but on Linux you have to 
> pick one or the other (at least without a bunch of mucking with the 
> config file).

I believe some cards can't do this in fact which might be argument due to
consistency. Otherwise I don't so much have an opinion on whether or not this
should be default though...

Rene




-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] - x86_64-add-ioapic-nmi-support-fix-3

2007-06-29 Thread John Keller

Place all the IOACPI NMI support code under CONFIG_ACPI
to clear up build errors with certain configs.

Signed-off-by: John Keller <[EMAIL PROTECTED]>
---

 arch/x86_64/kernel/io_apic.c |   77 +
 1 file changed, 40 insertions(+), 37 deletions(-)


Index: linux-2.6.22-rc6/arch/x86_64/kernel/io_apic.c
===
--- linux-2.6.22-rc6.orig/arch/x86_64/kernel/io_apic.c  2007-06-29 
08:56:46.0 -0500
+++ linux-2.6.22-rc6/arch/x86_64/kernel/io_apic.c   2007-06-29 
10:28:08.109040333 -0500
@@ -76,6 +76,10 @@ struct irq_cfg irq_cfg[NR_IRQS] __read_m
[15] = { .domain = CPU_MASK_ALL, .vector = IRQ15_VECTOR, },
 };
 
+#ifdef CONFIG_ACPI
+static void setup_ioapic_nmi_irq(int ioapic, int pin,
+struct IO_APIC_route_entry *entry);
+#endif
 static int assign_irq_vector(int irq, cpumask_t mask);
 
 #define __apicdebuginit  __init
@@ -1168,9 +1172,6 @@ void __apicdebuginit print_PIC(void)
 
 #endif  /*  0  */
 
-static void setup_ioapic_nmi_irq(int ioapic, int pin,
-struct IO_APIC_route_entry *entry);
-
 static void __init enable_IO_APIC(void)
 {
union IO_APIC_reg_01 reg_01;
@@ -1211,8 +1212,10 @@ static void __init enable_IO_APIC(void)
continue;
}
 
+#ifdef CONFIG_ACPI
if (entry.delivery_mode == dest_NMI)
setup_ioapic_nmi_irq(apic, pin, );
+#endif
}
}
 
@@ -1586,40 +1589,6 @@ static void setup_nmi (void)
printk(" done.\n");
 }
 
-#define disable_nmi_ioapic  mask_IO_APIC_irq
-#define enable_nmi_ioapic   unmask_IO_APIC_irq
-
-static struct irq_chip nmi_ioapic_chip __read_mostly = {
-   .name   = "IO-APIC NMI",
-   .enable = enable_nmi_ioapic,
-   .disable= disable_nmi_ioapic,
-   .mask   = mask_IO_APIC_irq,
-   .unmask = unmask_IO_APIC_irq,
-};
-
-void __init setup_ioapic_nmi_irq(int apic, int pin,
-struct IO_APIC_route_entry *entry)
-{
-   int irq;
-
-   entry->dest_mode = INT_DEST_MODE;
-   entry->dest = cpu_mask_to_apicid(TARGET_CPUS);
-   ioapic_write_entry(apic, pin, *entry);
-
-   irq = mp_apic_pin_to_gsi(apic, pin);
-
-   /* Setup pin_2_irq[irq] entry */
-   add_pin_to_irq(irq, apic, pin);
-
-   irq_desc[irq].status = IRQ_NOREQUEST | IRQ_NO_BALANCING;
-   if (!entry->mask) {
-   irq_desc[irq].status &= ~IRQ_DISABLED;
-   irq_desc[irq].depth = 0;
-   }
-
-   set_irq_chip(irq, _ioapic_chip);
-}
-
 /*
  * This looks a bit hackish but it's about the only one way of sending
  * a few INTA cycles to 8259As and any associated glue logic.  ICR does
@@ -2282,6 +2251,40 @@ void __init io_apic_set_nmi_src_irq(int 
ioapic_write_entry(ioapic, pin, entry);
 }
 
+#define disable_nmi_ioapic  mask_IO_APIC_irq
+#define enable_nmi_ioapic   unmask_IO_APIC_irq
+
+static struct irq_chip nmi_ioapic_chip __read_mostly = {
+   .name   = "IO-APIC NMI",
+   .enable = enable_nmi_ioapic,
+   .disable= disable_nmi_ioapic,
+   .mask   = mask_IO_APIC_irq,
+   .unmask = unmask_IO_APIC_irq,
+};
+
+void __init setup_ioapic_nmi_irq(int apic, int pin,
+struct IO_APIC_route_entry *entry)
+{
+   int irq;
+
+   entry->dest_mode = INT_DEST_MODE;
+   entry->dest = cpu_mask_to_apicid(TARGET_CPUS);
+   ioapic_write_entry(apic, pin, *entry);
+
+   irq = mp_apic_pin_to_gsi(apic, pin);
+
+   /* Setup pin_2_irq[irq] entry */
+   add_pin_to_irq(irq, apic, pin);
+
+   irq_desc[irq].status = IRQ_NOREQUEST | IRQ_NO_BALANCING;
+   if (!entry->mask) {
+   irq_desc[irq].status &= ~IRQ_DISABLED;
+   irq_desc[irq].depth = 0;
+   }
+
+   set_irq_chip(irq, _ioapic_chip);
+}
+
 #endif /* CONFIG_ACPI */
 
 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH pata-2.6 fix] hpt366: use correct enablebits for HPT36x

2007-06-29 Thread Bartlomiej Zolnierkiewicz

On Friday 29 June 2007, Sergei Shtylyov wrote:
> The HPT36x chips finally turned out to have the channel enable bits -- 
> however,
> badly implemented.  Make use of them despite it's probably only going to 
> burden
> the driver's code -- assuming both channels are always enabled by the 
> HighPoint
> BIOS anyway...
> 
> Signed-off-by: Sergei Shtylyov <[EMAIL PROTECTED]>

applied
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: how to determine if the noexec stack is defined by an application

2007-06-29 Thread Arjan van de Ven


> But it's running a Web service which is a combination of C code and 
> Tomcat/Java. I have no clue how to determine which portions specify a 
> noexec stack and which don't.
> 
> In case it turns out some portions do not specify a noexec stack, my 
> next question is how to get the application to create a noexec stack 
> (assume I can make that request to the developers).


like this:

$ eu-readelf -l /bin/true  | grep STACK
  GNU_STACK  0x00 0x 0x 0x00 0x00 RW 0x4


(replace /bin/true with the binary or library you want to check)

if it says "RW" like here, it'll have non-executable stack. If it says
"RWX" or if this line is absent entirely, the stack will be executable.



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Moving MD/LVM from PPC to x86

2007-06-29 Thread Alasdair G Kergon

On Thu, Jun 28, 2007 at 11:02:39PM +0200, Turbo Fredriksson wrote:
>   2. How do I move a VG/PV/LV from PPC to x86?

The on-disk LVM2 metadata should be accessible from both
architectures.

Alasdair
-- 
[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch 6/6] sys_indirect RFC - example usage from kernel POV

2007-06-29 Thread Davide Libenzi

This is an example of how to add a context set/unset wrapper inside
the kernel, for something like open flags:

1) Alloc a new SYSIND_CTX_* sequential index inside linux/indirect.h

2) Define a new "struct sysind_ctx_*" to be used by userspace

3) Define proper set/unset functions inside kernel/sys_indirect.c



Signed-off-by: Davide Libenzi <[EMAIL PROTECTED]>


- Davide



---
 include/linux/indirect.h |7 +++
 kernel/sys_indirect.c|   32 +++-
 2 files changed, 38 insertions(+), 1 deletion(-)

Index: linux-2.6.mod/include/linux/indirect.h
===
--- linux-2.6.mod.orig/include/linux/indirect.h 2007-06-29 12:58:11.0 
-0700
+++ linux-2.6.mod/include/linux/indirect.h  2007-06-29 12:58:11.0 
-0700
@@ -8,10 +8,17 @@
 #ifndef _LINUX_INDIRECT_H
 #define _LINUX_INDIRECT_H
 
+#define SYSIND_CTX_OPENFLAGS   0
+
 struct indirect_ctx {
__u32 ctx;
 };
 
+struct sysind_ctx_OPENFLAGS {
+   __u32 ctx;
+   __u32 flags;
+};
+
 #ifdef __KERNEL__
 
 #include 
Index: linux-2.6.mod/kernel/sys_indirect.c
===
--- linux-2.6.mod.orig/kernel/sys_indirect.c2007-06-29 12:58:11.0 
-0700
+++ linux-2.6.mod/kernel/sys_indirect.c 2007-06-29 12:58:11.0 -0700
@@ -24,9 +24,39 @@
void (*unset)(struct indirect_op *);
 };
 
+struct indirect_op_OPENFLAGS {
+   struct indirect_op op;
+   unsigned long flags;
+};
+
+static int set_OPENFLAGS(struct fsa_context *ator,
+const struct indirect_ctx __user *uctx,
+struct indirect_op **pnew)
+{
+   struct indirect_op_OPENFLAGS *iop;
+   struct sysind_ctx_OPENFLAGS kctx;
+
+   if (copy_from_user(, uctx, sizeof(*uctx)))
+   return -EFAULT;
+   iop = fsa_alloc(ator, sizeof(struct indirect_op_OPENFLAGS));
+   if (unlikely(!iop))
+   return -ENOMEM;
+   iop->op.ctx = kctx.ctx;
+   /* iop->flags = current->def_open_flags; */
+   /* current->def_open_flags = kctx.flags; */
+   *pnew = (struct indirect_op *) iop;
+
+   return 0;
+}
+
+static void unset_OPENFLAGS(struct indirect_op *iop)
+{
+   /* current->def_open_flags = ((struct indirect_op_OPENFLAGS *) 
iop)->flags; */
+}
+
 static const struct indirect_procs inprocs[] =
 {
-   { NULL, NULL },
+   [SYSIND_CTX_OPENFLAGS] ={ set_OPENFLAGS, unset_OPENFLAGS },
 };
 
 /**

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch 5/6] sys_indirect RFC - wire x86 sys_indirect

2007-06-29 Thread Davide Libenzi

Wires sys_indirect() to the x86 family arch.



Signed-off-by: Davide Libenzi <[EMAIL PROTECTED]>


- Davide


---
 arch/i386/kernel/syscall_table.S |1 +
 arch/x86_64/ia32/ia32entry.S |1 +
 include/asm-i386/unistd.h|4 +++-
 include/asm-x86_64/unistd.h  |3 +++
 4 files changed, 8 insertions(+), 1 deletion(-)

Index: linux-2.6.mod/include/asm-i386/unistd.h
===
--- linux-2.6.mod.orig/include/asm-i386/unistd.h2007-06-29 
12:12:48.0 -0700
+++ linux-2.6.mod/include/asm-i386/unistd.h 2007-06-29 12:14:01.0 
-0700
@@ -329,10 +329,11 @@
 #define __NR_signalfd  321
 #define __NR_timerfd   322
 #define __NR_eventfd   323
+#define __NR_indirect  324
 
 #ifdef __KERNEL__
 
-#define NR_syscalls 324
+#define NR_syscalls 325
 
 #define __ARCH_WANT_IPC_PARSE_VERSION
 #define __ARCH_WANT_OLD_READDIR
@@ -379,6 +380,7 @@
case __NR_rt_sigsuspend:
case __NR_sigaltstack:
case __NR_iopl:
+   case __NR_indirect:
return 0;
}
return 1;
Index: linux-2.6.mod/arch/i386/kernel/syscall_table.S
===
--- linux-2.6.mod.orig/arch/i386/kernel/syscall_table.S 2007-06-29 
12:12:41.0 -0700
+++ linux-2.6.mod/arch/i386/kernel/syscall_table.S  2007-06-29 
12:14:01.0 -0700
@@ -323,3 +323,4 @@
.long sys_signalfd
.long sys_timerfd
.long sys_eventfd
+   .long sys_indirect
Index: linux-2.6.mod/include/asm-x86_64/unistd.h
===
--- linux-2.6.mod.orig/include/asm-x86_64/unistd.h  2007-06-29 
12:13:34.0 -0700
+++ linux-2.6.mod/include/asm-x86_64/unistd.h   2007-06-29 12:14:01.0 
-0700
@@ -630,6 +630,8 @@
 __SYSCALL(__NR_timerfd, sys_timerfd)
 #define __NR_eventfd   284
 __SYSCALL(__NR_eventfd, sys_eventfd)
+#define __NR_indirect  284
+__SYSCALL(__NR_indirect, sys_indirect)
 
 #ifndef __NO_STUBS
 #define __ARCH_WANT_OLD_READDIR
@@ -681,6 +683,7 @@
case __NR_rt_sigsuspend:
case __NR_sigaltstack:
case __NR_iopl:
+   case __NR_indirect:
return 0;
}
return 1;
Index: linux-2.6.mod/arch/x86_64/ia32/ia32entry.S
===
--- linux-2.6.mod.orig/arch/x86_64/ia32/ia32entry.S 2007-06-29 
12:13:34.0 -0700
+++ linux-2.6.mod/arch/x86_64/ia32/ia32entry.S  2007-06-29 12:14:01.0 
-0700
@@ -737,4 +737,5 @@
.quad compat_sys_signalfd
.quad compat_sys_timerfd
.quad sys_eventfd
+   .quad compat_sys_indirect
 ia32_syscall_end:

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.22-rc6-mm1 Intel DMAR crash on AMD x86_64

2007-06-29 Thread Rafael J. Wysocki

On Friday, 29 June 2007 17:28, Keshavamurthy, Anil S wrote:
> On Thu, Jun 28, 2007 at 06:14:27PM -0700, Li, Shaohua wrote:
> > 
> > 
> > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.22-
> > >rc6/2.6.22-rc6-mm1/
> > >>
> > >>> +intel-iommu-dmar-detection-and-parsing-logic.patch
> [..]
> > >
> > >I took a picture of it, looks like the backtrace is:
> > >
> > >NULL pointer dereference at 024
> > >EIP:dmar_table_init+0x11
> > >intel_iommu_init+0x30
> > >pci_iommu_init+0xe
> > >kernel_init+0x16e
> > >
> > >Presumably something is NULL in dmar_table_init that wasn't expected to
> > >be.. I would guess it likely crashes on any system without an Intel
> > >IOMMU in it.
> Yup, that is correct.
> 
> > How about something like below?
> > 
> > 
> > int __init dmar_table_init(void)
> > {
> > +   if (!dmar_tbl)
> > +   return -ENODEV;
> > parse_dmar_table();
> why not check for NULL in the function where it touched?
> Also when there are no DMAR devices we need the below
> printk on the console.
> 
> > if (list_empty(_drhd_units)) {
> > printk(KERN_ERR PREFIX "No DMAR devices found\n");
> > return -ENODEV;
> > }
> > return 0;
> > }
> 
> Here is the revised patch of the above.
> Andrew, please add this fix to
> +intel-iommu-dmar-detection-and-parsing-logic.patch
> 
> 
> Check for dmar_tbl pointer as this can be NULL on 
> systems with no Intel VT-d support.
> 
> Signed-off-by: Anil S Keshavamurthy <[EMAIL PROTECTED]>

For the record, this patch fixes the boot crash on my AMD64-based test box.

Thanks,
Rafael


-- 
"Premature optimization is the root of all evil." - Donald Knuth
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch 4/6] sys_indirect RFC - compat code for sys_indirect and compat_call_syscall for x86-64

2007-06-29 Thread Davide Libenzi

This is the compat code necessary for sys_indirect(). Since the data
structure passed down to sys_indirect() is compat-free, this is the
only processing needed.
A required compat_call_syscall() has been added to the x86-64 arch.



Signed-off-by: Davide Libenzi <[EMAIL PROTECTED]>


- Davide



---
 arch/x86_64/ia32/ia32entry.S |   18 +
 include/asm-x86_64/unistd.h  |1 
 kernel/compat.c  |   44 +++
 3 files changed, 63 insertions(+)

Index: linux-2.6.mod/kernel/compat.c
===
--- linux-2.6.mod.orig/kernel/compat.c  2007-06-29 12:12:41.0 -0700
+++ linux-2.6.mod/kernel/compat.c   2007-06-29 12:13:46.0 -0700
@@ -23,6 +23,8 @@
 #include 
 #include 
 #include 
+#include 
+#include 
 
 #include 
 
@@ -1082,3 +1084,45 @@
return 0;
 }
 
+asmlinkage long compat_sys_indirect(unsigned int nr,
+   const __u32 __user *ctxs, unsigned int 
nctxs,
+   const __u32 __user *params)
+{
+   unsigned int i;
+   long res;
+   u32 tmp;
+   struct indirect_op *iops;
+   const struct indirect_ctx __user * __user *uctxs;
+   unsigned long kparams[6];
+   struct fsa_context ator;
+   char ator_cache[128];
+
+   if (!indirect_call_ok(nr) || nctxs >= INT_MAX / sizeof(__u32))
+   return -EINVAL;
+   if (!access_ok(VERIFY_READ, params, 6 * sizeof(u32)) ||
+   !access_ok(VERIFY_READ, ctxs, nctxs * sizeof(u32)))
+   return -EFAULT;
+   uctxs = compat_alloc_user_space(nctxs * sizeof(void *));
+   for (i = 0, res = 0; !res && i < nctxs; i++) {
+   res |= __get_user(tmp, [i]);
+   res |= __put_user((unsigned long) tmp, [i]);
+   }
+   if (res)
+   return -EFAULT;
+   for (i = 0, res = 0; i < 6; i++) {
+   res |= __get_user(tmp, [i]);
+   kparams[i] = tmp;
+   }
+   if (res)
+   return -EFAULT;
+   fsa_init(, ator_cache, sizeof(ator_cache));
+   res = indirect_set_context(, uctxs, nctxs, );
+   if (likely(res == 0)) {
+   res = compat_call_syscall(nr, kparams);
+   indirect_unset_context(iops);
+   }
+   fsa_cleanup();
+
+   return res;
+}
+
Index: linux-2.6.mod/include/asm-x86_64/unistd.h
===
--- linux-2.6.mod.orig/include/asm-x86_64/unistd.h  2007-06-29 
12:12:48.0 -0700
+++ linux-2.6.mod/include/asm-x86_64/unistd.h   2007-06-29 12:57:58.0 
-0700
@@ -670,6 +670,7 @@
struct sigaction __user *oact,
size_t sigsetsize);
 extern long call_syscall(unsigned int nr, const unsigned long *params);
+extern long compat_call_syscall(unsigned int nr, const unsigned long *params);
 
 static inline int indirect_call_ok(unsigned int nr)
 {
Index: linux-2.6.mod/arch/x86_64/ia32/ia32entry.S
===
--- linux-2.6.mod.orig/arch/x86_64/ia32/ia32entry.S 2007-06-29 
12:12:41.0 -0700
+++ linux-2.6.mod/arch/x86_64/ia32/ia32entry.S  2007-06-29 12:57:58.0 
-0700
@@ -392,6 +392,24 @@
CFI_ENDPROC
 END(ia32_ptregs_common)
 
+ENTRY(compat_call_syscall)
+   movq$-ENOSYS, %rax
+   cmpl$(IA32_NR_syscalls-1), %edi
+   ja  bad_sysc
+   mov %edi, %eax
+   movq%rsi, %r11
+   movq(%r11), %rdi
+   movq8(%r11), %rsi
+   movq16(%r11), %rdx
+   movq24(%r11), %rcx
+   movq32(%r11), %r8
+   movq40(%r11), %r9
+   call*ia32_sys_call_table(,%rax,8)
+bad_sysc:
+   ret
+ENDPROC(compat_call_syscall)
+
+
.section .rodata,"a"
.align 8
 ia32_sys_call_table:

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch 3/6] sys_indirect RFC - sys_indirect core

2007-06-29 Thread Davide Libenzi

This is the core skeleton for the new sys_indirect() system call.



Signed-off-by: Davide Libenzi <[EMAIL PROTECTED]>


- Davide



---
 include/linux/indirect.h |   32 +
 include/linux/syscalls.h |5 ++
 kernel/Makefile  |2 
 kernel/sys_indirect.c|  109 +++
 4 files changed, 147 insertions(+), 1 deletion(-)

Index: linux-2.6.mod/include/linux/syscalls.h
===
--- linux-2.6.mod.orig/include/linux/syscalls.h 2007-06-29 12:12:41.0 
-0700
+++ linux-2.6.mod/include/linux/syscalls.h  2007-06-29 12:12:51.0 
-0700
@@ -65,6 +65,7 @@
 #include 
 #include 
 #include 
+#include 
 
 asmlinkage long sys_time(time_t __user *tloc);
 asmlinkage long sys_stime(time_t __user *tptr);
@@ -608,6 +609,10 @@
 asmlinkage long sys_timerfd(int ufd, int clockid, int flags,
const struct itimerspec __user *utmr);
 asmlinkage long sys_eventfd(unsigned int count);
+asmlinkage long sys_indirect(unsigned int nr,
+const struct indirect_ctx __user * __user *ctxs,
+unsigned int nctxs,
+const unsigned long __user *params);
 
 int kernel_execve(const char *filename, char *const argv[], char *const 
envp[]);
 
Index: linux-2.6.mod/kernel/Makefile
===
--- linux-2.6.mod.orig/kernel/Makefile  2007-06-29 12:12:41.0 -0700
+++ linux-2.6.mod/kernel/Makefile   2007-06-29 12:12:51.0 -0700
@@ -5,7 +5,7 @@
 obj-y = sched.o fork.o exec_domain.o panic.o printk.o profile.o \
exit.o itimer.o time.o softirq.o resource.o \
sysctl.o capability.o ptrace.o timer.o user.o \
-   signal.o sys.o kmod.o workqueue.o pid.o \
+   signal.o sys.o sys_indirect.o kmod.o workqueue.o pid.o \
rcupdate.o extable.o params.o posix-timers.o \
kthread.o wait.o kfifo.o sys_ni.o posix-cpu-timers.o mutex.o \
hrtimer.o rwsem.o latency.o nsproxy.o srcu.o die_notifier.o
Index: linux-2.6.mod/kernel/sys_indirect.c
===
--- /dev/null   1970-01-01 00:00:00.0 +
+++ linux-2.6.mod/kernel/sys_indirect.c 2007-06-29 12:57:56.0 -0700
@@ -0,0 +1,109 @@
+/*
+ *  kernel/sys_indirect.c
+ *
+ *  Copyright (C) 2007  Davide Libenzi <[EMAIL PROTECTED]>
+ *
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+#include 
+
+struct indirect_procs {
+   int (*set)(struct fsa_context *, const struct indirect_ctx __user *,
+  struct indirect_op **);
+   void (*unset)(struct indirect_op *);
+};
+
+static const struct indirect_procs inprocs[] =
+{
+   { NULL, NULL },
+};
+
+/**
+ * indirect_set_context - Walks through the user-specified context-set 
operations
+ *and sets the new task context according to it
+ *
+ * @ator:   [in]  Pointer the the allocator to be used to allocate context
+ *operation nodes
+ * @ctxs:   [in]  Pointer to context data to be set before the syscall
+ * @nctxs:  [in]  Number of valid contexts in @ictxs and @ctxs
+ * @first:  [out] Pointer to the head of the operation chain
+ *
+ * Returns zero in case of success, or a negative error code in case of error.
+ */
+int indirect_set_context(struct fsa_context *ator,
+const struct indirect_ctx __user * __user *ctxs,
+unsigned int nctxs, struct indirect_op **first)
+{
+   unsigned int i;
+   int error;
+   u32 ctx;
+   const struct indirect_ctx __user *pctx;
+   struct indirect_op *new;
+
+   *first = NULL;
+   for (i = 0; i < nctxs; i++) {
+   if (get_user(pctx, [i]) || get_user(ctx, >ctx))
+   return -EFAULT;
+   if (unlikely(ctx >= ARRAY_SIZE(inprocs) || !inprocs[ctx].set))
+   return -EINVAL;
+   error = (*inprocs[ctx].set)(ator, pctx, );
+   if (unlikely(error))
+   return error;
+   new->next = *first;
+   *first = new;
+   }
+
+   return 0;
+}
+
+/**
+ * indirect_unset_context - Undo the chain of task context set operations
+ *  done by a previous call to indirect_set_context()
+ *
+ * @curr:  [in] Pointer to the head of the operations chain
+ *
+ */
+void indirect_unset_context(struct indirect_op *curr)
+{
+   for (; curr; curr = curr->next)
+   if (likely(inprocs[curr->ctx].unset))
+   (*inprocs[curr->ctx].unset)(curr);
+}
+
+asmlinkage long sys_indirect(unsigned int nr,
+const struct indirect_ctx __user * __user *ctxs,
+unsigned int nctxs,
+

[patch 2/6] sys_indirect RFC - add call_syscall helper to the x86 archs

2007-06-29 Thread Davide Libenzi

This patch introduces a new call_syscall() helper for the x86 family
archs, to be used by sys_indirect().


Signed-off-by: Davide Libenzi <[EMAIL PROTECTED]>


- Davide


---
 arch/i386/kernel/entry.S|   18 ++
 arch/x86_64/kernel/entry.S  |   18 ++
 include/asm-i386/unistd.h   |   16 
 include/asm-x86_64/unistd.h |   15 +++
 4 files changed, 67 insertions(+)

Index: linux-2.6.mod/include/asm-x86_64/unistd.h
===
--- linux-2.6.mod.orig/include/asm-x86_64/unistd.h  2007-06-29 
12:12:42.0 -0700
+++ linux-2.6.mod/include/asm-x86_64/unistd.h   2007-06-29 12:57:59.0 
-0700
@@ -669,6 +669,21 @@
const struct sigaction __user *act,
struct sigaction __user *oact,
size_t sigsetsize);
+extern long call_syscall(unsigned int nr, const unsigned long *params);
+
+static inline int indirect_call_ok(unsigned int nr)
+{
+   switch (nr) {
+   case __NR_clone:
+   case __NR_fork:
+   case __NR_vfork:
+   case __NR_rt_sigsuspend:
+   case __NR_sigaltstack:
+   case __NR_iopl:
+   return 0;
+   }
+   return 1;
+}
 
 #endif  /* __ASSEMBLY__ */
 #endif /* __KERNEL__ */
Index: linux-2.6.mod/arch/i386/kernel/entry.S
===
--- linux-2.6.mod.orig/arch/i386/kernel/entry.S 2007-06-29 12:12:42.0 
-0700
+++ linux-2.6.mod/arch/i386/kernel/entry.S  2007-06-29 12:12:48.0 
-0700
@@ -1023,6 +1023,24 @@
CFI_ENDPROC
 ENDPROC(kernel_thread_helper)
 
+ENTRY(call_syscall)
+   movl$-ENOSYS, %eax
+   movl4(%esp), %edx
+   cmpl$(nr_syscalls), %edx
+   jae bad_sysc
+   movl8(%esp), %eax
+   pushl   20(%eax)
+   pushl   16(%eax)
+   pushl   12(%eax)
+   pushl   8(%eax)
+   pushl   4(%eax)
+   pushl   (%eax)
+   call*sys_call_table(,%edx,4)
+   addl$24, %esp
+bad_sysc:
+   ret
+ENDPROC(call_syscall)
+
 .section .rodata,"a"
 #include "syscall_table.S"
 
Index: linux-2.6.mod/include/asm-i386/unistd.h
===
--- linux-2.6.mod.orig/include/asm-i386/unistd.h2007-06-29 
12:12:42.0 -0700
+++ linux-2.6.mod/include/asm-i386/unistd.h 2007-06-29 12:57:58.0 
-0700
@@ -368,5 +368,21 @@
 #define cond_syscall(x) asm(".weak\t" #x "\n\t.set\t" #x ",sys_ni_syscall")
 #endif
 
+extern long call_syscall(unsigned int nr, const unsigned long *params);
+
+static inline int indirect_call_ok(unsigned int nr)
+{
+   switch (nr) {
+   case __NR_clone:
+   case __NR_fork:
+   case __NR_vfork:
+   case __NR_rt_sigsuspend:
+   case __NR_sigaltstack:
+   case __NR_iopl:
+   return 0;
+   }
+   return 1;
+}
+
 #endif /* __KERNEL__ */
 #endif /* _ASM_I386_UNISTD_H_ */
Index: linux-2.6.mod/arch/x86_64/kernel/entry.S
===
--- linux-2.6.mod.orig/arch/x86_64/kernel/entry.S   2007-06-29 
12:12:42.0 -0700
+++ linux-2.6.mod/arch/x86_64/kernel/entry.S2007-06-29 12:12:48.0 
-0700
@@ -1170,3 +1170,21 @@
sysret
CFI_ENDPROC
 ENDPROC(ignore_sysret)
+
+ENTRY(call_syscall)
+   movq$-ENOSYS, %rax
+   cmpl$__NR_syscall_max, %edi
+   ja  bad_sysc
+   mov %edi, %eax
+   movq%rsi, %r11
+   movq(%r11), %rdi
+   movq8(%r11), %rsi
+   movq16(%r11), %rdx
+   movq24(%r11), %rcx
+   movq32(%r11), %r8
+   movq40(%r11), %r9
+   call*sys_call_table(,%rax,8)
+bad_sysc:
+   ret
+ENDPROC(call_syscall)
+

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch 0/6] sys_indirect RFC - sys_indirect introduction

2007-06-29 Thread Davide Libenzi

This patch-set implements the skeleton for a new sys_indirect() system call.
The reason for such system call would be to avoid the proliferation of new
system calls, introduced to only wrap old ones by setting/unsetting a given
context before/after the real system call.
The internal operation of sys_indirect() would be (pseudo-code):

for_each_ctx(wrap_ctx, _ctxs) {
save_prev_ctx(_chain, tsk->xxx);
tsk->xxx = wrap_ctx;
}
err = call_system_call(nr, params);
for_each_saved_ctx(ctx, _chain) {
restore_ctx(tsk->xxx, ctx);
}
return err;

To made sys_indirect() decently flexible, and to avoid a sys_indirect2()
anytime soon, a simple "flags" parameter is clearly not sufficent to be
passed down with it.
To be flexible we need to allow 1) more than a simple "flags" to be passed
down 2) more than one context possibly to be set before the real system
call.  Also, the data structures passed down as new-context-to-be-set should
be compat-free, in order to simplify the compat code.
The prototype for sys_indirect() is:

struct indirect_ctx {
__u32 ctx;
};

long sys_indirect(unsigned int nr,
const struct indirect_ctx **ctxs,
unsigned int nctxs,
const unsigned long *params);

The "struct indirect_ctx" is a stub structure that is supposed to be the
base for all the context set/unset operations.
An example usage for userspace POV (referring to the example in the next
patches) is:

[include/linux/indirect.h]
#define SYSIND_CTX_OPENFLAGS0
struct sysind_ctx_OPENFLAGS {
__u32 ctx;
__u32 flags;
};

[userspace]
struct sysind_ctx_OPENFLAGS octx;
struct indirect_ctx *ctxs[1];
unsigned long params[6];

octx.ctx = SYSIND_CTX_OPENFLAGS;
octx.flags = O_CLOEXEC;
ctxs[0] = (struct indirect_ctx *) 
params[0] = domain;
params[1] = type;
params[2] = proto;
res = indirect(__NR_socket, ctxs, 1, params);


The sys_indirect() system call requires a small arch-dependent asm function
call_syscall() in order to call a system call by passing call number and
parameters (compat_call_syscall() is also needed for archs having 32 bit 
compat).
I currently implemented them for i386 and x86-64.
The following patches builds but are totally untested at the moment.
Comments?




- Davide


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch 1/6] sys_indirect RFC - fast sequential allocator

2007-06-29 Thread Davide Libenzi

This file implements a fast, sequential allocator. Chunks of memory
are allocated in blocks of pages, and inside these blocks memory is
allocated is a sequential fashion. All the allocated memory is released
in one single sweep by freeing the backing pages. Indeed, there is not
an fsa_free() function to deallocate a single block. The user can provide
an initial allocation backing store, if needed, to avoid the alocator to
call page allocation functions for the first "in cache" allocations.
The FSA allocator provides no locking, so it is up to the caller serialize
fsa_alloc() calls if ever needed.



Signed-off-by: Davide Libenzi <[EMAIL PROTECTED]>


- Davide


---
 include/linux/fsalloc.h |   34 
 lib/Makefile|2 
 lib/fsalloc.c   |  102 
 3 files changed, 137 insertions(+), 1 deletion(-)

Index: linux-2.6.mod/lib/Makefile
===
--- linux-2.6.mod.orig/lib/Makefile 2007-06-29 12:12:42.0 -0700
+++ linux-2.6.mod/lib/Makefile  2007-06-29 12:12:45.0 -0700
@@ -5,7 +5,7 @@
 lib-y := ctype.o string.o vsprintf.o cmdline.o \
 rbtree.o radix-tree.o dump_stack.o \
 idr.o int_sqrt.o bitmap.o extable.o prio_tree.o \
-sha1.o irq_regs.o reciprocal_div.o
+sha1.o irq_regs.o reciprocal_div.o fsalloc.o
 
 lib-$(CONFIG_MMU) += ioremap.o
 lib-$(CONFIG_SMP) += cpumask.o
Index: linux-2.6.mod/include/linux/fsalloc.h
===
--- /dev/null   1970-01-01 00:00:00.0 +
+++ linux-2.6.mod/include/linux/fsalloc.h   2007-06-29 12:12:45.0 
-0700
@@ -0,0 +1,34 @@
+/*
+ *  include/linux/fsalloc.h
+ *
+ *  Copyright (C) 2007  Davide Libenzi <[EMAIL PROTECTED]>
+ *
+ */
+
+#ifndef _LINUX_FSALLOC_H
+#define _LINUX_FSALLOC_H
+
+#ifdef __KERNEL__
+
+struct fsa_page {
+   struct fsa_page *next;
+   int order;
+   unsigned long avail;
+   void *data;
+};
+
+struct fsa_context {
+   struct fsa_page *first, *curr;
+   struct fsa_page cached;
+   int order;
+};
+
+void fsa_init(struct fsa_context *ator, void *buffer,
+ unsigned long csize);
+void fsa_cleanup(struct fsa_context *ator);
+void *fsa_alloc(struct fsa_context *ator, unsigned long size);
+
+#endif /* __KERNEL__ */
+
+#endif /* _LINUX_FSALLOC_H */
+
Index: linux-2.6.mod/lib/fsalloc.c
===
--- /dev/null   1970-01-01 00:00:00.0 +
+++ linux-2.6.mod/lib/fsalloc.c 2007-06-29 12:12:45.0 -0700
@@ -0,0 +1,102 @@
+/*
+ *  lib/fsalloc.c
+ *
+ *  Copyright (C) 2007  Davide Libenzi <[EMAIL PROTECTED]>
+ *
+ *  This file implements a fast, sequential allocator. Chunks of memory
+ *  are allocated in blocks of pages, and inside these blocks memory is
+ *  allocated is a sequential fashion. All the allocated memory is released
+ *  in one single sweep by freeing the backing pages. Indeed, there is not
+ *  an fsa_free() function to deallocate a single block. The user can provide
+ *  an initial allocation backing store, if needed, to avoid the alocator to
+ *  call page allocation functions for the first "in cache" allocations.
+ *  The FSA allocator provides no locking, so it is up to the caller serialize
+ *  fsa_alloc() calls if ever needed.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#define FSA_MEM_ALIGN  sizeof(unsigned long)
+#define FSA_FIT_FACTOR 8
+#define FSA_MIN_ORDER  1
+#define FSA_MAX_ORDER  8
+
+/**
+ * fsa_init - Initializes a grow-only allocator
+ *
+ * @ator:[in] Pointer to the allocator structure
+ * @buffer:  [in] Pointer to a buffer to be used as initial cached buffer
+ *for the allocator
+ * @csize:   [in] Size of @buffer or 0 in case @buffer is NULL
+ *
+ */
+void fsa_init(struct fsa_context *ator, void *buffer,
+ unsigned long csize)
+{
+   ator->order = FSA_MIN_ORDER;
+   ator->cached.next = NULL;
+   ator->cached.order = -1;
+   ator->cached.avail = csize;
+   ator->cached.data = buffer;
+   ator->first = ator->curr = >cached;
+}
+
+/**
+ * fsa_cleanup - Cleanups all the memory allocated by the grow-only allocator
+ *
+ * @ator:[in] Pointer to the allocator structure
+ *
+ */
+void fsa_cleanup(struct fsa_context *ator)
+{
+   struct fsa_page *curr, *next;
+
+   for (next = ator->first; (curr = next) != NULL;) {
+   next = curr->next;
+   if (curr->order >= 0)
+   free_pages((unsigned long) curr, curr->order);
+   }
+}
+
+/**
+ * fsa_alloc - Allocated a buffer inside the grow-only allocator
+ *
+ * @ator:[in] Pointer to the allocator structure
+ * @size:[in] Size of the requested buffer
+ *
+ * Returns a pointer to the allocated buffer, or NULL in case of error.
+ */
+void *fsa_alloc(struct fsa_context *ator, unsigned long size)
+{
+

Re: 2.6.22-rc6 spurious hangs

2007-06-29 Thread Markus Rechberger

On 6/29/07, Mauro Carvalho Chehab <[EMAIL PROTECTED]> wrote:

> Still we can't do this under cinergyt2->sem, because cinergyt2_query()
> takes it too. This all looks very wrong to me, I hope maintaners can
> explain.

AFAIK, the driver authors are not working anymore with CinergyT2. The
last patch we have on development tree from Holger is dated as Dec, 3
2004. Since them, only internal kernel API changes and a few sparse
fixes from other contributors were applied.

Also, none of the current DVB maintainers seem to have any hardware for
testing.

I received a Mail a while ago that this driver is open to the
community, it duplicates some code because the developers wanted to
use this driver for testing another DVB API which never took off.
Best would be to remove the duplicated code in that driver and make it
look like all other DVB drivers.

Markus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Please release a stable kernel Linux 3.0

2007-06-29 Thread Rene Herman


On 06/29/2007 11:05 PM, Bodo Eggert wrote:


Alan Cox <[EMAIL PROTECTED]> wrote:



Indeed if its public domain you may have almost no rights at all
depending what you were given. Once you get the source code you can do
stuff but I don't have to give you that. If its public domain I can find
security holes in it, and refuse to provide the fixed module in source
form even.


The GPL forces nobody to not release his module under PD, therefore it can't
protect you from that. Even minor changes - like adjusting the module to use
to the current API - won't change that, at least in Germany they'd have to
qualify as a work of their own in order to create a GPL-only derived work,
because anything not qualifying for that could also be integrated into the
PD version, and both would remain identical.


What I focussed on when asking were only my wishes as an author but Alan (if 
I understood him right ofcourse) pointed out that _the kernel_ does not want 
integrated code to be in the public domain regardless of my wishes.


Arguably (no doubt, sigh...) someone could distribute the kernel in binary 
form but refuse to provide source for the bits marked as being in the public 
domain alongside it -- yes, can of worms when compared to GPL demands, but I 
believe I can see why one shouldn't even go near there.


Rene.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.22-rc6 spurious hangs

2007-06-29 Thread Mauro Carvalho Chehab

> Still we can't do this under cinergyt2->sem, because cinergyt2_query()
> takes it too. This all looks very wrong to me, I hope maintaners can
> explain.

AFAIK, the driver authors are not working anymore with CinergyT2. The
last patch we have on development tree from Holger is dated as Dec, 3
2004. Since them, only internal kernel API changes and a few sparse
fixes from other contributors were applied.

Also, none of the current DVB maintainers seem to have any hardware for
testing.


Cheers,
Mauro

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.22-rc6-mm1 Intel DMAR crash on AMD x86_64

2007-06-29 Thread Muli Ben-Yehuda

On Fri, Jun 29, 2007 at 12:23:53PM -0700, Keshavamurthy, Anil S wrote:

> Since this is IOMMU is built into the kernel and it is good idea to
> report that the device is not present.

Yes - as a debug message.

> The above is printed only once and is consistent with other IOMMU
> implementation. Atleast it is useful when people report bugs we can
> makeout whether IOMMU is being detected or not.

If it was printed that it was detected it was - otherwise, it wasn't.

> Here is what I see on my box.
> [..]
> "PCI-GART: No AMD northbridge found."

You're right, that should be a debug message as well.

> [..]
> Calgary: detecting Calgary via BIOS EBDA area
> Calgary: Unable to locate Rio Grande table in EBDA - bailing!

These are KERN_DEBUG messages.

Cheers,
Muli
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch 1/5] avoid tlb gather restarts.

2007-06-29 Thread Martin Schwidefsky

On Fri, 2007-06-29 at 19:56 +0100, Hugh Dickins wrote:
> I don't dare comment on your page_mkclean_one patch (5/5),
> that dirty page business has grown too subtle for me.

Oh yes, the dirty handling is tricky. I had to fix a really nasty bug
with it lately. As for page_mkclean_one the difference is that it
doesn't claim a page is dirty if only the write protect bit has not been
set. If we manage to lose dirty bits from ptes and have to rely on the
write protect bit to take over the job, then we have a different problem
altogether, no ?

> Your cleanups 2-4 look good, especially the mm_types.h one (how
> confident are you that everything builds?), and I'm glad we can
> now lay ptep_establish to rest.  Though I think you may have 
> missed removing a __HAVE_ARCH_PTEP... from frv at least?

Ok, thanks for the review. I take a look at frv to see if I missed
something.

> But this one...
> 
> On Fri, 29 Jun 2007, Martin Schwidefsky wrote:
> 
> > If need_resched() is false it is unnecessary to call tlb_finish_mmu()
> > and tlb_gather_mmu() for each vma in unmap_vmas(). Moving the tlb gather
> > restart under the if that contains the cond_resched() will avoid
> > unnecessary tlb flush operations that are triggered by tlb_finish_mmu() 
> > and tlb_gather_mmu().
> > 
> > Signed-off-by: Martin Schwidefsky <[EMAIL PROTECTED]>
> 
> Sorry, no.  It looks reasonable, but unmap_vmas is treading a delicate
> and uncomfortable line between hi-performance and lo-latency: you've
> chosen to improve performance at the expense of latency.

That it true, my only concern had been performance. You likely have a
point here.

> You think you're just moving the finish/gather to where they're
> actually necessary; but the thing is, that per-cpu struct mmu_gather
> is liable to accumulate a lot of unpreemptible work for the future
> tlb_finish_mmu, particularly when anon pages are associated with swap.

Hmm, ok, so you are saying that we should do a flush at the end of each
vma.

> So although there may be no need to resched right now, if we keep on
> gathering more and more without flushing, we'll be very unresponsive
> when a resched is needed later on.  Hence Ingo's ZAP_BLOCK_SIZE to
> split it up, small when CONFIG_PREEMPT, more reasonable but still
> limited when not.

Would it be acceptable to call tlb_flush_mmu instead of the
tlb_finish_mmu / tlb_gather_mmu pair if the condition around
cond_resched evaluates to false?
The background for this change is that I'm working on another patch that
will change the tlb flushing for s390 quite a bit. We won't have
anything to flush with tlb_finish_mmu because we will either flush all
tlbs with tlb_gather_mmu or each pte seperatly. The pages will always be
freed immediatly. If we are forced to restart the tlb gather then we'll
do multiple flush_tlb_mm because the information that we already flushed
everything is lost with tlb_finish_mmu.

-- 
blue skies,
  Martin.

"Reality continues to ruin my life." - Calvin.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] Containment measures for slab objects on scatter gather lists

2007-06-29 Thread Russell King

On Fri, Jun 29, 2007 at 01:45:29PM -0700, Andrew Morton wrote:
> On Fri, 29 Jun 2007 13:16:57 +0100
> Alan Cox <[EMAIL PROTECTED]> wrote:
> 
> > > If those operations involve modifying that slab page's pageframe then what
> > > stops concurrent dma'ers from stomping on each other's changes?  As in:
> > > why aren't we already buggy?
> > 
> > Or DMA operations falling out with CPU operations in the same memory
> > area. Not all platforms have hardware consistency and some will blat the
> > entire page out of cache.
> 
> Is that just a performance problem, or can data be lost here?  It depends
> on the meaning of "blat": writeback?  invalidate?  More details, please.
> 
> 
> I'm dyin here and nobody will talk to me.  If the kernel is already doing
> these things, why aren't we already buggy?  Is it because we don't actually
> modify the pageframes of these dma-to-from-kmalloced pages?  But we were
> thinking of doing so in the future?

I think people are getting too het up about this.

DMA to or from memory should be done via the DMA mapping API.  If we're
DMAing to/from a limited range within a page, either we should be using
dma_map_single(), or dma_map_page() with an appropriate offset and size.

Other cache flushing functions should not be called for DMA operations;
any cache handling required by non-coherent architectures should be done
by the DMA API only.

However, with non-coherent aliasing architectures (such as those with
aliasing VIPT or VIVT caches) there is an additional requirement on PIO
to page cache.  If the page we're writing data has some cache lines
allocated to it, we potentially hit those cache lines and the data
doesn't hit the underlying page.  Later on, when we come to map the
page into userspace, the data may still be sitting in the cache lines
corresponding with the kernel's mapping.  Therefore, there is a
requirement to ensure that the cache state WRT the kernel's mapping is
the same irrespective of the method by which data ends up in the page.

That means that for these caches, the data PIO'd into the page must be
written back to the underlying page before the page is handed to
userspace.

The two are completely separate; it seems to me from the above discussion
that people are confusing the two scenarios, and mixing DMA with the PIO
cache handling.  Please don't, you'll only get more and more confused.

(Note: with the dma_map_* API, architectures have to be sensible when
they're passed offests and sizes which aren't cacheline aligned.
Technically, it's buggy to ask for non-L1 line aligned offsets and
sizes, but they do happen.  We handle this on ARM by writing back
the overlapped lines and invalidating the rest before the DMA operation
commences, and hope that the overlapped lines aren't touched for the
duration of the DMA.)

-- 
Russell King
 Linux kernel2.6 ARM Linux   - http://www.arm.linux.org.uk/
 maintainer of:
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Please release a stable kernel Linux 3.0

2007-06-29 Thread Bodo Eggert

Alan Cox <[EMAIL PROTECTED]> wrote:
> On Fri, 29 Jun 2007 00:00:27 +0200
> Rene Herman <[EMAIL PROTECTED]> wrote:
>> On 06/28/2007 06:30 PM, Alan Cox wrote:

>> > Public domain is GPL compatible.
>> 
>> Would you happen to have an opinion on the attached? I don't so much need it
> 
> The answer is "NO"
> 
> Public domain also means "I don't have to give you the source".
> If its merged with the kernel the resulting work is GPL anyway
> 
>> Stating that code which one intends to be in the public domain has "GPL and
>> additional rights" is a bit of a travesty though.
> 
> Indeed if its public domain you may have almost no rights at all
> depending what you were given. Once you get the source code you can do
> stuff but I don't have to give you that. If its public domain I can find
> security holes in it, and refuse to provide the fixed module in source
> form even.

The GPL forces nobody to not release his module under PD, therefore it can't
protect you from that. Even minor changes - like adjusting the module to use
to the current API - won't change that, at least in Germany they'd have to
qualify as a work of their own in order to create a GPL-only derived work,
because anything not qualifying for that could also be integrated into the
PD version, and both would remain identical.
-- 
"Cluster bombing from B-52s is very, very accurate. The bombs are
guaranteed to always hit the ground."
-U.S.A.F. Ammo Troop
Friß, Spammer: [EMAIL PROTECTED] [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [1/2] 2.6.22-rc6: known regressions v2

2007-06-29 Thread Jeff Garzik


Alan Cox wrote:

On Fri, 29 Jun 2007 14:10:49 -0400
Jeff Garzik <[EMAIL PROTECTED]> wrote:


Alan Cox wrote:

I'm not even sure this report is IT8212 related rather than just an IRQ
storm

Why does the driver report "irq 0"?

ata7: PATA max UDMA/133 cmd  ctl  bmdma  irq 0  <

Above that, the ACPI layer says it assigned IRQ 20

Because the libata core code in 2.6.22rc6 reports all the ports and IRQ
values wrongly ?
AFAIK that was fixed, for IRQ.  Please point out examples where it 
remains broken...


2.6.22-rc6 it is broken, for all the systems I've looked at, as are the
port numbers. Tejun posted fixes for the IRQ but they do not seem to have
been applied, or if they were it was post -rc6 to a git tree.



I've seen the patch that eventually became 
22888423b3b1b96573250671afb5b72ea4364902 from Olof Johansson but nothing 
from Tejun on the subject.


Jeff


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.22-rc6-mm1

2007-06-29 Thread Andrew Morton

On Fri, 29 Jun 2007 10:50:30 -0400
[EMAIL PROTECTED] wrote:

> On Thu, 28 Jun 2007 03:43:21 PDT, Andrew Morton said:
> > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.22-rc6/2.6.22-rc6-mm1/
> 
> Configures, builds, boots on first try.  Dell Latitude D820 laptop, T7200 CPU,
> x86_64 kernel.  Doesn't break any of the out-of-tree stuff I use.
> 
> >   `make oldconfig', your kernel probably won't work.  I lost useful things
> >   like CONFIG_BLK_DEV and the whole SCSI system, because they were added 
> > after
> >   I generated my .config.
> 
> Odd - just for grins, I checked what 'make oldconfig' did when handed a 
> .config
> from 22-rc4-mm2, and it behaved just fine, much to my surprise.

That's probably because your old config file was relatively recent, and
had things like CONFIG_BLK_DEV=y in it.

But those people who are still dragging around old config files which
predate the introduction of CONFIG_BLK_DEV will find that 2.6.22-rc6-mm1
oldconfig will give them CONFIG_BLK_DEV=n instead of current mainline's
deafult of CONFIG_BLK_DEV=y.

I think making BLK_DEV deafult to n was a bit dumb, so I dropped the
offending patches.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.22-rc6-mm1

2007-06-29 Thread Andrew Morton

On Fri, 29 Jun 2007 16:17:38 +0200 (CEST)
Roman Zippel <[EMAIL PROTECTED]> wrote:

> On Thu, 28 Jun 2007, Andrew Morton wrote:
> 
> >   So save yourself some hassle and check your .config carefully before
> >   building this kernel.  Make sure that everything you need is still 
> > enabled.
> > 
> >   I found that manually adding "CONFIG_BLK_DEV=y" to the .config before
> >   running oldconfig saved a large number of config items from getting lost.
> 
> This patch should help for this, so that this isn't done when Kconfig or 
> .config has been changed and they are not in sync.
> 
> bye, Roman
> 
> 
> Reset generates values only if Kconfig and .config agree.

unclear.  Could you please explain further what this change does?

Thanks.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 0/6][TAKE5] fallocate system call

2007-06-29 Thread Andrew Morton

On Fri, 29 Jun 2007 11:50:04 -0400
Mingming Caoc <[EMAIL PROTECTED]> wrote:

> I think the ext4 patch queue is in good shape now.

Which ext4 patches are you intending to merge into 2.6.23?

Please send all those out to lkml for review?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] frv: fix fallout from "remove sched.h from mm.h" patch

2007-06-29 Thread Andrew Morton

On Fri, 29 Jun 2007 09:10:52 -0400 (EDT)
"Robert P. J. Day" <[EMAIL PROTECTED]> wrote:

> > Please provide changelogs.
> >
> > I assume this patch fixes some build error or something.

I am still awaiting a description of what this patch does.

>   i actually asked about this on either the janitors or newbies list
> the other day -- one of the early examples from the LDD3 device
> drivers book which built fine all this time suddenly stopped building,
> until i explicitly included  to be able to dereference
> a pointer to "task_struct":
> 
> /home/rpjday/AMD/k/topics/0_hi/hi1.c:15: error: dereferencing pointer to 
> incomplete type
> /home/rpjday/AMD/k/topics/0_hi/hi1.c:16: error: dereferencing pointer to 
> incomplete type

I'll proceed, assuming that the patch fixes the above build errors.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

how to determine if the noexec stack is defined by an application

2007-06-29 Thread Florin Andrei


I'm reading Ingo's NX quick start document:

http://people.redhat.com/mingo/nx-patches/QuickStart-NX.txt

Quote:
"If an application defines a noexec stack then the kernel will enforce 
this executability, and all attempts to execute on the stack will be 
prevented by the hardware."


My question is related to the conditional "if an application". So it 
looks like it depends on the app.
Now, the OS/hardware combination that I'm using (RHEL4 WS 32 bit on 
AMD64 CPU - long story, don't ask) definitely enables NX:


# grep -i nx /var/log/dmesg
NX (Execute Disable) protection: active

But it's running a Web service which is a combination of C code and 
Tomcat/Java. I have no clue how to determine which portions specify a 
noexec stack and which don't.


In case it turns out some portions do not specify a noexec stack, my 
next question is how to get the application to create a noexec stack 
(assume I can make that request to the developers).



(please do NOT Cc me, I'm subscribed to the list)

--
Florin Andrei

http://florin.myip.org/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Need help making sense of IRQ API

2007-06-29 Thread Michal Schmidt

LOL ER wrote:
> Hello,
>   I've been trying to make sense of how the kernel (on an i386) calls
> __do_IRQ() from do_IRQ() for the past few days to no avail. [...]

Since i386 was switched to the generic-IRQ architecture (see "Linux
generic IRQ handling" in Documentation/Docbook) it does not use __do_IRQ().

common_interrupt (in assembler) calls do_IRQ(), which calls
desc->handle_irq() that is usually one of:
 handle_fasteoi_irq()
 handle_level_irq()
 handle_edge_irq()

Michal
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.22-rc6-mm1

2007-06-29 Thread Andrew Morton

On Fri, 29 Jun 2007 14:32:09 +0200
Mariusz Kozlowski <[EMAIL PROTECTED]> wrote:

> Hello,
> 
>   allmodconfig on powerpc (iMac g3) fails due to
> git-kgdb.patch. allmodconfig defaults should be changed?
> 
>   CC  arch/powerpc/kernel/kgdb.o
> arch/powerpc/kernel/kgdb.c:485:2: error: #error Both XMON and KGDB selected 
> in .config. Unselect one of them.
> make[1]: *** [arch/powerpc/kernel/kgdb.o] Blad 1
> make: *** [arch/powerpc/kernel] Blad 2

Jason cc'ed

> anyway after unselecting XMON we can see:
> 
>   CC [M]  fs/xfs/linux-2.6/xfs_ioctl32.o
> fs/xfs/linux-2.6/xfs_ioctl32.c: In function 'xfs_ioc_bulkstat_compat':
> fs/xfs/linux-2.6/xfs_ioctl32.c:334: error: 'xfs_inumbers_fmt_compat' 
> undeclared (first use in this function)
> fs/xfs/linux-2.6/xfs_ioctl32.c:334: error: (Each undeclared identifier is 
> reported only once
> fs/xfs/linux-2.6/xfs_ioctl32.c:334: error: for each function it appears in.)
> make[2]: *** [fs/xfs/linux-2.6/xfs_ioctl32.o] Blad 1
> make[1]: *** [fs/xfs] Blad 2
> 
> This is just allmodconfig - not a .config that's used daily by users but I'm 
> used to compiling the kernel using it anyway 8)
> 

Michal cc'ed.  I think this is the one which was already reported but
I haven't seen a fix yet?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Kernel doesn't recognize complete memory

2007-06-29 Thread Frank Fiene

Lenovo Z61p, Intel Core2 Duo T7200

I have 4GB RAM installed and BIOS recognize 4GB RAM.
Linux kernel (Ubuntu-7.04, 32bit-PAE and 64bit, openSUSE-10.2 32bit-PAE 
and 64bit) tells me: only 3GB of RAM are installed.

Any other user with a 4GB Thinkpad? tytso?

What can i do? Please help!

Regards
Frank
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] Containment measures for slab objects on scatter gather lists

2007-06-29 Thread Andrew Morton

On Fri, 29 Jun 2007 13:16:57 +0100
Alan Cox <[EMAIL PROTECTED]> wrote:

> > If those operations involve modifying that slab page's pageframe then what
> > stops concurrent dma'ers from stomping on each other's changes?  As in:
> > why aren't we already buggy?
> 
> Or DMA operations falling out with CPU operations in the same memory
> area. Not all platforms have hardware consistency and some will blat the
> entire page out of cache.

Is that just a performance problem, or can data be lost here?  It depends
on the meaning of "blat": writeback?  invalidate?  More details, please.

I'm dyin here and nobody will talk to me.  If the kernel is already doing
these things, why aren't we already buggy?  Is it because we don't actually
modify the pageframes of these dma-to-from-kmalloced pages?  But we were
thinking of doing so in the future?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 1/2] x86_64: get mp_bus_to_node as early

2007-06-29 Thread Yinghai Lu

[PATCH 1/2] x86_64: get mp_bus_to_node as early

In struct device, we already have numa_node member. and we can use dev_to_node()
/set_dev_node() to get and set numa_node in the device.
set_dev_node is called in pci_device_add() with pcibus_to_node(bus). and
pci_bus_to_node use bus->sysdata for nodeid.
the problem is when pci_add_device is called, bus->sysdata is not assigned
correct nodeid yet. the result will be numa_node always is 0.
pcibios_scan_root and pci_scan_root could take sysdata. So we need to get
mp_bus_to_node mapping before these two are called. and get_mp_bus_to_node
could get correct node for sysdata in root bus.
in scanning of root bus, all child bus will take parent bus sysdata. So all
pci_device->dev.numa_node will be assigned correctly automatically.
later we could use dev_to_node(_dev->dev) to numa_node, and we could also
could make other bus specific device get the correct numa_node too.
and in different driver we could use kmalloc_node instead of kmalloc for
skbuff/net or urb/usb etc. That could help improve performance with usb or net 
or sata for AMD K8 two sockets beyond system.

For example:
two way opteron system and only one HT chain on node 0. USB controller on SB 
will be on node0. some dma accessing is used with kmalloc/dma_map_single. and 
these address will be on node1 instead of node0. and even worse, when node1 ram
 is above 4G, we may need to iommu mapping for usb operation.
two way system with one HT chain on different node, we will need to 
kmalloc/dma_map_single to use ram on corresonding node too.  esp for nvidia 
mcp55/io55 system. the second io55 could have nic/sata/pcie devices.

Signed-off-by: Yinghai Lu <[EMAIL PROTECTED]>

diff --git a/arch/i386/pci/Makefile b/arch/i386/pci/Makefile
index 44650e0..600d0e7 100644
--- a/arch/i386/pci/Makefile
+++ b/arch/i386/pci/Makefile
@@ -10,5 +10,6 @@ pci-y += legacy.o irq.o
 
 pci-$(CONFIG_X86_VISWS):= visws.o fixup.o
 pci-$(CONFIG_X86_NUMAQ):= numa.o irq.o
+pci-$(CONFIG_NUMA) += mp_bus_to_node.o
 
 obj-y  += $(pci-y) common.o early.o
diff --git a/arch/i386/pci/acpi.c b/arch/i386/pci/acpi.c
index b33aea8..5f8859f 100644
--- a/arch/i386/pci/acpi.c
+++ b/arch/i386/pci/acpi.c
@@ -8,24 +8,27 @@
 struct pci_bus * __devinit pci_acpi_scan_root(struct acpi_device *device, int 
domain, int busnum)
 {
struct pci_bus *bus;
+#ifdef CONFIG_ACPI_NUMA
+   int pxm;
+   int node;
+#endif
 
if (domain != 0) {
printk(KERN_WARNING "PCI: Multiple domains not supported\n");
return NULL;
}
 
-   bus = pcibios_scan_root(busnum);
 #ifdef CONFIG_ACPI_NUMA
-   if (bus != NULL) {
-   int pxm = acpi_get_pxm(device->handle);
-   if (pxm >= 0) {
-   bus->sysdata = (void *)(unsigned long)pxm_to_node(pxm);
-   printk("bus %d -> pxm %d -> node %ld\n",
-   busnum, pxm, (long)(bus->sysdata));
-   }
+   pxm = acpi_get_pxm(device->handle);
+   if (pxm >= 0) {
+   node  = pxm_to_node(pxm);
+   printk("bus %02x -> pxm %d -> node %02x\n", busnum, pxm, node);
+   set_mp_bus_to_node(busnum, node);
}
 #endif
-   
+
+   bus = pcibios_scan_root(busnum);
+
return bus;
 }
 
diff --git a/arch/i386/pci/common.c b/arch/i386/pci/common.c
index 3f78d4d..d47f0a0 100644
--- a/arch/i386/pci/common.c
+++ b/arch/i386/pci/common.c
@@ -293,6 +293,7 @@ static struct dmi_system_id __devinitdata 
pciprobe_dmi_table[] = {
 struct pci_bus * __devinit pcibios_scan_root(int busnum)
 {
struct pci_bus *bus = NULL;
+   long node;
 
dmi_check_system(pciprobe_dmi_table);
 
@@ -303,9 +304,15 @@ struct pci_bus * __devinit pcibios_scan_root(int busnum)
}
}
 
+   node = get_mp_bus_to_node(busnum);
+
+#ifdef CONFIG_NUMA
+   printk(KERN_DEBUG "PCI: Probing PCI hardware (bus %02x) with (node 
%02lx)\n", busnum, node);
+#else
printk(KERN_DEBUG "PCI: Probing PCI hardware (bus %02x)\n", busnum);
+#endif
 
-   return pci_scan_bus_parented(NULL, busnum, _root_ops, NULL);
+   return pci_scan_bus_parented(NULL, busnum, _root_ops, (void *)node);
 }
 
 extern u8 pci_cache_line_size;
diff --git a/arch/i386/pci/irq.c b/arch/i386/pci/irq.c
index f2cb942..50df769 100644
--- a/arch/i386/pci/irq.c
+++ b/arch/i386/pci/irq.c
@@ -136,9 +136,11 @@ static void __init pirq_peer_trick(void)
busmap[e->bus] = 1;
}
for(i = 1; i < 256; i++) {
+   long node;
if (!busmap[i] || pci_find_bus(0, i))
continue;
-   if (pci_scan_bus(i, _root_ops, NULL))
+   node = get_mp_bus_to_node(i);
+   if (pci_scan_bus(i, _root_ops, (void *)node))
printk(KERN_INFO "PCI: Discovered primary peer bus %02x

[PATCH 2/2] net: make net and forcedeth to use kmalloc_node

2007-06-29 Thread Yinghai Lu

[PATCH 2/2] net: make net and forcedeth to use kmalloc_node

Signed-off-by: Yinghai Lu <[EMAIL PROTECTED]>
diff --git a/drivers/net/forcedeth.c b/drivers/net/forcedeth.c
index 42ba1c0..6d53b52 100644
--- a/drivers/net/forcedeth.c
+++ b/drivers/net/forcedeth.c
@@ -1383,7 +1383,7 @@ static int nv_alloc_rx(struct net_device *dev)
less_rx = np->last_rx.orig;
 
while (np->put_rx.orig != less_rx) {
-   struct sk_buff *skb = dev_alloc_skb(np->rx_buf_sz + 
NV_RX_ALLOC_PAD);
+   struct sk_buff *skb = dev_alloc_skb_node(np->rx_buf_sz + 
NV_RX_ALLOC_PAD, dev_to_node(>dev));
if (skb) {
np->put_rx_ctx->skb = skb;
np->put_rx_ctx->dma = pci_map_single(np->pci_dev,
@@ -1415,7 +1415,7 @@ static int nv_alloc_rx_optimized(struct net_device *dev)
less_rx = np->last_rx.ex;
 
while (np->put_rx.ex != less_rx) {
-   struct sk_buff *skb = dev_alloc_skb(np->rx_buf_sz + 
NV_RX_ALLOC_PAD);
+   struct sk_buff *skb = dev_alloc_skb_node(np->rx_buf_sz + 
NV_RX_ALLOC_PAD, dev_to_node(>dev));
if (skb) {
np->put_rx_ctx->skb = skb;
np->put_rx_ctx->dma = pci_map_single(np->pci_dev,
@@ -3976,8 +3976,8 @@ static int nv_set_ringparam(struct net_device *dev, 
struct ethtool_ringparam* ri
sizeof(struct ring_desc_ex) * 
(ring->rx_pending + ring->tx_pending),
_addr);
}
-   rx_skbuff = kmalloc(sizeof(struct nv_skb_map) * ring->rx_pending, 
GFP_KERNEL);
-   tx_skbuff = kmalloc(sizeof(struct nv_skb_map) * ring->tx_pending, 
GFP_KERNEL);
+   rx_skbuff = kmalloc_node(sizeof(struct nv_skb_map) * ring->rx_pending, 
GFP_KERNEL, dev_to_node(>dev));
+   tx_skbuff = kmalloc_node(sizeof(struct nv_skb_map) * ring->tx_pending, 
GFP_KERNEL, dev_to_node(>dev));
if (!rxtx_ring || !rx_skbuff || !tx_skbuff) {
/* fall back to old rings */
if (np->desc_ver == DESC_VER_1 || np->desc_ver == DESC_VER_2) {
@@ -4372,7 +4372,7 @@ static int nv_loopback_test(struct net_device *dev)
 
/* setup packet for tx */
pkt_len = ETH_DATA_LEN;
-   tx_skb = dev_alloc_skb(pkt_len);
+   tx_skb = dev_alloc_skb_node(pkt_len, dev_to_node(>dev));
if (!tx_skb) {
printk(KERN_ERR "dev_alloc_skb() failed during loopback test"
 " of %s\n", dev->name);
@@ -4976,6 +4976,7 @@ static int __devinit nv_probe(struct pci_dev *pci_dev, 
const struct pci_device_i
dev->base_addr = (unsigned long)np->base;
 
dev->irq = pci_dev->irq;
+   printk(KERN_INFO "nv_probe: numa_node : %02d\n", 
dev_to_node(_dev->dev));
 
np->rx_ring_size = RX_RING_DEFAULT;
np->tx_ring_size = TX_RING_DEFAULT;
@@ -4995,8 +4996,9 @@ static int __devinit nv_probe(struct pci_dev *pci_dev, 
const struct pci_device_i
goto out_unmap;
np->tx_ring.ex = >rx_ring.ex[np->rx_ring_size];
}
-   np->rx_skb = kmalloc(sizeof(struct nv_skb_map) * np->rx_ring_size, 
GFP_KERNEL);
-   np->tx_skb = kmalloc(sizeof(struct nv_skb_map) * np->tx_ring_size, 
GFP_KERNEL);
+   np->rx_skb = kmalloc_node(sizeof(struct nv_skb_map) * np->rx_ring_size, 
GFP_KERNEL, dev_to_node(_dev->dev));
+   np->tx_skb = kmalloc_node(sizeof(struct nv_skb_map) * np->tx_ring_size, 
GFP_KERNEL, dev_to_node(_dev->dev));
+
if (!np->rx_skb || !np->tx_skb)
goto out_freering;
memset(np->rx_skb, 0, sizeof(struct nv_skb_map) * np->rx_ring_size);
@@ -5204,6 +5206,13 @@ static int __devinit nv_probe(struct pci_dev *pci_dev, 
const struct pci_device_i
np->autoneg = 1;
 
err = register_netdev(dev);
+
+   /*
+* store numa_node in dev->dev, so we don't need to use
+* netdev_priv(dev)->pci_dev->dev
+*/
+   set_dev_node(>dev, dev_to_node(_dev->dev));
+
if (err) {
printk(KERN_INFO "forcedeth: unable to register netdev: %d\n", 
err);
goto out_error;
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 6f0b2f7..747588c 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -333,11 +333,22 @@ static inline struct sk_buff *alloc_skb(unsigned int size,
return __alloc_skb(size, priority, 0, -1);
 }
 
+static inline struct sk_buff *alloc_skb_node(unsigned int size,
+   gfp_t priority, int node)
+{
+   return __alloc_skb(size, priority, 0, node);
+}
+
 static inline struct sk_buff *alloc_skb_fclone(unsigned int size,
   gfp_t priority)
 {
return __alloc_skb(size, priority, 1, -1);
 }
+static inline struct sk_buff *alloc_skb_fclone_node(unsigned int size,
+  gfp_t priority, int node)
+{
+

Re: 2.6.21-rt9 problem : xruns

2007-06-29 Thread Guennadi Liakhovetski

On Fri, 29 Jun 2007 [EMAIL PROTECTED] wrote:

> Hi,
> 
> I've recently compiled a "vanilla" 2.6.21 kernel, patched with Ingo Molnar's
> rt-8 patch, as I was unable to compile with rt-7.
> 
> I needed it because I'm using audio applications (tests were made with
> FrugalWare, but I don't think it's a distro issue).
> 
> Everything was allright until I changed my motherboard for an  Asrock
> 4coreDual-Vsta (I formerly used a Nforce4 one), with VIA PT880 Ultra chipset.
> 
> I've justed switched the hardware, as Linux is neat enough to boot without
> having to reinstall the whole OS.
> 
> Since, I get tons of xruns when using RT applications, and the only solution
> I've found to "fix" it was to disable ACPI at boot time.

Interestingly, this looks very similar to a problem I had... Could you, 
please, verify if acpid is running (with ACPI configured on), and if yes - 
stop it and retest for xruns?

Thanks
Guennadi

> Moreover, I get this message at boot time :
> PCI: BIOS bug: MCFG [EMAIL PROTECTED] is not E820 reserved
> PCI: not using MMCONFIG
> 
> I've also tested this hardware setup with a UbuntoStudio kernel (2.6.19),
> and everything works flawlessy (no XRUNS, no kernel messages) !
> 
> So here is my question (I'd like to understand what's happening) :
>- is it a kernel (or patch) issue ?
>- is it a bad chipset support in the latest kernel versions ?
>- could this be related to BIOS issues only (I've tried all available
> versions without any change) ?
>- should I use special settings in the ".config" in order to avoid these
> problems ?
> 
> I wasn't able to detect what causes these XRUNS, so if anyone has clues ...
> 
> Regards
> skb
> 
> 
> 
> 
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [EMAIL PROTECTED]
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 

---
Guennadi Liakhovetski
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: call request_irq before or after hardware initialization?

2007-06-29 Thread Arjan van de Ven

On Fri, 2007-06-29 at 12:35 -0700, Li Juen Hwang wrote:
> 
> Hi,
> 
> Most 1394 drivers on Linux, ohci1394 for instance, calls 
> request_irq() before
> initializing/enabling hardware chip. I'd like to reverse the order 
> so that driver
> can exit the kernel without calling free_irq() if hardware failed. 
> Is that ok?
> will it cause side effect? Thanks.

well you have to be able to handle interrupts the moment the hardware
will generate them... so the request_irq has to happen before that
point.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch 0/4] MAP_NOZERO v2 - VM_NOZERO/MAP_NOZERO early summer madness

2007-06-29 Thread Davide Libenzi

On Fri, 29 Jun 2007, Andy Isaacson wrote:

> On Thu, Jun 28, 2007 at 10:57:00PM -0400, Kyle Moffett wrote:
> > On Jun 28, 2007, at 14:49:24, Davide Libenzi wrote:
> > >So I implemented a rather quick hack that introduces a new mmap()  
> > >flag MAP_NOZERO (only valid for anonymous mappings) and the  vma   
> > >counter-part VM_NOZERO. Also, a new sys_brk2() has been introduced  
> > >to accept a new flags  parameter. A brief description of the  
> > >patches follows in the next emails.
> > 
> > Hmm, sounds like this would also need a "MAP_NOREUSE" flag of some  
> > kind for security sensitive applications.  Basically, I wouldn't want  
> > my ssh-agent pages holding private SSH keys to be reused by my web  
> > browser which then gets exploited :-D.
> 
> PGP at least (and I think GPG still) did overwrite keys before calling
> free(), and attempted to use mlock().  Looks like ssh-agent doesn't use
> mlock -- at least it hasn't in this case:
> % grep Lck /proc/`pidof ssh-agent`/status
> VmLck: 0 kB
> % ulimit -a | grep lock
> file size (blocks) unlimited
> core file size (blocks)0
> locked-in-memory size (kb) 32
> file locks unlimited
> 
> Requiring security-sensitive apps to use a new flag to get safe behavior
> is dangerous.  Better to be safe by default and turn on the
> less-safe-but-faster behavior for the cases that benefit from it.

Can you better explain what MAP_NOZERO would alter in such case?



> I still think that using uid in mm_struct is wrong, and some kind of
> abstraction is required.  I called this "free pool" in
> <[EMAIL PROTECTED]>, but I think that name is
> misleading -- I am not proposing that this should be part of the
> management of free pages, but should be a label which abstracts "safe to
> share freed pages among" groups.  Then different SELinux protection
> domains would simply have different labels.

I think I answered this one at least a couple of times, but anyawy. First, 
that can be whatever cookie we choose. At the moment UID is used because 
it makes easier a fit into _mapcount. Second, SeLinux will be able to 
disable the feature on a per-process base, or globally.
Anything else?



- Davide


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

call request_irq before or after hardware initialization?

2007-06-29 Thread Li Juen Hwang




Hi,

   Most 1394 drivers on Linux, ohci1394 for instance, calls 
request_irq() before
   initializing/enabling hardware chip. I'd like to reverse the order 
so that driver
   can exit the kernel without calling free_irq() if hardware failed. 
Is that ok?

   will it cause side effect? Thanks.


   Li-Juen
  
-

To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH pata-2.6 fix] hpt366: use correct enablebits for HPT36x

2007-06-29 Thread Sergei Shtylyov

The HPT36x chips finally turned out to have the channel enable bits -- however,
badly implemented.  Make use of them despite it's probably only going to burden
the driver's code -- assuming both channels are always enabled by the HighPoint
BIOS anyway...

Signed-off-by: Sergei Shtylyov <[EMAIL PROTECTED]>

---
Michal, Linas, please verify the patch... :-)

 drivers/ide/pci/hpt366.c |   20 +++-
 1 files changed, 15 insertions(+), 5 deletions(-)

Index: linux-2.6/drivers/ide/pci/hpt366.c
===
--- linux-2.6.orig/drivers/ide/pci/hpt366.c
+++ linux-2.6/drivers/ide/pci/hpt366.c
@@ -1,5 +1,5 @@
 /*
- * linux/drivers/ide/pci/hpt366.c  Version 1.05Jun 26, 2007
+ * linux/drivers/ide/pci/hpt366.c  Version 1.06Jun 27, 2007
  *
  * Copyright (C) 1999-2003 Andre Hedrick <[EMAIL PROTECTED]>
  * Portions Copyright (C) 2001 Sun Microsystems, Inc.
@@ -1514,18 +1514,28 @@ static int __devinit init_setup_hpt366(s
goto init_single;
 
/*
-* HPT36x chips are single channel and
-* do not seem to have the channel enable bit...
+* HPT36x chips have one channel per function and have
+* both channel enable bits located differently and visible
+* to both functions -- really stupid design decision... :-(
+* Bit 4 is for the primary channel, bit 5 for the secondary.
 */
d->channels = 1;
-   d->enablebits[0].reg = 0;
+   d->enablebits[0].mask = d->enablebits[0].val = 0x10;
 
if ((dev2 = pci_get_slot(dev->bus, dev->devfn + 1)) != NULL) {
-   u8  pin1 = 0, pin2 = 0;
+   u8  mcr1 = 0, pin1 = 0, pin2 = 0;
int ret;
 
pci_set_drvdata(dev2, info[rev]);
 
+   /*
+* Now we'll have to force both channels enabled if
+* at least one of them has been enabled by BIOS...
+*/
+   pci_read_config_byte(dev, 0x50, );
+   if (mcr1 & 0x30)
+   pci_write_config_byte(dev, 0x50, mcr1 | 0x30);
+
pci_read_config_byte(dev,  PCI_INTERRUPT_PIN, );
pci_read_config_byte(dev2, PCI_INTERRUPT_PIN, );
if (pin1 != pin2 && dev->irq == dev2->irq) {

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Need help making sense of IRQ API

2007-06-29 Thread LOL ER


Hello,
 I've been trying to make sense of how the kernel (on an i386) calls
__do_IRQ() from do_IRQ() for the past few days to no avail. After
doing some research I found out that do_IRQ() calls the corresponding
"highlevel irq-events handler", this lead me to believe that the
kernel calls __do_IRQ through "desc->handle_irq(irq, desc);" (snippet
from do_IRQ in arch/i386). Looking at the kernel I found a comment
which reads, " * @handle_irq: highlevel irq-events handler [if NULL,
__do_IRQ()]", instinctively I then grepped through the kernel to find
any instances of where handle_IRQ is checked for being NULL. I then
found an inlined function called generic_handle_irq that does just
that. If indeed generic_handle_irq() is the default value of
handle_irq(), why is its datatype not irq_flow_handler_t, and where in
the kernel is it set as the default "highlevel irq-events handler?"
Just to clarify what I mean, the handle_irq member in the irq_desc
struct has its datatype as irq_flow_handler_t, which is a function
pointer that takes 2 arguments,  "unsigned int irq, struct irq_desc
*desc", however unsigned generic_handle_irq() takes one argument,
which is just an int representing the IRQ number.
Thank You,
  Robert G.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Concerning a post that you made about expandable anonymous shared mappings

2007-06-29 Thread William Tambe

I read a post that you made about not being able to expand anonymous 
shared mapping with mremap(). And I am actually having that issue now.


You made the post in 2004 and we are now in 2007. I would like to know 
if that feature was added because the code below always fail with bus 
error on my machine. I use glibc 2.5


Thank you for helping.

#define _GNU_SOURCE
#include 
#include 

#include 

main() {
void *ptr;
if ((ptr=mmap(0, 4096, PROT_READ|PROT_WRITE,
MAP_ANONYMOUS|MAP_SHARED|MAP_GROWSDOWN, 0, 0)) == -1) {
printf("failed to mmap\n");
return;
}

if ((ptr=mremap(ptr, 4096, 8192, MREMAP_MAYMOVE)) == -1) {
printf("failed to mremap\n");
return;
}

//why does this failed. I am well in the interval [4096, 8192]
*(unsigned int *)(ptr + 4096 + 8)= 10;
}
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

drivers/net/wireless/libertas/rx.c: use-after-free

2007-06-29 Thread Adrian Bunk

The Coverity checker spotted the following use-after-free of "skb" in 
drivers/net/wireless/libertas/rx.c introduced by
commit 9012b28a407511fb355f6d2176a12d4653489672 (WTF did this commit
with the title "libertas: make debug configurable" add the 
"skb->protocol = __constant_htons(0x0019);" line?):

<--  snip  -->

...
static int process_rxed_802_11_packet(wlan_private * priv, struct sk_buff *skb)
{
...
libertas_upload_rx_packet(priv, skb);

ret = 0;

done:
skb->protocol = __constant_htons(0x0019);   /* ETH_P_80211_RAW */
...

<--  snip  -->


cu
Adrian

-- 

   "Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
   "Only a promise," Lao Er said.
   Pearl S. Buck - Dragon Seed

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] xen: fix x86 config dependencies

2007-06-29 Thread Jeremy Fitzhardinge


Make sure we set dependencies on CPU features.

Signed-off-by: Jeremy Fitzhardinge <[EMAIL PROTECTED]>
Cc: Adrian Bunk <[EMAIL PROTECTED]>
---
arch/i386/xen/Kconfig |2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

===
--- a/arch/i386/xen/Kconfig
+++ b/arch/i386/xen/Kconfig
@@ -4,7 +4,7 @@

config XEN
bool "Enable support for Xen hypervisor"
-   depends on PARAVIRT
+   depends on PARAVIRT && X86_CMPXCHG && X86_TSC
help
  This is the Linux Xen port.  Enabling this will allow the
  kernel to boot in a paravirtualized environment under the


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch 0/4] MAP_NOZERO v2 - VM_NOZERO/MAP_NOZERO early summer madness

2007-06-29 Thread Andy Isaacson

On Thu, Jun 28, 2007 at 10:57:00PM -0400, Kyle Moffett wrote:
> On Jun 28, 2007, at 14:49:24, Davide Libenzi wrote:
> >So I implemented a rather quick hack that introduces a new mmap()  
> >flag MAP_NOZERO (only valid for anonymous mappings) and the  vma   
> >counter-part VM_NOZERO. Also, a new sys_brk2() has been introduced  
> >to accept a new flags  parameter. A brief description of the  
> >patches follows in the next emails.
> 
> Hmm, sounds like this would also need a "MAP_NOREUSE" flag of some  
> kind for security sensitive applications.  Basically, I wouldn't want  
> my ssh-agent pages holding private SSH keys to be reused by my web  
> browser which then gets exploited :-D.

PGP at least (and I think GPG still) did overwrite keys before calling
free(), and attempted to use mlock().  Looks like ssh-agent doesn't use
mlock -- at least it hasn't in this case:
% grep Lck /proc/`pidof ssh-agent`/status
VmLck: 0 kB
% ulimit -a | grep lock
file size (blocks) unlimited
core file size (blocks)0
locked-in-memory size (kb) 32
file locks unlimited

Requiring security-sensitive apps to use a new flag to get safe behavior
is dangerous.  Better to be safe by default and turn on the
less-safe-but-faster behavior for the cases that benefit from it.

> It would also be a massive  
> information leak under SELinux.  To fix it properly according to the  
> SELinux model you would need to tag each page with a label  
> immediately after it's freed and then do an access-vector-check  
> against the old page and the new process before allowing reuse.  On  
> the other hand, that would probably be at least as expensive as  
> zeroing the page.

I still think that using uid in mm_struct is wrong, and some kind of
abstraction is required.  I called this "free pool" in
<[EMAIL PROTECTED]>, but I think that name is
misleading -- I am not proposing that this should be part of the
management of free pages, but should be a label which abstracts "safe to
share freed pages among" groups.  Then different SELinux protection
domains would simply have different labels.

-andy
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.22-rc6-mm1 Intel DMAR crash on AMD x86_64

2007-06-29 Thread Keshavamurthy, Anil S

On Fri, Jun 29, 2007 at 12:23:43PM -0400, Muli Ben-Yehuda wrote:
> On Fri, Jun 29, 2007 at 08:28:58AM -0700, Keshavamurthy, Anil S wrote:
> 
> > +++ linux-2.6.22-rc4-mm2/drivers/pci/dmar.c 2007-06-29 07:46:25.0 
> > -0700
> > @@ -260,6 +260,8 @@
> > int ret = 0;
> >  
> > dmar = (struct acpi_table_dmar *)dmar_tbl;
> > +   if (!dmar)
> > +   return -ENODEV;
> >  
> > if (!dmar->width) {
> > printk (KERN_WARNING PREFIX "Zero: Invalid DMAR haw\n");
> > @@ -301,7 +303,7 @@
> >  
> > parse_dmar_table();
> > if (list_empty(_drhd_units)) {
> > -   printk(KERN_ERR PREFIX "No DMAR devices found\n");
> > +   printk(KERN_INFO PREFIX "No DMAR devices found\n");
> > return -ENODEV;
> > }
> > return 0;
> 
> The convention is to print a KERN_DEBUG message if hardware is not
> found when probing it, otherwise the boot messages become cluttered
> with lots of "$FOO not found".

Since this is IOMMU is built into the kernel and it is
good idea to report that the device is not present. The 
above is printed only once and is consistent with other
IOMMU implementation. Atleast it is useful when people 
report bugs we can makeout whether IOMMU is being detected
or not.

Here is what I see on my box.
[..]
"PCI-GART: No AMD northbridge found."
[..]
Calgary: detecting Calgary via BIOS EBDA area
Calgary: Unable to locate Rio Grande table in EBDA - bailing!
[..]

As you can see I don;t have either GART or Calgary on my box.

-Thanks,
Anil
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] UDF: fix function name from udf_crc16 to udf_crc

2007-06-29 Thread Cyrill Gorcunov

We have to change udf_crc16() name to udf_crc()
to be able to play with CRC test.

Signed-off-by: Cyrill Gorcunov <[EMAIL PROTECTED]>
---

 fs/udf/crc.c |4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)


diff --git a/fs/udf/crc.c b/fs/udf/crc.c
index 490aebe..85aaee5 100644
--- a/fs/udf/crc.c
+++ b/fs/udf/crc.c
@@ -105,8 +105,8 @@ int main(void)
 {
unsigned short x;
 
-   x = udf_crc16(bytes, sizeof bytes);
-   printf("udf_crc16: calculated = %4.4x, correct = %4.4x\n", x, 0x3299U);
+   x = udf_crc(bytes, sizeof bytes);
+   printf("udf_crc: calculated = %4.4x, correct = %4.4x\n", x, 0x3299U);
 
return 0;
 }

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC PATCH 0/6] Convert all tasklets to workqueues

2007-06-29 Thread Alexey Kuznetsov

Hello!

> Also, create_workqueue() is very costly. The last 2 lines should be
> reverted.

Indeed.

The result improves from 3988 nanoseconds to 3975. :-)
Actually, the difference is within statistical variance,
which is about 20 ns.

Alexey
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch 1/5] avoid tlb gather restarts.

2007-06-29 Thread Hugh Dickins

I don't dare comment on your page_mkclean_one patch (5/5),
that dirty page business has grown too subtle for me.

Your cleanups 2-4 look good, especially the mm_types.h one (how
confident are you that everything builds?), and I'm glad we can
now lay ptep_establish to rest.  Though I think you may have 
missed removing a __HAVE_ARCH_PTEP... from frv at least?

But this one...

On Fri, 29 Jun 2007, Martin Schwidefsky wrote:

> If need_resched() is false it is unnecessary to call tlb_finish_mmu()
> and tlb_gather_mmu() for each vma in unmap_vmas(). Moving the tlb gather
> restart under the if that contains the cond_resched() will avoid
> unnecessary tlb flush operations that are triggered by tlb_finish_mmu() 
> and tlb_gather_mmu().
> 
> Signed-off-by: Martin Schwidefsky <[EMAIL PROTECTED]>

Sorry, no.  It looks reasonable, but unmap_vmas is treading a delicate
and uncomfortable line between hi-performance and lo-latency: you've
chosen to improve performance at the expense of latency.

You think you're just moving the finish/gather to where they're
actually necessary; but the thing is, that per-cpu struct mmu_gather
is liable to accumulate a lot of unpreemptible work for the future
tlb_finish_mmu, particularly when anon pages are associated with swap.

So although there may be no need to resched right now, if we keep on
gathering more and more without flushing, we'll be very unresponsive
when a resched is needed later on.  Hence Ingo's ZAP_BLOCK_SIZE to
split it up, small when CONFIG_PREEMPT, more reasonable but still
limited when not.

I expect there is some tinkering which could be done to improve it a
little; but my ambition has always been to eliminate ZAP_BLOCK_SIZE,
get away from the per-cpu'ness of the mmu_gather, and make unmap_vmas
preemptible.  But the i_mmap_lock case, and the per-arch variations
in TLB flushing, have forever stalled me.

Hugh

> ---
> 
>  mm/memory.c |7 +++
>  1 files changed, 3 insertions(+), 4 deletions(-)
> 
> diff -urpN linux-2.6/mm/memory.c linux-2.6-patched/mm/memory.c
> --- linux-2.6/mm/memory.c 2007-06-29 15:44:08.0 +0200
> +++ linux-2.6-patched/mm/memory.c 2007-06-29 15:44:08.0 +0200
> @@ -851,19 +851,18 @@ unsigned long unmap_vmas(struct mmu_gath
>   break;
>   }
>  
> - tlb_finish_mmu(*tlbp, tlb_start, start);
> -
>   if (need_resched() ||
>   (i_mmap_lock && need_lockbreak(i_mmap_lock))) {
> + tlb_finish_mmu(*tlbp, tlb_start, start);
>   if (i_mmap_lock) {
>   *tlbp = NULL;
>   goto out;
>   }
>   cond_resched();
> + *tlbp = tlb_gather_mmu(vma->vm_mm, fullmm);
> + tlb_start_valid = 0;
>   }
>  
> - *tlbp = tlb_gather_mmu(vma->vm_mm, fullmm);
> - tlb_start_valid = 0;
>   zap_work = ZAP_BLOCK_SIZE;
>   }
>   }
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

1 2 3 4 5 6 7 >

1 - 100 of 614 matches

Mail list logo