date:20070814

On Tue, 2007-08-14 at 18:31 -0700, Junio C Hamano wrote:
>On the other hand, git-send-email _is_ all about sending it
>out, and it needs to know who your patch should reach.  I
>think it makes sense to have one script that, given a set of
>paths that are affected, gives a list of potentially
>interested people (that is "Finding" part -- and I see there
>are 600+ patches to implement this on the list), and a new
>option to git-send-email to (1) inspect the patch to see what
>paths are affected, and (2) call that "Find" script to figure
>out whom to send it to, and probably asking for confirmation.

Yes please.

The LK MAINTAINERS file is ugly.

Might there be a git portable way to "find"?

Rene Herman had an idea about using some git
metadata that might be useful.  The completely
external data approach suggested by Al Viro 
might be OK too in that it wouldn't tie listeners
to git requiring more content in git metadata.

Perhaps both via something like:

--external-find "cmd @filelist"

Thanks,  Joe


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: SLUB doesn't work with kdump kernel on Cell

Can you try this patch?

>From 74863f472810cb58dc56dde050616581d38f7673 Mon Sep 17 00:00:00 2001
From: Christoph Lameter <[EMAIL PROTECTED]>
Date: Tue, 14 Aug 2007 19:09:00 -0700
Subject: [PATCH] SLUB: Do not fail on broken memory configurations

Print a big fat warning and do what is necessary to continue if a node
is marked as up but allocations from the node do not succeed.

Signed-off-by: Christoph Lameter <[EMAIL PROTECTED]>
---
 mm/slub.c |   11 +--
 1 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/mm/slub.c b/mm/slub.c
index 1488e71..fc82751 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -1877,9 +1877,16 @@ static struct kmem_cache_node * __init 
early_kmem_cache_node_alloc(gfp_t gfpflag
 
BUG_ON(kmalloc_caches->size < sizeof(struct kmem_cache_node));
 
-   page = new_slab(kmalloc_caches, gfpflags | GFP_THISNODE, node);
-
+   page = new_slab(kmalloc_caches, gfpflags, node);
BUG_ON(!page);
+
+   if (page_to_nid(page) != node) {
+   printk(KERN_ERR "SLUB: Unable to allocate memory from "
+   "node %d\n", node);
+   printk(KERN_ERR "SLUB: Allocating a useless per node structure"
+   " in order to be able to continue\n");
+   }
+
n = page->freelist;
BUG_ON(!n);
page->freelist = get_freepointer(kmalloc_caches, n);
-- 
1.5.2.4


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: SLUB doesn't work with kdump kernel on Cell

On Wed, 15 Aug 2007, Michael Ellerman wrote:

> > Yes SLUB will fall back but not during bootstrap. Bootstrap needs to
> > carefully place structures on the right nodes. We fail during bootstrap
> > because there is *no* memory available on it.
> 
> Sure, you want to have the structures on the right node if possible.
> But seeing as there's no memory available, what is wrong with just
> falling back?

Then you have a useless kmem_cache_node structure that will never be used.

What I could do is an alloc without GFP_THISNODE, check the location of 
the allocated memory and then print out a big fat warning that the memory 
setup is screwed up?

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] [443/2many] MAINTAINERS - HIBERNATION (aka Software Suspend, aka swsusp):

On Tue, 2007-08-14 at 11:15 -0700, Linus Torvalds wrote:
> In other words, it would be much better to just have per-file markers, 
> along with some per-subdirectory stuff or similar.

So that there would be no hot single file, I cut the
MAINTAINER file into single file segments in maintainers/*

00_descriptions
3c359_network_driver
3c505_network_driver
3c59x_network_driver
3cr990_network_driver
...
zd1211rw_wireless_driver
zf_machz_watchdog
zr36067_video_for_linux_driver
zs_decstation_z85c30_serial_driver
zz_the_rest

611 files.

How could "make" make a single MAINTAINERS?

"cat [0-9a-z]* > ../MAINTAINERS"?

Would it need to?
Anyone have suggestions for Makefile/Kconfig support?

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] [1/2many] - FInd the maintainer(s) for a patch - scripts/get_maintainer.pl

2007-08-14 Thread Rene Herman

On 08/14/2007 09:33 PM, Al Viro wrote:

FWIW, I suspect that we are looking at that from the wrong POV. If
that's about "who ought to be Cc'd on the issues dealing with ", why does it have to be tied to "who is maintainer for
"?

I'm not suggesting something like [EMAIL PROTECTED] with something
like majordomo allowing to add yourself to those, but something less
extreme in that direction might be worth thinking about... Hell,
even simple
$ finger fs/minix/[EMAIL PROTECTED]
with majordomo-like interface for adding yourself to such lists
might solve most of those problems...

It mostly is just about that it seems. However, this would not also allow
the other information currently in the MAINTAINERS file to be queried in
similar ways.

Git could grow a generic file meta data implementation through the use of
tags, sort of like tags on multimedia files although while with multimedia
files the tags are in fact stored as a file header, here you'd keep them
just in git. Any project using git would be free to define its own set of
info tags and you'd supply them to git simply as a list of

pairs:

$ git info --add drivers/ide/ide-cd.c "
"Sean Hefty <[EMAIL PROTECTED]>"
"Hal Rosenstock <[EMAIL PROTECTED]>"
[EMAIL PROTECTED]

$ git info --website drivers/infiniband/
http://www.openib.org/

$ git info --tree drivers/infiniband/
git kernel.org:/pub/scm/linux/kernel/git/roland/infiniband.git

Extra: when you have such an implementation, you can use it for other
purposes as well such as the summary Documentation/ files want for the
00-INDEX files:

$ git info --summary Documentation/BUG-HUNTING
brute force method of doing binary search of patches to find bug.

And importantly -- when queuried for a file that itself doesn't have the
requested info tag:

$ git info --cc drivers/infiniband/core/addr.c

git looks for the tag on the drivers/infiniband/core/ directory next, and
then on drivers/infiniband/, where it finds it. linux-kernel@vger.kernel.org
would be the final fallback, being set on the project root.

I'd really like something like this. As long as projects are both free to
use and not use them and free to define their own set of tags I believe this
would work very nicely.

Once you have these tags, you can basically use them for anything.

Rene.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Re: [PATCH] [121/2many] MAINTAINERS - CFG80211 and NL80211

On Mon, 2007-08-13 at 12:19 +0200, Johannes Berg wrote:
> On Sun, 2007-08-12 at 23:25 -0700, [EMAIL PROTECTED] wrote:
> > Add file pattern to MAINTAINER entry
> 
> > +F: include/linux/nl80211.h
> > +F: include/net/cfg80211.h
> > +F: net/wireless/core.*
> > +F: net/wireless/sysfs.*
> > +F: net/wireless/radiotap.c
> 
> I must've missed the original discussion surrounding this,

Sorry, posted 1/way2many on LK.
http://lkml.org/lkml/2007/8/13/17

> are these supposed to be regular match patterns?

More or less.  

> Is there a tool reading it? With
> this and wireless extensions it'd probably be best to mark cfg80211 as
> "everything in net/wireless/ but net/wireless/wext*" if possible.

CFG80211 and NL80211
P:  Johannes Berg
M:  [EMAIL PROTECTED]
L:  [EMAIL PROTECTED]
S:  Maintained
F:  include/linux/nl80211.h
F:  include/net/cfg80211.h
F:  net/wireless/*
X:  net/wireless/wext*


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] [1/2many] - FInd the maintainer(s) for a patch - scripts/get_maintainer.pl

2007-08-14 Thread Richard Knutsson


Linus Torvalds wrote:

On Tue, 14 Aug 2007, Joe Perches wrote:

  

On Tue, 2007-08-14 at 20:03 +0200, Rene Herman wrote:

"git info --maintainer drivers/ide/ide-cd.c" or some such would say "Alan 
Cox <[EMAIL PROTECTED]>".
  

Perhaps maintainer(s), approver(s), listener(s)?

I think something like this should be a git-goal.
What do the git-wranglers think?



The thing is, if you have git, you can basically already do this.

Do a script like this:

#!/bin/sh
git log --since=6.months.ago -- "$@" |
grep -i '^[-a-z]*by:.*@' |
  

sed -r "s/^.*by: \"?([^\"]+)\"?/\1/" |

sort | uniq -c |
sort -r -n | head

and it gives you a rather good picture of who is involved with a 
particular subdirectory or file.


  
Like the script! Especially since it reveled --since=6.month.ago and 
uniq to me.
Just wondering, why order them in the acked, signed and tested? Other 
then removing those, the added 'sed' also fix the  vs 
""-"problem". + adding '-i' to uniq should help the result too, right?


Now a simple "diffstat -p1 -l  | xargs " 
makes the day. Too bad, as Joe pointed out, it does not include relevant ML.


cheers
Richard Knutsson

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] [1/2many] - FInd the maintainer(s) for a patch - scripts/get_maintainer.pl

2007-08-14 Thread Junio C Hamano

Joe Perches <[EMAIL PROTECTED]> writes:

> On Tue, 2007-08-14 at 17:53 +0200, Rene Herman wrote:
>> It isn't about MODULE_FOO() tags, it is about tagging /source/ files 
>> to help with putting CCs on patch submissals.
>> If we want to link source file foo.c and the 
>> MAINTAINERS information, we have 3 options:
>> 1. MAINTAINERS --> foo.c
>> 2. foo.c --> MAINTAINERS
>> 3. foo.c <--> some 3rd file <--> MAINTAINERS
>
> I added [EMAIL PROTECTED] and Junio Hamano
>
> Another possibility is improving git to allow
> some sort of "declaration of interest" in bits
> of projects.
>
> That would allow options like:
>
> o  git-format-patch to include CCs
> o  git-commit and git-branch to notify or
>  take some other action
>
> etc...

There are things git can help, and other things git does not
have any business with.

1. Finding out who the potentially interested parties are.

   Linus already gave a script to grep *-by: lines from commit
   messages.  I find this is probably be the best option, as it
   follows "yesterday's weather".  People who had dealt with the
   area are the ones who are likely to be interested.

   git records who did the work (author) and who did the
   integration to git-based patch flow (committer).  It does not
   structurally track intermediate people who touched the patch
   on e-mail, but Signed-off-by: and Acked-by: (and sometimes I
   see Cc: as well in the commit messages) are accepted social
   convention in the kernel community, and taking advantage of
   that is a good idea.

2. Making it easier to send your patches to these people.

   There are three possible places to add Signed-off-by: and
   friends in the commit messages you would mail out:

   - When you create your own commit, or commit a patch that
 came to you via e-mail.  The commit object in your tree
 will carry them --- you can send format-patch output as-is
 to Linus or Andrew and you are done.

   - When you run format-patch; your commit will not have extra
 Cc: or "interested parties" information, you will use the
 result of 1. and insert it near your own Signed-off-by: to
 the format-patch output.

   - When you send format-patch output, via git-send-email
 perhaps.

   To make the result useful for "yesterday's weather" approach,
   I think it would be the best to do the first.  After all,
   your commit may propagate via "git pull" not over e-mail, and
   no postprocessing approach would work in such a case.

   The second one is my least favorite.  format-patch output is
   designed to record author/committer (i.e. origin) and not to
   record recipient at all.  "Who's interested in this" does not
   simply belong there.

   On the other hand, git-send-email _is_ all about sending it
   out, and it needs to know who your patch should reach.  I
   think it makes sense to have one script that, given a set of
   paths that are affected, gives a list of potentially
   interested people (that is "Finding" part -- and I see there
   are 600+ patches to implement this on the list), and a new
   option to git-send-email to (1) inspect the patch to see what
   paths are affected, and (2) call that "Find" script to figure
   out whom to send it to, and probably asking for confirmation.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: SLUB doesn't work with kdump kernel on Cell

2007-08-14 Thread Michael Ellerman

On 8/15/07, Christoph Lameter <[EMAIL PROTECTED]> wrote:
> On Tue, 14 Aug 2007, Lucio Correia wrote:
>
> > > SLAB boots because it falls back to node 0 for the control structures. So
> > > it creates useless control structures for node 1. These are then never
> > > used since any allocation  attempt to node 1 falls back to node 0.
> >
> > Hi Christoph,
> >
> > Shouldn't SLUB falls back to other node also for the case it can't
> > allocate memory?
>
> Yes SLUB will fall back but not during bootstrap. Bootstrap needs to
> carefully place structures on the right nodes. We fail during bootstrap
> because there is *no* memory available on it.

Sure, you want to have the structures on the right node if possible.
But seeing as there's no memory available, what is wrong with just
falling back?

Something like:

diff --git a/mm/slub.c b/mm/slub.c
index 9b2d617..0fc29d2 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -1876,6 +1876,8 @@ static struct kmem_cache_node * __init early_kmem_cache_no
BUG_ON(kmalloc_caches->size < sizeof(struct kmem_cache_node));

page = new_slab(kmalloc_caches, gfpflags | GFP_THISNODE, node);
+   if (!page)
+   page = new_slab(kmalloc_caches, gfpflags, node);

BUG_ON(!page);
n = page->freelist;
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[GIT PULL] Corrected - x86 setup fixes (now 3)

2007-08-14 Thread H. Peter Anvin

Hi Linus,

Please pull:

  git://git.kernel.org/pub/scm/linux/kernel/git/hpa/linux-2.6-x86setup.git 
for-linus

H. Peter Anvin (3):
  [x86 setup] The current display page is returned in %bh, not %bl
  [x86 setup] Don't use EDD to get the MBR signature
  [x86 setup] edd.c: make sure MBR signatures actually get reported

 arch/i386/boot/edd.c   |   54 ++-
 arch/i386/boot/video.c |2 +-
 2 files changed, 13 insertions(+), 43 deletions(-)

[Log messages and full diffs follow]

commit 9a5f35d4ede43fee791a486e0850e9e3afdde0a7
Author: H. Peter Anvin <[EMAIL PROTECTED]>
Date:   Tue Aug 14 17:36:00 2007 -0700

[x86 setup] edd.c: make sure MBR signatures actually get reported

When filling in the MBR signature array, the setup code failed to advance
boot_params.edd_mbr_sig_buf_entries, which resulted in the valid data
being ignored.

Signed-off-by: H. Peter Anvin <[EMAIL PROTECTED]>

diff --git a/arch/i386/boot/edd.c b/arch/i386/boot/edd.c
index d65dd21..82b5c84 100644
--- a/arch/i386/boot/edd.c
+++ b/arch/i386/boot/edd.c
@@ -37,11 +37,10 @@ static int read_mbr(u8 devno, void *buf)
return -(u8)ax; /* 0 or -1 */
 }
 
-static u32 read_mbr_sig(u8 devno, struct edd_info *ei)
+static u32 read_mbr_sig(u8 devno, struct edd_info *ei, u32 *mbrsig)
 {
int sector_size;
char *mbrbuf_ptr, *mbrbuf_end;
-   u32 mbrsig;
u32 buf_base, mbr_base;
extern char _end[];
 
@@ -57,15 +56,15 @@ static u32 read_mbr_sig(u8 devno, struct edd_info *ei)
 
/* Make sure we actually have space on the heap... */
if (!(boot_params.hdr.loadflags & CAN_USE_HEAP))
-   return 0;
+   return -1;
if (mbrbuf_end > (char *)(size_t)boot_params.hdr.heap_end_ptr)
-   return 0;
+   return -1;
 
if (read_mbr(devno, mbrbuf_ptr))
-   return 0;
+   return -1;
 
-   mbrsig = *(u32 *)_ptr[EDD_MBR_SIG_OFFSET];
-   return mbrsig;
+   *mbrsig = *(u32 *)_ptr[EDD_MBR_SIG_OFFSET];
+   return 0;
 }
 
 static int get_edd_info(u8 devno, struct edd_info *ei)
@@ -132,6 +131,7 @@ void query_edd(void)
int do_edd = 1;
int devno;
struct edd_info ei, *edp;
+   u32 *mbrptr;
 
if (cmdline_find_option("edd", eddarg, sizeof eddarg) > 0) {
if (!strcmp(eddarg, "skipmbr") || !strcmp(eddarg, "skip"))
@@ -140,7 +140,8 @@ void query_edd(void)
do_edd = 0;
}
 
-   edp = (struct edd_info *)boot_params.eddbuf;
+   edp= boot_params.eddbuf;
+   mbrptr = boot_params.edd_mbr_sig_buffer;
 
if (!do_edd)
return;
@@ -158,11 +159,8 @@ void query_edd(void)
boot_params.eddbuf_entries++;
}
 
-   if (do_mbr) {
-   u32 mbr_sig;
-   mbr_sig = read_mbr_sig(devno, );
-   boot_params.edd_mbr_sig_buffer[devno-0x80] = mbr_sig;
-   }
+   if (do_mbr && !read_mbr_sig(devno, , mbrptr++))
+   boot_params.edd_mbr_sig_buf_entries = devno-0x80+1;
}
 }
 

commit c1a6e2b082a7cefe58315af7a461bbf2f33221a3
Author: H. Peter Anvin <[EMAIL PROTECTED]>
Date:   Mon Aug 13 16:27:42 2007 -0700

[x86 setup] Don't use EDD to get the MBR signature

At least one machine has been identified in the field which advertises
EDD for all drives but locks up if one attempts an extended read from
a non-primary drive.

The MBR is always at CHS 0-0-1, so there is no reason to use an
extended read, other than the possibility that the BIOS cannot handle
it.

Although this might break as many machines as it fixes (a small number
either way), the current state is a regression but the reverse is not.
Therefore revert to the previous state of not using extended read.

Quite probably the Right Thing to do is to read using plain (CHS) read
and extended read on failure, but that change would definitely have to
go through -mm first.

Signed-off-by: H. Peter Anvin <[EMAIL PROTECTED]>

diff --git a/arch/i386/boot/edd.c b/arch/i386/boot/edd.c
index 658834d..d65dd21 100644
--- a/arch/i386/boot/edd.c
+++ b/arch/i386/boot/edd.c
@@ -19,40 +19,12 @@
 
 #if defined(CONFIG_EDD) || defined(CONFIG_EDD_MODULE)
 
-struct edd_dapa {
-   u8  pkt_size;
-   u8  rsvd;
-   u16 sector_cnt;
-   u16 buf_off, buf_seg;
-   u64 lba;
-   u64 buf_lin_addr;
-};
-
 /*
  * Read the MBR (first sector) from a specific device.
  */
 static int read_mbr(u8 devno, void *buf)
 {
-   struct edd_dapa dapa;
-   u16 ax, bx, cx, dx, si;
-
-   memset(, 0, sizeof dapa);
-   dapa.pkt_size = sizeof(dapa);
-   dapa.sector_cnt = 1;
-   dapa.buf_off = (size_t)buf;
-   dapa.buf_seg = ds();
-   /* dapa.lba = 0; */
-
-

[PATCH] [4/4] x86_64: Check for .cfi_rel_offset in CFI probe


Very old 64bit binutils have .cfi_startproc/endproc, but
no .cfi_rel_offset. Check for .cfi_rel_offset too.

Cc: [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED]

Index: linux/arch/x86_64/Makefile
===
--- linux.orig/arch/x86_64/Makefile
+++ linux/arch/x86_64/Makefile
@@ -57,8 +57,8 @@ cflags-y += $(call cc-option,-mno-sse -m
 cflags-y += -maccumulate-outgoing-args
 
 # do binutils support CFI?
-cflags-y += $(call as-instr,.cfi_startproc\n.cfi_endproc,-DCONFIG_AS_CFI=1,)
-AFLAGS += $(call as-instr,.cfi_startproc\n.cfi_endproc,-DCONFIG_AS_CFI=1,)
+cflags-y += $(call as-instr,.cfi_startproc\n.cfi_rel_offset 
rsp${comma}0\n.cfi_endproc,-DCONFIG_AS_CFI=1,)
+AFLAGS += $(call as-instr,.cfi_startproc\n.cfi_rel_offset 
rsp${comma}0\n.cfi_endproc,-DCONFIG_AS_CFI=1,)
 
 # is .cfi_signal_frame supported too?
 cflags-y += $(call 
as-instr,.cfi_startproc\n.cfi_signal_frame\n.cfi_endproc,-DCONFIG_AS_CFI_SIGNAL_FRAME=1,)
Index: linux/arch/i386/Makefile
===
--- linux.orig/arch/i386/Makefile
+++ linux/arch/i386/Makefile
@@ -51,8 +51,8 @@ cflags-y += -maccumulate-outgoing-args
 CFLAGS += $(shell if [ $(call cc-version) -lt 0400 ] ; 
then echo $(call cc-option,-fno-unit-at-a-time); fi ;)
 
 # do binutils support CFI?
-cflags-y += $(call as-instr,.cfi_startproc\n.cfi_endproc,-DCONFIG_AS_CFI=1,)
-AFLAGS += $(call as-instr,.cfi_startproc\n.cfi_endproc,-DCONFIG_AS_CFI=1,)
+cflags-y += $(call as-instr,.cfi_startproc\n.cfi_rel_offset 
esp${comma}0\n.cfi_endproc,-DCONFIG_AS_CFI=1,)
+AFLAGS += $(call as-instr,.cfi_startproc\n.cfi_rel_offset 
esp${comma}0\n.cfi_endproc,-DCONFIG_AS_CFI=1,)
 
 # is .cfi_signal_frame supported too?
 cflags-y += $(call 
as-instr,.cfi_startproc\n.cfi_signal_frame\n.cfi_endproc,-DCONFIG_AS_CFI_SIGNAL_FRAME=1,)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] [3/4] x86_64: Change PMDS invocation to single macro


Very old binutils (2.12.90...) seem to have trouble with newlines
in assembler macro invocation. They put them into the resulting
argument expansion. In this case this lead to a parse error because
a .rept expression ended up spread over multiple lines. Change the PMDS() 
invocation to a single line.

Signed-off-by: Andi Kleen <[EMAIL PROTECTED]>

Index: linux/arch/x86_64/kernel/head.S
===
--- linux.orig/arch/x86_64/kernel/head.S
+++ linux/arch/x86_64/kernel/head.S
@@ -345,8 +345,7 @@ NEXT_PAGE(level2_kernel_pgt)
/* 40MB kernel mapping. The kernel code cannot be bigger than that.
   When you change this change KERNEL_TEXT_SIZE in page.h too. */
/* (2^48-(2*1024*1024*1024)-((2^39)*511)-((2^30)*510)) = 0 */
-   PMDS(0x, __PAGE_KERNEL_LARGE_EXEC|_PAGE_GLOBAL,
-   KERNEL_TEXT_SIZE/PMD_SIZE)
+   PMDS(0x, __PAGE_KERNEL_LARGE_EXEC|_PAGE_GLOBAL, 
KERNEL_TEXT_SIZE/PMD_SIZE)
/* Module mapping starts here */
.fill   (PTRS_PER_PMD - (KERNEL_TEXT_SIZE/PMD_SIZE)),8,0
 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] [2/4] x86_64: Fix to keep watchdog disabled by default for i386/x86_64


From: Daniel Gollub <[EMAIL PROTECTED]>
Fixed wrong expression which enabled watchdogs even if nmi_watchdog kernel 
parameter wasn't set. This regression got slightly introduced with commit 
b7471c6da94d30d3deadc55986cc38d1ff57f9ca.

Introduced NMI_DISABLED (-1) which allows to switch the value of NMI_DEFAULT 
without breaking the APIC NMI watchdog code (again).

Fixes:
https://bugzilla.novell.com/show_bug.cgi?id=298084
http://bugzilla.kernel.org/show_bug.cgi?id=7839
And likely some more nmi_watchdog=0 related issues.

Signed-off-by: Daniel Gollub <[EMAIL PROTECTED]>
Signed-off-by: Andi Kleen <[EMAIL PROTECTED]>

---
Index: linux/arch/i386/kernel/apic.c
===
--- linux.orig/arch/i386/kernel/apic.c
+++ linux/arch/i386/kernel/apic.c
@@ -1085,7 +1085,7 @@ static int __init detect_init_APIC (void
if (l & MSR_IA32_APICBASE_ENABLE)
mp_lapic_addr = l & MSR_IA32_APICBASE_BASE;
 
-   if (nmi_watchdog != NMI_NONE)
+   if (nmi_watchdog != NMI_NONE && nmi_watchdog != NMI_DISABLED)
nmi_watchdog = NMI_LOCAL_APIC;
 
printk(KERN_INFO "Found and enabled local APIC!\n");
Index: linux/arch/i386/kernel/nmi.c
===
--- linux.orig/arch/i386/kernel/nmi.c
+++ linux/arch/i386/kernel/nmi.c
@@ -77,7 +77,7 @@ static int __init check_nmi_watchdog(voi
unsigned int *prev_nmi_count;
int cpu;
 
-   if ((nmi_watchdog == NMI_NONE) || (nmi_watchdog == NMI_DEFAULT))
+   if ((nmi_watchdog == NMI_NONE) || (nmi_watchdog == NMI_DISABLED))
return 0;
 
if (!atomic_read(_active))
@@ -424,7 +424,7 @@ int proc_nmi_enabled(struct ctl_table *t
if (!!old_state == !!nmi_watchdog_enabled)
return 0;
 
-   if (atomic_read(_active) < 0) {
+   if (atomic_read(_active) < 0 || nmi_watchdog == NMI_DISABLED) {
printk( KERN_WARNING "NMI watchdog is permanently disabled\n");
return -EIO;
}
Index: linux/arch/x86_64/kernel/nmi.c
===
--- linux.orig/arch/x86_64/kernel/nmi.c
+++ linux/arch/x86_64/kernel/nmi.c
@@ -85,7 +85,7 @@ int __init check_nmi_watchdog (void)
int *counts;
int cpu;
 
-   if ((nmi_watchdog == NMI_NONE) || (nmi_watchdog == NMI_DEFAULT))
+   if ((nmi_watchdog == NMI_NONE) || (nmi_watchdog == NMI_DISABLED)) 
return 0;
 
if (!atomic_read(_active))
@@ -442,7 +442,7 @@ int proc_nmi_enabled(struct ctl_table *t
if (!!old_state == !!nmi_watchdog_enabled)
return 0;
 
-   if (atomic_read(_active) < 0) {
+   if (atomic_read(_active) < 0 || nmi_watchdog == NMI_DISABLED) {
printk( KERN_WARNING "NMI watchdog is permanently disabled\n");
return -EIO;
}
Index: linux/include/asm-i386/nmi.h
===
--- linux.orig/include/asm-i386/nmi.h
+++ linux/include/asm-i386/nmi.h
@@ -33,11 +33,12 @@ extern int nmi_watchdog_tick (struct pt_
 
 extern atomic_t nmi_active;
 extern unsigned int nmi_watchdog;
-#define NMI_DEFAULT -1
+#define NMI_DISABLED-1
 #define NMI_NONE   0
 #define NMI_IO_APIC1
 #define NMI_LOCAL_APIC 2
 #define NMI_INVALID3
+#define NMI_DEFAULTNMI_DISABLED
 
 struct ctl_table;
 struct file;
Index: linux/include/asm-x86_64/nmi.h
===
--- linux.orig/include/asm-x86_64/nmi.h
+++ linux/include/asm-x86_64/nmi.h
@@ -64,11 +64,12 @@ extern int setup_nmi_watchdog(char *);
 
 extern atomic_t nmi_active;
 extern unsigned int nmi_watchdog;
-#define NMI_DEFAULT-1
+#define NMI_DISABLED-1
 #define NMI_NONE   0
 #define NMI_IO_APIC1
 #define NMI_LOCAL_APIC 2
 #define NMI_INVALID3
+#define NMI_DEFAULTNMI_DISABLED
 
 struct ctl_table;
 struct file;
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [v4l-dvb-maintainer] [PATCH] [534/2many] MAINTAINERS - VIDEO FOR LINUX

2007-08-14 Thread hermann pitton

Am Dienstag, den 14.08.2007, 18:02 +0400 schrieb Manu Abraham:
> On 8/14/07, Mauro Carvalho Chehab <[EMAIL PROTECTED]> wrote:
> >
> > >
> > > > F: drivers/media/*
> > >
> > >
> > > This is NOT OK !
> >
> > This IS ok. You just need to read the definition of the 'F' tag:
> >
> > F: Files and directories with wildcard patterns.
> >A trailing slash includes all files and subdirectory files.
> > F:  drivers/net/all files in and below drivers/net
> > F:  drivers/net/*   all files in drivers/net, but not below
> > F:  */net/* all files in "any top level directory"/net
> >One pattern per line.  Multiple F: lines acceptable.
> >
> > This tag just includes Kconfig and Makefile.
> 
> 
> Just as much as you state that, the reverse is as well true for
> Kconfig and Makefile.
> 

Please explain what you still want and why.

After two almost wasted years, you should be able to do so.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] PSS(proportional set size) accounting in smaps

2007-08-14 Thread Fengguang Wu

On Tue, Aug 14, 2007 at 01:19:40AM -0500, Matt Mackall wrote:
> On Tue, Aug 14, 2007 at 09:33:50AM +0800, Fengguang Wu wrote:
> > The "proportional set size" (PSS) of a process is the count of pages it has 
> > in
> > memory, where each page is divided by the number of processes sharing it. 
> > So if
> > a process has 1000 pages all to itself, and 1000 shared with one other 
> > process,
> > its PSS will be 1500.
> >- lwn.net: "ELC: How much memory are applications really 
> > using?"
> > 
> > The PSS proposed by Matt Mackall is a very nice metic for measuring an 
> > process's
> > memory footprint. So collect and export it via /proc//smaps.
> > 
> > Matt Mackall's pagemap/kpagemap and John Berthels's exmap can also do the 
> > job,
> > providing pretty much details.  But for PSS, let's do it in a simple way. 
> 
> Yes, if people actually want to use this particular metric a lot (and
> I obviously personally think it makes a lot of sense), then it should
> be done in kernel like this.

Thank you for the acknowledge, Matt.

> > Cc: Matt Mackall <[EMAIL PROTECTED]>
> > Cc: John Berthels <[EMAIL PROTECTED]>
> > Signed-off-by: Fengguang Wu <[EMAIL PROTECTED]>
> > ---
> >  fs/proc/task_mmu.c |   13 ++---
> >  1 file changed, 10 insertions(+), 3 deletions(-)
> > 
> > --- linux-2.6.23-rc2-mm2.orig/fs/proc/task_mmu.c
> > +++ linux-2.6.23-rc2-mm2/fs/proc/task_mmu.c
> > @@ -319,6 +319,7 @@ const struct file_operations proc_maps_o
> >  struct mem_size_stats
> >  {
> > unsigned long resident;
> > +   u64   pss;  /* proportional set size: my share of rss */
> 
> 64 bits?

Yes, to accommodate the extra 12 bits for error shifting.

> > unsigned long shared_clean;
> > unsigned long shared_dirty;
> > unsigned long private_clean;
> > @@ -341,6 +342,7 @@ static int smaps_pte_range(pmd_t *pmd, u
> > pte_t *pte, ptent;
> > spinlock_t *ptl;
> > struct page *page;
> > +   int mapcount;
> >  
> > pte = pte_offset_map_lock(vma->vm_mm, pmd, addr, );
> > for (; addr != end; pte++, addr += PAGE_SIZE) {
> > @@ -357,16 +359,19 @@ static int smaps_pte_range(pmd_t *pmd, u
> > /* Accumulate the size in pages that have been accessed. */
> > if (pte_young(ptent) || PageReferenced(page))
> > mss->referenced += PAGE_SIZE;
> > -   if (page_mapcount(page) >= 2) {
> > +   mapcount = page_mapcount(page);
> > +   if (mapcount >= 2) {
> > if (pte_dirty(ptent))
> > mss->shared_dirty += PAGE_SIZE;
> > else
> > mss->shared_clean += PAGE_SIZE;
> > +   mss->pss += (PAGE_SIZE << 12) / mapcount;
> 
> Hmm, what's that shift for? Oh, you're doing fixed-point math.
> 
> 64-bit divisions are quite expensive on some platforms. The compiler
> might be able to do something smarter with common constants like:
> 
>if (mapcount == 1)
>   mss->pss += PAGE_SIZE;
>else if (mapcount == 2)
>   mss->pss += PAGE_SIZE / 2;
>else if (mapcount == 3)
>   mss->pss += PAGE_SIZE / 3;
>else if (mapcount == 4)
>   mss->pss += PAGE_SIZE / 4;
>else
>   mss->pss += PAGE_SIZE / mapcount;
> 
> ..but I don't know. I suspect we'll at least want to special-case
> mapcount == 1 though.

Don't worry, the PAGE_SIZE being divided is unsigned long. So there's
no 64bit division on 32bit CPU :) And we do avoid the division for
the common case of mapcount == 1.

> > +  sarg.mss.resident  >> 10,
> > +  (unsigned long)(mss->pss >> 22),
> 
> And then you're throwing away 22 bits of precision. 10 bits wasn't
> enough? Hmmm.. Looks like the worst case is sharing a 4k page 2049
> ways, where we'll be off by .999 bytes per 4k page for nearly 50%
> error. Your extra 12 bits drops this to .2% error, so I suppose it's
> worth it.
> 
> But it probably needs a comment.

OK, I introduced PSS_ERROR_BITS=12, and some comments for it.
Note that the output unit of 1KB could be the most significant source
of errors :)

> > -  sarg.mss.referenced >> 10);
> > +  sarg.mss.referenced>> 10);
> 
> Unrelated change.

Ok, removed it.

Thank you,
Fengguang
===

PSS(proportional set size) accounting in smaps

The "proportional set size" (PSS) of a process is the count of pages it has in
memory, where each page is divided by the number of processes sharing it. So if
a process has 1000 pages all to itself, and 1000 shared with one other process,
its PSS will be 1500.
   - lwn.net: "ELC: How much memory are applications really using?"

The PSS proposed by Matt Mackall is a very nice metic for measuring an process's
memory footprint. So collect and export it via /proc//smaps.

Matt Mackall's pagemap/kpagemap and John Berthels's exmap can also do the job.
They are comprehensive tools. But for PSS, let's do it in the simple way. 


Cc: Matt Mackall <[EMAIL PROTECTED]>
Cc:

Re: kfree(0) - ok?

On Wed, 15 Aug 2007, Satyam Sharma wrote:

> [PATCH] {slub, slob}: use unlikely() for kfree(ZERO_OR_NULL_PTR) check

Good that actually has a code impact.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 9/13] cxgb3 - Update internal memory management


Jeff Garzik wrote:


Divy Le Ray wrote:
> From: Divy Le Ray <[EMAIL PROTECTED]>
>
> Set PM1 internal memory to round robin mode
>
> Signed-off-by: Divy Le Ray <[EMAIL PROTECTED]>

why?


For multiport adapters, it balances access to this internal memory.

Divy
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[GIT PULL] Two more x86 setup fixes

2007-08-14 Thread H. Peter Anvin

Hi Linus,

Please pull:


  git://git.kernel.org/pub/scm/linux/kernel/git/hpa/linux-2.6-x86setup.git 
for-linus

H. Peter Anvin (2):
  [x86 setup] The current display page is returned in %bh, not %bl
  [x86 setup] Don't use EDD to get the MBR signature

 arch/i386/boot/edd.c   |   28 
 arch/i386/boot/video.c |2 +-
 2 files changed, 1 insertions(+), 29 deletions(-)

[Log messages and full diffs follow]

commit c031c43503412b1f60ff92fdf4527787a6c98afd
Author: H. Peter Anvin <[EMAIL PROTECTED]>
Date:   Mon Aug 13 16:27:42 2007 -0700

[x86 setup] Don't use EDD to get the MBR signature

At least one machine has been identified in the field which advertises
EDD for all drives but locks up if one attempts an extended read from
a non-primary drive.

The MBR is always at CHS 0-0-1, so there is no reason to use an
extended read, other than the possibility that the BIOS cannot handle
it.

Although this might break as many machines as it fixes (a small number
either way), the current state is a regression but the reverse is not.
Therefore revert to the previous state of not using extended read.

Quite probably the Right Thing to do is to read using plain (CHS) read
and extended read on failure, but that change would definitely have to
go through -mm first.

Signed-off-by: H. Peter Anvin <[EMAIL PROTECTED]>

diff --git a/arch/i386/boot/edd.c b/arch/i386/boot/edd.c
index 658834d..ba1b37b 100644
--- a/arch/i386/boot/edd.c
+++ b/arch/i386/boot/edd.c
@@ -19,41 +19,13 @@
 
 #if defined(CONFIG_EDD) || defined(CONFIG_EDD_MODULE)
 
-struct edd_dapa {
-   u8  pkt_size;
-   u8  rsvd;
-   u16 sector_cnt;
-   u16 buf_off, buf_seg;
-   u64 lba;
-   u64 buf_lin_addr;
-};
-
 /*
  * Read the MBR (first sector) from a specific device.
  */
 static int read_mbr(u8 devno, void *buf)
 {
-   struct edd_dapa dapa;
u16 ax, bx, cx, dx, si;
 
-   memset(, 0, sizeof dapa);
-   dapa.pkt_size = sizeof(dapa);
-   dapa.sector_cnt = 1;
-   dapa.buf_off = (size_t)buf;
-   dapa.buf_seg = ds();
-   /* dapa.lba = 0; */
-
-   ax = 0x4200;/* Extended Read */
-   si = (size_t)
-   dx = devno;
-   asm("pushfl; stc; int $0x13; setc %%al; popfl"
-   : "+a" (ax), "+S" (si), "+d" (dx)
-   : "m" (dapa)
-   : "ebx", "ecx", "edi", "memory");
-
-   if (!(u8)ax)
-   return 0;   /* OK */
-
ax = 0x0201;/* Legacy Read, one sector */
cx = 0x0001;/* Sector 0-0-1 */
dx = devno;

commit 362cea339a34e04caae6cad67ea9bde5c100d12b
Author: H. Peter Anvin <[EMAIL PROTECTED]>
Date:   Fri Aug 10 14:20:26 2007 -0700

[x86 setup] The current display page is returned in %bh, not %bl

The current display page is an 8-bit number, even though struct
screen_info gives it a 16-bit number.  The number is returned in %bh,
so it needs to be >> 8 before storing.

Special thanks to Jeff Chua for detailed bug reporting.

Signed-off-by: H. Peter Anvin <[EMAIL PROTECTED]>

diff --git a/arch/i386/boot/video.c b/arch/i386/boot/video.c
index 958130e..693f20d 100644
--- a/arch/i386/boot/video.c
+++ b/arch/i386/boot/video.c
@@ -61,7 +61,7 @@ static void store_video_mode(void)
 
/* Not all BIOSes are clean with respect to the top bit */
boot_params.screen_info.orig_video_mode = ax & 0x7f;
-   boot_params.screen_info.orig_video_page = page;
+   boot_params.screen_info.orig_video_page = page >> 8;
 }
 
 /*
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] [496/2many] MAINTAINERS - USB HID/HIDBP DRIVERS (USB KEYBOARDS, MICE, REMOTE CONTROLS, ...)

2007-08-14 Thread Jiri Kosina

On Sun, 12 Aug 2007, [EMAIL PROTECTED] wrote:

> Add file pattern to MAINTAINER entry
> 
> Signed-off-by: Joe Perches <[EMAIL PROTECTED]>
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index ae24def..270952c 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -4696,6 +4696,8 @@ M:  [EMAIL PROTECTED]
>  L:   [EMAIL PROTECTED]
>  T:   git kernel.org:/pub/scm/linux/kernel/git/jikos/hid.git
>  S:   Maintained
> +F:   Documentation/usb/hiddev.txt
> +F:   drivers/hid/usbhid/
>  
>  USB HUB DRIVER
>  P:   Johannes Erdfelt

Same here -- if this goes upstream, you can add

Signed-off-by: Jiri Kosina <[EMAIL PROTECTED]>

Thanks,

-- 
Jiri Kosina
SUSE Labs
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] [217/2many] MAINTAINERS - HID CORE LAYER

2007-08-14 Thread Jiri Kosina

On Sun, 12 Aug 2007, [EMAIL PROTECTED] wrote:

> Add file pattern to MAINTAINER entry
> 
> Signed-off-by: Joe Perches <[EMAIL PROTECTED]>
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index bdbf999..4a8770c 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -2071,6 +2071,8 @@ M:  [EMAIL PROTECTED]
>  L:   [EMAIL PROTECTED]
>  T:   git kernel.org:/pub/scm/linux/kernel/git/jikos/hid.git
>  S:   Maintained
> +F:   drivers/hid/
> +F:   include/linux/hid*
>  
>  HIGH-RESOLUTION TIMERS, CLOCKEVENTS, DYNTICKS
>  P:   Thomas Gleixner
> 

If this is ever going to be merged, you can add

Signed-off-by: Jiri Kosina <[EMAIL PROTECTED]>

-- 
Jiri Kosina
SUSE Labs
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [-mm patch] unexport cap_inode_killpriv

2007-08-14 Thread Adrian Bunk

On Tue, Aug 14, 2007 at 04:35:02PM -0500, Serge E. Hallyn wrote:
> Quoting Adrian Bunk ([EMAIL PROTECTED]):
> > On Thu, Aug 09, 2007 at 10:42:54PM -0700, Andrew Morton wrote:
> > >...
> > > Changes since 2.6.23-rc1-mm2:
> > >...
> > > +file-capabilities-clear-fcaps-on-inode-change.patch
> > > 
> > >  file caps update
> > >...
> > 
> > This patch removes the unused EXPORT_SYMBOL(cap_inode_killpriv).
> > 
> > Signed-off-by: Adrian Bunk <[EMAIL PROTECTED]>
> 
> Acked-by: Serge Hallyn <[EMAIL PROTECTED]>
> 
> > ---
> > 68ca3bcc4918d0b84a97318f60fb74c4600d9f6b 
> > diff --git a/security/commoncap.c b/security/commoncap.c
> > index 7816cdc..9ec5890 100644
> > --- a/security/commoncap.c
> > +++ b/security/commoncap.c
> > @@ -543,7 +543,6 @@ EXPORT_SYMBOL(cap_bprm_apply_creds);
> >  EXPORT_SYMBOL(cap_bprm_secureexec);
> >  EXPORT_SYMBOL(cap_inode_setxattr);
> >  EXPORT_SYMBOL(cap_inode_removexattr);
> > -EXPORT_SYMBOL(cap_inode_killpriv);
> >  EXPORT_SYMBOL(cap_task_post_setuid);
> >  EXPORT_SYMBOL(cap_task_kill);
> >  EXPORT_SYMBOL(cap_task_setscheduler);
> 
> Ah yes, bc LSMs can't be modules any more.  But then, why still export
> cap_task_setscheduler, for instance?

Look at my next patch.  :-)

> -serge

cu
Adrian

-- 

   "Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
   "Only a promise," Lao Er said.
   Pearl S. Buck - Dragon Seed

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH RFC] CPU hotplug support for preemptible RCU

Hello!

Work in progress, not for inclusion.

The attached patch passes multiple hours of rcutorture while hotplugging
CPUs every ten seconds on 64-bit PPC and x86_64.  It fails miserably on
32-bit i386 after a few hotplugs, but then again, so does stock 2.6.22
even without running rcutorture simultaneously.

Is there some extra patch or hardware dependency for CPU hotplug on
32-bit i386?

Signed-off-by: Paul E. McKenney <[EMAIL PROTECTED]>
---

 include/linux/rcuclassic.h |2 
 include/linux/rcupreempt.h |2 
 kernel/rcuclassic.c|8 +++
 kernel/rcupreempt.c|   93 +++--
 4 files changed, 100 insertions(+), 5 deletions(-)

diff -urpNa -X dontdiff linux-2.6.22-d-schedclassic/include/linux/rcuclassic.h 
linux-2.6.22-e-hotplugcpu/include/linux/rcuclassic.h
--- linux-2.6.22-d-schedclassic/include/linux/rcuclassic.h  2007-08-06 
10:16:16.0 -0700
+++ linux-2.6.22-e-hotplugcpu/include/linux/rcuclassic.h2007-08-07 
18:15:08.0 -0700
@@ -83,6 +83,8 @@ static inline void rcu_bh_qsctr_inc(int 
 #define rcu_check_callbacks_rt(cpu, user)
 #define rcu_init_rt()
 #define rcu_needs_cpu_rt(cpu) 0
+#define rcu_offline_cpu_rt(cpu)
+#define rcu_online_cpu_rt(cpu)
 #define rcu_pending_rt(cpu) 0
 #define rcu_process_callbacks_rt(unused)
 
diff -urpNa -X dontdiff linux-2.6.22-d-schedclassic/include/linux/rcupreempt.h 
linux-2.6.22-e-hotplugcpu/include/linux/rcupreempt.h
--- linux-2.6.22-d-schedclassic/include/linux/rcupreempt.h  2007-08-06 
14:56:00.0 -0700
+++ linux-2.6.22-e-hotplugcpu/include/linux/rcupreempt.h2007-08-07 
18:15:10.0 -0700
@@ -59,6 +59,8 @@ extern void rcu_advance_callbacks_rt(int
 extern void rcu_check_callbacks_rt(int cpu, int user);
 extern void rcu_init_rt(void);
 extern int  rcu_needs_cpu_rt(int cpu);
+extern void rcu_offline_cpu_rt(int cpu);
+extern void rcu_online_cpu_rt(int cpu);
 extern int  rcu_pending_rt(int cpu);
 struct softirq_action;
 extern void rcu_process_callbacks_rt(struct softirq_action *unused);
diff -urpNa -X dontdiff linux-2.6.22-d-schedclassic/kernel/rcuclassic.c 
linux-2.6.22-e-hotplugcpu/kernel/rcuclassic.c
--- linux-2.6.22-d-schedclassic/kernel/rcuclassic.c 2007-08-06 
14:07:26.0 -0700
+++ linux-2.6.22-e-hotplugcpu/kernel/rcuclassic.c   2007-08-11 
08:25:55.0 -0700
@@ -404,14 +404,19 @@ static void __rcu_offline_cpu(struct rcu
 static void rcu_offline_cpu(int cpu)
 {
struct rcu_data *this_rdp = _cpu_var(rcu_data);
+#ifdef CONFIG_CLASSIC_RCU
struct rcu_data *this_bh_rdp = _cpu_var(rcu_bh_data);
+#endif /* #ifdef CONFIG_CLASSIC_RCU */
 
__rcu_offline_cpu(this_rdp, _ctrlblk,
_cpu(rcu_data, cpu));
+#ifdef CONFIG_CLASSIC_RCU
__rcu_offline_cpu(this_bh_rdp, _bh_ctrlblk,
_cpu(rcu_bh_data, cpu));
-   put_cpu_var(rcu_data);
put_cpu_var(rcu_bh_data);
+#endif /* #ifdef CONFIG_CLASSIC_RCU */
+   put_cpu_var(rcu_data);
+   rcu_offline_cpu_rt(cpu);
 }
 
 #else
@@ -561,6 +566,7 @@ static void __devinit rcu_online_cpu(int
rdp->passed_quiesc = _cpu(rcu_data_passed_quiesc, cpu);
rcu_init_percpu_data(cpu, _bh_ctrlblk, bh_rdp);
bh_rdp->passed_quiesc = _cpu(rcu_data_bh_passed_quiesc, cpu);
+   rcu_online_cpu_rt(cpu);
 }
 
 static int __cpuinit rcu_cpu_notify(struct notifier_block *self,
diff -urpNa -X dontdiff linux-2.6.22-d-schedclassic/kernel/rcupreempt.c 
linux-2.6.22-e-hotplugcpu/kernel/rcupreempt.c
--- linux-2.6.22-d-schedclassic/kernel/rcupreempt.c 2007-08-06 
14:58:07.0 -0700
+++ linux-2.6.22-e-hotplugcpu/kernel/rcupreempt.c   2007-08-11 
04:02:10.0 -0700
@@ -125,6 +125,8 @@ enum rcu_mb_flag_values {
 };
 static DEFINE_PER_CPU(enum rcu_mb_flag_values, rcu_mb_flag) = rcu_mb_done;
 
+static cpumask_t rcu_cpu_online_map = CPU_MASK_NONE;
+
 /*
  * Macro that prevents the compiler from reordering accesses, but does
  * absolutely -nothing- to prevent CPUs from reordering.  This is used
@@ -400,7 +402,7 @@ rcu_try_flip_idle(void)
 
/* Now ask each CPU for acknowledgement of the flip. */
 
-   for_each_possible_cpu(cpu)
+   for_each_cpu_mask(cpu, rcu_cpu_online_map)
per_cpu(rcu_flip_flag, cpu) = rcu_flipped;
 
return 1;
@@ -416,7 +418,7 @@ rcu_try_flip_waitack(void)
int cpu;
 
RCU_TRACE_ME(rcupreempt_trace_try_flip_a1);
-   for_each_possible_cpu(cpu)
+   for_each_cpu_mask(cpu, rcu_cpu_online_map)
if (per_cpu(rcu_flip_flag, cpu) != rcu_flip_seen) {
RCU_TRACE_ME(rcupreempt_trace_try_flip_ae1);
return 0;
@@ -460,7 +462,7 @@ rcu_try_flip_waitzero(void)
 
/* Call for a memory barrier from each CPU. */
 
-   for_each_possible_cpu(cpu)
+   for_each_cpu_mask(cpu, rcu_cpu_online_map)
per_cpu(rcu_mb_flag, cpu) = rcu_mb_needed;

Re: kfree(0) - ok?

2007-08-14 Thread Satyam Sharma



On Tue, 14 Aug 2007, Arjan van de Ven wrote:

> 
> On Tue, 2007-08-14 at 15:59 -0700, Tim Bird wrote:
> > Hi all,
> > 
> > I have a quick question.
> > 
> > I'm trying to resurrect a patch from the Linux-tiny patch suite,
> > to do accounting of kmalloc memory allocations.  In testing it
> > with Linux 2.6.22, I've found a large number of kfrees of
> > NULL pointers.
> > 
> > Is this considered OK?  Or should I examine the offenders
> > to see if something is coded badly?
> 
> kfree(NULL) is explicitly ok and it saves code size to not check
> anywhere

But that doesn't come free of cost, does it, seeing we've now pushed
the conditional inside kfree() itself. kfree() isn't inlined so we do
save on space but lose out on the extra time overhead for the common
case. Speaking of which ...

[PATCH] {slub, slob}: use unlikely() for kfree(ZERO_OR_NULL_PTR) check

Considering kfree(NULL) would normally occur only in error paths and
kfree(ZERO_SIZE_PTR) is uncommon as well, so let's use unlikely() for
the condition check in SLUB's and SLOB's kfree() to optimize for the
common case. SLAB has this already.

Signed-off-by: Satyam Sharma <[EMAIL PROTECTED]>

---

 mm/slob.c |2 +-
 mm/slub.c |2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/mm/slob.c b/mm/slob.c
index ec33fcd..37a8b9a 100644
--- a/mm/slob.c
+++ b/mm/slob.c
@@ -466,7 +466,7 @@ void kfree(const void *block)
 {
struct slob_page *sp;
 
-   if (ZERO_OR_NULL_PTR(block))
+   if (unlikely(ZERO_OR_NULL_PTR(block)))
return;
 
sp = (struct slob_page *)virt_to_page(block);
diff --git a/mm/slub.c b/mm/slub.c
index 69d02e3..3788537 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -2467,7 +2467,7 @@ void kfree(const void *x)
 * this comparison would be true for all "negative" pointers
 * (which would cover the whole upper half of the address space).
 */
-   if (ZERO_OR_NULL_PTR(x))
+   if (unlikely(ZERO_OR_NULL_PTR(x)))
return;
 
page = virt_to_head_page(x);
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/23] document preferred use of volatile with atomic_t

On Tue, Aug 14, 2007 at 03:56:51PM -0700, Christoph Lameter wrote:
> On Tue, 14 Aug 2007, Chris Snook wrote:
> 
> > > volatile means that there is some vague notion of "read it now". But that
> > > really does not exist. Instead we control visibility via barriers 
> > > (smp_wmb,
> > > smp_rmb). Would it not be best to not have volatile at all in atomic
> > > operations and let the barriers do the work?
> > 
> > From my reply in the other thread...
> > 
> > But barriers force a flush of *everything* in scope, which we generally 
> > don't
> > want.  On the other hand, we pretty much always want to flush atomic_*
> > operations.  One way or another, we should be restricting the volatile
> > behavior to the thing that needs it.  On most architectures, this patch set
> > just moves that from the declaration, where it is considered harmful, to the
> > use, where it is considered an occasional necessary evil.
> > 
> > If you really, *really* distrust the compiler that much, you shouldn't be
> > using barrier, since that uses volatile under the hood too.  You should just
> > go ahead and implement the atomic operations in assembler, like Segher
> > Boessenkool did for powerpc in response to my previous patchset.
> 
> >From my reply on the other thread:
> 
> Maybe we need two read functions? One volatile, one not?
> 
> The atomic_read()s that I have in slub really do not care about when the 
> variables are read. And if volatile creates overhead then I rather not have 
> it.


The overhead due to volatile access is -way- small.  Not like barrier(),
which can flush out a fair fraction of the machine registers.

Thanx, Paul
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 5/13] cxgb3 - Expose HW memory page info


Jeff Garzik wrote:

Divy Le Ray wrote:

From: Divy Le Ray <[EMAIL PROTECTED]>

Let the RDMA driver get HW page info to work around HW issues.
Assign explicit enum values.

Signed-off-by: Divy Le Ray <[EMAIL PROTECTED]>


"HW issues" -- you need to go into far more detail, when adding a new 
interface.  what hw issues?  why was this the best/only solution?





A HW issue requires limiting the receive window size to 23 pages of 
internal memory.
These pages can be configured to different sizes, thus the RDMA driver 
needs to know the

page size to enforce the upper limit.

Divy
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] [463/2many] MAINTAINERS - STRADIS MPEG-2 DECODER DRIVER

2007-08-14 Thread Nathan Laredo

Looks good to me.

Signed-off-by: Nathan Laredo <[EMAIL PROTECTED]>

- Nathan


On 8/14/07, Joe Perches <[EMAIL PROTECTED]> wrote:
> On Tue, 2007-08-14 at 14:51 -0700, Nathan Laredo wrote:
> > Just the ones that show my name at the top of the source file.
> > cs8240.h, ibmmpeg2.h, saa7121.h, saa7146*.h, stradis.c
>
> STRADIS MPEG-2 DECODER DRIVER
> P:  Nathan Laredo
> M:  [EMAIL PROTECTED]
> W:  http://www.stradis.com/
> S:  Maintained
> F:  drivers/media/video/cs8240.h
> F:  drivers/media/video/ibmmpeg2.h
> F:  drivers/media/video/saa7121.h
> F:  drivers/media/video/saa7146*.h
> F:  drivers/media/video/stradis.c
>
>
>
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch 1/2] i386: use asm() like the other atomic operations already do.

> My config with march=pentium-m and gcc (GCC) 4.1.2 (Gentoo 4.1.2):
>   textdata bss dec hex filename
> 3434150  249176  176128 3859454  3ae3fe atomic_normal/vmlinux
> 3435308  249176  176128 3860612  3ae884 atomic_inlineasm/vmlinux

What is the difference between atomic_normal and atomic_inlineasm? 

>  /**
>   * atomic_read - read atomic variable
>   * @v: pointer of type atomic_t
> - * 
> + *

Please don't change white space in patches

>   * Atomically reads the value of @v.
> - */ 
> -#define atomic_read(v)   ((v)->counter)
> + */
> +static __inline__ int atomic_read(const atomic_t *v)
> +{
> + int t;
> +
> + __asm__ __volatile__(

And don't use __*__ in new code

-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

On Wed, Aug 15, 2007 at 04:38:54AM +0530, Satyam Sharma wrote:
> 
> 
> On Tue, 14 Aug 2007, Christoph Lameter wrote:
> 
> > On Thu, 9 Aug 2007, Chris Snook wrote:
> > 
> > > This patchset makes the behavior of atomic_read uniform by removing the
> > > volatile keyword from all atomic_t and atomic64_t definitions that 
> > > currently
> > > have it, and instead explicitly casts the variable as volatile in
> > > atomic_read().  This leaves little room for creative optimization by the
> > > compiler, and is in keeping with the principles behind "volatile 
> > > considered
> > > harmful".
> > 
> > volatile is generally harmful even in atomic_read(). Barriers control
> > visibility and AFAICT things are fine.
> 
> Frankly, I don't see the need for this series myself either. Personal
> opinion (others may differ), but I consider "volatile" to be a sad /
> unfortunate wart in C (numerous threads on this list and on the gcc
> lists/bugzilla over the years stand testimony to this) and if we _can_
> steer clear of it, then why not -- why use this ill-defined primitive
> whose implementation has often differed over compiler versions and
> platforms? Granted, barrier() _is_ heavy-handed in that it makes the
> optimizer forget _everything_, but then somebody did post a forget()
> macro on this thread itself ...
> 
> [ BTW, why do we want the compiler to not optimize atomic_read()'s in
>   the first place? Atomic ops guarantee atomicity, which has nothing
>   to do with "volatility" -- users that expect "volatility" from
>   atomic ops are the ones who must be fixed instead, IMHO. ]

Interactions between mainline code and interrupt/NMI handlers on the same
CPU (for example, when both are using per-CPU variables.  See examples
previously posted in this thread, or look at the rcu_read_lock() and
rcu_read_unlock() implementations in http://lkml.org/lkml/2007/8/7/280.

Thanx, Paul
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/23] document preferred use of volatile with atomic_t


Christoph Lameter wrote:

On Tue, 14 Aug 2007, Chris Snook wrote:


volatile means that there is some vague notion of "read it now". But that
really does not exist. Instead we control visibility via barriers (smp_wmb,
smp_rmb). Would it not be best to not have volatile at all in atomic
operations and let the barriers do the work?

From my reply in the other thread...

But barriers force a flush of *everything* in scope, which we generally don't
want.  On the other hand, we pretty much always want to flush atomic_*
operations.  One way or another, we should be restricting the volatile
behavior to the thing that needs it.  On most architectures, this patch set
just moves that from the declaration, where it is considered harmful, to the
use, where it is considered an occasional necessary evil.

If you really, *really* distrust the compiler that much, you shouldn't be
using barrier, since that uses volatile under the hood too.  You should just
go ahead and implement the atomic operations in assembler, like Segher
Boessenkool did for powerpc in response to my previous patchset.


From my reply on the other thread:

Maybe we need two read functions? One volatile, one not?


If we're going to do this, and I don't think we need to, I'd prefer that 
atomic_read() be volatile, and something like atomic_read_opt() be non-volatile, 
to discourage premature optimization.


The atomic_read()s that I have in slub really do not care about when the 
variables are read. And if volatile creates overhead then I rather not have it.


A single volatile access is no more expensive than a non-volatile access.  It's 
when you have dependencies that you start to see overhead.  If you're doing a 
bunch of atomic operations on the same atomic_t in quick succession, then you 
will see some overhead.  Of course, if you're doing that, I think you have a 
design problem.


On modern, register-rich CPUs with cache latencies of a couple clock cycles, 
volatile generally isn't as much of a performance hit as it used to be.  I think 
that going out of your way to avoid it would be premature optimization on modern 
hardware.


-- Chris
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 4/13] cxgb3 - use immediate data for offload Tx

Jeff Garzik wrote:

Divy Le Ray wrote:
> From: Divy Le Ray <[EMAIL PROTECTED]>
>
> Send small TX_DATA work requests as immediate data even when
> there are fragments.
>
> Signed-off-by: Divy Le Ray <[EMAIL PROTECTED]>
> ---
>
>  drivers/net/cxgb3/sge.c |   17 +++--
>  1 files changed, 11 insertions(+), 6 deletions(-)

needs additional explanation.  don't just describe the new post-change
behavior, describe why this change is needed.

It's an optimization avoiding doing multiple DMAs for small fragmented 
packets.
The driver already implements this optimization for small contiguous 
packets.

Divy
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] [443/2many] MAINTAINERS - HIBERNATION (aka Software Suspend, aka swsusp):

2007-08-14 Thread Dave Jones

On Tue, Aug 14, 2007 at 11:22:37AM -0700, Andrew Morton wrote:
 > On Tue, 14 Aug 2007 11:15:41 -0700 (PDT)
 > Linus Torvalds <[EMAIL PROTECTED]> wrote:
 > 
 > > In other words, it would be much better to just have per-file markers, 
 > > along with some per-subdirectory stuff or similar.
 > 
 > And a `make maintainers' target to pull it all together..
 > 
 > (perhaps we could add a
 > 
 >  maintainer 
 > 
 > record to Kconfig, then `make maintainers' goes and looks up 
 > somewhere and does something with it)

Not everything that's in MAINTAINERS has a Kconfig entry though,
so it really needs to live in the .c/.h files.

Dave


-- 
http://www.codemonkey.org.uk
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 2/13] cxgb3 - Update rx coalescing length


Jeff Garzik wrote:


Divy Le Ray wrote:
> From: Divy Le Ray <[EMAIL PROTECTED]>
>
> Set max Rx coalescing length to 12288
>
> Signed-off-by: Divy Le Ray <[EMAIL PROTECTED]>
> ---
>
>  drivers/net/cxgb3/common.h |2 +-
>  1 files changed, 1 insertions(+), 1 deletions(-)
>
> diff --git a/drivers/net/cxgb3/common.h b/drivers/net/cxgb3/common.h
> index c46c249..55922ed 100644
> --- a/drivers/net/cxgb3/common.h
> +++ b/drivers/net/cxgb3/common.h
> @@ -104,7 +104,7 @@ enum {
>   PROTO_SRAM_LINES = 128, /* size of TP sram */
>  };
> 
> -#define MAX_RX_COALESCING_LEN 16224U

> +#define MAX_RX_COALESCING_LEN 12288U

neither the patch nor description explains -why-



We're seeing back pressure from PCIe with large bursts, this patch 
allows to cut down on the burst size.


Divy

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] [RESEND] PIE executable randomization

2007-08-14 Thread Jiri Kosina

On Tue, 14 Aug 2007, Jiri Kosina wrote:

> It turned out recently that PIE-compiled binaries on x86_64, that 
> perform larger amount of brk-allocations (for example bash) will not 
> work (but they will work on ?86). This is because currently on ?86 the 
> memory layout is as follows:

(Andi added to CC)

The following patch fixes the brk-allocation problems on x86_64 with code 
randomization patch on PIE-compiled binaries. Is anyone aware of any 
potential disaster it might cause somewhere please?

If not -- Andrew, could you apply it on top of 
pie-executable-randomization.patch please? Thanks.



From: Jiri Kosina <[EMAIL PROTECTED]>

X86_64: add flexmmap support

This patch adds flexible-mmap support for x86_64 and brings the address 
space layout closer to the "new" i?86 address space layout. Using the 
legacy layout is still possible by

- ADDR_COMPAT_LAYOUT personality
- having unlimited resource limit for stack
- legacy_va_layout sysctl setting

This corresponds to the ?86 behavior.

Flexible-mmap support is necessary for establishing proper mapping when 
performing executable code randomization for PIE-compiled binaries, 
otherwise non-randomized brk, which is immediately following the code, 
might not have enough free space.

Signed-off-by: Jiri Kosina <[EMAIL PROTECTED]>

 arch/x86_64/mm/mmap.c |  107 ++---
 1 files changed, 92 insertions(+), 15 deletions(-)

diff --git a/arch/x86_64/mm/mmap.c b/arch/x86_64/mm/mmap.c
index 80bba0d..a5e658c 100644
--- a/arch/x86_64/mm/mmap.c
+++ b/arch/x86_64/mm/mmap.c
@@ -1,29 +1,106 @@
-/* Copyright 2005 Andi Kleen, SuSE Labs.
- * Licensed under GPL, v.2
+/*
+ *  linux/arch/x86-64/mm/mmap.c
+ *
+ *  flexible mmap layout support
+ *
+ * Based on code by Ingo Molnar and Andi Kleen, copyrighted
+ * as follows:
+ *
+ * Copyright 2003-2004 Red Hat Inc., Durham, North Carolina.
+ * All Rights Reserved.
+ * Copyright 2005 Andi Kleen, SuSE Labs.
+ * Copyright 2007 Jiri Kosina, SuSE Labs.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
+ *
  */
+
+#include 
 #include 
-#include 
 #include 
+#include 
+#include 
 #include 
 
-/* Notebook: move the mmap code from sys_x86_64.c over here. */
+/*
+ * Top of mmap area (just below the process stack).
+ *
+ * Leave an at least ~128 MB hole.
+ */
+#define MIN_GAP (128*1024*1024)
+#define MAX_GAP (TASK_SIZE/6*5)
 
-void arch_pick_mmap_layout(struct mm_struct *mm)
+static inline unsigned long mmap_base(void)
+{
+   unsigned long gap = current->signal->rlim[RLIMIT_STACK].rlim_cur;
+
+   if (gap < MIN_GAP)
+   gap = MIN_GAP;
+   else if (gap > MAX_GAP)
+   gap = MAX_GAP;
+
+   return TASK_SIZE - (gap & PAGE_MASK);
+}
+
+static inline int mmap_is_legacy(void)
 {
 #ifdef CONFIG_IA32_EMULATION
-   if (current_thread_info()->flags & _TIF_IA32)
-   return ia32_pick_mmap_layout(mm);
+   if (test_thread_flag(TIF_IA32))
+   return 1;   
 #endif
-   mm->mmap_base = TASK_UNMAPPED_BASE;
+
+   if (current->personality & ADDR_COMPAT_LAYOUT)
+   return 1;
+
+   if (current->signal->rlim[RLIMIT_STACK].rlim_cur == RLIM_INFINITY)
+   return 1;
+
+   return sysctl_legacy_va_layout;
+}
+
+/*
+ * This function, called very early during the creation of a new
+ * process VM image, sets up which VM layout function to use:
+ */
+void arch_pick_mmap_layout(struct mm_struct *mm)
+{
+   int rnd = 0;
if (current->flags & PF_RANDOMIZE) {
/* Add 28bit randomness which is about 40bits of address space
   because mmap base has to be page aligned.
-  or ~1/128 of the total user VM
-  (total user address space is 47bits) */
-   unsigned rnd = get_random_int() & 0xfff;
-   mm->mmap_base += ((unsigned long)rnd) << PAGE_SHIFT;
+  or ~1/128 of the total user VM
+  (total user address space is 47bits) */
+   rnd = get_random_int() & 0xfff;
}
-   mm->get_unmapped_area = arch_get_unmapped_area;
-   mm->unmap_area = arch_unmap_area;
-}
 
+   /*
+* Fall back to the standard layout if the personality
+* bit is set, or if the expected stack growth is unlimited:
+

Re: kfree(0) - ok?

2007-08-14 Thread Jason Uhlenkott

On Tue, Aug 14, 2007 at 15:55:48 -0700, Arjan van de Ven wrote:
> NULL is not 0 though.

It is.  Its representation isn't guaranteed to be all-bits-zero, but
the constant value 0 when used in pointer context is always a null
pointer (and in fact the standard requires that NULL be #defined as 0
or a cast thereof).
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

On Tue, 14 Aug 2007, Chris Snook wrote:

> Because atomic operations are generally used for synchronization, which
> requires volatile behavior.  Most such codepaths currently use an inefficient
> barrier().  Some forget to and we get bugs, because people assume that
> atomic_read() actually reads something, and atomic_write() actually writes
> something.  Worse, these are architecture-specific, even compiler
> version-specific bugs that are often difficult to track down.

Looks like we need to have lock and unlock semantics?

atomic_read()

which has no barrier or volatile implications.

atomic_read_for_lock

Acquire semantics?


atomic_read_for_unlock

Release semantics?

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures


Satyam Sharma wrote:


On Tue, 14 Aug 2007, Christoph Lameter wrote:


On Thu, 9 Aug 2007, Chris Snook wrote:


This patchset makes the behavior of atomic_read uniform by removing the
volatile keyword from all atomic_t and atomic64_t definitions that currently
have it, and instead explicitly casts the variable as volatile in
atomic_read().  This leaves little room for creative optimization by the
compiler, and is in keeping with the principles behind "volatile considered
harmful".

volatile is generally harmful even in atomic_read(). Barriers control
visibility and AFAICT things are fine.


Frankly, I don't see the need for this series myself either. Personal
opinion (others may differ), but I consider "volatile" to be a sad /
unfortunate wart in C (numerous threads on this list and on the gcc
lists/bugzilla over the years stand testimony to this) and if we _can_
steer clear of it, then why not -- why use this ill-defined primitive
whose implementation has often differed over compiler versions and
platforms? Granted, barrier() _is_ heavy-handed in that it makes the
optimizer forget _everything_, but then somebody did post a forget()
macro on this thread itself ...

[ BTW, why do we want the compiler to not optimize atomic_read()'s in
  the first place? Atomic ops guarantee atomicity, which has nothing
  to do with "volatility" -- users that expect "volatility" from
  atomic ops are the ones who must be fixed instead, IMHO. ]


Because atomic operations are generally used for synchronization, which requires 
volatile behavior.  Most such codepaths currently use an inefficient barrier(). 
 Some forget to and we get bugs, because people assume that atomic_read() 
actually reads something, and atomic_write() actually writes something.  Worse, 
these are architecture-specific, even compiler version-specific bugs that are 
often difficult to track down.


-- Chris
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: kfree(0) - ok?

2007-08-14 Thread Robert Hancock


Tim Bird wrote:

Hi all,

I have a quick question.

I'm trying to resurrect a patch from the Linux-tiny patch suite,
to do accounting of kmalloc memory allocations.  In testing it
with Linux 2.6.22, I've found a large number of kfrees of
NULL pointers.

Is this considered OK?  Or should I examine the offenders
to see if something is coded badly?


It's perfectly correct to do it - though, if it's done very frequently 
in certain cases, it might be more efficient to check for null before 
the kfree, to avoid the function call overhead into kfree..


--
Robert Hancock  Saskatoon, SK, Canada
To email, remove "nospam" from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: kfree(0) - ok?

2007-08-14 Thread Arjan van de Ven


On Tue, 2007-08-14 at 15:59 -0700, Tim Bird wrote:
> Hi all,
> 
> I have a quick question.
> 
> I'm trying to resurrect a patch from the Linux-tiny patch suite,
> to do accounting of kmalloc memory allocations.  In testing it
> with Linux 2.6.22, I've found a large number of kfrees of
> NULL pointers.
> 
> Is this considered OK?  Or should I examine the offenders
> to see if something is coded badly?

kfree(NULL) is explicitly ok and it saves code size to not check
anywhere
(the idea is that kfree(kmalloc(...)); is a guaranteed safe nop)

NULL is not 0 though.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: kfree(0) - ok?

2007-08-14 Thread Satyam Sharma

On Tue, 14 Aug 2007, Tim Bird wrote:

> Hi all,
> 
> I have a quick question.
> 
> I'm trying to resurrect a patch from the Linux-tiny patch suite,
> to do accounting of kmalloc memory allocations.  In testing it
> with Linux 2.6.22, I've found a large number of kfrees of
> NULL pointers.
> 
> Is this considered OK?

kfree(NULL) is allowed -- for programmers' convenience.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/23] document preferred use of volatile with atomic_t

On Tue, 14 Aug 2007, Chris Snook wrote:

> > volatile means that there is some vague notion of "read it now". But that
> > really does not exist. Instead we control visibility via barriers (smp_wmb,
> > smp_rmb). Would it not be best to not have volatile at all in atomic
> > operations and let the barriers do the work?
> 
> From my reply in the other thread...
> 
> But barriers force a flush of *everything* in scope, which we generally don't
> want.  On the other hand, we pretty much always want to flush atomic_*
> operations.  One way or another, we should be restricting the volatile
> behavior to the thing that needs it.  On most architectures, this patch set
> just moves that from the declaration, where it is considered harmful, to the
> use, where it is considered an occasional necessary evil.
> 
> If you really, *really* distrust the compiler that much, you shouldn't be
> using barrier, since that uses volatile under the hood too.  You should just
> go ahead and implement the atomic operations in assembler, like Segher
> Boessenkool did for powerpc in response to my previous patchset.

>From my reply on the other thread:

Maybe we need two read functions? One volatile, one not?

The atomic_read()s that I have in slub really do not care about when the 
variables are read. And if volatile creates overhead then I rather not have it.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC 0/3] dm: add path uevents

2007-08-14 Thread Kay Sievers

On Tue, 2007-08-14 at 13:05 -0700, Mike Anderson wrote:
> This patch series enables device mapper (dm) to send kobject uevents for
> dm events. Currently only two new events are sent related to path state
> changes.

Sounds fine.

> DM_ACTION=PATH_FAILED
> DM_SEQNUM=1
> DM_PATH=8:48
> DM_PATHS=1

Oh, I have a patch pending for dmsetup to export a bunch of variables in
the environment format with a single call, so we can use it udev
context.
Here is the list, just that we don't clash with names or use the same
names where appropriate:
  DM_NAME=
  DM_UUID=
  DM_STATE=
  DM_TABLE_STATE=
  DM_OPENCOUNT=
  DM_MAJOR=
  DM_MINOR=
  DM_TARGET_COUNT=
  DM_TARGET_TYPES=
  DM_LAST_EVENT_NR=

Let me know if something should be renamed.

Thanks,
Kay

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-14 Thread Satyam Sharma

On Tue, 14 Aug 2007, Christoph Lameter wrote:

> On Thu, 9 Aug 2007, Chris Snook wrote:
> 
> > This patchset makes the behavior of atomic_read uniform by removing the
> > volatile keyword from all atomic_t and atomic64_t definitions that currently
> > have it, and instead explicitly casts the variable as volatile in
> > atomic_read().  This leaves little room for creative optimization by the
> > compiler, and is in keeping with the principles behind "volatile considered
> > harmful".
> 
> volatile is generally harmful even in atomic_read(). Barriers control
> visibility and AFAICT things are fine.

Frankly, I don't see the need for this series myself either. Personal
opinion (others may differ), but I consider "volatile" to be a sad /
unfortunate wart in C (numerous threads on this list and on the gcc
lists/bugzilla over the years stand testimony to this) and if we _can_
steer clear of it, then why not -- why use this ill-defined primitive
whose implementation has often differed over compiler versions and
platforms? Granted, barrier() _is_ heavy-handed in that it makes the
optimizer forget _everything_, but then somebody did post a forget()
macro on this thread itself ...

[ BTW, why do we want the compiler to not optimize atomic_read()'s in
  the first place? Atomic ops guarantee atomicity, which has nothing
  to do with "volatility" -- users that expect "volatility" from
  atomic ops are the ones who must be fixed instead, IMHO. ]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch 1/2] i386: use asm() like the other atomic operations already do.

2007-08-14 Thread Sebastian Siewior

As Segher pointed out, inline asm is better than the volatile casting all over
the place. From the PowerPC patch description:
 Also use inline functions instead of macros; this actually
 improves code generation (some code becomes a little smaller,
 probably because of improved alias information -- just a few
 hundred bytes total on a default kernel build, nothing shocking).

My config with march=pentium-m and gcc (GCC) 4.1.2 (Gentoo 4.1.2):
  textdata bss dec hex filename
3434150  249176  176128 3859454  3ae3fe atomic_normal/vmlinux
3435308  249176  176128 3860612  3ae884 atomic_inlineasm/vmlinux
3436201  249176  176128 3861505  3aec01 atomic_inline_volatile/vmlinux
3436203  249176  176128 3861507  3aec03 atomic_volatile/vmlinux

Signed-off-by: Sebastian Siewior <[EMAIL PROTECTED]>
--- a/include/asm-i386/atomic.h
+++ b/include/asm-i386/atomic.h
@@ -22,19 +22,34 @@ typedef struct { int counter; } atomic_t
 /**
  * atomic_read - read atomic variable
  * @v: pointer of type atomic_t
- * 
+ *
  * Atomically reads the value of @v.
- */ 
-#define atomic_read(v) ((v)->counter)
+ */
+static __inline__ int atomic_read(const atomic_t *v)
+{
+   int t;
+
+   __asm__ __volatile__(
+   "movl %1,%0"
+   : "=r"(t)
+   : "m"(v->counter));
+   return t;
+}
 
 /**
  * atomic_set - set atomic variable
  * @v: pointer of type atomic_t
  * @i: required value
- * 
+ *
  * Atomically sets the value of @v to @i.
- */ 
-#define atomic_set(v,i)(((v)->counter) = (i))
+ */
+static __inline__ void atomic_set(atomic_t *v, int i)
+{
+   __asm__ __volatile__(
+   "movl %1,%0"
+   : "=m"(v->counter)
+   : "ir"(i));
+}
 
 /**
  * atomic_add - add integer to atomic variable

-- 

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch 0/2] use asm() for atomic_{read|set}

2007-08-14 Thread Sebastian Siewior

I converted i386+x86-64. Compiled, booted and played for a while. The
description of both patches contains the file size of four kernel builds:
- "normal" is 28e8351ac22de25034e048c680014ad824323c65 as it
- "inline asm" is with this patch
- "inline volatile" is *(volatile int *)&(v)->counter as a static inline
  function
- "volatile" is *(volatile int *)&(v)->counter as a #define macro

I hope I don't encourage anyone to use macros over inline functions.

Sebastian
-- 

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch 2/2] x86_64: use asm() like the other atomic operations already do.

2007-08-14 Thread Sebastian Siewior

As Segher pointed out, inline asm is better than the volatile casting all over
the place. From the PowerPC patch description:
 Also use inline functions instead of macros; this actually
 improves code generation (some code becomes a little smaller,
 probably because of improved alias information -- just a few
 hundred bytes total on a default kernel build, nothing shocking).

My config with march=k8 and gcc (GCC) 4.1.2 (Gentoo 4.1.2):
  textdata bss dec hex filename
  4002473  385936  474440 4862849  4a3381 atomic_normal/vmlinux
  4002587  385936  474440 4862963  4a33f3 atomic_inlineasm/vmlinux
  4003911  385936  474440 4864287  4a391f atomic_volatile/vmlinux
  4003959  385936  474440 4864335  4a394f atomic_volatile_inline/vmlinux

Signed-off-by: Sebastian Siewior <[EMAIL PROTECTED]>
--- a/include/asm-x86_64/atomic.h
+++ b/include/asm-x86_64/atomic.h
@@ -29,19 +29,34 @@ typedef struct { int counter; } atomic_t
 /**
  * atomic_read - read atomic variable
  * @v: pointer of type atomic_t
- * 
+ *
  * Atomically reads the value of @v.
- */ 
-#define atomic_read(v) ((v)->counter)
+ */
+static __inline__ int atomic_read(const atomic_t *v)
+{
+   int t;
+
+   __asm__ __volatile__(
+   "movl %1, %0"
+   : "=r"(t)
+   : "m"(v->counter));
+   return t;
+}
 
 /**
  * atomic_set - set atomic variable
  * @v: pointer of type atomic_t
  * @i: required value
- * 
+ *
  * Atomically sets the value of @v to @i.
- */ 
-#define atomic_set(v,i)(((v)->counter) = (i))
+ */
+static __inline__ void atomic_set(atomic_t *v, int i)
+{
+   __asm__ __volatile__(
+   "movl %1, %0"
+   : "=m"(v->counter)
+   : "ir"(i));
+}
 
 /**
  * atomic_add - add integer to atomic variable
@@ -206,7 +221,7 @@ static __inline__ int atomic_sub_return(
 
 /* An 64bit atomic type */
 
-typedef struct { volatile long counter; } atomic64_t;
+typedef struct { long counter; } atomic64_t;
 
 #define ATOMIC64_INIT(i)   { (i) }
 
@@ -217,7 +232,16 @@ typedef struct { volatile long counter; 
  * Atomically reads the value of @v.
  * Doesn't imply a read memory barrier.
  */
-#define atomic64_read(v)   ((v)->counter)
+static __inline__ long atomic64_read(const atomic64_t *v)
+{
+   long t;
+
+   __asm__ __volatile__(
+   "movq %1, %0"
+   : "=r"(t)
+   : "m"(v->counter));
+   return t;
+}
 
 /**
  * atomic64_set - set atomic64 variable
@@ -226,7 +250,13 @@ typedef struct { volatile long counter; 
  *
  * Atomically sets the value of @v to @i.
  */
-#define atomic64_set(v,i)  (((v)->counter) = (i))
+static __inline__ void atomic64_set(atomic64_t *v, long i)
+{
+   __asm__ __volatile__(
+   "movq %1, %0"
+   : "=m"(v->counter)
+   : "ir"(i));
+}
 
 /**
  * atomic64_add - add integer to atomic64 variable

-- 

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch 1/6] Convert from class_device to device in drivers/char/drm

2007-08-14 Thread Greg KH

On Tue, Aug 07, 2007 at 10:28:43PM -0700, [EMAIL PROTECTED] wrote:
> Convert from class_device to device in drivers/char/drm.
> 
> Signed-off-by: Tony Jones <[EMAIL PROTECTED]>
> Signed-off-by: Kay Sievers <[EMAIL PROTECTED]>
> 
> ---
>  drivers/char/drm/drmP.h  |8 ++---
>  drivers/char/drm/drm_stub.c  |9 +++---
>  drivers/char/drm/drm_sysfs.c |   58 
> ++-
>  3 files changed, 39 insertions(+), 36 deletions(-)

This should go to the drm maintainer, not me.  But they look good, feel
free to add my:
Signed-off-by: Greg Kroah-Hartman <[EMAIL PROTECTED]>

to it.

thanks,

greg k-h
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch 3/6] Convert from class_device to device for hwmon

2007-08-14 Thread Greg KH

On Tue, Aug 07, 2007 at 10:28:45PM -0700, [EMAIL PROTECTED] wrote:
> Convert from class_device to device for hwmon_device_register/unregister
> 
> Signed-off-by: Tony Jones <[EMAIL PROTECTED]>
> Signed-off-by: Kay Sievers <[EMAIL PROTECTED]>

Patches 3-5 here should go through the hwmon maintainer, not me.  Feel
free to add a:
Signed-off-by: Greg Kroah-Hartman <[EMAIL PROTECTED]>

to them, they look good.

thanks,

greg k-h
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

kfree(0) - ok?

2007-08-14 Thread Tim Bird

Hi all,

I have a quick question.

I'm trying to resurrect a patch from the Linux-tiny patch suite,
to do accounting of kmalloc memory allocations.  In testing it
with Linux 2.6.22, I've found a large number of kfrees of
NULL pointers.

Is this considered OK?  Or should I examine the offenders
to see if something is coded badly?

Thanks,
 -- Tim

=
Tim Bird
Architecture Group Chair, CE Linux Forum
Senior Staff Engineer, Sony Corporation of America
=

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/23] document preferred use of volatile with atomic_t


Christoph Lameter wrote:

On Mon, 13 Aug 2007, Chris Snook wrote:


@@ -38,7 +45,7 @@
 
 Next, we have:
 
-	#define atomic_read(v)	((v)->counter)

+   #define atomic_read(v)  (*(volatile int *)&(v)->counter)
 
 which simply reads the current value of the counter.


volatile means that there is some vague notion of "read it now". But that 
really does not exist. Instead we control visibility via barriers 
(smp_wmb, smp_rmb). Would it not be best to not have volatile at all in 
atomic operations and let the barriers do the work?


From my reply in the other thread...

But barriers force a flush of *everything* in scope, which we generally don't 
want.  On the other hand, we pretty much always want to flush atomic_* 
operations.  One way or another, we should be restricting the volatile behavior 
to the thing that needs it.  On most architectures, this patch set just moves 
that from the declaration, where it is considered harmful, to the use, where it 
is considered an occasional necessary evil.


If you really, *really* distrust the compiler that much, you shouldn't be using 
barrier, since that uses volatile under the hood too.  You should just go ahead 
and implement the atomic operations in assembler, like Segher Boessenkool did 
for powerpc in response to my previous patchset.


-- Chris
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Thinking outside the box on file systems

2007-08-14 Thread alan


On Tue, 14 Aug 2007, Marc Perkel wrote:


For example. If you list a directory you only see the
files that you have some rights to and files where you
have no rights are invisible to you. If a file is read
only to you then you can't delete it either. Having
write access to a directory really means that you have
file create rights. You can also delete files that you
have write access to. You would also allocate
permissions to manage file rights like being able to
set the rights of inferior users.


Imagine the fun you will have trying to write a file name and being told 
you cannot write it for some unknown reason.  Unbeknownst to you, there is 
a file there, but it is not owned by you, thus invisible.


Making a file system more user oriented would avoid little gotchas like 
this.  The reason it is "programmer oriented" is that those are the people 
who have worked out why it works and why certain things are bad ideas.


--
Refrigerator Rule #1: If you don't remember when you bought it, Don't eat it.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

On Tue, 14 Aug 2007, Chris Snook wrote:

> But barriers force a flush of *everything* in scope, which we generally don't
> want.  On the other hand, we pretty much always want to flush atomic_*
> operations.  One way or another, we should be restricting the volatile
> behavior to the thing that needs it.  On most architectures, this patch set
> just moves that from the declaration, where it is considered harmful, to the
> use, where it is considered an occasional necessary evil.

Then we would need

atomic_read()

and

atomic_read_volatile()

atomic_read_volatile() would imply an object sized memory barrier before 
and after?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: PROBLEM: 2.6.23-rc "NETDEV WATCHDOG: eth0: transmit timed out"

2007-08-14 Thread Francois Romieu

Karl Meyer <[EMAIL PROTECTED]> :
> I did some additional testing, the results are:
> [0e4851502f846b13b29b7f88f1250c980d57e944] r8169: merge with version
> 8.001.00 of Realtek's r8168 driver
> does not work, I after some traffic the transmit timeout occurs.
> [6dccd16b7c2703e8bbf8bca62b5cf248332afbe2] r8169: merge with version
> 6.001.00 of Realtek's r8169 driver
> Seems to be the last version to work. I did some stress testing (much
> more than the level that was enough to make
> [0e4851502f846b13b29b7f88f1250c980d57e944]  break) and am currently
> using this version and no problems so far.

Thanks for the quick feedback.

Can you try the patch below on top of 2.6.23-rc3 ?

If it does not work I'll dissect 0e4851502f846b13b29b7f88f1250c980d57e944
tomorrow.

diff --git a/drivers/net/r8169.c b/drivers/net/r8169.c
index b85ab4a..cdb8a08 100644
--- a/drivers/net/r8169.c
+++ b/drivers/net/r8169.c
@@ -2749,6 +2749,7 @@ static irqreturn_t rtl8169_interrupt(int irq, void 
*dev_instance)
if (!(status & tp->intr_event))
break;
 
+#if 0
 /* Work around for rx fifo overflow */
 if (unlikely(status & RxFIFOOver) &&
(tp->mac_version == RTL_GIGA_MAC_VER_11)) {
@@ -2756,6 +2757,7 @@ static irqreturn_t rtl8169_interrupt(int irq, void 
*dev_instance)
rtl8169_tx_timeout(dev);
break;
}
+#endif
 
if (unlikely(status & SYSErr)) {
rtl8169_pcierr_interrupt(dev);
-- 
Ueimor
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: System call interposition/unprotecting the table

> Then you fix the specific case and the game continues.

If they intercept netdev->hard_start_xmit there is nothing
to fix. Or inode->i_ops or any other virtual method pointer
that is called often..

Putting i_ops into const memory doesn't help either -- they
can just copy them and use their own and replace the pointer
in any accessed inode.

It's also not that doing this is rocket science. Anybody
barely skilled in computer architecture should be able
to figure this out.

Ok the only thing that could help is IA64/PPC64 style smart pointer
checking that could prevent foreign code from being 
executed, but you won't get that on x86 or most other
architectures any time soon.

And that would also only work if you disable module loading
or implement a likely impractical/incompatible
with free software code signing scheme
(and Vista has just shown that these don't work anyways) 

> > In general the .data protection is only considered a debugging
> > feature. I don't know why Fedora enables it in their production
> > kernels.
> 
> That would be because we think you are wrong 8)

Well, it might at best buy you a few weeks/months in
terms of the exploit arms race, but thrash your user's TLBs
forever.

-Andi

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] [463/2many] MAINTAINERS - STRADIS MPEG-2 DECODER DRIVER

On Tue, 2007-08-14 at 14:51 -0700, Nathan Laredo wrote:
> Just the ones that show my name at the top of the source file.
> cs8240.h, ibmmpeg2.h, saa7121.h, saa7146*.h, stradis.c

STRADIS MPEG-2 DECODER DRIVER
P:  Nathan Laredo
M:  [EMAIL PROTECTED]
W:  http://www.stradis.com/
S:  Maintained
F:  drivers/media/video/cs8240.h
F:  drivers/media/video/ibmmpeg2.h
F:  drivers/media/video/saa7121.h
F:  drivers/media/video/saa7146*.h
F:  drivers/media/video/stradis.c


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures


Christoph Lameter wrote:

On Thu, 9 Aug 2007, Chris Snook wrote:


This patchset makes the behavior of atomic_read uniform by removing the
volatile keyword from all atomic_t and atomic64_t definitions that currently
have it, and instead explicitly casts the variable as volatile in
atomic_read().  This leaves little room for creative optimization by the
compiler, and is in keeping with the principles behind "volatile considered
harmful".


volatile is generally harmful even in atomic_read(). Barriers control
visibility and AFAICT things are fine.


But barriers force a flush of *everything* in scope, which we generally don't 
want.  On the other hand, we pretty much always want to flush atomic_* 
operations.  One way or another, we should be restricting the volatile behavior 
to the thing that needs it.  On most architectures, this patch set just moves 
that from the declaration, where it is considered harmful, to the use, where it 
is considered an occasional necessary evil.


See the resubmitted patchset, which also puts a cast in the atomic[64]_set 
operations.


-- Chris
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/23] document preferred use of volatile with atomic_t

On Mon, 13 Aug 2007, Chris Snook wrote:

> @@ -38,7 +45,7 @@
>  
>  Next, we have:
>  
> - #define atomic_read(v)  ((v)->counter)
> + #define atomic_read(v)  (*(volatile int *)&(v)->counter)
>  
>  which simply reads the current value of the counter.

volatile means that there is some vague notion of "read it now". But that 
really does not exist. Instead we control visibility via barriers 
(smp_wmb, smp_rmb). Would it not be best to not have volatile at all in 
atomic operations and let the barriers do the work?

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 6/24] make atomic_read() behave consistently on frv

On Wed, Aug 15, 2007 at 12:01:54AM +0200, Arnd Bergmann wrote:
> On Tuesday 14 August 2007, Paul E. McKenney wrote:
> > > #define order(x) asm volatile("" : "+m" (x))
> > 
> > There was something very similar discussed earlier in this thread,
> > with quite a bit of debate as to exactly what the "m" flag should
> > look like.  I suggested something similar named ACCESS_ONCE in the
> > context of RCU (http://lkml.org/lkml/2007/7/11/664):
> > 
> > #define ACCESS_ONCE(x) (*(volatile typeof(x) *)&(x))
> > 
> > The nice thing about this is that it works for both loads and stores.
> > Not clear that order() above does this -- I get compiler errors when
> > I try something like "b = order(a)" or "order(a) = 1" using gcc 4.1.2.
> 
> Well, it serves a different purpose: While your ACCESS_ONCE() macro is
> an lvalue, the order() macro is a statement that can be used in place
> of the barrier() macro. order() is the most lightweight barrier as it
> only enforces ordering on a single variable in the compiler, but does
> not have any side-effects visible to other threads, like the cache
> line access in ACCESS_ONCE has.

ACCESS_ONCE() is indeed intended to be used when actually loading or
storing the variable.  That said, I must admit that it is not clear to me
why you would want to add an extra order() rather than ACCESS_ONCE()ing
one or both of the adjacent accesses to that same variable.

So, what am I missing?

Thanx, Paul
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Thinking outside the box on file systems

2007-08-14 Thread Marc Perkel

I want to throw out some concepts about a new way of
thinking about file systems. But the first thing you
have to do is to forget what you know about file
systems now. This is a discussion about a new view of
looking a file storage that is radically different and
it's more easily undersood if you forget a lot of what
you know. The idea is to create what seems natural to
the user rather than what seems natural to the
programmer.

For example, if a user has not read or write access to
a file then why should they be able to delete the file
- or even list the file in the directory? In order to
grasp this idea the idea of directory permission as
you now know them needs to go away. 

Imagine that the file system is a database that
contains file data, name data, and permission data.
Loose the idea that files have an owner and a group or
the attributes that we are familiar with. Think
instead  that users, groups, managers, application,
and such are objects and there is a complex rights
system that gives access to names that point to file
data.

For example. If you list a directory you only see the
files that you have some rights to and files where you
have no rights are invisible to you. If a file is read
only to you then you can't delete it either. Having
write access to a directory really means that you have
file create rights. You can also delete files that you
have write access to. You would also allocate
permissions to manage file rights like being able to
set the rights of inferior users.

The ACLs that were added to Linux were a step in the
right direction but very incomplete. What should be is
a complex permission system that would allow fine
grained permissions and inherentance masks to control
what permission are granted when someone moves new
files into a directory. Instead of just root and users
there would be mid level roles where users and objects
had management authority over parts of the system and
the roles can be defined in a very flexible way. For
example, rights might change during "business hours".

I want to throw these concepts out there to inspire a
new way of thinging and let Linux evolve into a more
natural kind of file system rather than staying ture
to it's ancient roots. Of course there would be an
emulation layer to keep existing apps happy but I
think that Linux will never be truly what it could be
unless it breaks away from the limitations of the
past.

Anyhow, I'm going to stop at this just to let these
ideas settle in. In my mind there's a lot more detail
but let's see where this goes.

Marc Perkel






Marc Perkel
Junk Email Filter dot com
http://www.junkemailfilter.com


  

Fussy? Opinionated? Impossible to please? Perfect.  Join Yahoo!'s user panel 
and lay it on us. http://surveylink.yahoo.com/gmrs/yahoo_panel_invite.asp?a=7 

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC 4/9] Atomic reclaim: Save irq flags in vmscan.c

On Wed, 15 Aug 2007, Andi Kleen wrote:

> > Ok I have a vague idea on how this could but its likely that the 
> > changes make things worse rather than better. Additional reference to a 
> > new cacheline (per cpu but still), preempt disable. Lots of code at all
> > call sites. Interrupt enable/disable is quite efficient in recent 
> > processors.
> 
> The goal of this was not to be faster than interrupt disable,
> but to avoid the interrupt latency impact. This might be a problem
> when spending a lot of time inside the locks.

Both. They need to be fast too and not complicate the kernel too much. I 
have not seen a serious holdoff case. The biggest issue is still the 
zone->lru lock but interrupts are always disabled for that one already.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

On Thu, 9 Aug 2007, Chris Snook wrote:

> This patchset makes the behavior of atomic_read uniform by removing the
> volatile keyword from all atomic_t and atomic64_t definitions that currently
> have it, and instead explicitly casts the variable as volatile in
> atomic_read().  This leaves little room for creative optimization by the
> compiler, and is in keeping with the principles behind "volatile considered
> harmful".

volatile is generally harmful even in atomic_read(). Barriers control
visibility and AFAICT things are fine.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: System call interposition/unprotecting the table

2007-08-14 Thread Alan Cox

> So even with Alan's hypervisor support the whole thing would be still
> quite holey.  The argument of raising the bar also doesn't seem very

Its materially harder, especially with the hypervisor.

> convincing to me, because attackers reuse code too and it's enough
> when someone publishes such code once, then they can cut'n'paste
> it into any exploits forever.

Then you fix the specific case and the game continues.

> In general the .data protection is only considered a debugging
> feature. I don't know why Fedora enables it in their production
> kernels.

That would be because we think you are wrong 8)

Alan
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [2.6 patch] remove Documentation/networking/net-modules.txt

2007-08-14 Thread Adrian Bunk

On Tue, Aug 14, 2007 at 06:04:01PM -0400, Jeff Garzik wrote:
> Adrian Bunk wrote:
>> According to git, the only one who touched this file during the last
>> 5 years was me when removing drivers...
>> modinfo offers less ancient information.
>> Signed-off-by: Adrian Bunk <[EMAIL PROTECTED]>
>> ---
>> This patch has been sent on:
>> - 23 Jul 2007
>>  Documentation/networking/00-INDEX|2  
>> Documentation/networking/net-modules.txt |  315 ---
>>  2 files changed, 317 deletions(-)
>
> NAK, IMO it's still use for ancient drivers

Is there any that lacks a MODULE_PARM_DESC()?
If yes, shouln't we fix such drivers instead?

Even for ancient drivers net-modules.txt is outdated and sometimes lists 
no longer existing or doesn't document more recent parameters.

cu
Adrian

-- 

   "Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
   "Only a promise," Lao Er said.
   Pearl S. Buck - Dragon Seed

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 10/23] make atomic_read() and atomic_set() behavior consistent on ia64


Christoph Lameter wrote:

On Tue, 14 Aug 2007, Luck, Tony wrote:


I re-tried the macros ... the three warnings from mm/slub.c all result in
broken code ... and quite rightly too, they all come from code that does:

atomic_read(>nr_slabs)

But the nr_slabs field is an atomic_long_t, so we shouldn't be using
atomic_read().  I didn't spot these last time around because I was using
slab, not slub for the previous build.


H...  Strange that this did not cause failures before on any other 
platforms?


Prior to the patch in question, atomic_read was a macro.  I didn't use slub in 
my cursory testing.  Tony had ia64 under a microscope because of the tricky 
memory access ordering semantics of that architecture.


-- Chris
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [2.6 patch] w1_remove_master_device(): fix check-after-use

2007-08-14 Thread Evgeniy Polyakov

Hi Adrian.

On Tue, Aug 14, 2007 at 11:22:48PM +0200, Adrian Bunk ([EMAIL PROTECTED]) wrote:
> The Coverity checker spotted that we'd have already oops'ed if "dev"
> was NULL.

This is wrong.
Although dev can not be null there there is no way it will crash.
The right paranoidal solution is to setup new pointer and make it equal
to the found device and check if it is NULL or not out of the loop.
I will cook up a patch tomorrow, thanks for pointing to this issue.

-- 
Evgeniy Polyakov
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC 4/9] Atomic reclaim: Save irq flags in vmscan.c

> Ok I have a vague idea on how this could but its likely that the 
> changes make things worse rather than better. Additional reference to a 
> new cacheline (per cpu but still), preempt disable. Lots of code at all
> call sites. Interrupt enable/disable is quite efficient in recent 
> processors.

The goal of this was not to be faster than interrupt disable,
but to avoid the interrupt latency impact. This might be a problem
when spending a lot of time inside the locks.

-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC 4/9] Atomic reclaim: Save irq flags in vmscan.c

On Wed, 15 Aug 2007, Andi Kleen wrote:

> > > The interrupt handler shouldn't touch zone_flag. If it wants
> > > to it would need to be converted to a local_t and incremented/decremented
> > > (should be about the same cost at least on architectures with sane
> > > local_t implementation) 
> > 
> > That would mean we need to fork the code for reclaim?
> 
> Not with the local_t increment.

Ok I have a vague idea on how this could but its likely that the 
changes make things worse rather than better. Additional reference to a 
new cacheline (per cpu but still), preempt disable. Lots of code at all
call sites. Interrupt enable/disable is quite efficient in recent 
processors.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [3/4] 2.6.23-rc3: known regressions

2007-08-14 Thread Francois Romieu

Michal Piotrowski <[EMAIL PROTECTED]> :
[...]
> Networking
> 
> Subject : NETDEV WATCHDOG: eth0: transmit timed out
> References  : http://lkml.org/lkml/2007/8/13/737
> Last known good : ?
> Submitter   : Karl Meyer <[EMAIL PROTECTED]>
> Caused-By   : ?
> Handled-By  : ?

  Handled-By  : [EMAIL PROTECTED]

> Status  : unknown

> Subject : Weird network problems with 2.6.23-rc2
> References  : http://lkml.org/lkml/2007/8/11/40
> Last known good : ?
> Submitter   : Shish <[EMAIL PROTECTED]>
> Caused-By   : ?
> Handled-By  : ?
> Status  : unknown

The PR does not give any driver nor hardware detail. :o/

> Subject : BUG: when using 'brctl stp'
> References  : http://lkml.org/lkml/2007/8/10/441
> Last known good : 2.6.23-rc1
> Submitter   : Daniel K. <[EMAIL PROTECTED]>
> Caused-By   : ?
> Handled-By  : ?

  Handled-By  : [EMAIL PROTECTED]

> Status  : unknown

  Status  : fix applied by David Miller

-- 
Ueimor
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC 4/9] Atomic reclaim: Save irq flags in vmscan.c

On Tue, Aug 14, 2007 at 03:07:10PM -0700, Christoph Lameter wrote:
> There are more spinlocks needed. So we would just check the whole bunch 
> and fail if any of them are used?

Yes zone_flag would apply to all of them.

> 
> > do things with zone locks 
> > }
> > 
> > The interrupt handler shouldn't touch zone_flag. If it wants
> > to it would need to be converted to a local_t and incremented/decremented
> > (should be about the same cost at least on architectures with sane
> > local_t implementation) 
> 
> That would mean we need to fork the code for reclaim?

Not with the local_t increment.

-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC 4/9] Atomic reclaim: Save irq flags in vmscan.c

On Tue, 14 Aug 2007, Andi Kleen wrote:

> > Could you be a bit more specific? Where do you want to place the data?
> 
> DEFINE_PER_CPU(int, zone_flag);
> 
>   get_cpu();  // likely already true and then not needed
>   __get_cpu(zone_flag) = 1;
>   /* wmb is implied in spin_lock I think */

No its not. Only on x64 which has implicit write ordering.

>   spin_lock(>lru_lock);
>   ...
>   spin_unlock(>lru_lock);
>   __get_cpu(zone_flag) = 0;
>   put_cpu();
> 
> Interrupt handler
> 
>   if (!__get_cpu(zone_flag)) {

There are more spinlocks needed. So we would just check the whole bunch 
and fail if any of them are used?

>   do things with zone locks 
>   }
> 
> The interrupt handler shouldn't touch zone_flag. If it wants
> to it would need to be converted to a local_t and incremented/decremented
> (should be about the same cost at least on architectures with sane
> local_t implementation) 

That would mean we need to fork the code for reclaim?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: [PATCH 10/23] make atomic_read() and atomic_set() behavior consistent on ia64

On Tue, 14 Aug 2007, Luck, Tony wrote:

> I re-tried the macros ... the three warnings from mm/slub.c all result in
> broken code ... and quite rightly too, they all come from code that does:
> 
>   atomic_read(>nr_slabs)
> 
> But the nr_slabs field is an atomic_long_t, so we shouldn't be using
> atomic_read().  I didn't spot these last time around because I was using
> slab, not slub for the previous build.

H...  Strange that this did not cause failures before on any other 
platforms?


Fix atomic_read's in slub

Signed-off-by: Christoph Lameter <[EMAIL PROTECTED]>

diff --git a/mm/slub.c b/mm/slub.c
index 69d02e3..0c106d7 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -3112,7 +3112,7 @@ static int list_locations(struct kmem_cache *s, char *buf,
unsigned long flags;
struct page *page;
 
-   if (!atomic_read(>nr_slabs))
+   if (!atomic_long_read(>nr_slabs))
continue;
 
spin_lock_irqsave(>list_lock, flags);
@@ -3247,7 +3247,7 @@ static unsigned long slab_objects(struct kmem_cache *s,
}
 
if (flags & SO_FULL) {
-   int full_slabs = atomic_read(>nr_slabs)
+   int full_slabs = atomic_long_read(>nr_slabs)
- per_cpu[node]
- n->nr_partial;
 
@@ -3283,7 +3283,7 @@ static int any_slab_objects(struct kmem_cache *s)
for_each_node(node) {
struct kmem_cache_node *n = get_node(s, node);
 
-   if (n->nr_partial || atomic_read(>nr_slabs))
+   if (n->nr_partial || atomic_long_read(>nr_slabs))
return 1;
}
return 0;
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [2.6 patch] remove Documentation/networking/net-modules.txt

2007-08-14 Thread Jeff Garzik


Adrian Bunk wrote:

According to git, the only one who touched this file during the last
5 years was me when removing drivers...

modinfo offers less ancient information.

Signed-off-by: Adrian Bunk <[EMAIL PROTECTED]>

---

This patch has been sent on:
- 23 Jul 2007

 Documentation/networking/00-INDEX|2 
 Documentation/networking/net-modules.txt |  315 ---

 2 files changed, 317 deletions(-)


NAK, IMO it's still use for ancient drivers


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: [PATCH 10/23] make atomic_read() and atomic_set() behavior consistent on ia64

2007-08-14 Thread Luck, Tony

>> include/linux/skbuff.h:521: warning: passing arg 1 of `atomic_read' discards 
>> qualifiers from pointer target type
>> include/net/sock.h:1244: warning: passing arg 1 of `atomic_read' discards 
>> qualifiers from pointer target type
>> include/net/tcp.h:958: warning: passing arg 1 of `atomic_read' discards 
>> qualifiers from pointer target type
>> mm/slub.c:3115: warning: passing arg 1 of `atomic_read' from incompatible 
>> pointer type
>> mm/slub.c:3250: warning: passing arg 1 of `atomic_read' from incompatible 
>> pointer type
>> mm/slub.c:3286: warning: passing arg 1 of `atomic_read' from incompatible 
>> pointer type

> Do you get any warnings other than those two?

That looks like six, not two.  But that's the whole list.

>IIRC, when you applied a version which used macros instead, there was no 
>change. 
>  It would seem that inlining changed the optimization behavior of the 
> compiler. 
>  If you turn down the optimization level, do the macro and inline versions 
> look 
>  the same, or at least more similar?

I re-tried the macros ... the three warnings from mm/slub.c all result in
broken code ... and quite rightly too, they all come from code that does:

atomic_read(>nr_slabs)

But the nr_slabs field is an atomic_long_t, so we shouldn't be using
atomic_read().  I didn't spot these last time around because I was using
slab, not slub for the previous build.

I think that I'll run into other build issues if I turn down the
optimization level (there are lots of places where the kernel relies
on optimizing away impossible cases in switch statements.

> The binary does boot ... but I haven't run any tests to see whether
> there are any problems.

-Tony
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [-mm patch] make fs/nfsd/nfs4callback.c:do_probe_callback() static

2007-08-14 Thread J. Bruce Fields

On Tue, Aug 14, 2007 at 11:22:58PM +0200, Adrian Bunk wrote:
> On Thu, Aug 09, 2007 at 01:51:06AM -0700, Andrew Morton wrote:
> >...
> > Changes since 2.6.23-rc2-mm1:
> >...
> >  git-nfsd.patch
> >...
> >  git trees
> >...
> 
> 
> do_probe_callback() can become static.

Oops, thanks; applied.--b.

> 
> Signed-off-by: Adrian Bunk <[EMAIL PROTECTED]>
> 
> ---
> 8177c6f652deb91fcb43c8ca86f7703a61468ba9 
> diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
> index afdf66b..c17a520 100644
> --- a/fs/nfsd/nfs4callback.c
> +++ b/fs/nfsd/nfs4callback.c
> @@ -369,7 +369,7 @@ nfsd4_lookupcred(struct nfs4_client *clp, int taskflags)
>  /* Reference counting, callback cleanup, etc., all look racy as heck.
>   * And why is cb_set an atomic? */
>  
> -int do_probe_callback(void *data)
> +static int do_probe_callback(void *data)
>  {
>   struct nfs4_client *clp = data;
>   struct nfs4_callback *cb = >cl_callback;
> 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 6/24] make atomic_read() behave consistently on frv

2007-08-14 Thread Arnd Bergmann

On Tuesday 14 August 2007, Paul E. McKenney wrote:
> > #define order(x) asm volatile("" : "+m" (x))
> 
> There was something very similar discussed earlier in this thread,
> with quite a bit of debate as to exactly what the "m" flag should
> look like.  I suggested something similar named ACCESS_ONCE in the
> context of RCU (http://lkml.org/lkml/2007/7/11/664):
> 
> #define ACCESS_ONCE(x) (*(volatile typeof(x) *)&(x))
> 
> The nice thing about this is that it works for both loads and stores.
> Not clear that order() above does this -- I get compiler errors when
> I try something like "b = order(a)" or "order(a) = 1" using gcc 4.1.2.

Well, it serves a different purpose: While your ACCESS_ONCE() macro is
an lvalue, the order() macro is a statement that can be used in place
of the barrier() macro. order() is the most lightweight barrier as it
only enforces ordering on a single variable in the compiler, but does
not have any side-effects visible to other threads, like the cache
line access in ACCESS_ONCE has.

Arnd <><
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: why use memcpy when memmove is there?

2007-08-14 Thread David Schwartz


> Hi, All
>
> We were looking at  "[kernel]/lib/string.c"
> (http://lxr.linux.no/source/lib/string.c#L500)
>
> memcpy copies a part of memory to some other location
> but It will not work for all cases of overlapping
> blocks.(if the start of destination block falls
> between the source block)
>
> while memove copes with overlapping areas.
>
> then why is memcpy present in the sources can't we
> simply do
>
> "#define memcpy memmove" in include/linux/string.h
>
> or am I missing something?

Suppose you have two vehicles, an economy car and a semi truck. The truck
can go everyplace the car can go, and the truck can carry a bigger load. So
why would you ever use the car? Answer: The car uses less gas and you don't
always need a truck.

Think about what it takes to be able to copy one block of memory to another
location when those locations might overlap. You don't always need that.

DS


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC 4/9] Atomic reclaim: Save irq flags in vmscan.c

On Tue, Aug 14, 2007 at 02:48:31PM -0700, Christoph Lameter wrote:
> On Tue, 14 Aug 2007, Andi Kleen wrote:
> 
> > > But that still creates lots of overhead each time we take the lru lock!
> > 
> > A lot of overhead in what way? Setting a flag in a cache hot
> > per CPU data variable shouldn't be more than a few cycles.
> 
> Could you be a bit more specific? Where do you want to place the data?

DEFINE_PER_CPU(int, zone_flag);

get_cpu();  // likely already true and then not needed
__get_cpu(zone_flag) = 1;
/* wmb is implied in spin_lock I think */
spin_lock(>lru_lock);
...
spin_unlock(>lru_lock);
__get_cpu(zone_flag) = 0;
put_cpu();

Interrupt handler

if (!__get_cpu(zone_flag)) {
do things with zone locks 
}

The interrupt handler shouldn't touch zone_flag. If it wants
to it would need to be converted to a local_t and incremented/decremented
(should be about the same cost at least on architectures with sane
local_t implementation) 

-Andi

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: O_NONBLOCK is broken

2007-08-14 Thread David Schwartz


> The problem is, O_NONBLOCK flag is not attached to file *descriptor*,
> but to a "file description" mentioned in fcntl manpage:
[snip]
> We don't know whether our stdout descriptor #1 is shared with
> anyone or not,
> and if we were started from shell, it typically is. That's why we try to
> restore flags ASAP.

> But "ASAP" isn't soon enough. Between setting and clearing O_NONBLOCK,
> other process which share fd #1 with us may well be affected
> by file suddenly becoming O_NONBLOCK under its feet.
>
> Worse, other process can do the same
> fcntl(1, F_SETFL, fl | O_NONBLOCK);
> ...
> fcntl(1, F_SETFL, fl);
> sequence, and first fcntl can return flags with O_NONBLOCK set
> (because of
> us), and then second fcntl will set O_NONBLOCK permanently, which is not
> what was intended!
[snip]
> P.S. Hmm, it seems fcntl GETFL/SETFL interface seems to be racy:
>
> int fl = fcntl(fd, F_GETFL, 0);
> /* other process can muck with file flags here */
> fcntl(fd, F_SETFL, fl | SOME_BITS);
>
> How can I *atomically* add or remove bits from file flags?

Simply put, you cannot change file flags on a shared descriptor. It is a bug
to do so, a bug that is sadly present in many common programs.

I like the idea of being able to specify blocking or non-blocking behavior
in the operation. It is not too uncommon to want to perform blocking
operations sometimes and non-blocking operations other times for the same
object and having to keep changing modes, even if it wasn't racy, is a pain.

However, there's a much more fundamental problem here. Processes need a good
way to get exclusive use of their stdin, stdout, and stderr streams and
there is no good way. Perhaps an "exclusive lock" that blocked all other
process' attempts to use the terminal until it was released would be a good
thing.

DS


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Regression in 2.6.23-rc2-mm2, mounting cpusets causes a hang

On Tue, 14 Aug 2007, Lee Schermerhorn wrote:

> > Ok then you did not have a NUMA system configured. So its okay for the 
> > dummies to ignore the stuff. CONFIG_NODES_SHIFT is a constant and does not 
> > change. The first bit is always set.
> 
> The first bit [node 0] is only set for the N_ONLINE [and N_POSSIBLE]
> mask.  We could add the static init for the other masks, but since
> non-numa platforms are going through the __build_all_zonelists, they
> might as well set the MEMORY bits explicitly.  Or, maybe you'll
> disagree ;-).

The bitmaps can be completely ignored if !NUMA.

In the non NUMA case we define

static inline int node_state(int node, enum node_states state)
{
return node == 0;
}

So its always true for node 0. The "bit" is set.

We are trying to get cpusets to work with !NUMA?




-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] [463/2many] MAINTAINERS - STRADIS MPEG-2 DECODER DRIVER

2007-08-14 Thread Nathan Laredo

No, not all of them.  Just the ones that show my name at the top of
the source file.

cs8240.h, ibmmpeg2.h, saa7121.h, saa7146*.h, stradis.c

stradis.c includes all of the above .h files, so they weren't listed separately.

Additionally, it should probably be noted that I still use a 2.2
kernel with this device and haven't tested the 3rd party ports to 2.6
yet but I've never actually gotten a bug report yet from someone
claiming it isn't working (but I'm pretty confident it will crash).

Thanks,
- Nathan Laredo
[EMAIL PROTECTED]

On 8/13/07, Joe Perches <[EMAIL PROTECTED]> wrote:
> On Mon, 2007-08-13 at 21:04 -0700, Nathan Laredo wrote:
> > Well, technically speaking, there are multiple files to the stradis
> > driver, not just stradis.c.
>
> These files seem to be shared between drivers.
>
> the Kconfig file shows:
>
> obj-$(CONFIG_VIDEO_STRADIS) += stradis.o
>
> as the only STRADIS file.
>
> Are you the maintainer for other entries?
> Should I add these entries?
>
> drivers/media/video/cs8420.h
> drivers/media/video/ibmmpeg2.h
> drivers/media/video/saa5249.c
> drivers/media/video/saa7110.c
> drivers/media/video/saa7111.c
> drivers/media/video/saa7121.h
> drivers/media/video/saa7146*.h
> drivers/media/video/saa7185.h
> drivers/media/video/saa7196.h
> drivers/media/video/videodev.c
>
>
>
>
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Regression in 2.6.23-rc2-mm2, mounting cpusets causes a hang

2007-08-14 Thread Lee Schermerhorn

On Tue, 2007-08-14 at 14:28 -0700, Christoph Lameter wrote:
> On Tue, 14 Aug 2007, Serge E. Hallyn wrote:
> 
> > CONFIG_NODES_SHIFT was unset, so MAX_NUMNODES=1, so
> > node_state() and node_set_state() are dummies.
> 
> Ok then you did not have a NUMA system configured. So its okay for the 
> dummies to ignore the stuff. CONFIG_NODES_SHIFT is a constant and does not 
> change. The first bit is always set.

The first bit [node 0] is only set for the N_ONLINE [and N_POSSIBLE]
mask.  We could add the static init for the other masks, but since
non-numa platforms are going through the __build_all_zonelists, they
might as well set the MEMORY bits explicitly.  Or, maybe you'll
disagree ;-).

Lee

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC 4/9] Atomic reclaim: Save irq flags in vmscan.c

On Tue, 14 Aug 2007, Andi Kleen wrote:

> > But that still creates lots of overhead each time we take the lru lock!
> 
> A lot of overhead in what way? Setting a flag in a cache hot
> per CPU data variable shouldn't be more than a few cycles.

Could you be a bit more specific? Where do you want to place the data?

What we are talking about is

atomic_inc(>reclaim_cpu[smp_processor_id()]);
smp_wmb();
spin_lock(>lru_lock);



spin_unlock(_lru_lock);
smp_wmb();
atomic_dec(>reclaim_cpu[smp_processor_id()]);

That is not light weight.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC 4/9] Atomic reclaim: Save irq flags in vmscan.c

On Tue, Aug 14, 2007 at 02:37:27PM -0700, Christoph Lameter wrote:
> On Tue, 14 Aug 2007, Andi Kleen wrote:
> 
> > > We already have such a flag in the zone structure
> > 
> > Zone structure is not strictly CPU local so it's broader
> > than needed. But it might work.
> 
> We could convert this into a per cpu array?

Perhaps. That would make it more expensive to read for
its current users though. 

> But that still creates lots of overhead each time we take the lru lock!

A lot of overhead in what way? Setting a flag in a cache hot
per CPU data variable shouldn't be more than a few cycles.

-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 3/14] UML - Style fixes pass 1

Formatting changes in the files which have been changed in the
tt-removal patchset so far.  These include:
copyright updates
header file trimming
style fixes
adding severity to printks
indenting Kconfig help according to the predominant kernel style

These changes should be entirely non-functional.

Signed-off-by: Jeff Dike <[EMAIL PROTECTED]>
--
 arch/um/Kconfig  |  199 +++
 arch/um/Kconfig.char |  139 ++---
 arch/um/Kconfig.debug|   24 +--
 arch/um/Makefile |8 -
 arch/um/include/sysdep-i386/sigcontext.h |   12 -
 arch/um/kernel/Makefile  |2 
 arch/um/kernel/init_task.c   |8 -
 arch/um/kernel/smp.c |   31 ++--
 arch/um/kernel/trap.c|  111 -
 arch/um/os-Linux/Makefile|2 
 arch/um/os-Linux/sys-i386/Makefile   |2 
 arch/um/os-Linux/time.c  |   38 ++---
 arch/um/os-Linux/tls.c   |4 
 arch/um/sys-i386/Makefile|4 
 arch/um/sys-i386/ptrace_user.c   |   13 --
 15 files changed, 283 insertions(+), 314 deletions(-)

Index: linux-2.6.22/arch/um/kernel/trap.c
===
--- linux-2.6.22.orig/arch/um/kernel/trap.c 2007-08-10 15:43:45.0 
-0400
+++ linux-2.6.22/arch/um/kernel/trap.c  2007-08-13 11:58:48.0 -0400
@@ -1,38 +1,24 @@
 /*
- * Copyright (C) 2000, 2001 Jeff Dike ([EMAIL PROTECTED])
+ * Copyright (C) 2000 - 2007 Jeff Dike ([EMAIL PROTECTED],linux.intel}.com)
  * Licensed under the GPL
  */
 
-#include "linux/kernel.h"
-#include "asm/errno.h"
-#include "linux/sched.h"
-#include "linux/mm.h"
-#include "linux/spinlock.h"
-#include "linux/init.h"
-#include "linux/ptrace.h"
-#include "asm/semaphore.h"
-#include "asm/pgtable.h"
-#include "asm/pgalloc.h"
-#include "asm/tlbflush.h"
-#include "asm/a.out.h"
-#include "asm/current.h"
-#include "asm/irq.h"
-#include "sysdep/sigcontext.h"
-#include "kern_util.h"
-#include "as-layout.h"
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
 #include "arch.h"
-#include "kern.h"
-#include "chan_kern.h"
-#include "mconsole_kern.h"
-#include "mem.h"
-#include "mem_kern.h"
-#include "sysdep/sigcontext.h"
-#include "sysdep/ptrace.h"
-#include "os.h"
-#include "skas.h"
+#include "as-layout.h"
+#include "kern_util.h"
 #include "os.h"
+#include "sysdep/sigcontext.h"
 
-/* Note this is constrained to return 0, -EFAULT, -EACCESS, -ENOMEM by segv(). 
*/
+/*
+ * Note this is constrained to return 0, -EFAULT, -EACCESS, -ENOMEM by
+ * segv().
+ */
 int handle_page_fault(unsigned long address, unsigned long ip,
  int is_write, int is_user, int *code_out)
 {
@@ -46,31 +32,33 @@ int handle_page_fault(unsigned long addr
 
*code_out = SEGV_MAPERR;
 
-   /* If the fault was during atomic operation, don't take the fault, just
-* fail. */
+   /*
+* If the fault was during atomic operation, don't take the fault, just
+* fail.
+*/
if (in_atomic())
goto out_nosemaphore;
 
down_read(>mmap_sem);
vma = find_vma(mm, address);
-   if(!vma)
+   if (!vma)
goto out;
-   else if(vma->vm_start <= address)
+   else if (vma->vm_start <= address)
goto good_area;
-   else if(!(vma->vm_flags & VM_GROWSDOWN))
+   else if (!(vma->vm_flags & VM_GROWSDOWN))
goto out;
-   else if(is_user && !ARCH_IS_STACKGROW(address))
+   else if (is_user && !ARCH_IS_STACKGROW(address))
goto out;
-   else if(expand_stack(vma, address))
+   else if (expand_stack(vma, address))
goto out;
 
 good_area:
*code_out = SEGV_ACCERR;
-   if(is_write && !(vma->vm_flags & VM_WRITE))
+   if (is_write && !(vma->vm_flags & VM_WRITE))
goto out;
 
/* Don't require VM_READ|VM_EXEC for write faults! */
-   if(!is_write && !(vma->vm_flags & (VM_READ | VM_EXEC)))
+   if (!is_write && !(vma->vm_flags & (VM_READ | VM_EXEC)))
goto out;
 
do {
@@ -96,9 +84,10 @@ survive:
pud = pud_offset(pgd, address);
pmd = pmd_offset(pud, address);
pte = pte_offset_kernel(pmd, address);
-   } while(!pte_present(*pte));
+   } while (!pte_present(*pte));
err = 0;
-   /* The below warning was added in place of
+   /*
+* The below warning was added in place of
 *  pte_mkyoung(); if (is_write) pte_mkdirty();
 * If it's triggered, we'd see normally a hang here (a clean pte is
 * marked read-only to emulate the dirty bit).
@@ -112,7 +101,7 @@ survive:
 out:
up_read(>mmap_sem);
 out_nosemaphore:
-   return(err);
+   return

[PATCH 11/14] UML - Rename pt_regs general-purpose register file

Before the removal of tt mode, access to a register on the skas-mode
side of a pt_regs struct looked like pt_regs.regs.skas.regs.regs[FOO].
This was bad enough, but it became pt_regs.regs.regs.regs[FOO] with
the removal of the union from the middle.  To get rid of the run of
three "regs", the last field is renamed to "gp".

Signed-off-by: Jeff Dike <[EMAIL PROTECTED]>
--
 arch/um/include/sysdep-i386/ptrace.h   |   36 +-
 arch/um/include/sysdep-x86_64/ptrace.h |   60 +++---
 arch/um/kernel/process.c   |4 +-
 arch/um/kernel/skas/syscall.c  |2 -
 arch/um/os-Linux/registers.c   |6 +--
 arch/um/os-Linux/skas/process.c|4 +-
 arch/um/sys-i386/signal.c  |   64 -
 arch/um/sys-x86_64/signal.c|6 +--
 arch/um/sys-x86_64/tls.c   |2 -
 9 files changed, 92 insertions(+), 92 deletions(-)

Index: linux-2.6.22/arch/um/include/sysdep-i386/ptrace.h
===
--- linux-2.6.22.orig/arch/um/include/sysdep-i386/ptrace.h  2007-08-13 
15:16:57.0 -0400
+++ linux-2.6.22/arch/um/include/sysdep-i386/ptrace.h   2007-08-13 
16:58:20.0 -0400
@@ -53,7 +53,7 @@ extern int sysemu_supported;
 #endif
 
 struct uml_pt_regs {
-   unsigned long regs[MAX_REG_NR];
+   unsigned long gp[MAX_REG_NR];
unsigned long fp[HOST_FP_SIZE];
unsigned long xfp[HOST_XFP_SIZE];
struct faultinfo faultinfo;
@@ -63,23 +63,23 @@ struct uml_pt_regs {
 
 #define EMPTY_UML_PT_REGS { }
 
-#define UPT_IP(r) REGS_IP((r)->regs)
-#define UPT_SP(r) REGS_SP((r)->regs)
-#define UPT_EFLAGS(r) REGS_EFLAGS((r)->regs)
-#define UPT_EAX(r) REGS_EAX((r)->regs)
-#define UPT_EBX(r) REGS_EBX((r)->regs)
-#define UPT_ECX(r) REGS_ECX((r)->regs)
-#define UPT_EDX(r) REGS_EDX((r)->regs)
-#define UPT_ESI(r) REGS_ESI((r)->regs)
-#define UPT_EDI(r) REGS_EDI((r)->regs)
-#define UPT_EBP(r) REGS_EBP((r)->regs)
+#define UPT_IP(r) REGS_IP((r)->gp)
+#define UPT_SP(r) REGS_SP((r)->gp)
+#define UPT_EFLAGS(r) REGS_EFLAGS((r)->gp)
+#define UPT_EAX(r) REGS_EAX((r)->gp)
+#define UPT_EBX(r) REGS_EBX((r)->gp)
+#define UPT_ECX(r) REGS_ECX((r)->gp)
+#define UPT_EDX(r) REGS_EDX((r)->gp)
+#define UPT_ESI(r) REGS_ESI((r)->gp)
+#define UPT_EDI(r) REGS_EDI((r)->gp)
+#define UPT_EBP(r) REGS_EBP((r)->gp)
 #define UPT_ORIG_EAX(r) ((r)->syscall)
-#define UPT_CS(r) REGS_CS((r)->regs)
-#define UPT_SS(r) REGS_SS((r)->regs)
-#define UPT_DS(r) REGS_DS((r)->regs)
-#define UPT_ES(r) REGS_ES((r)->regs)
-#define UPT_FS(r) REGS_FS((r)->regs)
-#define UPT_GS(r) REGS_GS((r)->regs)
+#define UPT_CS(r) REGS_CS((r)->gp)
+#define UPT_SS(r) REGS_SS((r)->gp)
+#define UPT_DS(r) REGS_DS((r)->gp)
+#define UPT_ES(r) REGS_ES((r)->gp)
+#define UPT_FS(r) REGS_FS((r)->gp)
+#define UPT_GS(r) REGS_GS((r)->gp)
 
 #define UPT_SYSCALL_ARG1(r) UPT_EBX(r)
 #define UPT_SYSCALL_ARG2(r) UPT_ECX(r)
@@ -161,7 +161,7 @@ struct syscall_args {
 #define UPT_SET_SYSCALL_RETURN(r, res) \
REGS_SET_SYSCALL_RETURN((r)->regs, (res))
 
-#define UPT_RESTART_SYSCALL(r) REGS_RESTART_SYSCALL((r)->regs)
+#define UPT_RESTART_SYSCALL(r) REGS_RESTART_SYSCALL((r)->gp)
 
 #define UPT_ORIG_SYSCALL(r) UPT_EAX(r)
 #define UPT_SYSCALL_NR(r) UPT_ORIG_EAX(r)
Index: linux-2.6.22/arch/um/include/sysdep-x86_64/ptrace.h
===
--- linux-2.6.22.orig/arch/um/include/sysdep-x86_64/ptrace.h2007-08-13 
15:16:57.0 -0400
+++ linux-2.6.22/arch/um/include/sysdep-x86_64/ptrace.h 2007-08-13 
16:58:20.0 -0400
@@ -85,7 +85,7 @@
 #define REGS_ERR(r) ((r)->fault_type)
 
 struct uml_pt_regs {
-   unsigned long regs[MAX_REG_NR];
+   unsigned long gp[MAX_REG_NR];
unsigned long fp[HOST_FP_SIZE];
struct faultinfo faultinfo;
long syscall;
@@ -94,36 +94,36 @@ struct uml_pt_regs {
 
 #define EMPTY_UML_PT_REGS { }
 
-#define UPT_RBX(r) REGS_RBX((r)->regs)
-#define UPT_RCX(r) REGS_RCX((r)->regs)
-#define UPT_RDX(r) REGS_RDX((r)->regs)
-#define UPT_RSI(r) REGS_RSI((r)->regs)
-#define UPT_RDI(r) REGS_RDI((r)->regs)
-#define UPT_RBP(r) REGS_RBP((r)->regs)
-#define UPT_RAX(r) REGS_RAX((r)->regs)
-#define UPT_R8(r) REGS_R8((r)->regs)
-#define UPT_R9(r) REGS_R9((r)->regs)
-#define UPT_R10(r) REGS_R10((r)->regs)
-#define UPT_R11(r) REGS_R11((r)->regs)
-#define UPT_R12(r) REGS_R12((r)->regs)
-#define UPT_R13(r) REGS_R13((r)->regs)
-#define UPT_R14(r) REGS_R14((r)->regs)
-#define UPT_R15(r) REGS_R15((r)->regs)
-#define UPT_CS(r) REGS_CS((r)->regs)
-#define UPT_FS_BASE(r) REGS_FS_BASE((r)->regs)
-#define UPT_FS(r) REGS_FS((r)->regs)
-#define UPT_GS_BASE(r) REGS_GS_BASE((r)->regs)
-#define UPT_GS(r) REGS_GS((r)->regs)
-#define UPT_DS(r) REGS_DS((r)->regs)
-#define UPT_ES(r) REGS_ES((r)->regs)
-#define UPT_CS(r) REGS_CS((r)->regs)
-#define UPT_SS(r) REGS_SS((r)->regs)
-#define UPT_ORIG_RAX(r) REGS_ORIG_RAX((r)->regs)
+#define

[PATCH 10/14] UML - Fold mmu_context_skas into mm_context

This patch folds mmu_context_skas into struct mm_context, changing all
users of these structures as needed.

Signed-off-by: Jeff Dike <[EMAIL PROTECTED]>
--
 arch/um/include/skas/mmu-skas.h |   23 -
 arch/um/include/tlb.h   |2 -
 arch/um/include/um_mmu.h|   18 +---
 arch/um/kernel/exec.c   |4 +--
 arch/um/kernel/reboot.c |2 -
 arch/um/kernel/skas/mmu.c   |   12 +--
 arch/um/kernel/skas/process.c   |2 -
 arch/um/kernel/tlb.c|   43 +++-
 arch/um/sys-i386/ldt.c  |   17 +++
 arch/um/sys-x86_64/syscalls.c   |2 -
 include/asm-um/ldt.h|4 ---
 include/asm-um/mmu_context.h|4 +--
 12 files changed, 58 insertions(+), 75 deletions(-)

Index: linux-2.6.22/arch/um/include/tlb.h
===
--- linux-2.6.22.orig/arch/um/include/tlb.h 2007-08-13 15:16:57.0 
-0400
+++ linux-2.6.22/arch/um/include/tlb.h  2007-08-13 16:57:03.0 -0400
@@ -33,7 +33,7 @@ struct host_vm_op {
 extern void force_flush_all(void);
 extern void fix_range_common(struct mm_struct *mm, unsigned long start_addr,
  unsigned long end_addr, int force,
-int (*do_ops)(union mm_context *,
+int (*do_ops)(struct mm_context *,
   struct host_vm_op *, int, int,
   void **));
 extern int flush_tlb_kernel_range_common(unsigned long start,
Index: linux-2.6.22/arch/um/include/um_mmu.h
===
--- linux-2.6.22.orig/arch/um/include/um_mmu.h  2007-08-13 15:16:57.0 
-0400
+++ linux-2.6.22/arch/um/include/um_mmu.h   2007-08-13 16:57:03.0 
-0400
@@ -7,10 +7,22 @@
 #define __ARCH_UM_MMU_H
 
 #include "uml-config.h"
-#include "mmu-skas.h"
+#include "mm_id.h"
+#include "asm/ldt.h"
 
-typedef union mm_context {
-   struct mmu_context_skas skas;
+typedef struct mm_context {
+   struct mm_id id;
+   unsigned long last_page_table;
+#ifdef CONFIG_3_LEVEL_PGTABLES
+   unsigned long last_pmd;
+#endif
+   struct uml_ldt ldt;
 } mm_context_t;
 
+extern void __switch_mm(struct mm_id * mm_idp);
+
+/* Avoid tangled inclusion with asm/ldt.h */
+extern long init_new_ldt(struct mm_context *to_mm, struct mm_context *from_mm);
+extern void free_ldt(struct mm_context *mm);
+
 #endif
Index: linux-2.6.22/arch/um/kernel/tlb.c
===
--- linux-2.6.22.orig/arch/um/kernel/tlb.c  2007-08-13 15:16:57.0 
-0400
+++ linux-2.6.22/arch/um/kernel/tlb.c   2007-08-13 16:57:03.0 -0400
@@ -14,8 +14,8 @@
 
 static int add_mmap(unsigned long virt, unsigned long phys, unsigned long len,
unsigned int prot, struct host_vm_op *ops, int *index,
-   int last_filled, union mm_context *mmu, void **flush,
-   int (*do_ops)(union mm_context *, struct host_vm_op *,
+   int last_filled, struct mm_context *mmu, void **flush,
+   int (*do_ops)(struct mm_context *, struct host_vm_op *,
  int, int, void **))
 {
__u64 offset;
@@ -52,8 +52,8 @@ static int add_mmap(unsigned long virt, 
 
 static int add_munmap(unsigned long addr, unsigned long len,
  struct host_vm_op *ops, int *index, int last_filled,
- union mm_context *mmu, void **flush,
- int (*do_ops)(union mm_context *, struct host_vm_op *,
+ struct mm_context *mmu, void **flush,
+ int (*do_ops)(struct mm_context *, struct host_vm_op *,
int, int, void **))
 {
struct host_vm_op *last;
@@ -82,8 +82,8 @@ static int add_munmap(unsigned long addr
 
 static int add_mprotect(unsigned long addr, unsigned long len,
unsigned int prot, struct host_vm_op *ops, int *index,
-   int last_filled, union mm_context *mmu, void **flush,
-   int (*do_ops)(union mm_context *, struct host_vm_op *,
+   int last_filled, struct mm_context *mmu, void **flush,
+   int (*do_ops)(struct mm_context *, struct host_vm_op *,
  int, int, void **))
 {
struct host_vm_op *last;
@@ -117,8 +117,8 @@ static int add_mprotect(unsigned long ad
 static inline int update_pte_range(pmd_t *pmd, unsigned long addr,
   unsigned long end, struct host_vm_op *ops,
   int last_op, int *op_index, int force,
-  union mm_context *mmu, void **flush,
-  int (*do_ops)(union mm_context *,
+

[PATCH 4/14] UML - Throw out CHOOSE_MODE

The next stage after removing code which depends on CONFIG_MODE_TT is
removing the CHOOSE_MODE abstraction, which provided both compile-time
and run-time branching to either tt-mode or skas-mode code.

This patch removes choose-mode.h and all inclusions of it, and
replaces all CHOOSE_MODE invocations with the skas branch.  This
leaves a number of trivial functions which will be dealt with in a
later patch.

There are some changes in the uaccess and tls support which go
somewhat beyond this and eliminate some of the now-redundant functions.

Signed-off-by: Jeff Dike <[EMAIL PROTECTED]>
--
 arch/um/drivers/chan_user.c|3 -
 arch/um/drivers/harddog_user.c |3 -
 arch/um/drivers/mconsole_kern.c|1 
 arch/um/include/choose-mode.h  |   20 
 arch/um/include/skas/uaccess-skas.h|   21 
 arch/um/include/sysdep-i386/ptrace.h   |   67 ---
 arch/um/include/sysdep-x86_64/ptrace.h |   81 ++---
 arch/um/include/um_mmu.h   |1 
 arch/um/include/um_uaccess.h   |   39 ++-
 arch/um/kernel/exec.c  |5 --
 arch/um/kernel/ksyms.c |   10 ++--
 arch/um/kernel/physmem.c   |2 
 arch/um/kernel/process.c   |   19 ++-
 arch/um/kernel/reboot.c|7 +-
 arch/um/kernel/skas/uaccess.c  |   12 ++--
 arch/um/kernel/syscall.c   |1 
 arch/um/kernel/time.c  |4 -
 arch/um/kernel/tlb.c   |   17 ++
 arch/um/kernel/um_arch.c   |   13 ++---
 arch/um/os-Linux/aio.c |5 --
 arch/um/os-Linux/main.c|   10 +---
 arch/um/os-Linux/signal.c  |6 --
 arch/um/os-Linux/start_up.c|1 
 arch/um/os-Linux/trap.c|1 
 arch/um/sys-i386/ldt.c |4 -
 arch/um/sys-i386/ptrace.c  |   17 ++
 arch/um/sys-i386/signal.c  |   12 
 arch/um/sys-i386/tls.c |7 +-
 arch/um/sys-x86_64/signal.c|9 ---
 arch/um/sys-x86_64/syscalls.c  |4 -
 include/asm-um/a.out.h |7 --
 include/asm-um/mmu_context.h   |   12 +---
 include/asm-um/processor-generic.h |1 
 include/asm-um/ptrace-i386.h   |   12 
 include/asm-um/tlbflush.h  |6 --
 include/asm-um/uaccess.h   |2 
 36 files changed, 131 insertions(+), 311 deletions(-)

Index: linux-2.6.22/arch/um/drivers/chan_user.c
===
--- linux-2.6.22.orig/arch/um/drivers/chan_user.c   2007-08-14 
12:37:58.0 -0400
+++ linux-2.6.22/arch/um/drivers/chan_user.c2007-08-14 13:25:58.0 
-0400
@@ -264,8 +264,7 @@ void register_winch(int fd, struct tty_s
return;
 
pid = tcgetpgrp(fd);
-   if (!CHOOSE_MODE_PROC(is_tracer_winch, is_skas_winch, pid, fd, tty) &&
-   (pid == -1)) {
+   if (!is_skas_winch(pid, fd, tty) && (pid == -1)) {
thread = winch_tramp(fd, tty, _fd, );
if (thread < 0)
return;
Index: linux-2.6.22/arch/um/drivers/harddog_user.c
===
--- linux-2.6.22.orig/arch/um/drivers/harddog_user.c2007-08-14 
12:37:58.0 -0400
+++ linux-2.6.22/arch/um/drivers/harddog_user.c 2007-08-14 13:28:18.0 
-0400
@@ -9,7 +9,6 @@
 #include "user.h"
 #include "mconsole.h"
 #include "os.h"
-#include "choose-mode.h"
 #include "mode.h"
 
 struct dog_data {
@@ -64,7 +63,7 @@ int start_watchdog(int *in_fd_ret, int *
}
else {
/* XXX The os_getpid() is not SMP correct */
-   sprintf(pid_buf, "%d", CHOOSE_MODE(tracing_pid, os_getpid()));
+   sprintf(pid_buf, "%d", os_getpid());
args = pid_args;
}
 
Index: linux-2.6.22/arch/um/include/choose-mode.h
===
--- linux-2.6.22.orig/arch/um/include/choose-mode.h 2007-08-14 
12:37:58.0 -0400
+++ /dev/null   1970-01-01 00:00:00.0 +
@@ -1,20 +0,0 @@
-/* 
- * Copyright (C) 2002 Jeff Dike ([EMAIL PROTECTED])
- * Licensed under the GPL
- */
-
-#ifndef __CHOOSE_MODE_H__
-#define __CHOOSE_MODE_H__
-
-#include "uml-config.h"
-
-#define CHOOSE_MODE(tt, skas) (skas)
-
-#define CHOOSE_MODE_PROC(tt, skas, args...) \
-   CHOOSE_MODE(tt(args), skas(args))
-
-#ifndef __CHOOSE_MODE
-#define __CHOOSE_MODE(tt, skas) CHOOSE_MODE(tt, skas)
-#endif
-
-#endif
Index: linux-2.6.22/arch/um/include/sysdep-i386/ptrace.h
===
--- linux-2.6.22.orig/arch/um/include/sysdep-i386/ptrace.h  2007-08-14 
12:37:58.0 -0400
+++ linux-2.6.22/arch/um/include/sysdep-i386/ptrace.h   2007-08-14 
13:28:18.0 -0400
@@

[PATCH 14/14] UML - Replace clone with fork

Convert the boot-time host ptrace testing from clone to fork.  They were
essentially doing fork anyway.  This cleans up the code a bit, and makes
valgrind a bit happier about grinding it.

Signed-off-by: Jeff Dike <[EMAIL PROTECTED]>
--
 arch/um/os-Linux/start_up.c |   55 
 1 file changed, 20 insertions(+), 35 deletions(-)

Index: linux-2.6.22/arch/um/os-Linux/start_up.c
===
--- linux-2.6.22.orig/arch/um/os-Linux/start_up.c   2007-08-10 
15:59:56.0 -0400
+++ linux-2.6.22/arch/um/os-Linux/start_up.c2007-08-10 16:16:32.0 
-0400
@@ -25,7 +25,7 @@
 #include "registers.h"
 #include "skas_ptrace.h"
 
-static int ptrace_child(void *arg)
+static int ptrace_child(void)
 {
int ret;
int pid = os_getpid(), ppid = getppid();
@@ -90,31 +90,23 @@ static void non_fatal(char *fmt, ...)
fflush(stdout);
 }
 
-static int start_ptraced_child(void **stack_out)
+static int start_ptraced_child(void)
 {
-   void *stack;
-   unsigned long sp;
int pid, n, status;
 
-   stack = mmap(NULL, UM_KERN_PAGE_SIZE,
-PROT_READ | PROT_WRITE | PROT_EXEC,
-MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
-   if (stack == MAP_FAILED)
-   fatal_perror("check_ptrace : mmap failed");
-
-   sp = (unsigned long) stack + UM_KERN_PAGE_SIZE - sizeof(void *);
-   pid = clone(ptrace_child, (void *) sp, SIGCHLD, NULL);
-   if (pid < 0)
-   fatal_perror("start_ptraced_child : clone failed");
+   pid = fork();
+   if (pid == 0)
+   ptrace_child();
+   else if (pid < 0)
+   fatal_perror("start_ptraced_child : fork failed");
 
CATCH_EINTR(n = waitpid(pid, , WUNTRACED));
if (n < 0)
-   fatal_perror("check_ptrace : clone failed");
+   fatal_perror("check_ptrace : waitpid failed");
if (!WIFSTOPPED(status) || (WSTOPSIG(status) != SIGSTOP))
fatal("check_ptrace : expected SIGSTOP, got status = %d",
  status);
 
-   *stack_out = stack;
return pid;
 }
 
@@ -124,8 +116,7 @@ static int start_ptraced_child(void **st
  * So only for SYSEMU features we test mustpanic, while normal host features
  * must work anyway!
  */
-static int stop_ptraced_child(int pid, void *stack, int exitcode,
- int mustexit)
+static int stop_ptraced_child(int pid, int exitcode, int mustexit)
 {
int status, n, ret = 0;
 
@@ -145,8 +136,6 @@ static int stop_ptraced_child(int pid, v
ret = -1;
}
 
-   if (munmap(stack, UM_KERN_PAGE_SIZE) < 0)
-   fatal_perror("check_ptrace : munmap failed");
return ret;
 }
 
@@ -198,13 +187,12 @@ __uml_setup("nosysemu", nosysemu_cmd_par
 
 static void __init check_sysemu(void)
 {
-   void *stack;
unsigned long regs[MAX_REG_NR];
int pid, n, status, count=0;
 
non_fatal("Checking syscall emulation patch for ptrace...");
sysemu_supported = 0;
-   pid = start_ptraced_child();
+   pid = start_ptraced_child();
 
if (ptrace(PTRACE_SYSEMU, pid, 0, 0) < 0)
goto fail;
@@ -231,7 +219,7 @@ static void __init check_sysemu(void)
goto fail;
}
 
-   if (stop_ptraced_child(pid, stack, 0, 0) < 0)
+   if (stop_ptraced_child(pid, 0, 0) < 0)
goto fail_stopped;
 
sysemu_supported = 1;
@@ -239,7 +227,7 @@ static void __init check_sysemu(void)
set_using_sysemu(!force_sysemu_disabled);
 
non_fatal("Checking advanced syscall emulation patch for ptrace...");
-   pid = start_ptraced_child();
+   pid = start_ptraced_child();
 
if ((ptrace(PTRACE_OLDSETOPTIONS, pid, 0,
   (void *) PTRACE_O_TRACESYSGOOD) < 0))
@@ -271,7 +259,7 @@ static void __init check_sysemu(void)
fatal("check_ptrace : expected SIGTRAP or "
  "(SIGTRAP | 0x80), got status = %d", status);
}
-   if (stop_ptraced_child(pid, stack, 0, 0) < 0)
+   if (stop_ptraced_child(pid, 0, 0) < 0)
goto fail_stopped;
 
sysemu_supported = 2;
@@ -282,18 +270,17 @@ static void __init check_sysemu(void)
return;
 
 fail:
-   stop_ptraced_child(pid, stack, 1, 0);
+   stop_ptraced_child(pid, 1, 0);
 fail_stopped:
non_fatal("missing\n");
 }
 
 static void __init check_ptrace(void)
 {
-   void *stack;
int pid, syscall, n, status;
 
non_fatal("Checking that ptrace can change system call numbers...");
-   pid = start_ptraced_child();
+   pid = start_ptraced_child();
 
if ((ptrace(PTRACE_OLDSETOPTIONS, pid, 0,
   (void *) PTRACE_O_TRACESYSGOOD) < 0))
@@ -323,7 +310,7 @@ static void __init check_ptrace(void)
break;
}
}
-

[PATCH 5/14] UML - Style fixes pass 2