date:20070904

[PATCH] Remove unneeded pointer intf from speedtch_upload_firmware() in drivers/usb/atm/speedtch.c

2007-09-04 Thread Micah Gruber

This trivial patch removes the unneeded pointer intf returned from 
usb_ifnum_to_if(), which is never used. The check for NULL can be simply done 
by if (!usb_ifnum_to_if(usb_dev, 2)).

Signed-off-by: Micah Gruber <[EMAIL PROTECTED]>
---

--- a/drivers/usb/atm/speedtch.c2007-09-04 23:18:17.0 +0800
+++ b/drivers/usb/atm/speedtch.c2007-09-05 00:51:19.0 +0800
@@ -251,7 +251,6 @@
 {
unsigned char *buffer;
struct usbatm_data *usbatm = instance->usbatm;
-   struct usb_interface *intf;
struct usb_device *usb_dev = usbatm->usb_dev;
int actual_length;
int ret = 0;
@@ -265,7 +264,7 @@
goto out;
}
 
-   if (!(intf = usb_ifnum_to_if(usb_dev, 2))) {
+   if (!usb_ifnum_to_if(usb_dev, 2)) {
ret = -ENODEV;
usb_dbg(usbatm, "%s: interface not found!\n", __func__);
goto out_free;

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] Remove unneeded pointer newdp from dccp_v4_request_recv_sock() in net/dccp/ipv4.c

2007-09-04 Thread Micah Gruber

This trivial patch removes the unneeded pointer newdp, which is never used.

Signed-off-by: Micah Gruber <[EMAIL PROTECTED]>
---

--- a/net/dccp/ipv4.c   2007-09-04 23:18:42.0 +0800
+++ b/net/dccp/ipv4.c   2007-09-05 00:49:54.0 +0800
@@ -381,7 +381,6 @@
 {
struct inet_request_sock *ireq;
struct inet_sock *newinet;
-   struct dccp_sock *newdp;
struct sock *newsk;
 
if (sk_acceptq_is_full(sk))
@@ -396,7 +395,6 @@
 
sk_setup_caps(newsk, dst);
 
-   newdp  = dccp_sk(newsk);
newinet= inet_sk(newsk);
ireq   = inet_rsk(req);
newinet->daddr = ireq->rmt_addr;

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] Remove unneeded pointer iph from ipcomp6_input() in net/ipv6/ipcomp6.c

2007-09-04 Thread Micah Gruber

This trivial patch removes the unneeded pointer iph, which is never used.

Signed-off-by: Micah Gruber < [EMAIL PROTECTED]>
---

--- a/net/ipv6/ipcomp6.c2007-09-04 23:18:43.0 +0800
+++ b/net/ipv6/ipcomp6.c2007-09-05 00:48:05.0 +0800
@@ -65,7 +65,6 @@
 static int ipcomp6_input(struct xfrm_state *x, struct sk_buff *skb)
 {
int err = -ENOMEM;
-   struct ipv6hdr *iph;
struct ipv6_comp_hdr *ipch;
int plen, dlen;
struct ipcomp_data *ipcd = x->data;
@@ -79,7 +78,6 @@
skb->ip_summed = CHECKSUM_NONE;
 
/* Remove ipcomp header and decompress original payload */
-   iph = ipv6_hdr(skb);
ipch = (void *)skb->data;
skb->transport_header = skb->network_header + sizeof(*ipch);
__skb_pull(skb, sizeof(*ipch));


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [ANNOUNCE] Lguest64 - fatter puppies!

2007-09-04 Thread H. Peter Anvin


Steven Rostedt wrote:

This is a formal announcement of Lguest64.

Most are aware of the little puppies (lguest32, or simply lguest, or in
some circles "rustyvisor").  But this time the puppies ate a bit too
much.  No more lean and mean puppies, now we got big fat lazy ones.
Running on the hardware that's too lazy to do full virutalization. Yes,
lguest now runs on x86_64!


Totally phat!

-hpa
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: tbench regression - Why process scheduler has impact on tbench and why small per-cpu slab (SLUB) cache creates the scenario?

2007-09-04 Thread Zhang, Yanmin

On Tue, 2007-09-04 at 20:59 -0700, Christoph Lameter wrote:
> On Wed, 5 Sep 2007, Zhang, Yanmin wrote:
> 
> > 8) kmalloc-4096 order is 1 which means one slab consists of 2 objects. So a
> 
> You can change that by booting with slub_max_order=0. Then we can also use 
> the per cpu queues to get these order 0 objects which may speed up the 
> allocations because we do not have to take zone locks on slab allocation.
> 
> Note also that Andrew's tree has a page allocator pass through for SLUB 
> for 4k kmallocs bypassing slab completely. That may also address the 
> issue.
> 
> If you want SLUB to handle more objects in the 4k kmalloc cache 
> without going to the page allocator then you can boot f.e. with
> 
> slub_max_order=3 slub_min_objects=8
I tried this approach. The testing result showed 2.6.23-rc4 is about
2.5% better than 2.6.22. It really resovles the issue.

However, the approach treats the slabs in the same policy. Could we
implement a per-slab specific approach like direct b)?

> 
> which will result in a kmalloc-4096 that caches 8 objects.
> 
> > b) Change SLUB per-cpu slab cache, to cache more slabs instead of only 
> > one
> > slab. This way could use page->lru to creates a list linked in 
> > kmem_cache->cpu_slab[]
> > whose members need to be changed to as list_head. As for how many slabs 
> > could be in
> > a per-cpu slab cache, it might be implemented as a sysfs parameter under 
> > /sys/slab/XXX/.
> > Default could be 1 to satisfy big machines.
Above direction b) looks more flexible.

In addition, could process scheduler also have an enhancement to schedule waken
processes firstly or do some favor for waken processes? From cache-hot point of 
view,
this enhancement might help performance, because mostly, waken process and 
waker share
some data.

> Try the ways to address the issue that I mentioned above.
I really appreciate your kind comments!

-yanmin
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

2.6.23.rc5: Problem with procfs -- schedstat

2007-09-04 Thread Ph. Marek

Hello everybody,

I found a problem in /proc/self/schedstat: a simple "cat" can give "wrong"
results.

/proc# cat self/schedstat
91117 26027 2
/proc# cat self/schedstat
90691 27872 2
/proc# cat self/schedstat
995483 15675 3

/proc# cat self/schedstat
478050 124422 3
/proc# cat self/schedstat
87912 21539 2
/proc# cat self/schedstat
81382 19722 2
/proc# cat self/schedstat
87999 119699 2
/proc# cat self/schedstat
87192 25990 2
/proc# cat self/schedstat
80114 15113 2
3
/proc# cat self/schedstat
93064 28817 2
/proc# cat self/schedstat
90834 22816 2
/proc# cat self/schedstat
87806 37581 2
/proc# cat self/schedstat
80187 20283 2
3
/proc#

Please note the extra newline and possible other digits.

A strace reveals that cat does
open("self/schedstat", O_RDONLY|O_LARGEFILE) = 3
fstat64(3, {st_mode=S_IFREG|0444, st_size=0, ...}) = 0
read(3, "946533 98256 64\n", 4096)      = 16
write(1, "946533 98256 64\n", 16)       = 16
read(3, "8\n", 4096)                    = 2
write(1, "8\n", 2)                      = 2
read(3, "", 4096)                       = 0
close(3)                                = 0
close(1)                                = 0
exit_group(0)                           = ?

The simple fix would be to change the format in proc_pid_schedstat(), so
that it always returns the same number of bytes.

Or proc_info_read() gets changed - if less bytes than wanted (by userspace)
were returned, mark that file for returning EOF next time. But I fear that
that might break other entries that rely on getting repeatedly called.


Regards,

Phil


-- 
Versioning your /etc, /home or even your whole installation?
 Try fsvs (fsvs.tigris.org)!
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] Add support for keyboard on SEGA Dreamcast

2007-09-04 Thread Dmitry Torokhov

Hi Mike,

On Wednesday 05 September 2007 00:34, Mike Frysinger wrote:
> > + kbd->dev = input_allocate_device();
> > ...
> > + retval = input_register_device(kbd->dev);
> > + if (unlikely(retval))
> > + goto cleanup;
> > ...
> > +      cleanup:
> > + kfree(kbd);
> > + return -EINVAL;
> 
> i'm not familiar with the input layer, but do you need to deallocate that 
> input device if the register fails ?  if so, i guess dc_kbd_disconnect() 
> would need tweaking too ...

No, dc_kbd_disconnect() is fine - the structure is refcounted and so
input core will free it when the last reference drops. But you are
right, input_free_device() is still needed in error path.

-- 
Dmitry
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] Add support for keyboard on SEGA Dreamcast

2007-09-04 Thread Mike Frysinger

On Tuesday 04 September 2007, Adrian McMenamin wrote:
> --- a/drivers/input/keyboard/Kconfig
> +++ b/drivers/input/keyboard/Kconfig
> +   Say Y here if you have a DreamCast console running Linux and have

funny caps in Dreamcast

> --- /dev/null
> +++ b/drivers/input/keyboard/maple_keyb.c
> +static void dc_scan_kbd(struct dc_kbd *kbd)

still some funny wrappings in this func ...

> + printk
> + ("Unknown key (scancode %#x) released.",
> +  kbd->old[i]);
> ...
> + printk
> + ("Unknown key (scancode %#x) pressed.",
> +  kbd->new[i]);

missing KERN log levels in those printk's

> +static int dc_kbd_connect(struct maple_device *dev)
> +{
> ...
> + struct dc_kbd *kbd;
> ...
> + kbd = kzalloc(sizeof(struct dc_kbd), GFP_KERNEL);

i find this more readable/managable myself:
kbd = kzalloc(*kbd, GFP_KERNEL);

> + kbd->dev = input_allocate_device();
> ...
> + retval = input_register_device(kbd->dev);
> + if (unlikely(retval))
> + goto cleanup;
> ...
> +  cleanup:
> + kfree(kbd);
> + return -EINVAL;

i'm not familiar with the input layer, but do you need to deallocate that 
input device if the register fails ?  if so, i guess dc_kbd_disconnect() 
would need tweaking too ...
-mike


signature.asc
Description: This is a digitally signed message part.

Re: Fwd: [PATCH] IdealTEK URTC1000 support for usbtouchscreen

2007-09-04 Thread Dmitry Torokhov

On Monday 27 August 2007 18:07, Daniel Ritz wrote:
> > OK, so here's the new patch, inline this time:
> 
> thanks. looks fine now. forwarding to Dmitry for mainline inclusion...
>

Applied, thank you.

-- 
Dmitry
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] 10-dots braille keyboards

2007-09-04 Thread Dmitry Torokhov

On Monday 20 August 2007 20:38, Samuel Thibault wrote:
> Hi,
> 
> Some braille keyboards have 10 dots, so extend the Input braille keys
> definitions.
> 
> Signed-off-by: Samuel Thibault <[EMAIL PROTECTED]>
> 

Applied, thank you Samuel.

-- 
Dmitry
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] Input: i8042 - add HP Pavilion DV4270ca to the MUX blacklist

2007-09-04 Thread Dmitry Torokhov

On Monday 03 September 2007 17:47, Elvis Pranskevichus wrote:
> This fixes "atkbd.c: Suprious NAK on isa0060/serio0" errors for
> HP Pavilion DV4270ca. Same reasons as for 
> 9d9d50bb2efb50594abfc3941a5504b62c514ebd
> and 6e782584e0713ea89da151333e7fe754c8f40324.
> 
> Signed-off-by: Elvis Pranskevichus <[EMAIL PROTECTED]>
>

Applied, thank you Elvis.
 
-- 
Dmitry
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: ALPS touchpad with new Dell not recognised

2007-09-04 Thread Dmitry Torokhov

Hi,

On Saturday 04 August 2007 18:45, William Pettersson wrote:
> Hi,
> This patch adds support for the Alps touchpad on my Dell Vostro 1400 to
> the linux kernel.
> 
> Signed-off-by: William Pettersson <[EMAIL PROTECTED]>

Applied, thank you William.

-- 
Dmitry
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] docproc: style & typo cleanups

2007-09-04 Thread Randy Dunlap

From: Randy Dunlap <[EMAIL PROTECTED]>

- fix typos/spellos in docproc.c and Makefile
- add a little whitespace {while, switch} (coding style)
- use NULL instead of 0 for pointer testing

Signed-off-by: Randy Dunlap <[EMAIL PROTECTED]>
---
 scripts/basic/Makefile  |8 
 scripts/basic/docproc.c |   34 ++
 2 files changed, 22 insertions(+), 20 deletions(-)

--- linux-2.6.23-rc5.orig/scripts/basic/docproc.c
+++ linux-2.6.23-rc5/scripts/basic/docproc.c
@@ -10,8 +10,10 @@
  * documentation-frontend
  * Scans the template file and call kernel-doc for
  * all occurrences of ![EIF]file
- * Beforehand each referenced file are scanned for
- * any exported sympols "EXPORT_SYMBOL()" statements.
+ * Beforehand each referenced file is scanned for
+ * any symbols that are exported via these macros:
+ * EXPORT_SYMBOL(), EXPORT_SYMBOL_GPL(), &
+ * EXPORT_SYMBOL_GPL_FUTURE()
  * This is used to create proper -function and
  * -nofunction arguments in calls to kernel-doc.
  * Usage: docproc doc file.tmpl
@@ -73,7 +75,7 @@ void usage (void)
 }
 
 /*
- * Execute kernel-doc with parameters givin in svec
+ * Execute kernel-doc with parameters given in svec
  */
 void exec_kernel_doc(char **svec)
 {
@@ -82,7 +84,7 @@ void exec_kernel_doc(char **svec)
char real_filename[PATH_MAX + 1];
/* Make sure output generated so far are flushed */
fflush(stdout);
-   switch(pid=fork()) {
+   switch (pid=fork()) {
case -1:
perror("fork");
exit(1);
@@ -133,6 +135,7 @@ struct symfile * add_new_file(char * fil
symfilelist[symfilecnt++].filename = strdup(filename);
return [symfilecnt - 1];
 }
+
 /* Check if file already are present in the list */
 struct symfile * filename_exist(char * filename)
 {
@@ -156,8 +159,8 @@ void noaction2(char * file, char * line)
 void printline(char * line)   { printf("%s", line); }
 
 /*
- * Find all symbols exported with EXPORT_SYMBOL and EXPORT_SYMBOL_GPL
- * in filename.
+ * Find all symbols in filename that are exported with EXPORT_SYMBOL &
+ * EXPORT_SYMBOL_GPL (& EXPORT_SYMBOL_GPL_FUTURE implicitly).
  * All symbols located are stored in symfilelist.
  */
 void find_export_symbols(char * filename)
@@ -179,15 +182,15 @@ void find_export_symbols(char * filename
perror(real_filename);
exit(1);
}
-   while(fgets(line, MAXLINESZ, fp)) {
+   while (fgets(line, MAXLINESZ, fp)) {
char *p;
char *e;
-   if (((p = strstr(line, "EXPORT_SYMBOL_GPL")) != 0) ||
-((p = strstr(line, "EXPORT_SYMBOL")) != 0)) {
+   if (((p = strstr(line, "EXPORT_SYMBOL_GPL")) != NULL) ||
+((p = strstr(line, "EXPORT_SYMBOL")) != NULL)) {
/* Skip EXPORT_SYMBOL{_GPL} */
while (isalnum(*p) || *p == '_')
p++;
-   /* Remove paranteses and additional ws */
+   /* Remove parentheses & additional whitespace */
while (isspace(*p))
p++;
if (*p != '(')
@@ -211,7 +214,7 @@ void find_export_symbols(char * filename
  * Document all external or internal functions in a file.
  * Call kernel-doc with following parameters:
  * kernel-doc -docbook -nofunction function_name1 filename
- * function names are obtained from all the src files
+ * Function names are obtained from all the src files
  * by find_export_symbols.
  * intfunc uses -nofunction
  * extfunc uses -function
@@ -262,7 +265,7 @@ void singfunc(char * filename, char * li
vec[idx++] = KERNELDOC;
vec[idx++] = DOCBOOK;
 
-/* Split line up in individual parameters preceeded by FUNCTION */
+/* Split line up in individual parameters preceded by FUNCTION */
 for (i=0; line[i]; i++) {
 if (isspace(line[i])) {
 line[i] = '\0';
@@ -292,7 +295,7 @@ void parse_file(FILE *infile)
 {
char line[MAXLINESZ];
char * s;
-   while(fgets(line, MAXLINESZ, infile)) {
+   while (fgets(line, MAXLINESZ, infile)) {
if (line[0] == '!') {
s = line + 2;
switch (line[1]) {
@@ -351,9 +354,9 @@ int main(int argc, char *argv[])
{
/* Need to do this in two passes.
 * First pass is used to collect all symbols exported
-* in the various files.
+* in the various files;
 * Second pass

Re: tbench regression - Why process scheduler has impact on tbench and why small per-cpu slab (SLUB) cache creates the scenario?

2007-09-04 Thread Christoph Lameter

On Wed, 5 Sep 2007, Zhang, Yanmin wrote:

> 8) kmalloc-4096 order is 1 which means one slab consists of 2 objects. So a

You can change that by booting with slub_max_order=0. Then we can also use 
the per cpu queues to get these order 0 objects which may speed up the 
allocations because we do not have to take zone locks on slab allocation.

Note also that Andrew's tree has a page allocator pass through for SLUB 
for 4k kmallocs bypassing slab completely. That may also address the 
issue.

If you want SLUB to handle more objects in the 4k kmalloc cache 
without going to the page allocator then you can boot f.e. with

slub_max_order=3 slub_min_objects=8

which will result in a kmalloc-4096 that caches 8 objects.

>   b) Change SLUB per-cpu slab cache, to cache more slabs instead of only 
> one
> slab. This way could use page->lru to creates a list linked in 
> kmem_cache->cpu_slab[]
> whose members need to be changed to as list_head. As for how many slabs could 
> be in
> a per-cpu slab cache, it might be implemented as a sysfs parameter under 
> /sys/slab/XXX/.
> Default could be 1 to satisfy big machines.

Try the ways to address the issue that I mentioned above.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 3/6] x86: Convert cpu_sibling_map to be a per cpu variable (v2) (fwd)

2007-09-04 Thread Christoph Lameter

On Tue, 4 Sep 2007, Andrew Morton wrote:

> > My question though, would include/linux/smp.h be the appropriate place for
> > the above define?  (That is, if the above approach is the correct one... ;-)
> 
> It'd be better to convert the unconverted architectures?

That is certainly the cleanest solution. Maybe we can only convert the 
variables used in the scheduler that way?

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] Add support for keyboard on SEGA Dreamcast

2007-09-04 Thread Dmitry Torokhov

Hi Adrian,

On Tuesday 04 September 2007 19:34, Adrian McMenamin wrote:
> This patch will add support for the Dreamcast keyboard when used
> alongside the maple bus patch (http://lkml.org/lkml/2007/9/4/165) and
> the pvr2 patch.
> 
> Signed off by: Adrian McMenamin <[EMAIL PROTECTED]>
>

Thnank you very much for your patch. Couple of comments:
 
> +
> +static unsigned char dc_kbd_keycode[256] = {

Const?

> +
> +static int dc_kbd_connect(struct maple_device *dev)
> +{
> + int i, retval;
> + unsigned long data = be32_to_cpu(dev->devinfo.function_data[0]);
> + struct dc_kbd *kbd;
> + if (dev->function != MAPLE_FUNC_KEYBOARD)
> + return -EINVAL;
> +
> + kbd = kzalloc(sizeof(struct dc_kbd), GFP_KERNEL);
> + if (unlikely(!kbd))
> + return -ENOMEM;
> +
> + dev->private_data = kbd;
> + kbd->dev = input_allocate_device();
> + if (unlikely(!kbd->dev))
> + goto cleanup;
> +
> + kbd->dev->evbit[0] = BIT(EV_KEY) | BIT(EV_REP);
> +
> + for (i = 0; i < 255; i++)
> + set_bit(dc_kbd_keycode[i], kbd->dev->keybit);
> +
> + clear_bit(0, kbd->dev->keybit);
> +
> + kbd->dev->private = kbd;
> +
> + kbd->dev->name = dev->product_name;
> + kbd->dev->id.bustype = BUS_MAPLE;

Do we really need a new bus type? Would not BUS_HOST suffice?

> +
> + retval = input_register_device(kbd->dev);
> + if (unlikely(retval))
> + goto cleanup;
> +
> + maple_getcond_callback(dev, dc_kbd_callback, (25 * HZ) / 1000,
> +MAPLE_FUNC_KEYBOARD);
> +
> + printk(KERN_INFO "input: keyboard(0x%lx): %s\n", data,
> +kbd->dev->name);

Input core already prints a line when a new input device is registered,
do we need another one?

> +
> + return 0;
> +
> +  cleanup:
> + kfree(kbd);
> + return -EINVAL;

It loks like call to input_free_device() is missing here. You would leak
memory if input_register_device would fail. 

I would also get rid of all likelys/unlikelys here - driver registration is
not a hot path...

> +}
> +
> +static void dc_kbd_disconnect(struct maple_device *dev)
> +{
> + struct dc_kbd *kbd = dev->private_data;
> +
> + input_unregister_device(kbd->dev);
> + kfree(kbd);

Are we guaranteed that the dc_kbd_callback is not running in a separate
thread?

Please also consider implementing support for changing keyma. Since
the keymap is pretty full I think the best way is to copy the vanilla
keymap into a per-device memory and set up keymap, keycodesize and
keycodemax in input device structure.

Thank you.

-- 
Dmitry
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[ANNOUNCE] Lguest64 - fatter puppies!

2007-09-04 Thread Steven Rostedt

This is a formal announcement of Lguest64.

Most are aware of the little puppies (lguest32, or simply lguest, or in
some circles "rustyvisor").  But this time the puppies ate a bit too
much.  No more lean and mean puppies, now we got big fat lazy ones.
Running on the hardware that's too lazy to do full virutalization. Yes,
lguest now runs on x86_64!

As you know, puppies are young, and so is lguest64. And like most new
born puppies, lguest64 might crap on your floor. But if it's in a good
mood, it will make it to the door and give you a login prompt (or maybe
even ssh into it and run firefox!
http://rostedt.homelinux.com/pics/firefox-on-lguest64.png ).

lguest64 is still going through a bit of growth pains, but its getting
better. It's to a point that we are not that afraid to bring it to the
dog show.

So for those that love puppies, and want even bigger ones, you can
download the code at:

  git://git.et.redhat.com/kernel-lguest-64.git

TODO:
 Many things! but here's what's on the near future list.

 - SMP first for host then for guest. We had it working a while
   ago, but we decided to update to catchup to lguest32, and we
   broke it.

 - get closer to lguest32. Rusty just likes to annoy us ;-)

 - optimization, optimization, optimization (did I say it's still slow?)

 - oh, and hopefully to get it merged!

well, there's lots more to do, but we can think about it later.

Want to help? test it out, and let us know how big a mess it made on
your floor.

-- Steve


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [GIT PULL] x86 setup: work around bug in Xen HVM

2007-09-04 Thread H. Peter Anvin


Christoph Hellwig wrote:

On Tue, Sep 04, 2007 at 09:55:45AM -0700, H. Peter Anvin wrote:

Apparently XEN does not keep the contents of the 48-bit gdt_48 data

structure that is passed to lgdt in the XEN machine state. Instead it
appears to save the _address_ of the 48-bit descriptor
somewhere. Unfortunately this data happens to reside on the stack and
is probably no longer availiable at the time of the actual protected
mode jump.

This is Xen bug but given that there is a one-line patch to work

around this problem, the linux kernel should probably do this.  My fix
is to make the gdt_48 description in setup_gdt static (in setup_idt
this is already the case). This allows the kernel to boot under
Xen HVM again.



-   struct gdt_ptr gdt;
+   static struct gdt_ptr gdt;


It might make sense to add your above commit message to the code as a comment.


Good point; I have amended the commit with a brief comment:

diff --git a/arch/i386/boot/pm.c b/arch/i386/boot/pm.c
index 6be9ca8..09fb342 100644
--- a/arch/i386/boot/pm.c
+++ b/arch/i386/boot/pm.c
@@ -122,7 +122,11 @@ static void setup_gdt(void)
/* DS: data, read/write, 4 GB, base 0 */
[GDT_ENTRY_BOOT_DS] = GDT_ENTRY(0xc093, 0, 0xf),
};
-   struct gdt_ptr gdt;
+   /* Xen HVM incorrectly stores a pointer to the gdt_ptr, instead
+  of the gdt_ptr contents.  Thus, make it static so it will
+  stay in memory, at least long enough that we switch to the
+  proper kernel GDT. */
+   static struct gdt_ptr gdt;

gdt.len = sizeof(boot_gdt)-1;
gdt.ptr = (u32)_gdt + (ds() << 4);

-hpa
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: kernel 2.6.22: what IS the VM doing?

2007-09-04 Thread Rik van Riel


Sami Farin wrote:

Using SMP kernel 2.6.22.6pre-CFS-v20.5 on Pentium D (IA-32).
I think this bug (or whatever you want to call it) got triggered
when you first allocate several megabytes of memory in a kernel module
and then free them, and then run e.g. X and when memory gets tight,
you end up with this situation...

Top 2 /proc/vmstat Biggest Winners:

pgrefill_normal:49900/second
pgrefill_high:20810/second


That means the pageout code was scanning about 7 pages
per second on your system during peak stress.  You may have
run into a scalability problem in the Linux kernel, where it
wants to clear the referenced bit off all the anonymous pages
before swapping something out.

To make matters worse, that unlucky page gets chosen because
it was the page where kswapd started scanning.  It has little
to do with being the least recently used page, because every
anonymous page tends to have its referenced bit set by the time
we start scanning.

On truly enormous systems, say with 256GB of memory, kswapd
sometimes needs to scan hundreds of thousands or even millions
of pages before finding something to swap out.  Not fun.


Did I forget to include some info???
Oh, and I need to reboot in order to get usable system
when this bug happens.


Is the system trying to evict pages like crazy when your
system becomes unusable?

If so, I wonder if kswapd is simply doing the wrong thing
and trying to evict data from all zones, simply because the
highmem zone is low on free pages...

--
Politics is the struggle between those who want to make their country
the best in the world, and those who believe it already is.  Each group
calls the other unpatriotic.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Cache not being reclaimed?

2007-09-04 Thread Ian Kumlien

Hi, 

I have just had a quite unexpected 'low memory situation'...

This is a AMD64 machine with 2 gig memory, running 64 bit userland.

Kernel: 2.6.23-rc3-git10, updating to -rc5-* as soon as i can.
I'm using SLUB:s


To me, this looks odd... I thought that any cached memory would be
reclamed but it was always full.

Ideas?

One example from dmesg:
swapper: page allocation failure. order:1, mode:0x4020

Call Trace:
   [] __alloc_pages+0x30f/0x330
 [] __slab_alloc+0x141/0x590
 [] __netdev_alloc_skb+0x17/0x40
 [] __netdev_alloc_skb+0x17/0x40
 [] __kmalloc_track_caller+0xa0/0xc0
 [] __alloc_skb+0x6f/0x150
 [] __netdev_alloc_skb+0x17/0x40
 [] :sky2:sky2_rx_alloc+0x25/0xf0
 [] :sky2:sky2_poll+0x6dc/0xcf0
 [] tcp_delack_timer+0x0/0x210
 [] net_rx_action+0x8a/0x140
 [] __do_softirq+0x69/0xe0
 [] call_softirq+0x1c/0x30
 [] do_softirq+0x35/0x90
 [] do_IRQ+0x80/0x100
 [] default_idle+0x0/0x40
 [] ret_from_intr+0x0/0xa
   [] default_idle+0x29/0x40
 [] cpu_idle+0xa1/0xf0

Mem-info:
DMA per-cpu:
CPU0: Hot: hi:0, btch:   1 usd:   0   Cold: hi:0, btch:   1
usd:   0
CPU1: Hot: hi:0, btch:   1 usd:   0   Cold: hi:0, btch:   1
usd:   0
DMA32 per-cpu:
CPU0: Hot: hi:  186, btch:  31 usd: 163   Cold: hi:   62, btch:  15
usd:  56
CPU1: Hot: hi:  186, btch:  31 usd:  33   Cold: hi:   62, btch:  15
usd:  60
Active:348343 inactive:122950 dirty:13504 writeback:0 unstable:0
 free:2665 slab:21427 mapped:243884 pagetables:4816 bounce:0
DMA free:8020kB min:20kB low:24kB high:28kB active:16kB inactive:0kB
present:7636kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 2003 2003 2003
DMA32 free:2640kB min:5716kB low:7144kB high:8572kB active:1393356kB
inactive:491800kB present:2052008kB pages_scanned:22 all_unreclaimable?
no
lowmem_reserve[]: 0 0 0 0
DMA: 1*4kB 0*8kB 1*16kB 2*32kB 4*64kB 2*128kB 3*256kB 1*512kB 0*1024kB
1*2048kB 1*4096kB = 8020kB
DMA32: 400*4kB 0*8kB 1*16kB 0*32kB 2*64kB 1*128kB 1*256kB 1*512kB
0*1024kB 0*2048kB 0*4096kB = 2640kB
Swap cache: add 985117, delete 960396, find 102684/214435, race 0+193
Free swap  = 2136272kB
Total swap = 2530180kB
Free swap:   2136272kB
524208 pages of RAM
10098 reserved pages
588916 pages shared
24719 pages swap cached

vmstat
procs ---memory-- ---swap-- -io -system--
cpu
 r  b   swpd   free   buff  cache   si   sobibo   in   cs us sy
id wa
 0  1 393904  16108  13788 158337221   2296798  4  2
92  2


-- 
Ian Kumlien  -- http://pomac.netswarm.net


signature.asc
Description: This is a digitally signed message part

[PATCH -mm] ufs: Fix mount check in ufs_fill_super()

2007-09-04 Thread Satyam Sharma

Hi Evgeniy,


On Sun, 19 Aug 2007, Evgeniy Dushistov wrote:
> 
> Different types of ufs hold state in different places,
> to hide complexity of this, there is ufs_get_fs_state,
> it returns state according to "UFS_SB(sb)->s_flags",
> but during mount ufs_get_fs_state is called,
> before setting s_flags, this cause message for
> ufs types like sun ufs: "fs need fsck",
> and remount in readonly state.

I noticed another strange thing with that same compound condition in
ufs_fill_super(). In the present code, if (flags & UFS_ST_MASK) ==
UFS_ST_44BSD, then the whole ufs_get_fs_state() == UFS_FSOK check gets
completely skipped (!)

Is this intentional? If so, I'd recommend plonking a comment in there.
But doesn't look that way to me, because ufs_get_fs_state() does handle
the UFS_ST_44BSD case also ... Does the patch below look correct to you?

Satyam


[PATCH -mm] ufs: Fix mount check in ufs_fill_super()

The current code skips the check to verify whether the filesystem was
previously cleanly unmounted, if (flags & UFS_ST_MASK) == UFS_ST_44BSD
or UFS_ST_OLD. This looks like an inadvertent bug that slipped in due
to parantheses in the compound conditional to me, especially given that
ufs_get_fs_state() handles the UFS_ST_44BSD case perfectly well. So,
let's fix the compound condition appropriately.

Signed-off-by: Satyam Sharma <[EMAIL PROTECTED]>

---

 fs/ufs/super.c |   15 ---
 1 file changed, 8 insertions(+), 7 deletions(-)

--- linux-2.6.23-rc4-mm1/fs/ufs/super.c~fix 2007-09-05 06:35:12.0 
+0530
+++ linux-2.6.23-rc4-mm1/fs/ufs/super.c 2007-09-05 06:40:05.0 +0530
@@ -934,19 +934,20 @@ magic_found:
flags |=  UFS_ST_SUN;
}
 
-   sbi->s_flags = flags;/*after that line some functions use s_flags*/
+   /* Set sbi->s_flags here, used by ufs_get_fs_state() below */
+   sbi->s_flags = flags;
ufs_print_super_stuff(sb, usb1, usb2, usb3);
 
/*
 * Check, if file system was correctly unmounted.
 * If not, make it read only.
 */
-   if (((flags & UFS_ST_MASK) == UFS_ST_44BSD) ||
- ((flags & UFS_ST_MASK) == UFS_ST_OLD) ||
- (((flags & UFS_ST_MASK) == UFS_ST_SUN || 
-   (flags & UFS_ST_MASK) == UFS_ST_SUNOS ||
- (flags & UFS_ST_MASK) == UFS_ST_SUNx86) && 
- (ufs_get_fs_state(sb, usb1, usb3) == (UFS_FSOK - fs32_to_cpu(sb, 
usb1->fs_time) {
+   if flags & UFS_ST_MASK) == UFS_ST_44BSD)||
+((flags & UFS_ST_MASK) == UFS_ST_OLD)  ||
+((flags & UFS_ST_MASK) == UFS_ST_SUN)  ||
+((flags & UFS_ST_MASK) == UFS_ST_SUNOS)||
+((flags & UFS_ST_MASK) == UFS_ST_SUNx86))  &&
+   (ufs_get_fs_state(sb, usb1, usb3) == (UFS_FSOK - fs32_to_cpu(sb, 
usb1->fs_time {
switch(usb1->fs_clean) {
case UFS_FSCLEAN:
UFSD("fs is clean\n");
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: TUN/TAP driver - MAINTAINERS - bad mailing list entry?

2007-09-04 Thread Max Krasnyansky


Joe Perches wrote:

MAINTAINERS curently has:

TUN/TAP driver
P:  Maxim Krasnyansky
M:  [EMAIL PROTECTED]
L:  [EMAIL PROTECTED]

[EMAIL PROTECTED] doesn't seem to be a valid email address.

Should it be removed or modified?



Sorry for late response. Just noticed this.
Yes it's an ancient mailing list and should be removed.
I totally forgot about it.

Max

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

tbench regression - Why process scheduler has impact on tbench and why small per-cpu slab (SLUB) cache creates the scenario?

2007-09-04 Thread Zhang, Yanmin

1) Tbench has about 30% regression in kernel 2.6.23-rc4 than 2.6.22.
2.6.23-rc1 has about 10% regression. I investigated 2.6.22 and 2.6.23-rc4.
2) Testing environment: x86_64, qual-core, 2 physical processors, totally
8 cores. 8GB memory. Kernel enables CONFIG_SLUB=y and CONFIG_SLUB_DEBUG=y.
3) Under my environment, I started CPU_NUMBER*2 tbench sub processes and
server processes. So 16 tbench and 16 tbench_srv processes are running
based on 1:1 mapping. Tbench communicates with tbench_srv in an interactive
mode by tcp socket.
4) Collected oprofile data and showed __slab_alloc is about 15% in
2.6.23-rc4 and 3.8% in 2.6.22;
5) Collected slabinfo and found kmalloc-4096 and skbuff_head_cache are
proactive. Other slabs are mostly quiet.
6) Collect data about slab_alloc: data consists of
a) number to call slab_alloc
b) number to get objects from slab per cpu cache;
c) number to get objects from a new slab and a partial slab;
d) number to free objects from non-perCPU cache;
These data showed skbuff_head_cache allocation mostly succeeds in
per-cpu cache, so it won’t cause too much __slab_alloc. kmalloc-4096
is the slab which causes too most __slab_alloc callings.

On 2.6.22, about 58% kmalloc-4096 succeeds in per-cpu slab cache.
On 2.6.23-rc4, about 12.5% kmalloc-4096 succeeds in per-cpu slab cache.

7) By instrumenting kernel, I captured kernel allocates kmalloc-4096 always
at tcp_sendmsg=>sk_stream_alloc_psk and frees it at
tcp_ack=>tcp_clean_rtx_queue=>sk_stream_free_skb. When tbench client
communicates with tbench_srv, the sender will allocate a kmalloc-4096 and the
receiver will free it.
8) kmalloc-4096 order is 1 which means one slab consists of 2 objects. So a
partial slab always consists one free object. In the other hand, slub only
allocates 1 slab for every cpu. If a tbench client process gets a kmalloc-4096
object from a partial page on a cpu and put the slab as the per-cpu cache,
this slab will have no free objects. So late another tbench process also
applies for a kmalloc-4096 on the same cpu, it couldn’t get a free object
from per-CPU cache. It will get an object, mostly from another partial slab
and replace the per-cpu slab cache, although the new slab also hasn’t free 
objects then.
9) I collected more data about cpu to see if the cpu on which kernel allocates
the object is the cpu which kernel frees the same object. The result showed both
kernel 2.6.22 and 2.6.23-rc3, very mostly, an object will be allocated and freed
on the same cpu. That means tbench client and tbench_srv process who communicate
with each other are mostly running on the same cpu.

10) I ran both kernel with boot parameter maxcpus=1 and found the regression 
becomes
about 10%.
11) On my machines, averagely, 1 cpu has 2 tbench client process and 2 
tbench_srv
processes. So there are a couple of scenario of process scheduling:
a) Client 1 allocates a 4096 object and updates the per-CPU slab with
the new non-free slab. The tbench_srv 1 consumes the data and free the 4096 
object
on the same cpu, so the per-cpu slab cache slab has free object now. Then,
tbench_srv 1 replies to client 1 by allocating a new 4096 again, or client 2
allocates a 4096 object from the per-cpu slab cache to communicate with 
tbench_srv 2.
This good scenario is ideal.
b) Client 1 allocates a 4096 object and updates the per-CPU slab with 
the
new non-free slab, then sleep to wait tbench_srv 1 to reply. But client 2 
allocates
a 4096 object and finds the per-cpu slab cache has no free object, so allocates 
the
object from a partial slab and updates the per-CPU slab with the new non-free 
slab.
Then, tbench_srv 1 is scheduled in and frees the kmalloc-4096 object to a 
partial
slab, because previous slab already isn’t per-cpu cache. Then, tbench_srv 1 
tries
to allocate a new kmalloc-4096 to reply to client 1. But because the per-cpu 
slab
hasn’t free object, so it also needs get the free object from a partial slab and
update per-cpu slab cache. This scenario is very bad.
Under both scenarios, I think schedule wakes up sleeping processes on 
the same
cpu. In scenario 1, the waken process will be scheduled to run on cpu 
quickly
(immediately?). In In scenario 2, the waken process will be scheduled 
later.

   I think kernel 2.6.22 creates scenario 1 and 2.6.23-rc4 creates scenario 2.

12) How to resolve the issue? There are 2 directions.
a) Change process scheduler to schedule waken processes firstly.
b) Change SLUB per-cpu slab cache, to cache more slabs instead of only 
one
slab. This way could use page->lru to creates a list linked in 
kmem_cache->cpu_slab[]
whose members need to be changed to as list_head. As for how many slabs could 
be in
a per-cpu slab cache, it might be implemented as a sysfs parameter under 
/sys/slab/XXX/.
Default could be 1 to satisfy big machines.

--yanmin
-
To unsubscribe from this list:

Re: 2.6.22.6 + rt9: suspend/hibernate not working

2007-09-04 Thread Daniel Walker

On Tue, 2007-09-04 at 17:12 -0700, Fernando Lopez-Lezcano wrote:
> Hi Ingo... I'm getting reports from some of my Planet CCRMA users (which
> I confirmed) that the latest rt kernel I released has broken suspend
> (tested on fc6 & fc7, stock Fedora kernel works fine - the rt
> configuration files are virtual clones as far as possible of the
> standard Fedora kernel config files). 
> 
> I don't know where to start debugging this. When suspend is initiated it
> freezes with a "Stopping tasks ... " message in the text console - a
> hard power cycle is the only way to get the machine back to normal. 
> 
> kernel/power/process.c seems to contain that string in the
> freeze_processes function so it looks like the freezer is not freezing
> tasks as no "done" message is ever printed. 
> 
> What could we do to help?

If you have high resolution timers enabled you could try disabling it,
and see if the problem persists .

Daniel

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] Fix tsk->exit_state usage (resend)

2007-09-04 Thread Satyam Sharma

Hi Eugene,

This already got merged into -mm, but ...

On Sun, 19 Aug 2007, Eugene Teo wrote:
> 
> tsk->exit_state can only be 0, EXIT_ZOMBIE, or EXIT_DEAD. A non-zero test
> is the same as tsk->exit_state & (EXIT_ZOMBIE | EXIT_DEAD), so just testing
> tsk->exit_state is sufficient.

... IMHO this change harms the readability of the code.

> +++ b/fs/proc/array.c
> @@ -145,8 +145,7 @@ static inline const char *get_task_state(struct 
> task_struct *tsk)
>   TASK_UNINTERRUPTIBLE |
>   TASK_STOPPED |
>   TASK_TRACED)) |
> - (tsk->exit_state & (EXIT_ZOMBIE |
> - EXIT_DEAD));
> +tsk->exit_state;

Here, for example, the code is /purposefully/ enumerating all the task
states, probably it makes sense to explicitly enumerate the exit states
as well?

> +++ b/kernel/fork.c
> @@ -115,7 +115,7 @@ EXPORT_SYMBOL(free_task);
>  
>  void __put_task_struct(struct task_struct *tsk)
>  {
> - WARN_ON(!(tsk->exit_state & (EXIT_DEAD | EXIT_ZOMBIE)));
> + WARN_ON(!tsk->exit_state);

> +++ b/kernel/sched.c
> @@ -5190,7 +5190,7 @@ static void migrate_dead(unsigned int dead_cpu, struct 
> task_struct *p)
>   struct rq *rq = cpu_rq(dead_cpu);
>  
>   /* Must be exiting, otherwise would be on tasklist. */
> - BUG_ON(p->exit_state != EXIT_ZOMBIE && p->exit_state != EXIT_DEAD);
> + BUG_ON(!p->exit_state);

Regarding above two changes -- agreed, we want to catch /any/ exiting task
state, so (!p->exit_state) is /correct/, but still, enumerating those
explicitly helps readability. And although it's unlikely, in the future,
we may have an exit_state value for which we may _not_ want to complain
(WARN or BUG) in this code. So I'd still vote to keep the code explicit
like it was ...

Satyam
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

2.6.22.6 + rt9: suspend/hibernate not working

2007-09-04 Thread Fernando Lopez-Lezcano

Hi Ingo... I'm getting reports from some of my Planet CCRMA users (which
I confirmed) that the latest rt kernel I released has broken suspend
(tested on fc6 & fc7, stock Fedora kernel works fine - the rt
configuration files are virtual clones as far as possible of the
standard Fedora kernel config files). 

I don't know where to start debugging this. When suspend is initiated it
freezes with a "Stopping tasks ... " message in the text console - a
hard power cycle is the only way to get the machine back to normal. 

kernel/power/process.c seems to contain that string in the
freeze_processes function so it looks like the freezer is not freezing
tasks as no "done" message is ever printed. 

What could we do to help?
-- Fernando


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] Revised timerfd() interface

2007-09-04 Thread Michael Kerrisk

Davide,

> > As I think about this more, I see more problems with
> > your argument.  timerfd needs the ability to get and 
> > get-while-setting just as much as the earlier APIs.
> > Consider a library that creates a timerfd file descriptor that
> > is handed off to an application: that library may want
> > to modify the timer settings without having to create a
> > new file descriptor (the app mey not be able to be told about
> > the new fd).  Your argument just doesn't hold, AFAICS.
> 
> Such hypotethical library, in case it really wanted to offer such 
> functionality, could simply return an handle instead of the raw fd, and 
> take care of all that stuff in userspace.

Did I miss something?  Is it not the case that as soon as the
library returns a handle, rather than an fd, then the whole
advantage of timerfd() (being able to select/poll/epoll on 
the timer as well as other fds) is lost?  

> Again, mimicking POSIX APIs doesn't always take you in the right place.

POSIX may goof in places, but in general it is the result of
many smart people thinking about how to design/standardize APIs.
So the onus is on us to explain why they got this point wrong.
And it is not merely POSIX that did things things in the
way I've described: so did the earlier setitimer()/getitimer().

Cheers,

Michael
-- 
Michael Kerrisk
maintainer of Linux man pages Sections 2, 3, 4, 5, and 7 

Want to help with man page maintenance?  
Grab the latest tarball at
http://www.kernel.org/pub/linux/docs/manpages , 
read the HOWTOHELP file and grep the source 
files for 'FIXME'.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 3/6] x86: Convert cpu_sibling_map to be a per cpu variable (v2) (fwd)

2007-09-04 Thread Andrew Morton

> On Tue, 04 Sep 2007 16:11:31 -0700 Mike Travis <[EMAIL PROTECTED]> wrote:
> > 
> > It'd be better to convert the unconverted architectures?
> 
> I can easily do the changes for ia64 and test them.  I don't have the 
> capability
> of testing on the powerpc.  
> 
> And are you asking for just the changes to fix the build problem, or the whole
> set of the changes that were made for x86_64 and i386 in regards to converting
> NR_CPU arrays to per cpu data?

Well...  it'd be better to have all architectures doing the same thing.  If
that's impractical then we should at least implement suitable accessor
functions into the arch so that core code doesn't need to handle some
architectures one way and others the other way.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] Send quota messages via netlink

2007-09-04 Thread Serge E. Hallyn

Quoting Jan Kara ([EMAIL PROTECTED]):
> On Tue 04-09-07 16:32:10, Serge E. Hallyn wrote:
> > Quoting Jan Kara ([EMAIL PROTECTED]):
> > > On Thu 30-08-07 17:14:47, Serge E. Hallyn wrote:
> > > > Quoting Jan Kara ([EMAIL PROTECTED]):
> > > > >   I imagine it so that you have a machine and on it several virtual
> > > > > machines which are sharing a filesystem (or it could be a cluster). 
> > > > > Now you
> > > > > want UIDs to be independent between these virtual machines. That's it,
> > > > > right?
> > > > >   Now to continue the example: Alice has UID 100 on machineA, Bob has
> > > > >  UID 100 on machineB. These translate to UIDs 1000 and 1001 on the 
> > > > > common
> > > > > filesystem. Process of Alice writes to a file and Bob becomes to be 
> > > > > over
> > > > > quota. In this situation, there would be probably two processes (from
> > > > > machineA and machineB) listening on the netlink socket. We want to 
> > > > > send a
> > > > > message so that on Alice's desktop we can show a message: "You caused
> > > > > Bob to exceed his quotas" and of Bob's desktop: "Alice has caused 
> > > > > that you
> > > > > are over quota.".
> > > > 
> > > > Since this is over NFS, you handle it the way you would any other time
> > > > that user Alice on some other machine managed to do this.
> > >   I meant this would actually happen over a local filesystem (imagine
> > > something like "hostfs" from UML).
> > 
> > Ok, then that is where I was previously suggesting that we use an api to
> > report a uid meaningful in bob's context, where we currently (in the
> > absense of meaningful mount uids and uid equivalence) tell Bob that root
> > was the one who brought him over quota.  From a user pov 'nobody' would
> > make more sense, but I don't think we want the kernel to know about user
> > nobody, right?
>   But what is the problem with using the filesystem ids? All virtual
> machines in my example should have a notion of those...

I don't know what you mean by filesystem ids.  Do you mean the uid
stored on the fs?  I imagine a network fs could get fancy and store
something more detailed than the unix uid, based on the user's keys.

Do you mean the inode->i_uid?  Nothing wrong with that.  Then we just
assume that either you are in the superblock or mount's user namespace
(depending on how we implement it, probably superblock), or can figure
out what that is.

> > So if the msg weren't broadcast, or netlink sockets were tied to one
> > user namespace, we could call a
> > int uid_in_user_ns(struct user *, struct user_ns *)
> > sending in Alice's user struct and Bob's userns, and use the result in
> > the netlink message.  Otherwise I'm not sure what is the right answer.
> > We just might need the equivalent of 'struct pid' to struct user, or
> > persistant global user namespace ids (persistant after user namespace
> > destruction, not across reboot) so we can safely send the user_ns * in a
> > netlink msg.
>   Yes, that could also be a solution.
> 
> > > > >   Because there may be is not a notion of Bob on machineA or of Alice 
> > > > > on
> > > > > machineB, we are in trouble, right? What I like the most is to use the
> > > > > filesystem identities (as you suggested in some other email). I. e. 
> > > > > because
> > > > > both Alice and Bob share a filesystem, identities of both have to 
> > > > > make sense
> > > > > to it (for example for purposes of permission checking). So we can 
> > > > > probably
> > > > 
> > > > Right, so long as we're talking about local filesystems that's the way
> > > > to go.  If a file write was allowed which brought bob over quota,
> > > > clearly the person responsible had some uid valid on the filesystem to
> > > > allow him to do so.
> > >   Fine. So I'll keep UID in the quota netlink protocol with the meaning
> > > "the identity of the user for filesystem operations".
> > 
> > I think that's ok.
> > 
> > Hopefully when that changes to accomodate user namespaces, we can use
> > netlink field versioning to make that transition pretty seamless?
>   Yes, we'd just assign the attribute a different number and teach
> userspace about the new attribute format...

Ok.

> > If not, then we probably should in fact make some decision now so as not
> > to change the api.
> > 
> > > > > send via netlink these (in our example ids 1000 and 1001) and hope 
> > > > > that
> > > > > inside machineA and machineB there will be a way to translate these
> > > > > identities to names "Alice" and "Bob". So that user can understand 
> > > > > what
> > > > > is happenning. Does this sound plausible?
> > > > >   If we go this route, then we only need a kernel function, that will
> > > > > for a pair ($filesystem, $task) return indentity of that $task used
> > > > > for operations on $filesystem...
> > > > 
> > > > Ok, now I see.  This is again unrelated to user namespaces, it's an
> > > > issue regardless.
> > > > 
> > > > Is there no way to just report Alice as the guilty party to Bob on his
> > > >

Re: [patch] sched: fix broken smt/mc optimizations with CFS

2007-09-04 Thread Siddha, Suresh B

On Tue, Sep 04, 2007 at 07:35:21PM -0400, Chuck Ebbert wrote:
> On 08/28/2007 06:27 PM, Siddha, Suresh B wrote:
> > Try to fix MC/HT scheduler optimization breakage again, with out breaking
> > the FUZZ logic.
> > 
> > First fix the check
> > if (*imbalance + SCHED_LOAD_SCALE_FUZZ < busiest_load_per_task)
> > with this
> > if (*imbalance < busiest_load_per_task)
> > 
> > As the current check is always false for nice 0 tasks (as 
> > SCHED_LOAD_SCALE_FUZZ
> > is same as busiest_load_per_task for nice 0 tasks).
> > 
> > With the above change, imbalance was getting reset to 0 in the corner case
> > condition, making the FUZZ logic fail. Fix it by not corrupting the
> > imbalance and change the imbalance, only when it finds that the
> > HT/MC optimization is needed.
> > 
> > Signed-off-by: Suresh Siddha <[EMAIL PROTECTED]>
> > ---
> > 
> > diff --git a/kernel/sched.c b/kernel/sched.c
> > index 9fe473a..03e5e8d 100644
> > --- a/kernel/sched.c
> > +++ b/kernel/sched.c
> > @@ -2511,7 +2511,7 @@ group_next:
> >  * a think about bumping its value to force at least one task to be
> >  * moved
> >  */
> > -   if (*imbalance + SCHED_LOAD_SCALE_FUZZ < busiest_load_per_task) {
> > +   if (*imbalance < busiest_load_per_task) {
> > unsigned long tmp, pwr_now, pwr_move;
> > unsigned int imbn;
> >  
> > @@ -2563,10 +2563,8 @@ small_imbalance:
> > pwr_move /= SCHED_LOAD_SCALE;
> >  
> > /* Move if we gain throughput */
> > -   if (pwr_move <= pwr_now)
> > -   goto out_balanced;
> > -
> > -   *imbalance = busiest_load_per_task;
> > +   if (pwr_move > pwr_now)
> > +   *imbalance = busiest_load_per_task;
> > }
> >  
> > return busiest;
> 
> Seems this didn't get merged? Latest git as of today still has the code
> as it was before this patch.

This is must fix for .23 and Ingo previously mentioned that he will push it
for .23

Ingo?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch] sched: fix broken smt/mc optimizations with CFS

2007-09-04 Thread Chuck Ebbert

On 08/28/2007 06:27 PM, Siddha, Suresh B wrote:
> On Mon, Aug 27, 2007 at 12:31:03PM -0700, Siddha, Suresh B wrote:
>> Essentially I observed that nice 0 tasks still endup on two cores of same
>> package, with out getting spread out to two different packages. This behavior
>> is same with out this fix and this fix doesn't help in any way.
> 
> Ingo, Appended patch seems to fix the issue and as far as I can test, seems ok
> to me.
> 
> This is a quick fix for .23. Peter Williams and myself plan to look at
> code cleanups in this area (HT/MC optimizations) post .23
> 
> BTW, with this fix, do you want to retain the current FUZZ value?
> 
> thanks,
> suresh
> --
> 
> Try to fix MC/HT scheduler optimization breakage again, with out breaking
> the FUZZ logic.
> 
> First fix the check
>   if (*imbalance + SCHED_LOAD_SCALE_FUZZ < busiest_load_per_task)
> with this
>   if (*imbalance < busiest_load_per_task)
> 
> As the current check is always false for nice 0 tasks (as 
> SCHED_LOAD_SCALE_FUZZ
> is same as busiest_load_per_task for nice 0 tasks).
> 
> With the above change, imbalance was getting reset to 0 in the corner case
> condition, making the FUZZ logic fail. Fix it by not corrupting the
> imbalance and change the imbalance, only when it finds that the
> HT/MC optimization is needed.
> 
> Signed-off-by: Suresh Siddha <[EMAIL PROTECTED]>
> ---
> 
> diff --git a/kernel/sched.c b/kernel/sched.c
> index 9fe473a..03e5e8d 100644
> --- a/kernel/sched.c
> +++ b/kernel/sched.c
> @@ -2511,7 +2511,7 @@ group_next:
>* a think about bumping its value to force at least one task to be
>* moved
>*/
> - if (*imbalance + SCHED_LOAD_SCALE_FUZZ < busiest_load_per_task) {
> + if (*imbalance < busiest_load_per_task) {
>   unsigned long tmp, pwr_now, pwr_move;
>   unsigned int imbn;
>  
> @@ -2563,10 +2563,8 @@ small_imbalance:
>   pwr_move /= SCHED_LOAD_SCALE;
>  
>   /* Move if we gain throughput */
> - if (pwr_move <= pwr_now)
> - goto out_balanced;
> -
> - *imbalance = busiest_load_per_task;
> + if (pwr_move > pwr_now)
> + *imbalance = busiest_load_per_task;
>   }
>  
>   return busiest;

Seems this didn't get merged? Latest git as of today still has the code
as it was before this patch.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.22.5 forcedeth timeout hang

2007-09-04 Thread Steve Reinhardt


We're seeing this identical timeout starting with 2.6.21, any time we try and
push a significant amount of traffic through the nforce ethernet.  We've rolled
back to 2.6.20.18 and don't see any problems.  It seems that this bug got
introduced along with all the forcedeth fixes and optimizations in 2.6.21.

Steve



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] Add support for keyboard on SEGA Dreamcast

2007-09-04 Thread Adrian McMenamin

This patch will add support for the Dreamcast keyboard when used
alongside the maple bus patch (http://lkml.org/lkml/2007/9/4/165) and
the pvr2 patch.

Signed off by: Adrian McMenamin <[EMAIL PROTECTED]>

diff --git a/drivers/input/keyboard/Kconfig b/drivers/input/keyboard/Kconfig
index c97d5eb..1689f73 100644
--- a/drivers/input/keyboard/Kconfig
+++ b/drivers/input/keyboard/Kconfig
@@ -253,4 +253,14 @@ config KEYBOARD_GPIO
  To compile this driver as a module, choose M here: the
  module will be called gpio-keys.

+
+config KEYBOARD_MAPLE
+   tristate "Maple bus keyboard"
+   depends on SH_DREAMCAST && MAPLE
+   help
+ Say Y here if you have a DreamCast console running Linux and have
+ a keyboard attached to its Maple bus.
+
+ To compile this driver as a module, choose M here: the
+ module will be called maple_keyb. 
 endif
diff --git a/drivers/input/keyboard/Makefile b/drivers/input/keyboard/Makefile
index 28d211b..3f775ed 100644
--- a/drivers/input/keyboard/Makefile
+++ b/drivers/input/keyboard/Makefile
@@ -21,4 +21,5 @@ obj-$(CONFIG_KEYBOARD_OMAP)   += omap-keypad.o
 obj-$(CONFIG_KEYBOARD_PXA27x)  += pxa27x_keyboard.o
 obj-$(CONFIG_KEYBOARD_AAED2000)+= aaed2000_kbd.o
 obj-$(CONFIG_KEYBOARD_GPIO)+= gpio_keys.o
+obj-$(CONFIG_KEYBOARD_MAPLE)   += maple_keyb.o

diff --git a/drivers/input/keyboard/maple_keyb.c
b/drivers/input/keyboard/maple_keyb.c
new file mode 100644
index 000..12e4692
--- /dev/null
+++ b/drivers/input/keyboard/maple_keyb.c
@@ -0,0 +1,222 @@
+/*
+ * SEGA Dreamcast keyboard driver
+ * Based on drivers/usb/usbkbd.c
+ * Copyright YAEGASHI Takeshi, 2001
+ * Porting to 2.6 Copyright Adrian McMenamin, 2007
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, see the file COPYING, or write
+ * to the Free Software Foundation, Inc.,
+ * 51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA
+*/
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+MODULE_AUTHOR("YAEGASHI Takeshi, Adrian McMenamin");
+MODULE_DESCRIPTION("SEGA Dreamcast keyboard driver");
+MODULE_LICENSE("GPL");
+
+static unsigned char dc_kbd_keycode[256] = {
+   0, 0, 0, 0, 30, 48, 46, 32,
+   18, 33, 34, 35, 23, 36, 37, 38,
+   50, 49, 24, 25, 16, 19, 31, 20,
+   22, 47, 17, 45, 21, 44, 2, 3,
+   4, 5, 6, 7, 8, 9, 10, 11, 28,
+   1, 14, 15, 57, 12, 13, 26,
+   27, 43, 43, 39, 40, 41, 51,
+   52, 53, 58, 59, 60, 61, 62, 63, 64,
+   65, 66, 67, 68, 87, 88, 99,
+   70, 119, 110, 102, 104, 111,
+   107, 109, 106, 105, 108, 103,
+   69, 98, 55, 74, 78, 96, 79, 80,
+   81, 75, 76, 77, 71, 72, 73, 82, 83,
+   86, 127, 116, 117, 183, 184, 185,
+   186, 187, 188, 189, 190,
+   191, 192, 193, 194, 134, 138, 130, 132,
+   128, 129, 131, 137, 133, 135, 136, 113,
+   115, 114, 0, 0, 0, 121, 0, 89, 93, 124,
+   92, 94, 95, 0, 0, 0,
+   122, 123, 90, 91, 85, 0, 0, 0,
+   0, 0, 0, 0, 0, 0, 0, 0,
+   0, 0, 0, 0, 0, 0, 0, 0,
+   0, 0, 0, 0, 0, 0, 0, 0,
+   0, 0, 0, 0, 0, 0, 0, 0,
+   0, 0, 0, 0, 0, 0, 0, 0,
+   0, 0, 0, 0, 0, 0, 0, 0,
+   0, 0, 0, 0, 0, 0, 0, 0,
+   0, 0, 0, 0, 0, 0, 0, 0,
+   0, 0, 0, 0, 0, 0, 0, 0,
+   29, 42, 56, 125, 97, 54, 100, 126,
+   164, 166, 165, 163, 161, 115, 114, 113,
+   150, 158, 159, 128, 136, 177, 178, 176, 142,
+   152, 173, 140
+};
+
+struct dc_kbd {
+   struct input_dev *dev;
+   unsigned char new[8];
+   unsigned char old[8];
+};
+
+static void dc_scan_kbd(struct dc_kbd *kbd)
+{
+   int i;
+   struct input_dev *dev = kbd->dev;
+   for (i = 0; i < 8; i++)
+   input_report_key(dev,
+dc_kbd_keycode[i + 224],
+(kbd->new[0] >> i) & 1);
+
+   for (i = 2; i < 8; i++) {
+   if (kbd->old[i] > 3
+   && memchr(kbd->new + 2, kbd->old[i], 6) == NULL) {
+   if (dc_kbd_keycode[kbd->old[i]])
+   input_report_key(dev,
+   dc_kbd_keycode[kbd->old[i]], 0);
+   else
+   printk
+   ("Unknown key (scancode %#x) released.",
+kbd->old[i]);
+   }
+
+

Re: Suspend and hibernation status report

2007-09-04 Thread Len Brown

On Friday 27 July 2007 04:57, Rafael J. Wysocki wrote:

Thanks for writing this, Rafael.

> * system hibernation state - state, in which the system's processors are off 
> and
>   its main memory is not powered, but the information necessary for continuing
>   the computations carried out when the system was last in a working state is
>   preserved in a storage space, such as a disk
> * ACPI S4 state - system hibernation state, in which some information is
>   preserved by the ACPI platform, in accordance with the ACPI specification

"some information is preserved by the ACPI platform" is sort of mis-leading.
What ACPI adds to the hibernate flow is some platform hooks to handle
wakeup devices, and a platform hook for the actual sleep request.
I'm not aware of any information saved by ACPI during S4 that is
not saved were the hibernate to be done with "acpi=off".

thanks,
-Len
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH - RESUBMiT] Minor patch to pvr2 driver required for maple bus support on SEGA Dreamcast

2007-09-04 Thread Adrian McMenamin

The maple bus driver (http://lkml.org/lkml/2007/9/4/165) uses hardware
synchronisation between the maple bus and the VBLANK to poll the maple
bus. This patch makes the interrupt shareable.

By definition the interrupt is for both devices.

Signed-off by: Adrian McMenamin <[EMAIL PROTECTED]>

diff --git a/drivers/video/pvr2fb.c b/drivers/video/pvr2fb.c
index 7d6c298..13de07f 100644
--- a/drivers/video/pvr2fb.c
+++ b/drivers/video/pvr2fb.c
@@ -890,7 +890,7 @@ static int __init pvr2fb_dc_init(void)
pvr2_fix.mmio_start = 0xa05f8000;   /* registers start here */
pvr2_fix.mmio_len   = 0x2000;

-   if (request_irq(HW_EVENT_VSYNC, pvr2fb_interrupt, 0,
+   if (request_irq(HW_EVENT_VSYNC, pvr2fb_interrupt, IRQF_SHARED,
"pvr2 VBL handler", fb_info)) {
return -EBUSY;
}
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] Add maple bus support for the SEGA Dreamcast

2007-09-04 Thread Adrian McMenamin

This patch adds support for SEGA's proprietary Maple bus. Maple is a
serial communications bus and support is required to operate Dreamcast
peripherals. A keyboard driver is also available and will be posted
separately.

Signed-off by: Adrian McMenamin <[EMAIL PROTECTED]>

diff --git a/arch/sh/Kconfig b/arch/sh/Kconfig
index 54878f0..077438f 100644
--- a/arch/sh/Kconfig
+++ b/arch/sh/Kconfig
@@ -702,6 +702,17 @@ config CF_BASE_ADDR
default "0xb800" if CF_AREA6
default "0xb400" if CF_AREA5

+config MAPLE
+   bool "Maple Bus Support"
+   depends on SH_DREAMCAST
+   help
+ The Maple Bus is SEGA's serial communication bus for peripherals
+ on the Dreamcast. Without this bus support you won't be able to
+ get your Dreamcast keyboard etc to work, so most users
+ probably want to say 'Y' here, unless you are only using the
+ Dreamcast with a serial line terminal or a remote network
+ connection.
+
 source "arch/sh/drivers/pci/Kconfig"

 source "drivers/pci/Kconfig"
diff --git a/drivers/sh/Makefile b/drivers/sh/Makefile
index 8a14389..f0a1f4f 100644
--- a/drivers/sh/Makefile
+++ b/drivers/sh/Makefile
@@ -3,4 +3,5 @@
 #

 obj-$(CONFIG_SUPERHYWAY) += superhyway/
+obj-$(CONFIG_MAPLE) += maple/

diff --git a/drivers/sh/maple/Makefile b/drivers/sh/maple/Makefile
new file mode 100644
index 000..f8c39f2
--- /dev/null
+++ b/drivers/sh/maple/Makefile
@@ -0,0 +1,3 @@
+#Makefile for Maple Bus
+
+obj-y  := maplebus.o
diff --git a/drivers/sh/maple/maplebus.c b/drivers/sh/maple/maplebus.c
new file mode 100644
index 000..e6fd696
--- /dev/null
+++ b/drivers/sh/maple/maplebus.c
@@ -0,0 +1,738 @@
+/* maplebus.c
+ * Core maple bus functionality
+ * Original 2.4 code used here copyright
+ * YAEGASHI Takeshi, Paul Mundt, M. R. Brown and others
+ * Porting to 2.6 Copyright Adrian McMenamin, 2007
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, see the file COPYING, or write
+ * to the Free Software Foundation, Inc.,
+ * 51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+MODULE_AUTHOR("Yaegshi Takeshi, Paul Mundt, MR Brown, Adrian McMenamin");
+MODULE_DESCRIPTION("Maple bus driver for Dreamcast");
+MODULE_LICENSE("GPL");
+MODULE_SUPPORTED_DEVICE("{{SEGA, Dreamcast/Maple}}");
+
+static void maple_dma_handler(struct work_struct *work);
+static void maple_vblank_handler(struct work_struct *work);
+
+static DECLARE_WORK(maple_dma_process, maple_dma_handler);
+static DECLARE_WORK(maple_vblank_process, maple_vblank_handler);
+
+static LIST_HEAD(maple_waitq);
+static LIST_HEAD(maple_sentq);
+
+
+static struct maple_driver maple_null_driver;
+static struct device maple_bus;
+static int subdevice_map[MAPLE_PORTS];
+static unsigned long *maple_sendbuf, *maple_sendptr, *maple_lastptr;
+static unsigned long maple_pnp_time;
+static int started, scanning, liststatus;
+static struct kmem_cache *maple_cache;
+
+int maple_driver_register(struct device_driver *drv)
+{
+   if (unlikely(!drv))
+   return -EINVAL;
+   drv->bus = _bus_type;
+   return driver_register(drv);
+}
+EXPORT_SYMBOL_GPL(maple_driver_register);
+
+void maplebus_init_hardware(void)
+{
+   ctrl_outl(MAPLE_MAGIC, MAPLE_RESET);
+   /* set trig type to 0 for software trigger, 1 for hardware (VBLANK) */
+   ctrl_outl(1, MAPLE_TRIGTYPE);
+   ctrl_outl(MAPLE_2MBPS | MAPLE_TIMEOUT(5), MAPLE_SPEED);
+   ctrl_outl(PHYSADDR(maple_sendbuf), MAPLE_DMAADDR);
+   ctrl_outl(1, MAPLE_ENABLE);
+}
+EXPORT_SYMBOL_GPL(maplebus_init_hardware);
+
+void maple_getcond_callback(struct maple_device *dev,
+   void (*callback) (struct mapleq * mq),
+   unsigned long interval, unsigned long function)
+{
+   dev->callback = callback;
+   dev->interval = interval;
+   dev->function = cpu_to_be32(function);
+   dev->when = 0;
+}
+EXPORT_SYMBOL_GPL(maple_getcond_callback);
+
+int maple_dma_done(void)
+{
+   return (ctrl_inl(MAPLE_STATE) & 1) == 0;
+}
+EXPORT_SYMBOL_GPL(maple_dma_done);
+
+static void maple_release_device(struct device *dev)
+{
+   if (likely(dev->type)) {
+   if (likely(dev->type->name))
+   kfree(dev->type->name);
+   kfree(dev->type);
+   }

Re: [PATCH 3/6] x86: Convert cpu_sibling_map to be a per cpu variable (v2) (fwd)

2007-09-04 Thread Mike Travis



Andrew Morton wrote:
>> On Tue, 04 Sep 2007 13:29:11 -0700 Mike Travis <[EMAIL PROTECTED]> wrote:

>>> -- Forwarded message --
>>> Date: Fri, 31 Aug 2007 19:49:03 -0700
>>> From: Andrew Morton <[EMAIL PROTECTED]>
>>> To: [EMAIL PROTECTED]
>>> Cc: Andi Kleen <[EMAIL PROTECTED]>, [EMAIL PROTECTED], 
>>> linux-kernel@vger.kernel.org,
>>> Christoph Lameter <[EMAIL PROTECTED]>
>>> Subject: Re: [PATCH 3/6] x86: Convert cpu_sibling_map to be a per cpu 
>>> variable
>>> (v2)
>>>
>>> On Fri, 24 Aug 2007 15:26:57 -0700 [EMAIL PROTECTED] wrote:
>>>
 Convert cpu_sibling_map from a static array sized by NR_CPUS to a
 per_cpu variable.  This saves sizeof(cpumask_t) * NR unused cpus.
 Access is mostly from startup and CPU HOTPLUG functions.
>>> ia64 allmodconfig:
>>>
>>> kernel/sched.c: In function `cpu_to_phys_group':
>>>  kernel/sched.c:5937: 
>>> error: `per_cpu__cpu_sibling_map' undeclared (first use in this function)   
>>> kernel/sched.c:5937: error: (Each undeclared 
>>> identifier is reported only once
>>> kernel/sched.c:5937: error: for each function it appears in.)   
>>>  kernel/sched.c:5937: 
>>> warning: type defaults to `int' in declaration of `type name'
>>> kernel/sched.c:5937: error: invalid type argument of `unary *'  
>>>  kernel/sched.c: In 
>>> function `build_sched_domains': 
>>>   kernel/sched.c:6172: error: 
>>> `per_cpu__cpu_sibling_map' undeclared (first use in this function)  
>>>  kernel/sched.c:6172: warning: type defaults to `int' 
>>> in declaration of `type name'   
>>> kernel/sched.c:6172: error: invalid type argument of `unary *'  
>>>  kernel/sched.c:6183: 
>>> warning: type defaults to `int' in declaration of `type name'   
>>> kernel/sched.c:6183: error: invalid type 
>>> argument of `unary *'   
>>> 
>> I'm thinking that the best approach would be to define a cpu_sibling_map() 
>> macro
>> to handle the cases where cpu_sibling_map is not a per_cpu variable?  Perhaps
>> something like:
>>
>> #ifdef CONFIG_SCHED_SMT
>> #ifndef cpu_sibling_map
>> #define cpu_sibling_map(cpu)cpu_sibling_map[cpu]
>> #endif
>> #endif
>>
>> My question though, would include/linux/smp.h be the appropriate place for
>> the above define?  (That is, if the above approach is the correct one... ;-)
> 
> It'd be better to convert the unconverted architectures?

I can easily do the changes for ia64 and test them.  I don't have the capability
of testing on the powerpc.  

And are you asking for just the changes to fix the build problem, or the whole
set of the changes that were made for x86_64 and i386 in regards to converting
NR_CPU arrays to per cpu data?

Thanks,
Mike
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 6/7] blk_end_request: remove/unexport end_that_request_*

2007-09-04 Thread Kiyoshi Ueda

Hi,

On Tue, 4 Sep 2007 17:25:14 -0400, "Halevy, Benny" <[EMAIL PROTECTED]> wrote:
> We suspect we'll still need the extern entry points for handling the bidi 
> request in the scsi_io_completion() path as we only want to call
> end_that_request_chunk on req->next_rq and never
> end_that_request_last.
>  
> (see 
> http://www.bhalevy.com/open-osd/download/linux-2.6.23-rc2_and_iscsi-iscsi-2007_08_09/0005-SCSI-bidi-support.patch)

If this patch-set is merged, there may be other way to do that.

For tricky drivers, special interface, blk_end_request_callback(),
is added in the patch 5/7.
(http://marc.info/?l=linux-kernel=118860027714753=2)
Currently, only user of the interface is ide-cd (cdrom_newpc_intr()).
It needs to call only end_that_request_first() too.

With the patch 7/7, you can set your own handler in rq->end_io()
to complete the request by your own way.

Thanks,
Kiyoshi Ueda
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Fwd: That whole "Linux stealing our code" thing

2007-09-04 Thread Daniel Hazelton

On Tuesday 04 September 2007 15:44:31 Michael Poole wrote:
> Chris Friesen writes:
> > Daniel Hazelton wrote:
> >> On Tuesday 04 September 2007 09:27:02 Krzysztof Halasa wrote:
> >>>Daniel Hazelton <[EMAIL PROTECTED]> writes:
> US Copyright law. A copyright holder, regardless of what license he/she
> may have released the work under, can still revoke the license for a
> specific person or group of people. (There are some exceptions, but
>  they do not apply to the situation that is being discussed)
> >
> > The OpenBSD policy page doesn't agree with you:
> >
> > "...That means that having granted a permission, the copyright holder
> > can not retroactively say that an individual or class of individuals
> > are no longer granted those permissions. Likewise should the copyright
> > holder decide to "go commercial" he can not revoke permissions already
> > granted for the use of the work as distributed, though he may impose
> > more restrictive permissions in his future distributions of that work."
> >
> > http://www.openbsd.org/policy.html
>
> By my reading, this is supported by 17 USC 203(a)(3):
>
>   (3) Termination of the grant may be effected at any time during a
>   period of five years beginning at the end of thirty-five years
>   from the date of execution of the grant; or, if the grant covers
>   the right of publication of the work, the period begins at the
>   end of thirty-five years from the date of publication of the
>   work under the grant or at the end of forty years from the date
>   of execution of the grant, whichever term ends earlier.
>
> (from
> http://www.law.cornell.edu/uscode/html/uscode17/usc_sec_17_0203000-
>.html )

Ah, I am both right and wrong, it seems. Apparently you have to wait anywhere 
form 35 to 40 years, and then you only have a five year window. Seems damned 
strange to me, but oh well.

(I'd totally forgotten that part of the law - or my mind decided to play 
tricks on me.)

DRH

PS: See, I will admit it when I'm shown evidence that I'm wrong :)

-- 
Dialup is like pissing through a pipette. Slow and excruciatingly painful.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [mtd] allow modular mtdsuper

2007-09-04 Thread Satyam Sharma

Hi Jason,


On Tue, 4 Sep 2007, Jason Lunz wrote:
> 
> Declare mtdsuper to be gpl-licensed so it can access get_mtd_device and
> put_mtd_device when loaded as a module.

The actual issue was a bit different -- refer commit bec494775600b1cd in
latest -git (patch included below).

David, it looks like .22 had this problem as well. If we care enough, you
could forward this on to -stable (cc'ed, just in case).

Satyam

[MTD] Makefile fix for mtdsuper

We want drivers/mtd/{mtdcore, mtdsuper, mtdpart}.c to be built and linked
into the same mtd.ko module. Fix the Makefile to ensure this, and remove
duplicate MODULE_ declarations in mtdpart.c, as mtdcore.c already has them.

Signed-off-by: Satyam Sharma <[EMAIL PROTECTED]>
Signed-off-by: David Woodhouse <[EMAIL PROTECTED]>

---

 drivers/mtd/Makefile  |2 +-
 drivers/mtd/mtdpart.c |4 
 2 files changed, 1 insertions(+), 5 deletions(-)

diff --git a/drivers/mtd/Makefile b/drivers/mtd/Makefile
index 451adcc..6d958a4 100644
--- a/drivers/mtd/Makefile
+++ b/drivers/mtd/Makefile
@@ -3,9 +3,9 @@
 #
 
 # Core functionality.
+obj-$(CONFIG_MTD)  += mtd.o
 mtd-y  := mtdcore.o mtdsuper.o
 mtd-$(CONFIG_MTD_PARTITIONS)   += mtdpart.o
-obj-$(CONFIG_MTD)  += $(mtd-y)
 
 obj-$(CONFIG_MTD_CONCAT)   += mtdconcat.o
 obj-$(CONFIG_MTD_REDBOOT_PARTS) += redboot.o
diff --git a/drivers/mtd/mtdpart.c b/drivers/mtd/mtdpart.c
index 9c62368..6174a97 100644
--- a/drivers/mtd/mtdpart.c
+++ b/drivers/mtd/mtdpart.c
@@ -560,7 +560,3 @@ int parse_mtd_partitions(struct mtd_info *master, const 
char **types,
 EXPORT_SYMBOL_GPL(parse_mtd_partitions);
 EXPORT_SYMBOL_GPL(register_mtd_parser);
 EXPORT_SYMBOL_GPL(deregister_mtd_parser);
-
-MODULE_LICENSE("GPL");
-MODULE_AUTHOR("Nicolas Pitre <[EMAIL PROTECTED]>");
-MODULE_DESCRIPTION("Generic support for partitioning of MTD devices");
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: ramdisk

2007-09-04 Thread linux-os \(Dick Johnson\)

On Tue, 4 Sep 2007, Xu Yang wrote:

> Hi Dick,
> Thanks for the reply.
>
> then how to create these device nodes in /dev? from the information i
> got from the cosole(unknown block(1,0) ), it seems that I didn't
> create the device? I thought the kernel should do this work right? if
> not how to create it?
>
> thanks,
>
> regards,
>

mkknod /dev/ram0 b 1 0
mkknod /dev/ram1 b 1 1
mkknod /dev/ram2 b 1 2

Do this in the file-system you create for the RAM Disk.

>
> 2007/9/4, linux-os (Dick Johnson) <[EMAIL PROTECTED]>:
>>
>> On Mon, 3 Sep 2007, Xu Yang wrote:
>>
>>> Hi everyone,
>>>
>>> I want to use ramdisk to boot my filesystem, as I can't use NFS and 
>>> harddisk.
>>>
>>> I have load the ramdisk into the ram memory (start address :0x400)
>>>
>>> and in the boot options I specified : root =dev/ram0 initrd=0x400
>>
>> Since you don't know what the default directory is, perhaps
>> root should be /dev/ram0. Also, make sure you actually create
>> those device nodes in /dev
>>
>> [Snipped...]
>>
>>> but the kernel said it can not find any file system on it.
>>>
>>> regards,
>>>
>>> Yang
>>>
>>
>> Cheers,
>> Dick Johnson
>> Penguin : Linux version 2.6.22.1 on an i686 machine (5588.30 BogoMips).
>> My book : http://www.AbominableFirebug.com/
>> _
>>
>>
>> 
>> The information transmitted in this message is confidential and may be 
>> privileged.  Any review, retransmission, dissemination, or other use of this 
>> information by persons or entities other than the intended recipient is 
>> prohibited.  If you are not the intended recipient, please notify Analogic 
>> Corporation immediately - by replying to this message or by sending an email 
>> to [EMAIL PROTECTED] - and destroy all copies of this information, including 
>> any attachments, without reading or disclosing them.
>>
>> Thank you.
>>
>

Cheers,
Dick Johnson
Penguin : Linux version 2.6.22.1 on an i686 machine (5588.30 BogoMips).
My book : http://www.AbominableFirebug.com/
_

The information transmitted in this message is confidential and may be 
privileged.  Any review, retransmission, dissemination, or other use of this 
information by persons or entities other than the intended recipient is 
prohibited.  If you are not the intended recipient, please notify Analogic 
Corporation immediately - by replying to this message or by sending an email to 
[EMAIL PROTECTED] - and destroy all copies of this information, including any 
attachments, without reading or disclosing them.

Thank you.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] Revised timerfd() interface

2007-09-04 Thread Davide Libenzi

On Tue, 4 Sep 2007, Michael Kerrisk wrote:

> > Useless like it'd be a motorcycle w/out a cup-holder :)
> > Seriously, the ability to get the previous values from "something" could 
> > have a meaning if this something is a shared global resource (like 
> > signals
> > for example). In the timerfd case this makes little sense, since you can 
> > create as many timerfd as you like and you do not need to share a single 
> > one by changing/restoring the original context.
> 
> Davide,
> 
> As I think about this more, I see more problems with
> your argument.  timerfd needs the ability to get and 
> get-while-setting just as much as the earlier APIs.
> Consider a library that creates a timerfd file descriptor that
> is handed off to an application: that library may want
> to modify the timer settings without having to create a
> new file descriptor (the app mey not be able to be told about
> the new fd).  Your argument just doesn't hold, AFAICS.

Such hypotethical library, in case it really wanted to offer such 
functionality, could simply return an handle instead of the raw fd, and 
take care of all that stuff in userspace.
Again, mimicking POSIX APIs doesn't always take you in the right place.



- Davide


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] Revised timerfd() interface

2007-09-04 Thread Davide Libenzi

On Tue, 4 Sep 2007, Michael Kerrisk wrote:

> Hi Davide,
> 
> > > 
> > > 
> > > I'd have thought that the existing stuff would be near-useless without
> > > the capabilities which you describe?
> > 
> > Useless like it'd be a motorcycle w/out a cup-holder :)
> > Seriously, the ability to get the previous values from "something" could 
> > have a meaning if this something is a shared global resource (like 
> > signals
> > for example). In the timerfd case this makes little sense, since you can 
> > create as many timerfd as you like and you do not need to share a single 
> > one by changing/restoring the original context.
> 
> However, one can have multipe POSIX timers, just as you can 
> have multiple timerfd timers; nevertheless POSIX timers provide
> the get and get-while-setting functionality.

The fact that POSIX defined a certain API in a given way, does not 
automatically mean that every other API has to look exactly like that.
POSIX has the tendency to bloat things up at times ;)



> > and in terms of kernel code footprint.
> 
> Not sure what your concern is here.  The total amount of 
> new code for all of these options is pretty small.

>From your patch:

fs/compat.c  |   34 --
fs/timerfd.c |  147 +++
include/linux/compat.h   |3 
include/linux/syscalls.h |3 
4 files changed, 153 insertions(+), 34 deletions(-)

And the API definition becomes pretty messy. The other way is to add new 
system calls. 120+ lines of code more of new system calls wouldn't even be 
a problem in itself, if the added value was there.
IMO, as I already said, the added value does not justify them.



- Davide


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [GIT PULL] x86 setup: work around bug in Xen HVM

2007-09-04 Thread Christoph Hellwig

On Tue, Sep 04, 2007 at 09:55:45AM -0700, H. Peter Anvin wrote:
> 
> Apparently XEN does not keep the contents of the 48-bit gdt_48 data
> structure that is passed to lgdt in the XEN machine state. Instead it
> appears to save the _address_ of the 48-bit descriptor
> somewhere. Unfortunately this data happens to reside on the stack and
> is probably no longer availiable at the time of the actual protected
> mode jump.
> 
> This is Xen bug but given that there is a one-line patch to work
> around this problem, the linux kernel should probably do this.  My fix
> is to make the gdt_48 description in setup_gdt static (in setup_idt
> this is already the case). This allows the kernel to boot under
> Xen HVM again.

> - struct gdt_ptr gdt;
> + static struct gdt_ptr gdt;

It might make sense to add your above commit message to the code as a comment.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] Send quota messages via netlink

2007-09-04 Thread Jan Kara

On Tue 04-09-07 16:32:10, Serge E. Hallyn wrote:
> Quoting Jan Kara ([EMAIL PROTECTED]):
> > On Thu 30-08-07 17:14:47, Serge E. Hallyn wrote:
> > > Quoting Jan Kara ([EMAIL PROTECTED]):
> > > >   I imagine it so that you have a machine and on it several virtual
> > > > machines which are sharing a filesystem (or it could be a cluster). Now 
> > > > you
> > > > want UIDs to be independent between these virtual machines. That's it,
> > > > right?
> > > >   Now to continue the example: Alice has UID 100 on machineA, Bob has
> > > >  UID 100 on machineB. These translate to UIDs 1000 and 1001 on the 
> > > > common
> > > > filesystem. Process of Alice writes to a file and Bob becomes to be over
> > > > quota. In this situation, there would be probably two processes (from
> > > > machineA and machineB) listening on the netlink socket. We want to send 
> > > > a
> > > > message so that on Alice's desktop we can show a message: "You caused
> > > > Bob to exceed his quotas" and of Bob's desktop: "Alice has caused that 
> > > > you
> > > > are over quota.".
> > > 
> > > Since this is over NFS, you handle it the way you would any other time
> > > that user Alice on some other machine managed to do this.
> >   I meant this would actually happen over a local filesystem (imagine
> > something like "hostfs" from UML).
> 
> Ok, then that is where I was previously suggesting that we use an api to
> report a uid meaningful in bob's context, where we currently (in the
> absense of meaningful mount uids and uid equivalence) tell Bob that root
> was the one who brought him over quota.  From a user pov 'nobody' would
> make more sense, but I don't think we want the kernel to know about user
> nobody, right?
  But what is the problem with using the filesystem ids? All virtual
machines in my example should have a notion of those...

> So if the msg weren't broadcast, or netlink sockets were tied to one
> user namespace, we could call a
>   int uid_in_user_ns(struct user *, struct user_ns *)
> sending in Alice's user struct and Bob's userns, and use the result in
> the netlink message.  Otherwise I'm not sure what is the right answer.
> We just might need the equivalent of 'struct pid' to struct user, or
> persistant global user namespace ids (persistant after user namespace
> destruction, not across reboot) so we can safely send the user_ns * in a
> netlink msg.
  Yes, that could also be a solution.

> > > >   Because there may be is not a notion of Bob on machineA or of Alice on
> > > > machineB, we are in trouble, right? What I like the most is to use the
> > > > filesystem identities (as you suggested in some other email). I. e. 
> > > > because
> > > > both Alice and Bob share a filesystem, identities of both have to make 
> > > > sense
> > > > to it (for example for purposes of permission checking). So we can 
> > > > probably
> > > 
> > > Right, so long as we're talking about local filesystems that's the way
> > > to go.  If a file write was allowed which brought bob over quota,
> > > clearly the person responsible had some uid valid on the filesystem to
> > > allow him to do so.
> >   Fine. So I'll keep UID in the quota netlink protocol with the meaning
> > "the identity of the user for filesystem operations".
> 
> I think that's ok.
> 
> Hopefully when that changes to accomodate user namespaces, we can use
> netlink field versioning to make that transition pretty seamless?
  Yes, we'd just assign the attribute a different number and teach
userspace about the new attribute format...

> If not, then we probably should in fact make some decision now so as not
> to change the api.
> 
> > > > send via netlink these (in our example ids 1000 and 1001) and hope that
> > > > inside machineA and machineB there will be a way to translate these
> > > > identities to names "Alice" and "Bob". So that user can understand what
> > > > is happenning. Does this sound plausible?
> > > >   If we go this route, then we only need a kernel function, that will
> > > > for a pair ($filesystem, $task) return indentity of that $task used
> > > > for operations on $filesystem...
> > > 
> > > Ok, now I see.  This is again unrelated to user namespaces, it's an
> > > issue regardless.
> > > 
> > > Is there no way to just report Alice as the guilty party to Bob on his
> > > machine as (host=nfsserver,uid=1000)?
> >   You know, in fact this contains all the information but it is quite 
> > useless
> > for an ordinary user. The message should be understandable to average 
> > desktop
> 
> What is the ordinary user going to do about it?  If the user didn't set
> up the nfsserver and/or the second client, the only thing he can do is
> report the guilty user to an admin.  In which case the tuple
> (host=nfsserver,uid=1000) is exactly the data he needs to report.
  Maybe write him an email or go and bang him with a baseball bat ;)
Seriously, if someone (like admin) is able to find a physical identity of the
guilty user, then we should be able to do this

Re: [PATCH 1/7] blk_end_request: add new request completion interface

2007-09-04 Thread Kiyoshi Ueda

Hi Jens,

Thank you for the comments.

On Mon, 3 Sep 2007 09:45:45 +0200, Jens Axboe <[EMAIL PROTECTED]> wrote:
> > +extern int blk_end_request(struct request *rq, int uptodate, int nr_bytes);
> > +extern int __blk_end_request(struct request *rq, int uptodate, int 
> > nr_bytes);
> >  extern int end_that_request_first(struct request *, int, int);
> >  extern int end_that_request_chunk(struct request *, int, int);
> >  extern void end_that_request_last(struct request *, int);
> 
> We get in to way too many levels of underscores here. Please changes
> this to be blk_end_request() and blk_end_request_locked(), where the
> former grabs the queue lock but the latter assumes it's held. Then have
> the static __blk_end_request() where the lock MUST be held - do this in
> the caller, don't pass it as an argument!

It makes perfect sense but I have a reason I couldn't do it.

The goal of our patch set is to change the role of rq->end_io()
so that the request submitter can set its own procedure of
request completion (please see the patch 7/7).

So if the caller must hold the lock, we have to hold the lock
during the whole rq->end_io(), that includes end_that_request_first().
It would cause performance regression by making the lock held longer.
OTOH, the 'needlock' argument allows the completion handler to hold
the lock during the minimum piece of the code.

If you have any idea to fix the situation, I would appreciate it.

Below is the detailed explanation of the above.
I took your comment as below.
-
static void __blk_end_request(rq, uptodate) {
blk_queue_end_tag();
blkdev_dequeue_request();
end_that_request_last(rq, uptodate);
}
int blk_end_request_locked(rq, uptodate, nr_bytes) {
if (end_that_request_first(rq, uptodate, nr_bytes))
return 1;
add_disk_randomness();
__blk_end_request(rq, uptodate);
return 0;
}
EXPORT_SYMBOL_GPL(blk_end_request_locked);
int blk_end_request(rq, uptodate, nr_bytes) {
if (end_that_request_first(rq, uptodate, nr_bytes))
return 1;
add_disk_randomness();
spin_lock_irqsave();
__blk_end_request(rq, uptodate);
spin_unlock_irqrestore();
return 0;
}
EXPORT_SYMBOL_GPL(blk_end_request);
-

It's quite reasonable on the patch 1/7.
But the goal of this patch-set is to allow to hook the whole request
completion procedures by a single rq->end_io() hook.
(Please see the patch 7/7 for details)
So the callee (funciton_to_be_set_in_rq_end_io() below) needs to know
whether it has the lock or not.
To prepare for the change of the patch 7/7 and avoid code duplication,
I chose passing the lock information as an argument.
-
static int function_to_be_set_in_rq_end_io(rq, uptodate, nr_bytes, needlock) {
if (end_that_request_first(rq, uptodate, nr_bytes))
return 1;
add_disk_randomness();
if (needlock)
spin_lock_irqsave();
__blk_end_request(rq, uptodate);
if (needlock)
spin_unlock_irqrestore();
return 0;
}
int blk_end_request_locked(rq, uptodate, nr_bytes) {
if (rq->end_io)
return rq->end_io(rq, uptodate, nr_bytes, 0);

if (end_that_request_first(rq, uptodate, nr_bytes))

}
int blk_end_request(rq, uptodate, nr_bytes) {
if (rq->end_io)
return rq->end_io(rq, uptodate, nr_bytes, 1);

if (end_that_request_first(rq, uptodate, nr_bytes))

}
-

Thanks,
Kiyoshi Ueda
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[mtd] allow modular mtdsuper

2007-09-04 Thread Jason Lunz


Declare mtdsuper to be gpl-licensed so it can access get_mtd_device and
put_mtd_device when loaded as a module.

Signed-off-by: Jason Lunz <[EMAIL PROTECTED]>

---
 drivers/mtd/mtdsuper.c |1 +
 1 file changed, 1 insertion(+)

Index: linux-2.6.22.6-uml/drivers/mtd/mtdsuper.c
===
--- linux-2.6.22.6-uml.orig/drivers/mtd/mtdsuper.c
+++ linux-2.6.22.6-uml/drivers/mtd/mtdsuper.c
@@ -14,6 +14,8 @@
 #include 
 #include 

+MODULE_LICENSE("GPL");
+
 /*
  * compare superblocks to see if they're equivalent
  * - they are if the underlying MTD device is the same
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: What's happening with the cpuidle code?

2007-09-04 Thread Chuck Ebbert

On 09/04/2007 05:43 PM, Len Brown wrote:
> On Tuesday 04 September 2007 16:47, Chuck Ebbert wrote:
>> A look at the 'cpuidle' branch of git-acpi shows a commit
>> e40cede7d63a029e92712a3fe02faee60cc38fb4, "cpuidle: first
>> round of documentation updates" that doesn't show up in that
>> branch online. The entire Documentation/cpuidle directory
>> is missing from the tree when looking at the web pages, and
>> it's missing from git-acpi.patch in 2.6.23-rc4-mm1 (but the
>> patch shows up in the summary information in the patch
>> header.) Where did it go? And how can -mm be used to test
>> things if its patches don't even match their own headers?
> 
> A later patch in that series, "cpuidle: re-write", reverted
> the documentation from the intermediate patch that you refer to:
> 
> http://git.kernel.org/?p=linux/kernel/git/lenb/linux-acpi-2.6.git;a=commit;h=2305a5920fb8ee6ccec1c62ade05aa8351091d71

It never occurred to me that a patch would just remove documentation.
Thanks for looking into that...

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: hang with CONFIG_MCYRIXIII

2007-09-04 Thread Mark Hindley

Having established that the oops was not the cause of the hangs I have
been observing with MCYRIXIII, could anyone suggest ways to track down
if this is a compiler or kernel bug?

I have just complied latest git (with tcp_input.c oops fix)

With CONFIG_MCYRIXIII I got a hang with empty logs and nothing on the
serial line, sysrq unresponsive after about 2 hours

CONFIG_M586MMX has been solid

gcc (GCC) 4.1.3 20070812 (prerelease) (Debian 4.1.2-15)
Copyright (C) 2006 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

GNU ld (GNU Binutils for Debian) 2.17.90.20070812
Copyright 2007 Free Software Foundation, Inc.
This program is free software; you may redistribute it under the terms of
the GNU General Public License version 3 or (at your option) a later version.
This program has absolutely no warranty.
Mark Hindley

Thanks,

Mark
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: What's happening with the cpuidle code?

2007-09-04 Thread Len Brown

On Tuesday 04 September 2007 16:47, Chuck Ebbert wrote:
> A look at the 'cpuidle' branch of git-acpi shows a commit
> e40cede7d63a029e92712a3fe02faee60cc38fb4, "cpuidle: first
> round of documentation updates" that doesn't show up in that
> branch online. The entire Documentation/cpuidle directory
> is missing from the tree when looking at the web pages, and
> it's missing from git-acpi.patch in 2.6.23-rc4-mm1 (but the
> patch shows up in the summary information in the patch
> header.) Where did it go? And how can -mm be used to test
> things if its patches don't even match their own headers?

A later patch in that series, "cpuidle: re-write", reverted
the documentation from the intermediate patch that you refer to:

http://git.kernel.org/?p=linux/kernel/git/lenb/linux-acpi-2.6.git;a=commit;h=2305a5920fb8ee6ccec1c62ade05aa8351091d71

The cpuidle branch on git.kernel.org looks okay to me:

http://git.kernel.org/?p=linux/kernel/git/lenb/linux-acpi-2.6.git;a=shortlog;h=cpuidle

the top commit is this one:

commit 8975059a2c1e56cfe83d1bcf031bcf4cb39be743
Author: Adam Belay <[EMAIL PROTECTED]>
Date:   Tue Aug 21 18:27:07 2007 -0400

CPUIDLE: load ACPI properly when CPUIDLE is disabled

And this top patch is indeed included in the latest acpi test branch,
as well as the latest ACPI test patch here:

http://ftp.kernel.org/pub/linux/kernel/people/lenb/acpi/patches/test/2.6.23/acpi-test-20070126-2.6.23-rc5.diff.gz

Note that there were some merge conflicts when pulling cpuidle into 2.6.23,
so you are best off running either the acpi patch above on top of 2.6.23-rc5,
or the acpi test  branch, or the mm tree so you won't have to merge the
cpuidle branch onto your latest kernel again.

cheers,
-Len


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: ramdisk

2007-09-04 Thread Xu Yang

Hi Dick,
Thanks for the reply.

then how to create these device nodes in /dev? from the information i
got from the cosole(unknown block(1,0) ), it seems that I didn't
create the device? I thought the kernel should do this work right? if
not how to create it?

thanks,

regards,


2007/9/4, linux-os (Dick Johnson) <[EMAIL PROTECTED]>:
>
> On Mon, 3 Sep 2007, Xu Yang wrote:
>
> > Hi everyone,
> >
> > I want to use ramdisk to boot my filesystem, as I can't use NFS and 
> > harddisk.
> >
> > I have load the ramdisk into the ram memory (start address :0x400)
> >
> > and in the boot options I specified : root =dev/ram0 initrd=0x400
>
> Since you don't know what the default directory is, perhaps
> root should be /dev/ram0. Also, make sure you actually create
> those device nodes in /dev
>
> [Snipped...]
>
> > but the kernel said it can not find any file system on it.
> >
> > regards,
> >
> > Yang
> >
>
> Cheers,
> Dick Johnson
> Penguin : Linux version 2.6.22.1 on an i686 machine (5588.30 BogoMips).
> My book : http://www.AbominableFirebug.com/
> _
>
>
> 
> The information transmitted in this message is confidential and may be 
> privileged.  Any review, retransmission, dissemination, or other use of this 
> information by persons or entities other than the intended recipient is 
> prohibited.  If you are not the intended recipient, please notify Analogic 
> Corporation immediately - by replying to this message or by sending an email 
> to [EMAIL PROTECTED] - and destroy all copies of this information, including 
> any attachments, without reading or disclosing them.
>
> Thank you.
>
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.23-rc4-mm1

2007-09-04 Thread Stephen Hemminger

On Tue, 4 Sep 2007 10:54:32 -0700
Zach Carter <[EMAIL PROTECTED]> wrote:

> 
> > +ioc3-program-uart-predividers.patch
> > +sky2-fe-chip-support.patch
> > +sky2-use-debugfs-rename.patch
> > +sky2-document-gphy_ctrl-bits.patch
> > +sky2-dont-restrict-config-space-access.patch
> > +sky2-advanced-error-reporting.patch
> > +sky2-use-pci_config-access-functions.patch
> > +sky2-use-net_device-internal-stats.patch
> > +ktime_sub_ns-analog-of-ktime_add_ns.patch
> > +export-reciprocal_value-for-modules.patch
> > +sky2-hardware-receive-timestamp-counter.patch

I already told Andrew to please drop this last patch, because
it causes interrupt messages. It seems masking off the IRQ
in hardware doesn't prevent that interrupt!
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] Send quota messages via netlink

2007-09-04 Thread Serge E. Hallyn

Quoting Jan Kara ([EMAIL PROTECTED]):
> On Thu 30-08-07 17:14:47, Serge E. Hallyn wrote:
> > Quoting Jan Kara ([EMAIL PROTECTED]):
> > >   Maybe before proceeding further with the discussion I'd like to
> > > understand following: What are these user namespaces supposed to be good
> > > for?
> > 
> > (Please skip to the message end first, as I think you may not care about
> > the next bit of my blathering)
> > 
> > Right now they are only good for providing some separate accounting for
> > uid 1000 in one user namespace versus uid 1000 in another namespace.
> > All security enforcement must be done by actually providing separate
> > filesystems and separate pid namespaces and, hopefully, with a selinux
> > policy.
> > 
> > Eventually the idea will be that uid 1000 in one user namespace and uid
> > 1000 in another namespace will be completely separate entities.  A
> > mounted filesystem will be tied to a particuler user namespace, and
> > the kernel will provide any cross-userns access perhaps the way I
> > described, with uid equivalence implemented through the keyring.
>   I see. Thanks for explanation.
> 
> > But note that this isn't really relevant when we get to NFS.  Two user
> > namespaces on one machine should have different network namespaces and
> > network addresses as well, and so should look to the NFS server like two
> > separate machines.
> > 
> > So the user namespaces are only really relevant when talking about local
> > filesystems.
> > 
> > >   I imagine it so that you have a machine and on it several virtual
> > > machines which are sharing a filesystem (or it could be a cluster). Now 
> > > you
> > > want UIDs to be independent between these virtual machines. That's it,
> > > right?
> > >   Now to continue the example: Alice has UID 100 on machineA, Bob has
> > >  UID 100 on machineB. These translate to UIDs 1000 and 1001 on the common
> > > filesystem. Process of Alice writes to a file and Bob becomes to be over
> > > quota. In this situation, there would be probably two processes (from
> > > machineA and machineB) listening on the netlink socket. We want to send a
> > > message so that on Alice's desktop we can show a message: "You caused
> > > Bob to exceed his quotas" and of Bob's desktop: "Alice has caused that you
> > > are over quota.".
> > 
> > Since this is over NFS, you handle it the way you would any other time
> > that user Alice on some other machine managed to do this.
>   I meant this would actually happen over a local filesystem (imagine
> something like "hostfs" from UML).

Ok, then that is where I was previously suggesting that we use an api to
report a uid meaningful in bob's context, where we currently (in the
absense of meaningful mount uids and uid equivalence) tell Bob that root
was the one who brought him over quota.  From a user pov 'nobody' would
make more sense, but I don't think we want the kernel to know about user
nobody, right?

So if the msg weren't broadcast, or netlink sockets were tied to one
user namespace, we could call a
int uid_in_user_ns(struct user *, struct user_ns *)
sending in Alice's user struct and Bob's userns, and use the result in
the netlink message.  Otherwise I'm not sure what is the right answer.
We just might need the equivalent of 'struct pid' to struct user, or
persistant global user namespace ids (persistant after user namespace
destruction, not across reboot) so we can safely send the user_ns * in a
netlink msg.

> > >   Because there may be is not a notion of Bob on machineA or of Alice on
> > > machineB, we are in trouble, right? What I like the most is to use the
> > > filesystem identities (as you suggested in some other email). I. e. 
> > > because
> > > both Alice and Bob share a filesystem, identities of both have to make 
> > > sense
> > > to it (for example for purposes of permission checking). So we can 
> > > probably
> > 
> > Right, so long as we're talking about local filesystems that's the way
> > to go.  If a file write was allowed which brought bob over quota,
> > clearly the person responsible had some uid valid on the filesystem to
> > allow him to do so.
>   Fine. So I'll keep UID in the quota netlink protocol with the meaning
> "the identity of the user for filesystem operations".

I think that's ok.

Hopefully when that changes to accomodate user namespaces, we can use
netlink field versioning to make that transition pretty seamless?

If not, then we probably should in fact make some decision now so as not
to change the api.

> > > send via netlink these (in our example ids 1000 and 1001) and hope that
> > > inside machineA and machineB there will be a way to translate these
> > > identities to names "Alice" and "Bob". So that user can understand what
> > > is happenning. Does this sound plausible?
> > >   If we go this route, then we only need a kernel function, that will
> > > for a pair ($filesystem, $task) return indentity of that $task used
> > > for operations on $filesystem...
> > 
> > Ok, now

Re: [PATCH 6/7] blk_end_request: remove/unexport end_that_request_*

2007-09-04 Thread Jens Axboe

On Tue, Sep 04 2007, Halevy, Benny wrote:
> Boaz raised my attention to this patchset today...
> We suspect we'll still need the extern entry points for handling the bidi 
> request in the scsi_io_completion() path as we only want to call
> end_that_request_chunk on req->next_rq and never
> end_that_request_last.
>  
> (see 
> http://www.bhalevy.com/open-osd/download/linux-2.6.23-rc2_and_iscsi-iscsi-2007_08_09/0005-SCSI-bidi-support.patch)
>  
> If this is ok with you I'd leave these entry points in place rather than
> taking them out and putting them back in later.

There's no point in leaving them in when nothing current needs it, I'd
much rather add it back in should the need arise. That's the proper way
to handle things like this.

-- 
Jens Axboe

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: [PATCH 6/7] blk_end_request: remove/unexport end_that_request_*

2007-09-04 Thread Halevy, Benny

Boaz raised my attention to this patchset today...
We suspect we'll still need the extern entry points for handling the bidi 
request in the scsi_io_completion() path as we only want to call
end_that_request_chunk on req->next_rq and never
end_that_request_last.
 
(see 
http://www.bhalevy.com/open-osd/download/linux-2.6.23-rc2_and_iscsi-iscsi-2007_08_09/0005-SCSI-bidi-support.patch)
 
If this is ok with you I'd leave these entry points in place rather than
taking them out and putting them back in later.
 
Benny



From: [EMAIL PROTECTED] on behalf of Kiyoshi Ueda
Sent: Sat 2007-09-01 01:43
To: linux-kernel@vger.kernel.org; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL 
PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]
Subject: [PATCH 6/7] blk_end_request: remove/unexport end_that_request_*



This patch removes the following functions:
  o end_that_request_first()
  o end_that_request_chunk()
and stops exporting the functions below:
  o end_that_request_last()

Signed-off-by: Kiyoshi Ueda <[EMAIL PROTECTED]>
Signed-off-by: Jun'ichi Nomura <[EMAIL PROTECTED]>
---
 block/ll_rw_blk.c  |   61 
- include/linux/blkdev.h |   15 

 2 files changed, 21 insertions(+), 55 deletions(-)

diff -rupN 05-ide-cd-change/block/ll_rw_blk.c 
06-remove-old-interface/block/ll_rw_blk.c
--- 05-ide-cd-change/block/ll_rw_blk.c  2007-08-24 12:11:02.0 -0400
+++ 06-remove-old-interface/block/ll_rw_blk.c   2007-08-24 12:19:02.0 
-0400
@@ -3388,6 +3388,20 @@ static void blk_recalc_rq_sectors(struct
}
 }

+/**
+ * __end_that_request_first - end I/O on a request
+ * @req:  the request being processed
+ * @uptodate: 1 for success, 0 for I/O error, < 0 for specific error
+ * @nr_bytes: number of bytes to complete
+ *
+ * Description:
+ * Ends I/O on a number of bytes attached to @req, and sets it up
+ * for the next range of segments (if any) in the cluster.
+ *
+ * Return:
+ * 0 - we are done with this request, call end_that_request_last()
+ * 1 - still buffers pending for this request
+ **/
 static int __end_that_request_first(struct request *req, int uptodate,
int nr_bytes)
 {
@@ -3498,49 +3512,6 @@ static int __end_that_request_first(stru
return 1;
 }

-/**
- * end_that_request_first - end I/O on a request
- * @req:  the request being processed
- * @uptodate: 1 for success, 0 for I/O error, < 0 for specific error
- * @nr_sectors: number of sectors to end I/O on
- *
- * Description:
- * Ends I/O on a number of sectors attached to @req, and sets it up
- * for the next range of segments (if any) in the cluster.
- *
- * Return:
- * 0 - we are done with this request, call end_that_request_last()
- * 1 - still buffers pending for this request
- **/
-int end_that_request_first(struct request *req, int uptodate, int nr_sectors)
-{
-   return __end_that_request_first(req, uptodate, nr_sectors << 9);
-}
-
-EXPORT_SYMBOL(end_that_request_first);
-
-/**
- * end_that_request_chunk - end I/O on a request
- * @req:  the request being processed
- * @uptodate: 1 for success, 0 for I/O error, < 0 for specific error
- * @nr_bytes: number of bytes to complete
- *
- * Description:
- * Ends I/O on a number of bytes attached to @req, and sets it up
- * for the next range of segments (if any). Like end_that_request_first(),
- * but deals with bytes instead of sectors.
- *
- * Return:
- * 0 - we are done with this request, call end_that_request_last()
- * 1 - still buffers pending for this request
- **/
-int end_that_request_chunk(struct request *req, int uptodate, int nr_bytes)
-{
-   return __end_that_request_first(req, uptodate, nr_bytes);
-}
-
-EXPORT_SYMBOL(end_that_request_chunk);
-
 /*
  * splice the completion data to a local structure and hand off to
  * process_completion_queue() to complete the requests
@@ -3620,7 +3591,7 @@ EXPORT_SYMBOL(blk_complete_request);
 /*
  * queue lock must be held
  */
-void end_that_request_last(struct request *req, int uptodate)
+static void end_that_request_last(struct request *req, int uptodate)
 {
struct gendisk *disk = req->rq_disk;
int error;
@@ -3655,8 +3626,6 @@ void end_that_request_last(struct reques
__blk_put_request(req->q, req);
 }

-EXPORT_SYMBOL(end_that_request_last);
-
 void end_request(struct request *req, int uptodate)
 {
__blk_end_request(req, uptodate, sect2byte(req->hard_cur_sectors));
diff -rupN 05-ide-cd-change/include/linux/blkdev.h 
06-remove-old-interface/include/linux/blkdev.h
--- 05-ide-cd-change/include/linux/blkdev.h 2007-08-24 12:21:45.0 
-0400
+++ 06-remove-old-interface/include/linux/blkdev.h  2007-08-24 
12:21:15.0 -0400
@@ -720,19 +720,16 @@ static inline void blk_run_address_space
 }

 /*
- * end_request() and friends. Must be called

Re: 2.6.23-rc4-mm1

2007-09-04 Thread Wim Van Sebroeck

Hi,

> * Fix this warning:
> 
>   drivers/watchdog/core/watchdog_dev.c:84:
>   warning: format '%i' expects type 'int', but argument 5 has type 'size_t'
> 
> * CONFIG_xxx options are directly usable by preprocessor directives.

Patch works for me. I applied it to the linux-2.6-watchdog-mm tree.

Greetings,
Wim.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 3/6] x86: Convert cpu_sibling_map to be a per cpu variable (v2) (fwd)

2007-09-04 Thread Andrew Morton

> On Tue, 04 Sep 2007 13:29:11 -0700 Mike Travis <[EMAIL PROTECTED]> wrote:
> [Sorry, I did not see this message until Christoph forwarded it to me.  I'm
> guessing we (SGI) still have a problem with our external spam filter?]
> 
> > 
> > -- Forwarded message --
> > Date: Fri, 31 Aug 2007 19:49:03 -0700
> > From: Andrew Morton <[EMAIL PROTECTED]>
> > To: [EMAIL PROTECTED]
> > Cc: Andi Kleen <[EMAIL PROTECTED]>, [EMAIL PROTECTED], 
> > linux-kernel@vger.kernel.org,
> > Christoph Lameter <[EMAIL PROTECTED]>
> > Subject: Re: [PATCH 3/6] x86: Convert cpu_sibling_map to be a per cpu 
> > variable
> > (v2)
> > 
> > On Fri, 24 Aug 2007 15:26:57 -0700 [EMAIL PROTECTED] wrote:
> > 
> >> Convert cpu_sibling_map from a static array sized by NR_CPUS to a
> >> per_cpu variable.  This saves sizeof(cpumask_t) * NR unused cpus.
> >> Access is mostly from startup and CPU HOTPLUG functions.
> > 
> > ia64 allmodconfig:
> > 
> > kernel/sched.c: In function `cpu_to_phys_group':
> >  kernel/sched.c:5937: 
> > error: `per_cpu__cpu_sibling_map' undeclared (first use in this function)   
> > kernel/sched.c:5937: error: (Each undeclared 
> > identifier is reported only once
> > kernel/sched.c:5937: error: for each function it appears in.)   
> >  kernel/sched.c:5937: 
> > warning: type defaults to `int' in declaration of `type name'
> > kernel/sched.c:5937: error: invalid type argument of `unary *'  
> >  kernel/sched.c: In 
> > function `build_sched_domains': 
> >   kernel/sched.c:6172: error: 
> > `per_cpu__cpu_sibling_map' undeclared (first use in this function)  
> >  kernel/sched.c:6172: warning: type defaults to `int' 
> > in declaration of `type name'   
> > kernel/sched.c:6172: error: invalid type argument of `unary *'  
> >  kernel/sched.c:6183: 
> > warning: type defaults to `int' in declaration of `type name'   
> > kernel/sched.c:6183: error: invalid type 
> > argument of `unary *'   
> > 
> 
> I'm thinking that the best approach would be to define a cpu_sibling_map() 
> macro
> to handle the cases where cpu_sibling_map is not a per_cpu variable?  Perhaps
> something like:
> 
> #ifdef CONFIG_SCHED_SMT
> #ifndef cpu_sibling_map
> #define cpu_sibling_map(cpu)cpu_sibling_map[cpu]
> #endif
> #endif
> 
> My question though, would include/linux/smp.h be the appropriate place for
> the above define?  (That is, if the above approach is the correct one... ;-)

It'd be better to convert the unconverted architectures?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] Fix a potential NULL pointer dereference in usbat_check_status() in drivers/usb/storage/shuttle_usbat.c

2007-09-04 Thread Jens Axboe

On Tue, Sep 04 2007, Simon Holm Thøgersen wrote:
> tir, 04 09 2007 kl. 13:06 +0200, skrev Jens Axboe:
> > On Tue, Sep 04 2007, Micah Gruber wrote:
> > > This patch fixes a potential null dereference bug where we dereference us 
> > > before a null check. This patch simply moves the dereferencing after the 
> > > null check.
> > > 
> > > Signed-off-by: Micah Gruber <[EMAIL PROTECTED]>
> > 
> > Be careful with stuff like that, if you actually look at the code, a us
> > == NULL doesn't seem to be possible (or usbat_flash_transport() would
> > have oopsed before).
> > 
> If that is true, then
> if (!us)
> return USB_STOR_TRANSPORT_ERROR;
> is utterly pointless.

Well that was the point I was trying to make, that test and return
should be deleted instead.

-- 
Jens Axboe

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [linux-usb-devel] [PATCH] Fix a potential NULL pointer dereference in usbat_check_status() in drivers/usb/storage/shuttle_usbat.c

2007-09-04 Thread Alan Stern

On Tue, 4 Sep 2007, Simon Holm Thøgersen wrote:

> > tir, 04 09 2007 kl. 13:06 +0200, skrev Jens Axboe:
> > On Tue, Sep 04 2007, Micah Gruber wrote:
> > > This patch fixes a potential null dereference bug where we dereference us 
> > > before a null check. This patch simply moves the dereferencing after the 
> > > null check.
> > > 
> > > Signed-off-by: Micah Gruber <[EMAIL PROTECTED]>
> > 
> > Be careful with stuff like that, if you actually look at the code, a us
> > == NULL doesn't seem to be possible (or usbat_flash_transport() would
> > have oopsed before).
> > 
> If that is true, then
> if (!us)
> return USB_STOR_TRANSPORT_ERROR;
> is utterly pointless.

Indeed, so it is.

Alan Stern

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 2.6.23-rc5] USB Mass Storage: limit "Rockchip ROCK MP3" device (071b:3203) max I/O to 64 sectors per command

2007-09-04 Thread Massimiliano Ghilardi

From: Massimiliano Ghilardi <[EMAIL PROTECTED]>

The MP3/MP4/AVI player "Rockchip ROCK MP3" is seen as a USB disk, but fails
if more than 128 sectors (64kB) are sent or requested in a single read or write
command, and disconnects from the USB bus.

Typical kernel log showing the problem is:

usb 3-1: reset high speed USB device using ehci_hcd and address 6
usb 3-1: reset high speed USB device using ehci_hcd and address 6
sd 14:0:0:0: [sdb] Result: hostbyte=0x07 driverbyte=0x00
end_request: I/O error, dev sdb, sector 32
sd 14:0:0:0: [sdb] Result: hostbyte=0x07 driverbyte=0x00
end_request: I/O error, dev sdb, sector 32
usb 3-1: USB disconnect, address 6

This patch works around the device limitation by adding "Rockchip ROCK MP3"
to unusual USB devices list and limiting data transfers to 64 sectors (32kB)
per command.
Tested on 2.6.23-rc5 (amd64).

Signed-off-by: Massimiliano Ghilardi <[EMAIL PROTECTED]>
---

 drivers/usb/storage/unusual_devs.h |   16 
 1 file changed, 16 insertions(+)

--- a/drivers/usb/storage/unusual_devs.h2007-09-01 17:08:55.0 
+0200
+++ b/drivers/usb/storage/unusual_devs.h2007-09-04 22:40:28.0 
+0200
@@ -897,6 +897,22 @@
US_SC_DEVICE, US_PR_DEVICE, NULL,
US_FL_FIX_CAPACITY ),
 
+/* Reported by Massimiliano Ghilardi <[EMAIL PROTECTED]>
+ * This USB MP3/AVI player device fails and disconnects if more than 128 
sectors (64kB)
+ * are read/written in a single command, and may be present at least in the 
following products:
+ * 
+ * "Magnex Digital Video Panel DVP 1800"
+ * "MP4 AIGO 4GB SLOT SD"
+ * "Teclast TL-C260 MP3"
+ * "i.Meizu PMP MP3/MP4"
+ * "Speed MV8 MP4 Audio Player"
+ */
+UNUSUAL_DEV(  0x071b, 0x3203, 0x0100, 0x0100,
+   "RockChip",
+   "ROCK MP3",
+   US_SC_DEVICE, US_PR_DEVICE, NULL,
+   US_FL_MAX_SECTORS_64),
+
 /* Reported by Olivier Blondeau <[EMAIL PROTECTED]> */
 UNUSUAL_DEV(  0x0727, 0x0306, 0x0100, 0x0100,
"ATMEL",
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: origin of __tmp1930643048 network device name: kernel-space or user-space

2007-09-04 Thread Kay Sievers

On 9/4/07, davide rossetti <[EMAIL PROTECTED]> wrote:
> I'm trying to track down a problem on a Sun V40Z server with 4 network
> devices grabbing random ethX device names. now, trying to force the
> device names to what I want, I got a __tmpX form of device name,
> which I think is a half-configured device... but which piece of
> software is to blame ??? kernel, udev, hotplug
>
> it is a Fedora Core 6, fully updated (kernel-2.6.22.2-42.fc6
> udev-095-17.fc6.x86_64)

I don't think any of the mentioned tools is to blame. Please use the
distro's bugtracker.

Thanks,
Kay
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 2/2] Fix (improve) deadlock condition on module removal netfilter socket option removal

2007-09-04 Thread Neil Horman

Hey-
2nd of two patches.  This patch enhances modprobe to operate like rmmod
in non-blocking mode.  It also adds a -w option to allow for explicit blocking
operation.

Regards
Neil

Signed-off-by: Neil Horman <[EMAIL PROTECTED]>


 modprobe.8 |9 +
 modprobe.c |   21 ++---
 2 files changed, 23 insertions(+), 7 deletions(-)


diff --git a/modprobe.8 b/modprobe.8
index 6910b5a..83f3229 100644
--- a/modprobe.8
+++ b/modprobe.8
@@ -109,6 +109,15 @@ sense to specify module parameters when removing modules).
 There is usually no reason to remove modules, but some
 buggy modules require it.  Your kernel may not support
 removal of modules.
+
+.TP
+\fB-w --wait \fR
+This option is applicable only with the -r or --remove option.
+It causes modprobe to block in the kernel waiting for the specified
+modules reference count to reach zero.  Default operation is for
+modprobe to operate like rmmod, which exits with EWOULDBLOCK if the 
+modules reference count is non-zero.
+
 .TP
 \fB-V --version \fR
 Show version of program, and exit.  See below for caveats when run on older 
kernels.
diff --git a/modprobe.c b/modprobe.c
index ea8de74..c9bd6af 100644
--- a/modprobe.c
+++ b/modprobe.c
@@ -913,7 +913,8 @@ static void rmmod(struct list_head *list,
  struct module_command *commands,
  int ignore_commands,
  int ignore_inuse,
- const char *cmdline_opts)
+ const char *cmdline_opts,
+ int flags)
 {
const char *command;
unsigned int usecount = 0;
@@ -967,7 +968,7 @@ static void rmmod(struct list_head *list,
/* Now do things we depend. */
if (!list_empty(list))
rmmod(list, NULL, 0, warn, dry_run, verbose, commands,
- 0, 1, cmdline_opts);
+ 0, 1, cmdline_opts, flags);
return;
 
 nonexistent_module:
@@ -1333,7 +1334,8 @@ static void handle_module(const char *modname,
  int strip_vermagic,
  int strip_modversion,
  int unknown_silent,
- const char *cmdline_opts)
+ const char *cmdline_opts,
+ int flags)
 {
if (list_empty(todo_list)) {
const char *command;
@@ -1355,7 +1357,7 @@ static void handle_module(const char *modname,
 
if (remove)
rmmod(todo_list, newname, first_time, error, dry_run, verbose,
- commands, ignore_commands, 0, cmdline_opts);
+ commands, ignore_commands, 0, cmdline_opts, flags);
else
insmod(todo_list, NOFAIL(strdup(options)), newname,
   first_time, error, dry_run, verbose, modoptions,
@@ -1368,6 +1370,7 @@ static struct option options[] = { { "verbose", 0, NULL, 
'v' },
   { "config", 1, NULL, 'C' },
   { "name", 1, NULL, 'o' },
   { "remove", 0, NULL, 'r' },
+  { "wait", 0, NULL, 'w' },
   { "showconfig", 0, NULL, 'c' },
   { "autoclean", 0, NULL, 'k' },
   { "quiet", 0, NULL, 'q' },
@@ -1430,6 +1433,7 @@ int main(int argc, char *argv[])
char *newname = NULL;
char *aliasfilename, *symfilename;
errfn_t error = fatal;
+   int flags = O_NONBLOCK|O_EXCL;
 
/* Prepend options from environment. */
argv = merge_args(getenv("MODPROBE_OPTIONS"), argv, );
@@ -1444,7 +1448,7 @@ int main(int argc, char *argv[])
try_old_version("modprobe", argv);
 
uname();
-   while ((opt = getopt_long(argc, argv, "vVC:o:rknqQsclt:aifb", options, 
NULL)) != -1){
+   while ((opt = getopt_long(argc, argv, "vVC:o:rknqQsclt:aifbw", options, 
NULL)) != -1){
switch (opt) {
case 'v':
add_to_env_var("-v");
@@ -1529,6 +1533,9 @@ int main(int argc, char *argv[])
case 'b':
use_blacklist = 1;
break;
+   case 'w':
+   flags &= ~O_NONBLOCK;
+   break;
case 1:
strip_vermagic = 1;
break;
@@ -1651,7 +1658,7 @@ int main(int argc, char *argv[])
  ignore_proc, strip_vermagic,
  strip_modversion,
  unknown_silent,
- optstring);
+ optstring, flags);
 
aliases = aliases->next;
INIT_LIST_HEAD();
@@ -1666,7 +1673,7 @@ int main(int argc, char *argv[])

Re: [PATCH] Revised timerfd() interface

2007-09-04 Thread Michael Kerrisk

> > > The ABI change doesn't really matter, since timerfd() was broken in
> > > 2.6.22 anyway.
> > > 
> > > Both previous APIs provided the features I have described provide:
> > > 
> > > * the ability to fetch the old timer value when applying
> > >   a new setting
> > > 
> > > * the ability to non-destructively fetch the amount of time remaining
> > >   on a timer.
> > > 
> > > This is clearly useful for timers -- but you have not explained why
> > > you think this is not necessary for timerfd timers.
> > 
> > 
> > 
> > I'd have thought that the existing stuff would be near-useless without
> > the capabilities which you describe?
> 
> Useless like it'd be a motorcycle w/out a cup-holder :)
> Seriously, the ability to get the previous values from "something" could 
> have a meaning if this something is a shared global resource (like 
> signals
> for example). In the timerfd case this makes little sense, since you can 
> create as many timerfd as you like and you do not need to share a single 
> one by changing/restoring the original context.

Davide,

As I think about this more, I see more problems with
your argument.  timerfd needs the ability to get and 
get-while-setting just as much as the earlier APIs.
Consider a library that creates a timerfd file descriptor that
is handed off to an application: that library may want
to modify the timer settings without having to create a
new file descriptor (the app mey not be able to be told about
the new fd).  Your argument just doesn't hold, AFAICS.

Cheers,

Michael
-- 
Michael Kerrisk
maintainer of Linux man pages Sections 2, 3, 4, 5, and 7 

Want to help with man page maintenance?  
Grab the latest tarball at
http://www.kernel.org/pub/linux/docs/manpages , 
read the HOWTOHELP file and grep the source 
files for 'FIXME'.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

What's happening with the cpuidle code?

2007-09-04 Thread Chuck Ebbert

A look at the 'cpuidle' branch of git-acpi shows a commit
e40cede7d63a029e92712a3fe02faee60cc38fb4, "cpuidle: first
round of documentation updates" that doesn't show up in that
branch online. The entire Documentation/cpuidle directory
is missing from the tree when looking at the web pages, and
it's missing from git-acpi.patch in 2.6.23-rc4-mm1 (but the
patch shows up in the summary information in the patch
header.) Where did it go? And how can -mm be used to test
things if its patches don't even match their own headers?

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] slub - Use local_t protection

2007-09-04 Thread Christoph Lameter

On Tue, 4 Sep 2007, Mathieu Desnoyers wrote:

> @@ -1566,12 +1565,13 @@ redo:
>   object[c->offset]) != object))
>   goto redo;
>  
> - put_cpu();
> + local_exit(flags);
>   if (unlikely((gfpflags & __GFP_ZERO)))
>   memset(object, 0, c->objsize);
>  
>   return object;
>  slow:
> + local_exit(flags);

Here we can be rescheduled to another processors.

>   return __slab_alloc(s, gfpflags, node, addr, c)

c may point to the wrong processor.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 3/6] x86: Convert cpu_sibling_map to be a per cpu variable (v2) (fwd)

2007-09-04 Thread Mike Travis

[Sorry, I did not see this message until Christoph forwarded it to me.  I'm
guessing we (SGI) still have a problem with our external spam filter?]

> 
> -- Forwarded message --
> Date: Fri, 31 Aug 2007 19:49:03 -0700
> From: Andrew Morton <[EMAIL PROTECTED]>
> To: [EMAIL PROTECTED]
> Cc: Andi Kleen <[EMAIL PROTECTED]>, [EMAIL PROTECTED], 
> linux-kernel@vger.kernel.org,
> Christoph Lameter <[EMAIL PROTECTED]>
> Subject: Re: [PATCH 3/6] x86: Convert cpu_sibling_map to be a per cpu variable
> (v2)
> 
> On Fri, 24 Aug 2007 15:26:57 -0700 [EMAIL PROTECTED] wrote:
> 
>> Convert cpu_sibling_map from a static array sized by NR_CPUS to a
>> per_cpu variable.  This saves sizeof(cpumask_t) * NR unused cpus.
>> Access is mostly from startup and CPU HOTPLUG functions.
> 
> ia64 allmodconfig:
> 
> kernel/sched.c: In function `cpu_to_phys_group':  
>kernel/sched.c:5937: error: 
> `per_cpu__cpu_sibling_map' undeclared (first use in this function)
>kernel/sched.c:5937: error: (Each undeclared identifier is 
> reported only once
> kernel/sched.c:5937: error: for each function it appears in.) 
>kernel/sched.c:5937: warning: 
> type defaults to `int' in declaration of `type name'
> kernel/sched.c:5937: error: invalid type argument of `unary *'
>kernel/sched.c: In function 
> `build_sched_domains':
>kernel/sched.c:6172: error: `per_cpu__cpu_sibling_map' 
> undeclared (first use in this function)   
> kernel/sched.c:6172: warning: type defaults to `int' in declaration of `type 
> name'   kernel/sched.c:6172: error: 
> invalid type argument of `unary *'
>kernel/sched.c:6183: warning: type defaults to `int' in 
> declaration of `type name'   
> kernel/sched.c:6183: error: invalid type argument of `unary *'
>

I'm thinking that the best approach would be to define a cpu_sibling_map() macro
to handle the cases where cpu_sibling_map is not a per_cpu variable?  Perhaps
something like:

#ifdef CONFIG_SCHED_SMT
#ifndef cpu_sibling_map
#define cpu_sibling_map(cpu)cpu_sibling_map[cpu]
#endif
#endif

My question though, would include/linux/smp.h be the appropriate place for
the above define?  (That is, if the above approach is the correct one... ;-)

Thanks,
Mike
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Fwd: That whole "Linux stealing our code" thing

2007-09-04 Thread linux-os \(Dick Johnson\)

On Tue, 4 Sep 2007, Chris Friesen wrote:

> Daniel Hazelton wrote:
>> On Tuesday 04 September 2007 09:27:02 Krzysztof Halasa wrote:
>>
>>> Daniel Hazelton <[EMAIL PROTECTED]> writes:
>>>
 US Copyright law. A copyright holder, regardless of what license he/she
 may have released the work under, can still revoke the license for a
 specific person or group of people. (There are some exceptions, but they
 do not apply to the situation that is being discussed)
>
> The OpenBSD policy page doesn't agree with you:
>
> "...That means that having granted a permission, the copyright holder
> can not retroactively say that an individual or class of individuals are
> no longer granted those permissions. Likewise should the copyright
> holder decide to "go commercial" he can not revoke permissions already
> granted for the use of the work as distributed, though he may impose
> more restrictive permissions in his future distributions of that work."
>
> http://www.openbsd.org/policy.html
>
>
> Chris
> -

There are other enforceability issues as well. For instance in the
US, Copyright Law applies as soon as something is written. So,
does Copyright Law apply if I write, "You cannot read this."
Of course, it's a trivial example. Revocation of a license to
read a work is absurd. Using this theory, once somebody's
written "work" has been distributed under some license, a
different license would likely be regarded as unenforceable
by a court.

Cheers,
Dick Johnson
Penguin : Linux version 2.6.22.1 on an i686 machine (5588.30 BogoMips).
My book : http://www.AbominableFirebug.com/
_

The information transmitted in this message is confidential and may be 
privileged.  Any review, retransmission, dissemination, or other use of this 
information by persons or entities other than the intended recipient is 
prohibited.  If you are not the intended recipient, please notify Analogic 
Corporation immediately - by replying to this message or by sending an email to 
[EMAIL PROTECTED] - and destroy all copies of this information, including any 
attachments, without reading or disclosing them.

Thank you.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 1/2] Fix (improve) deadlock condition on module removal netfilter socket option removal

2007-09-04 Thread Neil Horman

Patch 1/2 to fix netfilter socket option removal

This patch changes netfilter socket options to do reference counting on the
module refcounter (And saves us 4 bytes in the structure to boot :) ).

regards

Neil

Signed-off-by: Neil Horman <[EMAIL PROTECTED]>


 include/linux/netfilter.h  |5 +--
 net/bridge/netfilter/ebtables.c|1 
 net/ipv4/ipvs/ip_vs_ctl.c  |1 
 net/ipv4/netfilter/arp_tables.c|1 
 net/ipv4/netfilter/ip_tables.c |1 
 net/ipv4/netfilter/nf_conntrack_l3proto_ipv4.c |1 
 net/ipv6/netfilter/ip6_tables.c|1 
 net/netfilter/nf_sockopt.c |   36 +++--
 8 files changed, 19 insertions(+), 28 deletions(-)


diff --git a/include/linux/netfilter.h b/include/linux/netfilter.h
index 0eed0b7..1dd075e 100644
--- a/include/linux/netfilter.h
+++ b/include/linux/netfilter.h
@@ -88,9 +88,8 @@ struct nf_sockopt_ops
int (*compat_get)(struct sock *sk, int optval,
void __user *user, int *len);
 
-   /* Number of users inside set() or get(). */
-   unsigned int use;
-   struct task_struct *cleanup_task;
+   /* Use the module struct to lock set/get code in place */
+   struct module *owner;
 };
 
 /* Each queued (to userspace) skbuff has one of these. */
diff --git a/net/bridge/netfilter/ebtables.c b/net/bridge/netfilter/ebtables.c
index 4169a2a..6018d0e 100644
--- a/net/bridge/netfilter/ebtables.c
+++ b/net/bridge/netfilter/ebtables.c
@@ -1513,6 +1513,7 @@ static struct nf_sockopt_ops ebt_sockopts =
.get_optmin = EBT_BASE_CTL,
.get_optmax = EBT_SO_GET_MAX + 1,
.get= do_ebt_get_ctl,
+   .owner  = THIS_MODULE,
 };
 
 static int __init ebtables_init(void)
diff --git a/net/ipv4/ipvs/ip_vs_ctl.c b/net/ipv4/ipvs/ip_vs_ctl.c
index e1052bc..0234122 100644
--- a/net/ipv4/ipvs/ip_vs_ctl.c
+++ b/net/ipv4/ipvs/ip_vs_ctl.c
@@ -2340,6 +2340,7 @@ static struct nf_sockopt_ops ip_vs_sockopts = {
.get_optmin = IP_VS_BASE_CTL,
.get_optmax = IP_VS_SO_GET_MAX+1,
.get= do_ip_vs_get_ctl,
+   .owner  = THIS_MODULE,
 };
 
 
diff --git a/net/ipv4/netfilter/arp_tables.c b/net/ipv4/netfilter/arp_tables.c
index d1149ab..29114a9 100644
--- a/net/ipv4/netfilter/arp_tables.c
+++ b/net/ipv4/netfilter/arp_tables.c
@@ -1161,6 +1161,7 @@ static struct nf_sockopt_ops arpt_sockopts = {
.get_optmin = ARPT_BASE_CTL,
.get_optmax = ARPT_SO_GET_MAX+1,
.get= do_arpt_get_ctl,
+   .owner  = THIS_MODULE,
 };
 
 static int __init arp_tables_init(void)
diff --git a/net/ipv4/netfilter/ip_tables.c b/net/ipv4/netfilter/ip_tables.c
index e1b402c..6486894 100644
--- a/net/ipv4/netfilter/ip_tables.c
+++ b/net/ipv4/netfilter/ip_tables.c
@@ -2296,6 +2296,7 @@ static struct nf_sockopt_ops ipt_sockopts = {
 #ifdef CONFIG_COMPAT
.compat_get = compat_do_ipt_get_ctl,
 #endif
+   .owner  = THIS_MODULE,
 };
 
 static struct xt_match icmp_matchstruct __read_mostly = {
diff --git a/net/ipv4/netfilter/nf_conntrack_l3proto_ipv4.c 
b/net/ipv4/netfilter/nf_conntrack_l3proto_ipv4.c
index 64552af..fd65337 100644
--- a/net/ipv4/netfilter/nf_conntrack_l3proto_ipv4.c
+++ b/net/ipv4/netfilter/nf_conntrack_l3proto_ipv4.c
@@ -403,6 +403,7 @@ static struct nf_sockopt_ops so_getorigdst = {
.get_optmin = SO_ORIGINAL_DST,
.get_optmax = SO_ORIGINAL_DST+1,
.get= ,
+   .owner  = THIS_MODULE,
 };
 
 struct nf_conntrack_l3proto nf_conntrack_l3proto_ipv4 __read_mostly = {
diff --git a/net/ipv6/netfilter/ip6_tables.c b/net/ipv6/netfilter/ip6_tables.c
index aeda617..cd9df02 100644
--- a/net/ipv6/netfilter/ip6_tables.c
+++ b/net/ipv6/netfilter/ip6_tables.c
@@ -1462,6 +1462,7 @@ static struct nf_sockopt_ops ip6t_sockopts = {
.get_optmin = IP6T_BASE_CTL,
.get_optmax = IP6T_SO_GET_MAX+1,
.get= do_ip6t_get_ctl,
+   .owner  = THIS_MODULE,
 };
 
 static struct xt_match icmp6_matchstruct __read_mostly = {
diff --git a/net/netfilter/nf_sockopt.c b/net/netfilter/nf_sockopt.c
index 8b8ece7..e32761c 100644
--- a/net/netfilter/nf_sockopt.c
+++ b/net/netfilter/nf_sockopt.c
@@ -55,18 +55,7 @@ EXPORT_SYMBOL(nf_register_sockopt);
 
 void nf_unregister_sockopt(struct nf_sockopt_ops *reg)
 {
-   /* No point being interruptible: we're probably in cleanup_module() */
- restart:
mutex_lock(_sockopt_mutex);
-   if (reg->use != 0) {
-   /* To be woken by nf_sockopt call... */
-   /* FIXME: Stuart Young's name appears gratuitously. */
-   set_current_state(TASK_UNINTERRUPTIBLE);
-   reg->cleanup_task = current;
-   mutex_unlock(_sockopt_mutex);
-   schedule();
-   goto restart;
-   }
list_del(>list);

Re: [PATCH] Fix a potential NULL pointer dereference in usbat_check_status() in drivers/usb/storage/shuttle_usbat.c

2007-09-04 Thread Simon Holm Thøgersen

tir, 04 09 2007 kl. 13:06 +0200, skrev Jens Axboe:
> On Tue, Sep 04 2007, Micah Gruber wrote:
> > This patch fixes a potential null dereference bug where we dereference us 
> > before a null check. This patch simply moves the dereferencing after the 
> > null check.
> > 
> > Signed-off-by: Micah Gruber <[EMAIL PROTECTED]>
> 
> Be careful with stuff like that, if you actually look at the code, a us
> == NULL doesn't seem to be possible (or usbat_flash_transport() would
> have oopsed before).
> 
If that is true, then
if (!us)
return USB_STOR_TRANSPORT_ERROR;
is utterly pointless.


Simon Holm Thøgersen

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 0/2] Fix (improve) deadlock condition on module removal netfilter socket option removal

2007-09-04 Thread Neil Horman

Hey all-
So I've had a deadlock reported to me.  I've found that the sequence of
events goes like this:

1) process A (modprobe) runs to remove ip_tables.ko

2) process B (iptables-restore) runs and calls setsockopt on a netfilter socket,
increasing the ip_tables socket_ops use count

3) process A acquires a file lock on the file ip_tables.ko, calls remove_module
in the kernel, which in turn executes the ip_tables module cleanup routine,
which calls nf_unregister_sockopt

4) nf_unregister_sockopt, seeing that the use count is non-zero, puts the
calling process into uninterruptible sleep, expecting the process using the
socket option code to wake it up when it exits the kernel

4) the user of the socket option code (process B) in do_ipt_get_ctl, calls
ipt_find_table_lock, which in this case calls request_module to load
ip_tables_nat.ko

5) request_module forks a copy of modprobe (process C) to load the module and
blocks until modprobe exits.

6) Process C. forked by request_module process the dependencies of
ip_tables_nat.ko, of which ip_tables.ko is one.

7) Process C attempts to lock the request module and all its dependencies, it
blocks when it attempts to lock ip_tables.ko (which was previously locked in
step 3)

Theres not really any great permanent solution to this that I can see, but I've
developed a two part solution that corrects the problem

Part 1) Modifies the nf_sockopt registration code so that, instead of using a
use counter internal to the nf_sockopt_ops structure, we instead use a pointer
to the registering modules owner to do module reference counting when nf_sockopt
calls a modules set/get routine.  This prevents the deadlock by preventing set 4
from happening.

Part 2) Enhances the modprobe utilty so that by default it preforms non-blocking
remove operations (the same way rmmod does), and add an option to explicity
request blocking operation.  So if you select blocking operation in modprobe you
can still cause the above deadlock, but only if you explicity try (and since
root can do any old stupid thing it would like :) ).

The following 2 patches have been tested out by me.

Thanks & Regards
Neil

-- 
/***
 *Neil Horman
 *Software Engineer
 *Red Hat, Inc.
 [EMAIL PROTECTED]
 *gpg keyid: 1024D / 0x92A74FA1
 *http://pgp.mit.edu
 ***/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] local_t protection (critical section)

2007-09-04 Thread Mathieu Desnoyers

local_t protection (critical section)

Adds local_enter(flags) and local_exit(flags) as primitives to surround critical
sections using local_t types.

On architectures providing fast atomic primitives, this turns into a preempt
disable/enable().
However, on architectures not providing such fast primitives, such as ia64, it
turns into a local irq disable/enable so that we can use *_local primitives that
are non atomic.

This is only the primary work here: made for testing ia64 with cmpxchg_local
(other local_* primitives still use atomic_long_t operations as fallback).

Signed-off-by: Mathieu Desnoyers <[EMAIL PROTECTED]>
Christoph Lameter <[EMAIL PROTECTED]>
---
 include/asm-generic/local.h   |3 +++
 include/asm-i386/local.h  |3 +++
 include/asm-ia64/intrinsics.h |   14 --
 3 files changed, 18 insertions(+), 2 deletions(-)

Index: linux-2.6-lttng/include/asm-generic/local.h
===
--- linux-2.6-lttng.orig/include/asm-generic/local.h2007-09-04 
15:32:02.0 -0400
+++ linux-2.6-lttng/include/asm-generic/local.h 2007-09-04 15:36:41.0 
-0400
@@ -46,6 +46,9 @@ typedef struct
 #define local_add_unless(l, a, u) atomic_long_add_unless((&(l)->a), (a), (u))
 #define local_inc_not_zero(l) atomic_long_inc_not_zero(&(l)->a)
 
+#define local_enter(flags) local_irq_save(flags)
+#define local_exit(flags) local_irq_restore(flags)
+
 /* Non-atomic variants, ie. preemption disabled and won't be touched
  * in interrupt, etc.  Some archs can optimize this case well. */
 #define __local_inc(l) local_set((l), local_read(l) + 1)
Index: linux-2.6-lttng/include/asm-i386/local.h
===
--- linux-2.6-lttng.orig/include/asm-i386/local.h   2007-09-04 
15:28:52.0 -0400
+++ linux-2.6-lttng/include/asm-i386/local.h2007-09-04 15:31:54.0 
-0400
@@ -194,6 +194,9 @@ static __inline__ long local_sub_return(
 })
 #define local_inc_not_zero(l) local_add_unless((l), 1, 0)
 
+#define local_enter(flags) preempt_disable()
+#define local_exit(flags) preempt_enable()
+
 /* On x86, these are no better than the atomic variants. */
 #define __local_inc(l) local_inc(l)
 #define __local_dec(l) local_dec(l)
Index: linux-2.6-lttng/include/asm-ia64/intrinsics.h
===
--- linux-2.6-lttng.orig/include/asm-ia64/intrinsics.h  2007-09-04 
15:47:24.0 -0400
+++ linux-2.6-lttng/include/asm-ia64/intrinsics.h   2007-09-04 
15:49:41.0 -0400
@@ -160,8 +160,18 @@ extern long ia64_cmpxchg_called_with_bad
 #define cmpxchg(ptr,o,n)   cmpxchg_acq(ptr,o,n)
 #define cmpxchg64(ptr,o,n) cmpxchg_acq(ptr,o,n)
 
-#define cmpxchg_local  cmpxchg
-#define cmpxchg64_localcmpxchg64
+/* Must be executed between local_enter/local_exit. */
+static inline void *cmpxchg_local(void **p, void *old, void *new)
+{
+   unsigned long flags;
+   void *before;
+
+   before = *p;
+   if (likely(before == old))
+   *p = new;
+   return before;
+}
+#define cmpxchg64_localcmpxchg_local
 
 #ifdef CONFIG_IA64_DEBUG_CMPXCHG
 # define CMPXCHG_BUGCHECK_DECL int _cmpxchg_bugcheck_count = 128;
-- 
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] slub - Use local_t protection

2007-09-04 Thread Mathieu Desnoyers

slub - Use local_t protection

Use local_enter/local_exit for protection in the fast path.

Depends on the cmpxchg_local slub patch.

Signed-off-by: Mathieu Desnoyers <[EMAIL PROTECTED]>
CC: Christoph Lameter <[EMAIL PROTECTED]>
---
 mm/slub.c |   18 ++
 1 file changed, 10 insertions(+), 8 deletions(-)

Index: linux-2.6-lttng/mm/slub.c
===
--- linux-2.6-lttng.orig/mm/slub.c  2007-09-04 15:47:20.0 -0400
+++ linux-2.6-lttng/mm/slub.c   2007-09-04 15:52:07.0 -0400
@@ -1456,7 +1456,6 @@ static void *__slab_alloc(struct kmem_ca
unsigned long flags;
 
local_irq_save(flags);
-   put_cpu_no_resched();
if (!c->page)
/* Slab was flushed */
goto new_slab;
@@ -1480,7 +1479,6 @@ load_freelist:
 out:
slab_unlock(c->page);
local_irq_restore(flags);
-   preempt_check_resched();
if (unlikely((gfpflags & __GFP_ZERO)))
memset(object, 0, c->objsize);
return object;
@@ -1524,7 +1522,6 @@ new_slab:
goto load_freelist;
}
local_irq_restore(flags);
-   preempt_check_resched();
return NULL;
 debug:
object = c->page->freelist;
@@ -1552,8 +1549,10 @@ static void __always_inline *slab_alloc(
 {
void **object;
struct kmem_cache_cpu *c;
+   unsigned long flags;
 
-   c = get_cpu_slab(s, get_cpu());
+   local_enter(flags);
+   c = get_cpu_slab(s, smp_processor_id());
 redo:
object = c->freelist;
if (unlikely(!object))
@@ -1566,12 +1565,13 @@ redo:
object[c->offset]) != object))
goto redo;
 
-   put_cpu();
+   local_exit(flags);
if (unlikely((gfpflags & __GFP_ZERO)))
memset(object, 0, c->objsize);
 
return object;
 slow:
+   local_exit(flags);
return __slab_alloc(s, gfpflags, node, addr, c);
 
 }
@@ -1605,7 +1605,6 @@ static void __slab_free(struct kmem_cach
void **object = (void *)x;
unsigned long flags;
 
-   put_cpu();
local_irq_save(flags);
slab_lock(page);
 
@@ -1670,10 +1669,12 @@ static void __always_inline slab_free(st
void **object = (void *)x;
void **freelist;
struct kmem_cache_cpu *c;
+   unsigned long flags;
 
debug_check_no_locks_freed(object, s->objsize);
 
-   c = get_cpu_slab(s, get_cpu());
+   local_enter(flags);
+   c = get_cpu_slab(s, smp_processor_id());
if (unlikely(c->node < 0))
goto slow;
 redo:
@@ -1687,9 +1688,10 @@ redo:
!= freelist))
goto redo;
 
-   put_cpu();
+   local_exit(flags);
return;
 slow:
+   local_exit(flags);
__slab_free(s, page, x, addr, c->offset);
 }

-- 
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] SLUB use cmpxchg_local

2007-09-04 Thread Mathieu Desnoyers

* Christoph Lameter ([EMAIL PROTECTED]) wrote:
> Measurements on IA64 slub w/per cpu vs slub w/per cpu/cmpxchg_local 
> emulation. Results are not good:
> 

Hi Christoph,

I tried to come up with a patch set implementing the basics of a new
critical section: local_enter(flags) and local_exit(flags).

Can you try those on ia64 and tell me if the results are better ?

See the 2 next posts...

Mathieu
-- 
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Fwd: That whole "Linux stealing our code" thing

2007-09-04 Thread Michael Poole

Chris Friesen writes:

> Daniel Hazelton wrote:
>> On Tuesday 04 September 2007 09:27:02 Krzysztof Halasa wrote:
>>
>>>Daniel Hazelton <[EMAIL PROTECTED]> writes:
>>>
US Copyright law. A copyright holder, regardless of what license he/she
may have released the work under, can still revoke the license for a
specific person or group of people. (There are some exceptions, but they
do not apply to the situation that is being discussed)
>
> The OpenBSD policy page doesn't agree with you:
>
> "...That means that having granted a permission, the copyright holder
> can not retroactively say that an individual or class of individuals
> are no longer granted those permissions. Likewise should the copyright
> holder decide to "go commercial" he can not revoke permissions already
> granted for the use of the work as distributed, though he may impose
> more restrictive permissions in his future distributions of that work."
>
> http://www.openbsd.org/policy.html

By my reading, this is supported by 17 USC 203(a)(3):

  (3) Termination of the grant may be effected at any time during a
  period of five years beginning at the end of thirty-five years
  from the date of execution of the grant; or, if the grant covers
  the right of publication of the work, the period begins at the
  end of thirty-five years from the date of publication of the
  work under the grant or at the end of forty years from the date
  of execution of the grant, whichever term ends earlier.

(from 
http://www.law.cornell.edu/uscode/html/uscode17/usc_sec_17_0203000-.html
 )

I would be interested to see what other legal basis is alleged as
grounds to rescind a license.

Michael
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Fwd: That whole "Linux stealing our code" thing

2007-09-04 Thread Chris Friesen


Daniel Hazelton wrote:

On Tuesday 04 September 2007 09:27:02 Krzysztof Halasa wrote:


Daniel Hazelton <[EMAIL PROTECTED]> writes:


US Copyright law. A copyright holder, regardless of what license he/she
may have released the work under, can still revoke the license for a
specific person or group of people. (There are some exceptions, but they
do not apply to the situation that is being discussed)


The OpenBSD policy page doesn't agree with you:

"...That means that having granted a permission, the copyright holder 
can not retroactively say that an individual or class of individuals are 
no longer granted those permissions. Likewise should the copyright 
holder decide to "go commercial" he can not revoke permissions already 
granted for the use of the work as distributed, though he may impose 
more restrictive permissions in his future distributions of that work."


http://www.openbsd.org/policy.html


Chris
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: huge improvement with per-device dirty throttling

2007-09-04 Thread Martin Knoblauch


--- Leroy van Logchem <[EMAIL PROTECTED]> wrote:

> Andrea Arcangeli wrote:
> > On Wed, Aug 22, 2007 at 01:05:13PM +0200, Andi Kleen wrote:
> >> Ok perhaps the new adaptive dirty limits helps your single disk
> >> a lot too. But your improvements seem to be more "collateral
> damage" @)
> >>
> >> But if that was true it might be enough to just change the dirty
> limits
> >> to get the same effect on your system. You might want to play with
> >> /proc/sys/vm/dirty_*
> > 
> > The adaptive dirty limit is per task so it can't be reproduced with
> > global sysctl. It made quite some difference when I researched into
> it
> > in function of time. This isn't in function of time but it
> certainly
> > makes a lot of difference too, actually it's the most important
> part
> > of the patchset for most people, the rest is for the corner cases
> that
> > aren't handled right currently (writing to a slow device with
> > writeback cache has always been hanging the whole thing).
> 
> 
> Self-tuning > static sysctl's. The last years we needed to use very 
> small values for dirty_ratio and dirty_background_ratio to soften the
> 
> latency problems we have during sustained writes. Imo these patches 
> really help in many cases, please commit to mainline.
> 
> -- 
> Leroy
> 

 while it helps in some situations, I did some tests today with
2.6.22.6+bdi-v9 (Peter was so kind) which seem to indicate that it
hurts NFS writes. Anyone seen similar effects?

 Otherwise I would just second your request. It definitely helps the
problematic performance of my CCISS based RAID5 volume.

Martin

Martin

--
Martin Knoblauch
email: k n o b i AT knobisoft DOT de
www:   http://www.knobisoft.de
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [-mm PATCH] Memory controller improve user interface (v3)

2007-09-04 Thread Dave Hansen

On Sun, 2007-09-02 at 16:20 +0530, Balbir Singh wrote:
> 
> +Setting a limit to a number that is not a multiple of page size causes
> +rounding up of the value. The user must check back to see (by reading
> +memory.limit_in_bytes), to check for differences between desired values and
> +committed values. Currently, all accounting is done in multiples of 
> PAGE_SIZE 

I wonder if we can say this in a bit more generic fashion.

A successful write to this file does not guarantee a successful
set of this limit to the value written into the file.  This can
be due to a number of factors, such as rounding up to page
boundaries or the total availability of memory on the system.
The user is required to re-read this file after a write to
guarantee the value committed by the kernel.

This keeps a user from saying "I page aligned the value I stuck in
there, no I don't have to check it."

-- Dave

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/1] pata_it821x: fix lost interrupt with atapi devices

2007-09-04 Thread Mikael Pettersson

Jeff Norden writes:
 >  From: Jeff Norden <[EMAIL PROTECTED]>
 > 
 > Fix "lost" interrupt problem when using dma with CD/DVD drives in some
 > configurations.  This problem can make installing linux from media
 > impossible for distro's that have switched to libata-only configurations.
 > 
 > The simple fix is to eliminate the use of dma for reading drive status, etc,
 > by checking the number of bytes to transferred.
 > 
 > This change will only affect the behavior of atapi devices, not disks.
 > There is more info at http://bugzilla.redhat.com/show_bug.cgi?id=242229
 > This patch is for 2.6.22.1
 > 
 > Signed-off-by: Jeff Norden <[EMAIL PROTECTED]>
 > 
 > ---
 > 
 > --- pata_it821x.c.orig   2007-08-16 14:20:49.0 -0500
 > +++ pata_it821x.c2007-08-31 16:09:22.0 -0500
 > @@ -533,6 +533,10 @@ static int it821x_check_atapi_dma(struct
 >  struct ata_port *ap = qc->ap;
 >  struct it821x_dev *itdev = ap->private_data;
 >  
 > +/* Only use dma for transfers to/from the media. */
 > +if (qc->nbytes < 2048)
 > +return -EOPNOTSUPP;
 > +
 >  /* No ATAPI DMA in smart mode */
 >  if (itdev->smart)
 >  return -EOPNOTSUPP;
 > 

This looks like a gross hack. Aren't you supposed to inspect
the command instead and whitelist the ones you know are OK,
like pata_pdc2027x.c and sata_promise.c do?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] Fix out-by-one error in traps.c

2007-09-04 Thread Rusty Russell

On Fri, 2007-08-31 at 11:24 -0700, Linus Torvalds wrote:
> 
> On Sat, 1 Sep 2007, Rusty Russell wrote:
> > 
> > This is only for the initial booting stack (init_thread_union); see
> > arch/i386/kernel/head.S:
> > /* Set up the stack pointer */
> > lss stack_start,%esp
> > ...
> > pushl $0# fake return address for unwinder
> 
> Ok, we should fix that. We should just make it look like all other stack 
> frames.
> 
> There is other code in the kernel that "knows" that all kernel stacks have 
> the fields for the user stack return on it, namely the ptrace code etc. 
> Now, the initial stack is hopefully never *accessed* by that kind of code, 
> but this kind of special-case code is just wrong.

Yes, but -ETIMEDOUT.  Maybe for 2.6.24...

> IOW, how 
> about this one, which just declares a structure that describes the stack 
> frame thing? That just makes everything clearer, since we can then use 
> "sizeof(that structure)" instead of using the magic "2*sizeof(unsigned 
> long)".

Much nicer, thanks.

Rusty.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.23-rc4-mm1

2007-09-04 Thread Zach Carter


> +ioc3-program-uart-predividers.patch
> +sky2-fe-chip-support.patch
> +sky2-use-debugfs-rename.patch
> +sky2-document-gphy_ctrl-bits.patch
> +sky2-dont-restrict-config-space-access.patch
> +sky2-advanced-error-reporting.patch
> +sky2-use-pci_config-access-functions.patch
> +sky2-use-net_device-internal-stats.patch
> +ktime_sub_ns-analog-of-ktime_add_ns.patch
> +export-reciprocal_value-for-modules.patch
> +sky2-hardware-receive-timestamp-counter.patch
> +sky2-avoid-divide-in-receive-path.patch
> +sky2-118.patch

Folks,

I've got these messages since installing 2.6.23-rc4-mm1:

sky2 :07:00.0: error interrupt status=0x8000
printk: 4 messages suppressed.
sky2 :07:00.0: error interrupt status=0x8000
printk: 4 messages suppressed.
sky2 :07:00.0: error interrupt status=0x8000
printk: 4 messages suppressed.
sky2 :07:00.0: error interrupt status=0x8000
printk: 4 messages suppressed.
sky2 :07:00.0: error interrupt status=0x8000
printk: 5 messages suppressed.
sky2 :07:00.0: error interrupt status=0x8000
printk: 5 messages suppressed.
sky2 :07:00.0: error interrupt status=0x8000
printk: 4 messages suppressed.
sky2 :07:00.0: error interrupt status=0x8000
printk: 4 messages suppressed.
sky2 :07:00.0: error interrupt status=0x8000
printk: 4 messages suppressed.
sky2 :07:00.0: error interrupt status=0x8000

The laptop is a Sony VAIO SZ430N/B

Despite the errors, the interface appears to be working well enough.

I'd be happy to supply additional information, try out patches, or 
submit a bugzilla if needed.

thanks!

-Zach
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.23-rc4-mm1 net bitops compile error

2007-09-04 Thread Jiri Slaby

Adrian Bunk napsal(a):
> defconfig fails with the following error on parisc:
> 
> <--  snip  -->
> 
> ...
>   CC  net/core/gen_estimator.o
> In file included from include2/asm/bitops.h:111,
>  from 
> /home/bunk/linux/kernel-2.6/linux-2.6.23-rc4-mm1/net/core/gen_estimator.c:18:
> /home/bunk/linux/kernel-2.6/linux-2.6.23-rc4-mm1/include/asm-generic/bitops/non-atomic.h:
>  
> In function '__set_bit':
> /home/bunk/linux/kernel-2.6/linux-2.6.23-rc4-mm1/include/asm-generic/bitops/non-atomic.h:17:
>  
> error: implicit declaration of function 'BIT_MASK'
> /home/bunk/linux/kernel-2.6/linux-2.6.23-rc4-mm1/include/asm-generic/bitops/non-atomic.h:18:
>  
> error: implicit declaration of function 'BIT_WORD'
> make[3]: *** [net/core/gen_estimator.o] Error 1
> 
> <--  snip  -->
> 
> Either #include  must become forbidden and #error or the 
> move of the #define's to include/linux/bitops.h reverted.

Just to let you know, that I'm working on the former.

thanks,
-- 
http://www.fi.muni.cz/~xslaby/Jiri Slaby
faculty of informatics, masaryk university, brno, cz
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] Make rcutorture RNG use temporal entropy

2007-09-04 Thread Paul E. McKenney

On Tue, Sep 04, 2007 at 09:14:19AM -0700, Paul E. McKenney wrote:
> On Tue, Sep 04, 2007 at 11:16:50AM +0530, Satyam Sharma wrote:
> > Hi Paul,
> > 
> > On Wed, 15 Aug 2007, Paul E. McKenney wrote:
> > > 
> > > The locking used by get_random_bytes() can conflict with the
> > > preempt_disable() and synchronize_sched() form of RCU.  This patch changes
> > > rcutorture's RNG to gather entropy from the new cpu_clock() interface
> > > (relying on interrupts, preemption, daemons, and rcutorture's reader
> > > thread's rock-bottom scheduling priority to provide useful entropy),
> > > and also adds and EXPORT_SYMBOL_GPL() to make that interface available
> > > to GPLed kernel modules such as rcutorture.
> > 
> > Honestly, rcutorture goes to some amazing lengths just to have this
> > randomizing-the-delays-that-read/write-test-threads-spend-inside-or-
> > outside-the-critical-sections thing :-) Especially, seeing that
> > synchro-test, the other "comparable" module, just doesn't bother with
> > all this at all. (especially check out its load == interval == do_sched
> > == 0 case! :-)
> 
> Yep.  The need for that level of randomization in rcutorture has been made
> painfully clear to me over a period of more than a decade.  Of course,
> the overhead of the re-seeding does get diluted by a factor of 10,000 or
> 100,000, depending on what version you are using.  So, from a throughput
> standpoint, the overhead is essentially that of a linear congruential
> random-number generator.  This is critically important given the low
> overhead of rcu_read_lock() and rcu_read_unlock().
> 
> Still, this is indeed not what you want on a fastpath of a realtime
> system, where average performance means nothing -- only the worst case
> counts.  And this is why I am -not- putting the rcutorture RNG forward
> for general-purpose use.  So we are at least in agreement on that piece!
> 
> And, as you hint below, anyone running rcutorture while also running
> a production realtime workload needs to seriously rethink their design.  ;-)
> (If you are instead running it to provide a test load for your realtime
> testing, fine and good.)
> 
> > So IMHO, considering that rcutorture isn't a "serious" user of randomness
> > in the first place (of even a "fast-and-loose version" for that matter),
> > you could consider a solution where you gather all the randomness you need
> > at module_init time itself and save it somewhere, and then use it wherever
> > you're calling into rcu_random()->cpu_clock() [ or get_random_bytes() ]
> > in the current code. You could even make some trivial updates to those
> > random numbers after every RCU_RANDOM_REFRESH uses, like present.
> 
> Well, assuming that the Linux kernel really needs a central implementation
> of a "pretty fast" and "pretty good" RNG, one could imagine all sorts of
> designs:
> 
> 1.Use an LCRNG feeding into an array, as the old Berkeley random()
>   does (or see Knuth for an earlier citation), but make it per-CPU.
>   When pulling out randomness, do an MDn hash on the array
>   along with a per-task counter and the per-CPU preempt counter.
>   Increment the per-task counter on each use.  Do an LCRNG step
>   on each use.  Since this is a fixed array, the collisions in
>   CONFIG_PREEMPT due to preemption can be permitted to happen
>   without penalty.
> 
>   This approach avoids all locking, all interrupt disabling, and
>   all preemption disabling.  But the MD hashes aren't the fastest
>   things in the kernel, from what I understand.
> 
>   Question: will this be fast enough?  If so, which of the MD
>   hashes should be used?
> 
> 2.As in #1 above, but use some simpler hash, such as addition or
>   XOR.  Maybe CRC.  (Benchmark for speed.)
> 
> 3.Just use a simple LCRNG with per-task state.  Perturb from some
>   statistical counter (the per-CPU RCU grace-period counter might
>   be appropriate).  Or don't even bother doing that.
> 
>   This would be -much- faster than any of the above, and would
>   be deterministic, hence good for realtime use.  LCRNG might not
>   satisfy more-demanding users, especially the paranoid ones.
> 
>   (This is what you are proposing above, correct?)
> 
> 4.Just use LCRNG into a array like Berkeley random(), but replicate
>   on a per-CPU basis.  Maybe or maybe not perturb occasionally
>   from some statistical counter as in #3 above.
> 
>   This would be reasonably fast, and should satisfy most users.
>   People needing cryptographically secure RNGs should of course
>   stick with get_random_bytes().
> 
>   [If I had some blazing reason to implement this -right- -now-,
>   this would be the approach I would take.]
> 
> 5.Stick with the current situation where people needing fast
>   and dirty RNGs roll their own.

Or, better yet, as suggested by Rusty:

6.  Use random32() from lib/random32.c and be happy.  This does

Re: Race condition: calling remove_proc_entry in cleanup_module (module_exit) while someone's using procfile

2007-09-04 Thread anon... anon.al

On 9/4/07, Alexey Dobriyan <[EMAIL PROTECTED]> wrote:
> For regular proc files, this is fixed in 2.6.23-rc1 and later.

Thanks!

I see you've been working on it:
fix-rmmod-read-write-races-in-proc-entries...
http://www.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.22-rc6/2.6.22-rc6-mm1/broken-out/

((
older relevant posts:
http://groups.google.com/group/linux.kernel/browse_thread/thread/4c74bbea17727e6e/809c15bd0e6fa8f9?
http://groups.google.com/group/fa.linux.kernel/browse_thread/thread/fb03e1a500fcb258/563a7f11acdec992?
))

Thanks, Albert
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: GPL weasels and the atheros stink

2007-09-04 Thread Daniel Hazelton

On Tuesday 04 September 2007 11:10:52 [EMAIL PROTECTED] wrote:
> On Mon, 03 Sep 2007 17:23:37 PDT, David Schwartz said:
> > > Wrong - I said "You can't complain about Person A doing X when
> > > you let Person
> > > B do X without complaint".
> >
> > Yes, I can. There is no inconsistency between acting in one case and
> > failing to act in another. We need not act in every possible case where
> > we could act to preserve our right to act in a particular case.
>
> You *do* however need to be aware that in some cases, public inactivity
> and/or statements that something will not be acted on may estoppel any
> future attempts to enforce something.

Exactly. However, inactivity can be construed as lack of knowledge that 
something is occurring. (ie: the RIAA didn't start acting on file-sharing 
until they became aware of Napster)

But for said estoppel to not be a factor you will have to prosecute and/or 
file suit on all instances of the activity. (And filing a suit and then 
dropping it against some of the people that you filed suit against - ie: you 
file suit against persons A and B and then drop the suit against person A in 
its entirety - because you only wanted to prosecute person B for the action, 
you are leaving yourself open to lawsuits on a number of counts.)

DRH

-- 
Dialup is like pissing through a pipette. Slow and excruciatingly painful.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: [PATCH 2.6.23-rc4] irq: irq and pci_ids patch for Intel Tolapai

2007-09-04 Thread Gaston, Jason D

>> Please do submit new PCI device IDs to pciids.sf.net project.
>
>Yep.


FYI:  I have already posted the Tolapai DID's and device strings to
pciids.sf.net.

Thanks,

Jason
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

origin of __tmp1930643048 network device name: kernel-space or user-space

2007-09-04 Thread davide rossetti

dear all,
I'm trying to track down a problem on a Sun V40Z server with 4 network
devices grabbing random ethX device names. now, trying to force the
device names to what I want, I got a __tmpX form of device name,
which I think is a half-configured device... but which piece of
software is to blame ??? kernel, udev, hotplug

it is a Fedora Core 6, fully updated (kernel-2.6.22.2-42.fc6
udev-095-17.fc6.x86_64)

ifconfig reports it as:
__tmp1930643048 Link encap:Ethernet  HWaddr 00:04:23:CA:BC:CB
  BROADCAST MULTICAST  MTU:1500  Metric:1
  RX packets:0 errors:0 dropped:0 overruns:0 frame:0
  TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
  collisions:0 txqueuelen:1000
  RX bytes:0 (0.0 b)  TX bytes:0 (0.0 b)
  Base address:0x3040 Memory:e70a-e70c

bond0 Link encap:Ethernet  HWaddr 00:04:23:CA:BC:CA
  inet addr:10.0.0.139  Bcast:10.0.255.255  Mask:255.255.0.0
  inet6 addr: fe80::204:23ff:feca:bcca/64 Scope:Link
  UP BROADCAST RUNNING MASTER MULTICAST  MTU:1500  Metric:1
  RX packets:7409 errors:0 dropped:0 overruns:0 frame:0
  TX packets:3855 errors:0 dropped:0 overruns:0 carrier:0
  collisions:0 txqueuelen:0
  RX bytes:1300152 (1.2 MiB)  TX bytes:509271 (497.3 KiB)

eth1  Link encap:Ethernet  HWaddr 00:04:23:CA:BC:CA
  inet6 addr: fe80::204:23ff:feca:bcca/64 Scope:Link
  UP BROADCAST RUNNING SLAVE MULTICAST  MTU:1500  Metric:1
  RX packets:7409 errors:0 dropped:0 overruns:0 frame:0
  TX packets:3855 errors:0 dropped:0 overruns:0 carrier:0
  collisions:0 txqueuelen:1000
  RX bytes:1300152 (1.2 MiB)  TX bytes:509271 (497.3 KiB)
  Base address:0x3000 Memory:e708-e70a


help appreciated... so much :(

davide

--
[EMAIL PROTECTED] ICQ:290677265 SKYPE:d.rossetti


-- 
[EMAIL PROTECTED] ICQ:290677265 SKYPE:d.rossetti
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Fwd: That whole "Linux stealing our code" thing

2007-09-04 Thread Daniel Hazelton

On Tuesday 04 September 2007 09:27:02 Krzysztof Halasa wrote:
> Daniel Hazelton <[EMAIL PROTECTED]> writes:
> > US Copyright law. A copyright holder, regardless of what license he/she
> > may have released the work under, can still revoke the license for a
> > specific person or group of people. (There are some exceptions, but they
> > do not apply to the situation that is being discussed)
>
> Oh come on, I thought some small country in maybe central Africa,
> but certainly not USA.

US Law is a twisted maze - you wouldn't believe the contradictions that exist 
between different sections of the US Federal Code. (And its worse as you move 
down to the State and the Local levels)

> What you write would essentially mean GPL (and any other such licence)
> is invalid in the USA.

Nope. The GPL is an explicit grant of rights and is fully legal and active as 
it stands.

> The licence is basically a promise not to sue. It wouldn't make any
> sense to promise if you could revoke at will.

If I was to revoke the license on something I held copyright to, I'd be forced 
to make an attempt to contact everyone that may have received a copy of the 
work under that license before I could ever begin filing lawsuits. This 
process will take at least a month - more if the various localities where 
someone might be living has laws about what constitutes an attempt to 
contact. (For instance, here in Pennsylvania an attempt to contact is taking 
out large format classified ad's in every newspaper in the area where the 
person is known to reside - or statewide if the region is not known. The ad's 
have to run for a minimum of one week)

This means that it'd take no less than five weeks - and might take as much as 
six months - before I could begin filing lawsuits. (And even then I'd have to 
have proof that the person in question was violating my copyright at the time 
the lawsuit was filed)

> > Ah, see - in the US the license(s) in question (and licenses in general)
> > are grants of rights, not a "statements of will".
>
> Right here grants of rights are some sort of statements of will.

Difference in terminology ?

A "Grant of Rights" is where you say 'Normally only I could do this, but I am 
giving you the legal right to do it as well'. A "statement of will" is 'This 
is what I want to have happen, in perpetuity'. In the US, a "statement of 
will" can include or imply a "Grant of Rights" and vice-versa, but they are 
separate entities.

> > (Truthfully, in the US a license
> > should be read with an implicit "All rights reserved")
>
> Actually (and I think it's the same in the USA), a copyrighted work
> has an implicit "all rights reserved". A licence is just exception.

And? The fact remains that "All Rights Reserved" means "I am reserving all 
rights I do not specifically grant or waive". ie: If a license doesn't 
state 'The licenser hereby waives the right to revoke this license at any 
time' then that right hasn't been lost. (A license acquired through a 
purchase - as might apply to a novel - is a lot different. And contracts are 
a different beast entirely)

DRH

-- 
Dialup is like pissing through a pipette. Slow and excruciatingly painful.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: kernel BUG at mm/slab.c:2980 (was Re: [] xfs_bmap_search_multi_extents+0x6f/0xe0)

2007-09-04 Thread Christoph Lameter

On Tue, 4 Sep 2007, Marco Berizzi wrote:

> After a week uptime I got this error. I hope it
> will be useful for you.

Yes indeed but this is a different type of failure. Looks like a higher 
allocation failure in the networking code. Someone created objects that
required an order 2 allocations that failed.

> SLUB: Genslabs=22, HWalign=32, Order=0-1, MinObjects=4, CPUs=1, Nodes=1

Slub is restricted to order 0 and 1 allocs.

>   PREFETCH window: e5c0-e7cf
> NET: Registered protocol family 2
> IP route cache hash table entries: 2048 (order: 1, 8192 bytes)
> TCP established hash table entries: 8192 (order: 4, 65536 bytes)
> TCP bind hash table entries: 8192 (order: 3, 32768 bytes)

H.

> eth2:  setting full-duplex.
> swapper: page allocation failure. order:2, mode:0x4020
>  [] __alloc_pages+0x1ed/0x2e0
>  [] allocate_slab+0x4b/0x90
>  [] new_slab+0x32/0x150
>  [] __slab_alloc+0xcb/0x120
>  [] __slab_alloc+0x79/0x120
>  [] tcp_collapse+0x113/0x3b0

tcp_collapse? This is due to a network configuration that required an
order 2 kmalloc block. Jumbo frames?

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Kernel panic with 2.6.23-rc5

2007-09-04 Thread Paulo Marques


Tilman Schmidt wrote:

Paulo Marques schrieb:
I just tried booting a brand new 2.6.23-rc5 and after a few minutes it 
just panicked: machine totally frozen, blinking keyboard leds.

[...]
Maybe someone out there has a good suggestion that I could try before 
bisecting...


A probable candidate would be:

http://lkml.org/lkml/2007/9/2/219


I've been running with that patch applied for a few hours now and 
everything seems to be working fine. Without the patch the kernel would 
hang in a few minutes, so I guess this fixed it.


Thanks for the help,

--
Paulo Marques - www.grupopie.com

"The Computer made me do it."
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Fwd: That whole "Linux stealing our code" thing

2007-09-04 Thread Daniel Hazelton

On Tuesday 04 September 2007 04:50:34 James Bruce wrote:
> Daniel Hazelton wrote:
> > On Monday 03 September 2007 14:26:29 Krzysztof Halasa wrote:
> >> Daniel Hazelton <[EMAIL PROTECTED]> writes:
> >>> The fact
> >>> remains that the person making a work available under *ANY* form of
> >>> copyright
> >>> license has the right to revoke said grant of license to anyone.
> >>
> >> Not after the licence has been given and accepted (and there might be
> >> restrictions), unless of course the licence contained such reservation.
> >
> > I hate to belabor the point, but you seem to be making the mistake of
> > "The license applies to the copyright holder" that I've seen a lot of
> > people make (and kept quiet about).
>
> I believe you are making the mistake that the license on code has
> anything to do with what the author chooses to do in the future.
> Releasing something as BSD does not force the author to do anything in
> the future with his code, and he/she could add and relicence as he/she
> feels fit.  HOWEVER, that particular code has already been released as
> BSD, and the author no longer has control over that release.

I may be mistaken, but it has always been my understanding that, unless you 
specifically waive your rights, they are automatically retained. (Under the 
law in the US, at least).

Hence, a copyright holder can do such, where the license has not been acquired 
by money changing hands.

(And actually, my above statement isn't rendered false by your rebuttal - it 
still appears that the person I replied to believes that a copyright license 
applies to the person holding the copyright in the same manner it applies to 
the person receiving the item under said license. Though I will admit it if I 
am wrong - publicly)

> > The person holding the copyright has all the legal standing to revoke a
> > license grant at any time. Licenses such as the GPL are not signed
> > contracts, and that means there are limits to what effect they can have
> > on the copyright holder.
>
> I believe you are confusing the fact that an author can decide to
> release code under another license, with the existence of code under
> that earlier license.  The license grant comes from THE CODE (which
> bears a license), not THE AUTHOR.  I can use GPL code I get in the mail
> because the license on the work says I can do so, not because I
> contacted the author and got a specific grant.  If such a grant were
> only verbal, your theory might hold, but that doesn't apply to any OSS
> software under discussion here.

The license is a direct grant from the author. If the author so wished, he/she 
could pull the license - either entirely or in part. About the only caveat is 
that the author would have to publish and attempt to contact everyone who may 
have acquired the item under that license to inform them of such a change - 
this does make it difficult, hell, makes it nearly impossible, but it can be 
done. (IANAL, but this does appear to be what the law says)

> If your legal theory were true, I could sell you a book and then later
> demand that you destroy it.  I could also release something as public
> domain, and then later rescind that (I still hold the copyright on what
> I produced), and charge money from anyone who used it.  I think its safe
> to say that this does not happen in practice.  Please provide some
> examples to the contrary or caselaw if you want to convince me otherwise.

Actually, no. A purchase does automatically grant the rights inherent in 
ownership - but that is a *PURCHASE*. Mere transfer of an item with no 
exchange of money cannot convey those rights. As far as the 'public domain' 
argument goes... That smells of a straw-man and is as different from a grant 
of license as it is from a purchase. When you release something into the 
public domain you are waiving *ALL* of your rights as copyright holder. 
(Which, I am told, cannot be done in Germany and some other countries)

> Furthermore, BSD/GPL software could not really exist under your legal
> theory; A programmer who wrote 30 year old core BSD code could wake up
> tomorrow and decide to require all BSD derivatives to remove his code or
> pay him for it (and the next day he could change the price again).  Open
> source software would not exist if such a liability were true, and
> companies like Sun could not be built up off of derivatives of it.
> Linux 0.01 is still available under a pre-GPL license if you can find a
> copy, and neither Linus (nor anyone else) can change that.

He could, but AFAICT, thirty years ago BSD was still run entirely by UC 
Berkely and any copyrights that might be held are held entirely by UC Berkely 
and not the individuals that contributed to such. (Whats more, a 30 year old 
version of BSD doesn't meet the requirements of the AT agreement, so its 
only legal in-so-far as it massively predates that agreement (and the lawsuit 
which spawned it) :)

And yes, Linus actually could revoke the license on any copy of

Re: Race condition: calling remove_proc_entry in cleanup_module (module_exit) while someone's using procfile

2007-09-04 Thread Alexey Dobriyan

On Tue, Sep 04, 2007 at 06:39:33PM +0200, anon... anon.al wrote:
> There is a race condition if an instance is executing "__exit
> device_exit" and calls remove_proc_entry, while someone is still using
> the procfile, right?.
> 
> static void __exit device_exit(void)
> {
>   // what if the procfile is still in use?
>   remove_proc_entry(PROC_FILE_NAME, _root);
> }

For regular proc files, this is fixed in 2.6.23-rc1 and later.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[GIT PULL] x86 setup: work around bug in Xen HVM

2007-09-04 Thread H. Peter Anvin

Hi Linus,

Please pull:

  git://git.kernel.org/pub/scm/linux/kernel/git/hpa/linux-2.6-x86setup.git 
for-linus

Christian Ehrhardt (1):
  [x86 setup] Work around bug in Xen HVM

 arch/i386/boot/pm.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

[Log messages and full diffs follow]

commit 92ea189254b87727d6be407558d9c18fed0937bb
Author: Christian Ehrhardt <[EMAIL PROTECTED]>
Date:   Mon Sep 3 20:32:38 2007 +0200

[x86 setup] Work around bug in Xen HVM

Apparently XEN does not keep the contents of the 48-bit gdt_48 data
structure that is passed to lgdt in the XEN machine state. Instead it
appears to save the _address_ of the 48-bit descriptor
somewhere. Unfortunately this data happens to reside on the stack and
is probably no longer availiable at the time of the actual protected
mode jump.

This is Xen bug but given that there is a one-line patch to work
around this problem, the linux kernel should probably do this.  My fix
is to make the gdt_48 description in setup_gdt static (in setup_idt
this is already the case). This allows the kernel to boot under
Xen HVM again.

Signed-off-by: Christian Ehrhardt <[EMAIL PROTECTED]>
Signed-off-by: H. Peter Anvin <[EMAIL PROTECTED]>

diff --git a/arch/i386/boot/pm.c b/arch/i386/boot/pm.c
index 6be9ca8..f7958f1 100644
--- a/arch/i386/boot/pm.c
+++ b/arch/i386/boot/pm.c
@@ -122,7 +122,7 @@ static void setup_gdt(void)
/* DS: data, read/write, 4 GB, base 0 */
[GDT_ENTRY_BOOT_DS] = GDT_ENTRY(0xc093, 0, 0xf),
};
-   struct gdt_ptr gdt;
+   static struct gdt_ptr gdt;
 
gdt.len = sizeof(boot_gdt)-1;
gdt.ptr = (u32)_gdt + (ds() << 4);
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Problem to recognize that the file system is full

2007-09-04 Thread Guilherme Vilela

Hi,
I'm tryng to mount a nfs file system with the option async and run a
program that writes to the file system. The problem is that the
program keep writing even when the file system is full. It appears
that the nfs dont see that the file system is full and keeps writing
to the cache. This program does'nt occur when mounting with the sync
option or with async and noac option, but the performance get very
poor and that is important to my application. The problem doesnt occur
too with the local file system. I'm running linux centos 5.0.

Below is my test program.
Any idea how to resolve this problem?

Thanks,
Guilherme


//io-stress.cpp

#include 
#include 
#include 

using namespace std;

int main()
{
int i = 0;
string roller( "|/-\\" );
string sample( "*", 1024 );
ofstream ofs( "io-stress.txt" );
ofs.exceptions( ios::failbit | ios::badbit );

while ( ofs )
{
ofs << sample << endl << flush;
ofs.flush();
if ( i == 3 )
i = 0;
cout << "\b" << roller[i++];
}
ofs.close();
return 0;
}
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Race condition: calling remove_proc_entry in cleanup_module (module_exit) while someone's using procfile

2007-09-04 Thread anon... anon.al

On 9/4/07, anon... anon.al <[EMAIL PROTECTED]> wrote:

> If yes: which mechanism can be used?

I was thinking about using an atomic counter in procfile_write

  proc_f = create_proc_entry(PROC_FILE_NAME, 0644, NULL);
  //...
  proc_f->write_proc = procfile_write;

int procfile_write(struct file *filp, const char *buffer, \
   unsigned long len, void *data)
{
  //"StackXXX"
  atomic_inc(_procfile_users);

  printk(KERN_ALERT "Hi there!\n");

  atomic_dec(_procfile_users);
  wake_up_interruptible();
  return len;
}

and then in cleanup_module using:

wait_event_interruptible(queue,   \
( \
 spin_lock_irqsave(, flags), \
 cnt = atomic_read(_procfile_users),  \
 ((cnt == 0)  \
  ? 1 \
  : (spin_unlock_irqrestore(, flags), 0))\
));
remove_proc_entry(PROC_FILE_NAME, _root);
spin_unlock_irqrestore(, flags);

But:
x1)
Could it happen that code is already in function procfile_write at "StackXXX"
(before atomic_inc(_procfile_users)) when the scheduler switches
to another task??
((Or is the "entering into a function, up to the function's first
statement" atomic??))

x2)
Could it happen that the scheduler switches, after
atomic_dev(_procfile_users) but before
return len??

If so, then it could happen that we're in spin_lock_irqsave, while
someone else is still using the procfile; and then this code still
fails miserably.
?

Regards -Albert
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 1/1] pata_it821x: fix lost interrupt with atapi devices

2007-09-04 Thread Jeff Norden

 From: Jeff Norden <[EMAIL PROTECTED]>

Fix "lost" interrupt problem when using dma with CD/DVD drives in some
configurations.  This problem can make installing linux from media
impossible for distro's that have switched to libata-only configurations.

The simple fix is to eliminate the use of dma for reading drive status, etc,
by checking the number of bytes to transferred.

This change will only affect the behavior of atapi devices, not disks.
There is more info at http://bugzilla.redhat.com/show_bug.cgi?id=242229
This patch is for 2.6.22.1

Signed-off-by: Jeff Norden <[EMAIL PROTECTED]>

---

--- pata_it821x.c.orig  2007-08-16 14:20:49.0 -0500
+++ pata_it821x.c   2007-08-31 16:09:22.0 -0500
@@ -533,6 +533,10 @@ static int it821x_check_atapi_dma(struct
struct ata_port *ap = qc->ap;
struct it821x_dev *itdev = ap->private_data;
 
+   /* Only use dma for transfers to/from the media. */
+   if (qc->nbytes < 2048)
+   return -EOPNOTSUPP;
+
/* No ATAPI DMA in smart mode */
if (itdev->smart)
return -EOPNOTSUPP;

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Race condition: calling remove_proc_entry in cleanup_module (module_exit) while someone's using procfile

2007-09-04 Thread anon... anon.al

Hi!

There is a race condition if an instance is executing "__exit
device_exit" and calls remove_proc_entry, while someone is still using
the procfile, right?.

static void __exit device_exit(void)
{
  // what if the procfile is still in use?
  remove_proc_entry(PROC_FILE_NAME, _root);
}

To remove this race condition, the code in "__exit device_exit" must
a) be sure that no other instance is in procfile functions
b) call remove_proc_entry before any other instance accesses the procfile

*** Is this at all possible if a race-condition is to be avoided?
If yes: which mechanism can be used?

Thanks -Albert
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Linux 2.6.23-rc5

2007-09-04 Thread S.Çağlar Onur

Hi Again;

04 Eyl 2007 Sal tarihinde, S.Çağlar Onur şunları yazmıştı: 
> Hi;
>
> 01 Eyl 2007 Cts tarihinde, Linus Torvalds şunları yazmıştı:
> > So have fun, give it a go, and expect a quiet week next week.
>
> After upgrading -rc5 (i'm currently using linus's latest git + appArmor and
> bootsplash patchset) my CD/DVD-ROM suddenly disappeared :).
>
> [EMAIL PROTECTED] ~ $ diff -u rc4 rc5 | grep cd-rom -i
> -Sep  3 11:30:33 localhost kernel: [   19.328120] scsi 4:0:0:0: CD-ROM
> MATSHITA DVD-RAM UJ-851S  1.50 PQ: 0 ANSI: 5
> -Sep  3 11:30:33 localhost kernel: [   21.258819] Uniform CD-ROM driver
> Revision: 3.20
> -Sep  3 11:30:33 localhost kernel: [   21.258873] sr 4:0:0:0: Attached scsi
> CD-ROM sr0

Forget it :), it was a hardware problem caused by cdrom bay

Cheers
-- 
S.Çağlar Onur <[EMAIL PROTECTED]>
http://cekirdek.pardus.org.tr/~caglar/

Linux is like living in a teepee. No Windows, no Gates and an Apache in house!
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Linux 2.6.23-rc5

2007-09-04 Thread Prakash Punnoor

On the day of Sunday 02 September 2007 Prakash Punnoor hast written:
> Hi,
>
> 2.6.23-rc5 locks up hard (Magic Syskeys won't even work) after a few
> minutes of work on x86_64. 2.6.23-rc4 was fine. I'll try git-bisect to find
> out what is causing trouble. Yes, I am using nvidia binary but it didn't
> make troubles since ages... When I found the bugger, I'll try whether it
> works w/o nvidia binary.

It seems my system is stable again with patch from

http://lkml.org/lkml/2007/9/2/219

Though I didn't get a light show on hang...

-- 
(°= =°)
//\ Prakash Punnoor /\\
V_/ \_V


signature.asc
Description: This is a digitally signed message part.

Re: [PATCH] Make rcutorture RNG use temporal entropy

2007-09-04 Thread Paul E. McKenney

On Tue, Sep 04, 2007 at 11:16:50AM +0530, Satyam Sharma wrote:
> Hi Paul,
> 
> 
> On Wed, 15 Aug 2007, Paul E. McKenney wrote:
> > 
> > The locking used by get_random_bytes() can conflict with the
> > preempt_disable() and synchronize_sched() form of RCU.  This patch changes
> > rcutorture's RNG to gather entropy from the new cpu_clock() interface
> > (relying on interrupts, preemption, daemons, and rcutorture's reader
> > thread's rock-bottom scheduling priority to provide useful entropy),
> > and also adds and EXPORT_SYMBOL_GPL() to make that interface available
> > to GPLed kernel modules such as rcutorture.
> 
> Honestly, rcutorture goes to some amazing lengths just to have this
> randomizing-the-delays-that-read/write-test-threads-spend-inside-or-
> outside-the-critical-sections thing :-) Especially, seeing that
> synchro-test, the other "comparable" module, just doesn't bother with
> all this at all. (especially check out its load == interval == do_sched
> == 0 case! :-)

Yep.  The need for that level of randomization in rcutorture has been made
painfully clear to me over a period of more than a decade.  Of course,
the overhead of the re-seeding does get diluted by a factor of 10,000 or
100,000, depending on what version you are using.  So, from a throughput
standpoint, the overhead is essentially that of a linear congruential
random-number generator.  This is critically important given the low
overhead of rcu_read_lock() and rcu_read_unlock().

Still, this is indeed not what you want on a fastpath of a realtime
system, where average performance means nothing -- only the worst case
counts.  And this is why I am -not- putting the rcutorture RNG forward
for general-purpose use.  So we are at least in agreement on that piece!

And, as you hint below, anyone running rcutorture while also running
a production realtime workload needs to seriously rethink their design.  ;-)
(If you are instead running it to provide a test load for your realtime
testing, fine and good.)

> So IMHO, considering that rcutorture isn't a "serious" user of randomness
> in the first place (of even a "fast-and-loose version" for that matter),
> you could consider a solution where you gather all the randomness you need
> at module_init time itself and save it somewhere, and then use it wherever
> you're calling into rcu_random()->cpu_clock() [ or get_random_bytes() ]
> in the current code. You could even make some trivial updates to those
> random numbers after every RCU_RANDOM_REFRESH uses, like present.

Well, assuming that the Linux kernel really needs a central implementation
of a "pretty fast" and "pretty good" RNG, one could imagine all sorts of
designs:

1.  Use an LCRNG feeding into an array, as the old Berkeley random()
does (or see Knuth for an earlier citation), but make it per-CPU.
When pulling out randomness, do an MDn hash on the array
along with a per-task counter and the per-CPU preempt counter.
Increment the per-task counter on each use.  Do an LCRNG step
on each use.  Since this is a fixed array, the collisions in
CONFIG_PREEMPT due to preemption can be permitted to happen
without penalty.

This approach avoids all locking, all interrupt disabling, and
all preemption disabling.  But the MD hashes aren't the fastest
things in the kernel, from what I understand.

Question: will this be fast enough?  If so, which of the MD
hashes should be used?

2.  As in #1 above, but use some simpler hash, such as addition or
XOR.  Maybe CRC.  (Benchmark for speed.)

3.  Just use a simple LCRNG with per-task state.  Perturb from some
statistical counter (the per-CPU RCU grace-period counter might
be appropriate).  Or don't even bother doing that.

This would be -much- faster than any of the above, and would
be deterministic, hence good for realtime use.  LCRNG might not
satisfy more-demanding users, especially the paranoid ones.

(This is what you are proposing above, correct?)

4.  Just use LCRNG into a array like Berkeley random(), but replicate
on a per-CPU basis.  Maybe or maybe not perturb occasionally
from some statistical counter as in #3 above.

This would be reasonably fast, and should satisfy most users.
People needing cryptographically secure RNGs should of course
stick with get_random_bytes().

[If I had some blazing reason to implement this -right- -now-,
this would be the approach I would take.]

5.  Stick with the current situation where people needing fast
and dirty RNGs roll their own.

> Agreed, anybody running rcutorture isn't really looking for performance,
> but why call get_random_bytes() or cpu_clock() (and the smp_processor_id()
> + irq_save/restore + export_symbol() that goes with it) when it isn't
> _really_ "required" as such ...

Well, that would in fact be why the

Re: Linux 2.6.23-rc5

2007-09-04 Thread S.Çağlar Onur

Hi;

01 Eyl 2007 Cts tarihinde, Linus Torvalds şunları yazmıştı: 
> So have fun, give it a go, and expect a quiet week next week.

After upgrading -rc5 (i'm currently using linus's latest git + appArmor and 
bootsplash patchset) my CD/DVD-ROM suddenly disappeared :). 

[EMAIL PROTECTED] ~ $ diff -u rc4 rc5 | grep cd-rom -i
-Sep  3 11:30:33 localhost kernel: [   19.328120] scsi 4:0:0:0: CD-ROM  
  
MATSHITA DVD-RAM UJ-851S  1.50 PQ: 0 ANSI: 5
-Sep  3 11:30:33 localhost kernel: [   21.258819] Uniform CD-ROM driver 
Revision: 3.20
-Sep  3 11:30:33 localhost kernel: [   21.258873] sr 4:0:0:0: Attached scsi 
CD-ROM sr0

You can find the logs and config from [1], i'm on a vacation with unstable 
network connection so i may not respond fast enough and i don't have enough 
time to bisect for a while :(.

[1] http://cekirdek.pardus.org.tr/~caglar/kernel/

Cheers
-- 
S.Çağlar Onur <[EMAIL PROTECTED]>
http://cekirdek.pardus.org.tr/~caglar/

Linux is like living in a teepee. No Windows, no Gates and an Apache in house!


signature.asc
Description: This is a digitally signed message part.

1 2 3 4 >

1 - 100 of 380 matches

Mail list logo