Re: [PATCH 2/3] cciss: add support for blktrace

2007-11-19 Thread Andrew Morton
On Mon, 19 Nov 2007 16:07:17 -0600 Mike Miller <[EMAIL PROTECTED]> wrote:

> Patch 2 of 3
> This patch adds support for the blktrace utility. Please consider this for
> inclusion. Seems there was already a call to blk_add_trace. This patch adds
> ifdef's and includes the header file.
> 
> Signed-off-by: Mike Miller <[EMAIL PROTECTED]>
> 
> 
> diff --git a/drivers/block/cciss.c b/drivers/block/cciss.c
> index 2ba5a89..61bc0f3 100644
> --- a/drivers/block/cciss.c
> +++ b/drivers/block/cciss.c
> @@ -41,6 +41,10 @@
>  #include 
>  #include 
>  
> +#ifdef CONFIG_BLK_DEV_IO_TRACE
> +#include 
> +#endif /* CONFIG_BLK_DEV_IO_TRACE */

The ifdefs shouldn't be needed here.  If they are needed, blktrace_api.h needs
fixing.

>  #include 
>  #include 
>  #include 
> @@ -3013,7 +3017,9 @@ after_error_processing:
>   }
>   cmd->rq->data_len = 0;
>   cmd->rq->completion_data = cmd;
> +#ifdef CONFIG_BLK_DEV_IO_TRACE
>   blk_add_trace_rq(cmd->rq->q, cmd->rq, BLK_TA_COMPLETE);
> +#endif /* CONFIG_BLK_DEV_IO_TRACE */
>   blk_complete_request(cmd->rq);
>  }

Add if you remove the first set of ifdefs, these ifdefs can also be
removed.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/1] x86: convert-cpuinfo_x86-array-to-a-per_cpu-array fix

2007-11-19 Thread Thomas Gleixner
On Wed, 14 Nov 2007, Mike Travis wrote:

> Hi Andrew,
> 
> It appears that this patch is missing from the latest 2.6.24 git kernel?
> 
> (Suresh noticed that it is still a problem.)
> 
> Thanks,
> Mike
> 
> This fix corrects the problem that early_identify_cpu() sets
> cpu_index to '0' (needed when called by setup_arch) after
> smp_store_cpu_info() had set it to the correct value.
> 
> Signed-off-by: Mike Travis <[EMAIL PROTECTED]>
> ---
>  arch/x86_64/kernel/smpboot.c |2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> --- linux.orig/arch/x86_64/kernel/smpboot.c   2007-10-12 14:28:45.0 
> -0700
> +++ linux/arch/x86_64/kernel/smpboot.c2007-10-12 14:53:42.753508152 
> -0700
> @@ -141,8 +141,8 @@ static void __cpuinit smp_store_cpu_info
>   struct cpuinfo_x86 *c = _data(id);
>  
>   *c = boot_cpu_data;
> - c->cpu_index = id;
>   identify_cpu(c);
> + c->cpu_index = id;
>   print_cpu_info(c);
>  }

The correct fix is already in mainline:

commit 699d934d5f958d7944d195c03c334f28cc0b3669

tglx
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] wait_task_stopped: pass correct exit_code to wait_noreap_copyout

2007-11-19 Thread Scott James Remnant
On Mon, 2007-11-19 at 22:43 -0800, Andrew Morton wrote:
> On Sun, 18 Nov 2007 09:13:24 + Scott James Remnant <[EMAIL PROTECTED]> 
> wrote:
> 
> > In wait_task_stopped() exit_code already contains the right value for
> > the si_status member of siginfo, and this is simply set in the non
> > WNOWAIT case.
> > 
> > Pass it unchanged to wait_noreap_copyout();  we would only need to
> > shift it and add 0x7f if we were returning it in the user status field
> > and that isn't used for any function that permits WNOWAIT.
> > 
> Is this bug visible to userspace?  If so, I'm surprised that none of the
> various testsuites (which like to exercise this sort of interface) has
> detected it.
> 
Absolutely;  if you call waitid() with a stopped or traced process,
you'll get the signal in siginfo.si_status as expected -- however if you
call waitid(WNOWAIT) at the same time, you'll get the signal << 8 | 0x7f

Scott
-- 
Scott James Remnant
[EMAIL PROTECTED]


signature.asc
Description: This is a digitally signed message part


Re: [PATCH 1/3] cciss: export more sysfs attributes

2007-11-19 Thread Andrew Morton
On Mon, 19 Nov 2007 16:03:07 -0600 Mike Miller <[EMAIL PROTECTED]> wrote:

> Patch 1 of 3
> This patch creates more sysfs attributes to be exported by cciss. Hopefully
> we can work better with udev. Please consider this patch for inclusion.
> 

It would be appropriate if the changelog were to describe what the problem
is with udev, and how this patch attemtps to address it.

> 
> diff --git a/drivers/block/cciss.c b/drivers/block/cciss.c
> index 7d70496..2ba5a89 100644
> --- a/drivers/block/cciss.c
> +++ b/drivers/block/cciss.c
> @@ -229,20 +229,483 @@ static inline CommandList_struct 
> *removeQ(CommandList_struct **Qptr,
>   return c;
>  }
>  
> +static inline int find_drv_index(int ctlr, drive_info_struct *drv){
> +int i;
> +for (i=0; i < CISS_MAX_LUN; i++) {
> +if (hba[ctlr]->drv[i].LunID == drv->LunID)
> +return i;
> +}
> +return i;
> +}

Please pass all patches though scripts/checkpatch.pl before sending.  It
will detect things like the codingstyle errors in the above code.

Also, that function seems to be too large to be inlined.

>  #include "cciss_scsi.c"  /* For SCSI tape support */
>  
> +#define ENG_GIG 10
> +#define ENG_GIG_FACTOR (ENG_GIG/512)
>  #define RAID_UNKNOWN 6
> +static const char *raid_label[] = { "0", "4", "1(1+0)", "5", "5+1", "ADG",
> + "UNKNOWN"};
> +
> +
> +static spinlock_t sysfs_lock = SPIN_LOCK_UNLOCKED;

And that's a bug which checkpatch would have detected.  Please use
DEFINE_SPINLOCK() to avoid confusing lockdep.

> +static void cciss_sysfs_stat_inquiry(int ctlr, int logvol,
> + int withirq, drive_info_struct *drv)
> +{
> + int return_code;
> + InquiryData_struct *inq_buff;
> +
> + /* If there are no heads then this is the controller disk and
> +  * not a valid logical drive so don't query it.
> +  */
> + if (!drv->heads)
> + return;
> +
> + inq_buff = kzalloc(sizeof(InquiryData_struct), GFP_KERNEL);
> + if (!inq_buff) {
> + printk(KERN_ERR "cciss: out of memory\n");
> + goto err;
> + }
> +
> + if (withirq)
> + return_code = sendcmd_withirq(CISS_INQUIRY, ctlr,
> + inq_buff, sizeof(*inq_buff), 1, logvol ,0, TYPE_CMD);
> + else
> + return_code = sendcmd(CISS_INQUIRY, ctlr, inq_buff,
> + sizeof(*inq_buff), 1, logvol , 0, NULL, TYPE_CMD);
> + if (return_code == IO_OK) {
> + memcpy(drv->vendor, _buff->data_byte[8], 8);
> + drv->vendor[8]='\0';
> + memcpy(drv->model, _buff->data_byte[16], 16);
> + drv->model[16] = '\0';
> + memcpy(drv->rev, _buff->data_byte[32], 4);
> + drv->rev[4] = '\0';
> + } else { /* Get geometry failed */
> + printk(KERN_WARNING "cciss: inquiry for VPD page 0 failed\n");
> + }
> +
> + if (withirq)
> + return_code = sendcmd_withirq(CISS_INQUIRY, ctlr,
> + inq_buff, sizeof(*inq_buff), 1, logvol ,0x83, TYPE_CMD);
> + else
> + return_code = sendcmd(CISS_INQUIRY, ctlr, inq_buff,
> + sizeof(*inq_buff), 1, logvol , 0x83, NULL, TYPE_CMD);
> +
> + if (return_code == IO_OK) {
> + memcpy(drv->uid, _buff->data_byte[8], 16);
> + } else { /* Get geometry failed */
> + printk(KERN_WARNING "cciss: id logical drive failed\n");
> + }
> +
> + kfree(inq_buff);
> +err:
> + drv->vendor[8] = '\0';
> + drv->model[16] = '\0';
> + drv->rev[4] = '\0';
> +
> +}
> +
> +static ssize_t cciss_show_raid_level(struct device *dev,
> +  struct device_attribute *attr, char *buf)
> +{
> + struct drv_dynamic *d;
> + drive_info_struct *drv;
> + ctlr_info_t *h;
> + unsigned long flags;
> + int raid;
> +
> + d = container_of(dev, struct drv_dynamic, dev);
> + spin_lock(_lock);
> + if (!d->disk) {
> + spin_unlock(_lock);
> + return -ENOENT;
> + }
> +
> + h = get_host(d->disk);
> +
> + spin_lock_irqsave(CCISS_LOCK(h->ctlr), flags);
> + if (h->busy_configuring) {
> + spin_unlock_irqrestore(CCISS_LOCK(h->ctlr), flags);
> + spin_unlock(_lock);
> + return snprintf(buf, 30, "Device busy configuring\n");
> + }
> +
> + drv = d->disk->private_data;
> + if ((drv->raid_level < 0) || (drv->raid_level) > 5)
> + raid = RAID_UNKNOWN;
> + else
> + raid = drv->raid_level;
> +
> + spin_unlock_irqrestore(CCISS_LOCK(h->ctlr), flags);
> + spin_unlock(_lock);
> + return snprintf(buf, 20, "RAID %s\n", raid_label[raid]);
> +}
> +
> +static ssize_t cciss_show_disk_size(struct device *dev,
> + struct device_attribute *attr, char *buf)
> +{
> + struct drv_dynamic *d;
> + drive_info_struct *drv;
> + ctlr_info_t *h;
> 

netconsole=y and rtl8139=m

2007-11-19 Thread Jan Engelhardt
Hi,



I get this during boot:

[   40.821740] netconsole: eth1 doesn't exist, aborting.

Given that CONFIG_NETCONSOLE=y and CONFIG_8139TOO=m, I can imagine.
Is there a way to get this working without making 8139TOO=y or 
NETCONSOLE=m?



thanks,
Jan
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 54/59] net/irda: Add missing "space"

2007-11-19 Thread David Miller
From: Joe Perches <[EMAIL PROTECTED]>
Date: Mon, 19 Nov 2007 17:53:41 -0800

> 
> Signed-off-by: Joe Perches <[EMAIL PROTECTED]>

Applied.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] net/ipv4/arp.c: Fix arp reply when sender ip 0

2007-11-19 Thread David Miller
From: Bill Fink <[EMAIL PROTECTED]>
Date: Tue, 20 Nov 2007 00:16:07 -0500

> On Mon, 19 Nov 2007, Alexey Kuznetsov wrote:
> 
> > 2. What's about your suggestion, I thought about this and I am going to 
> > agree.
> > 
> >Arguments, which convinced me are:
> > 
> >- arping still works.
> >- any piece of reasonable software should work.
> >- if Windows understands DaD (is it really true? I cannot believe)
> >  and it is unhappy about our responce and does not block use
> >  of duplicate address only due to this, we _must_ accomodate ASAP.
> >- if we do,we have to use 0 protocol address, no choice.
> 
> I agree the target protocol address should be 0 in this case.

Patches, someone :-)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 55/59] net/sctp: Add missing "space"

2007-11-19 Thread David Miller
From: Joe Perches <[EMAIL PROTECTED]>
Date: Mon, 19 Nov 2007 17:53:42 -0800

> 
> Signed-off-by: Joe Perches <[EMAIL PROTECTED]>

Applied.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 56/59] net/sunrpc: Add missing "space"

2007-11-19 Thread David Miller
From: Joe Perches <[EMAIL PROTECTED]>
Date: Mon, 19 Nov 2007 17:53:43 -0800

> 
> Signed-off-by: Joe Perches <[EMAIL PROTECTED]>

Applied.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 53/59] net/ipv6: Add missing "space"

2007-11-19 Thread David Miller
From: Joe Perches <[EMAIL PROTECTED]>
Date: Mon, 19 Nov 2007 17:53:40 -0800

> 
> Signed-off-by: Joe Perches <[EMAIL PROTECTED]>

Applied.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 50/59] net/bridge: Add missing "space"

2007-11-19 Thread David Miller
From: Joe Perches <[EMAIL PROTECTED]>
Date: Mon, 19 Nov 2007 17:53:37 -0800

> 
> Signed-off-by: Joe Perches <[EMAIL PROTECTED]>

Applied.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 52/59] net/ipv4: Add missing "space"

2007-11-19 Thread David Miller
From: Joe Perches <[EMAIL PROTECTED]>
Date: Mon, 19 Nov 2007 17:53:39 -0800

> 
> Signed-off-by: Joe Perches <[EMAIL PROTECTED]>

Applied.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 51/59] net/dccp: Add missing "space"

2007-11-19 Thread David Miller
From: Joe Perches <[EMAIL PROTECTED]>
Date: Mon, 19 Nov 2007 17:53:38 -0800

> Signed-off-by: Joe Perches <[EMAIL PROTECTED]>

Applied.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 06/59] arch/sparc: Add missing "space"

2007-11-19 Thread Joe Perches
On Mon, 2007-11-19 at 23:45 -0800, David Miller wrote:
> From: Joe Perches <[EMAIL PROTECTED]>
> Date: Mon, 19 Nov 2007 17:47:58 -0800
> > Signed-off-by: Joe Perches <[EMAIL PROTECTED]>
> Please check your patches, for trailing white "space".
> Adds trailing whitespace.
> diff:10:  prom_printf("PCIC: Error, cannot map " 
> Adds trailing whitespace.
> diff:19:  prom_printf("PCIC: Error, cannot map " 
> warning: 2 lines add whitespace errors.
> I've fixed it up this time.

It doesn't add whitespace, but it does keep the trailing
whitespace that's already there.

cheers, Joe

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 06/59] arch/sparc: Add missing "space"

2007-11-19 Thread David Miller
From: Joe Perches <[EMAIL PROTECTED]>
Date: Mon, 19 Nov 2007 17:47:58 -0800

> Signed-off-by: Joe Perches <[EMAIL PROTECTED]>

Please check your patches, for trailing white "space".

Adds trailing whitespace.
diff:10:prom_printf("PCIC: Error, cannot map " 
Adds trailing whitespace.
diff:19:prom_printf("PCIC: Error, cannot map " 
warning: 2 lines add whitespace errors.

I've fixed it up this time.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 07/59] arch/sparc64: Add missing "space"

2007-11-19 Thread David Miller
From: Joe Perches <[EMAIL PROTECTED]>
Date: Mon, 19 Nov 2007 17:47:59 -0800

> 
> Signed-off-by: Joe Perches <[EMAIL PROTECTED]>

Applied.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 01/59] arch/ia64: Add missing "space"

2007-11-19 Thread Simon Horman
On Mon, Nov 19, 2007 at 11:09:25PM -0800, Joe Perches wrote:
> On Tue, 2007-11-20 at 15:54 +0900, Simon Horman wrote:
> > Is it really neccessary for this fragment to create a line that
> > is greater than 80 characters long? Persumably the entire reason
> > that the printk line was split in the first place was to avoid
> > a long line.
> 
> No.  Many other lines in that source file are > 80 char.

That may be so, but surely adding another one makes
things slightly worse.

> My initial preference was to reformat the indented lines to the
> printk open parenthesis, but the minimal change seemed better.
> 
> cheers, Joe

-- 
Horms

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CONFIG_IRQBALANCE for 64-bit x86 ?

2007-11-19 Thread Nick Piggin
On Tuesday 20 November 2007 16:37, Arjan van de Ven wrote:
> On Tue, 20 Nov 2007 15:17:15 +1100

> > For that matter, I'd like to know why it has been decided that the
> > best place for IRQ balancing is in userspace. It should be in kernel
> > IMO, and it would probably allow better power saving, performance,
> > fairness, etc. if it were to be integrated with the task balancer as
> > well.
>
> actually no. IRQ balancing is not a "fast" decision; every time you

I didn't say anything of the sort. But IRQ load could still fluctuate
a lot more rapidly than we'd like to wake up the irqbalancer.


> move an interrupt around, you end up causing a really a TON of cache
> line bounces, and generally really bad performance

All the more reason why the kernel should do it. When I say move it to
the kernel, I don't mean because I want to move IRQs 1 000 000 times
per second and can't sustain enough context switches to do it in
userspace. Userspace basically has insufficient information to do it
as well as kernel.

We do task balancing in the kernel too, it's a pretty similar problem
(although granted it is less feasible for userspace because tasks are
created and destroyed very often)


> (esp if you do it 
> for networking ones, since you destroy the packet reassembly stuff in
> the tcp/ip stack).
>
> Instead, what ends up working is if you do high level categories of
> interrupt classes and balance within those (so that no 2 networking
> irqs are on the same core/package unless you have more nics than cores)

Sure, but you say that like it is difficult information for the kernel
to know about. Actually it is much easier. Note that you can still
bind interrupts to specific CPUs.


> etc. Balancing on a 10 second scale seems to work quite well; no need
> to pull that complexity into the kernel

My perspective is that it isn't a good idea to have such a critical
piece of infrastructure outside the kernel.

I want the kernel to balance interrupts and tasks fairly; maybe move
interrupts closer to the tasks they are interacting with (instead of,
or combined with our current policy of moving tasks near the interrupts,
which can be much more damaging for cache and NUMA); move all interrupts
to a single core when there is enough capacity and we are balancing for
power savings; do exponential interrupt balancing backoff when it isn't
required; etc. Not easy to do all that in userspace.

Any reason you actually think it is a good idea, aside from the fact
that a userspace solution was able to be better than a crappy old
kernel one?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [BUG?] OOM with large cache....(x86_64, 2.6.24-rc3-git1, nohz)

2007-11-19 Thread Nick Piggin
On Tuesday 20 November 2007 16:46, Ingo Molnar wrote:
> * Nick Piggin <[EMAIL PROTECTED]> wrote:
> > Unfortunately, we don't show NR_ANON_PAGES in these stats, [...]
>
> sidenote: the way i combat these missing pieces of instrumentation in
> the scheduler is to add them immediately to the cfs-debug-info.sh script
> (and to /proc/sched_debug if needed). I.e. if we get one report that
> misses a piece of critical information is OK, but if it's two reports
> and we still havent made it easy to report the right kind of information
> that is our fault entirely. This constant ping-ponging for information
> that goes on for basically every MM problem - which information could
> have been provided in the first message (by running a single, easy to
> download tool) is getting pretty hindering i believe.

I do usually to add the stats as I've needed them. I haven't
specifically needed NR_ANON_PAGES for an oom-killer problem
before, but I've added plenty of other output there.

(it's in /proc/meminfo of course, which is the most useful...)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ALSA PATCH] alsa-git merge request

2007-11-19 Thread Takashi Iwai
At Mon, 19 Nov 2007 19:35:09 +0100 (CET),
Jaroslav Kysela wrote:
> 
> 
> Linus, please pull from [the linus branch at]:
> 
>   master.kernel.org:/pub/scm/linux/kernel/git/perex/alsa.git linus
> gitweb interface:
>   http://www.kernel.org/git/?p=linux/kernel/git/perex/alsa.git

Grrr, please hold this.  The tree looks broken.

At least, the last patch
   [ALSA] hda-codec - Revert volume knob controls in STAC codecs
wasn't applied properly, and resulted in an empty change.

This is likely because the patch used was for mm tree and not cleanly
applied to 2.6.24-rc3.  It'll require a manual adjustment.


In addition, the following patches are real fixes that miss in this
push:

5543emu10k1 - Check value ranges in ctl callbacks
5532fix private data pointer calculation in CS4270 driver
5530emu10k1: Add mixer controls parameter checking.
5495portman2x4 - Fix probe error
5485ca0106 - Fix write proc assignment
5476s3c2443-ac97: compilation fix

thanks,

Takashi

> 
> The GNU patch is available at:
> 
>   ftp://ftp.alsa-project.org/pub/kernel-patches/alsa-git-2007-11-19.patch.gz
> 
> Additional notes:
> 
>   Just fixes and cleanups.
> 
> 
> The following files will be updated:
> 
>  include/sound/version.h|2 +-
>  sound/drivers/mpu401/mpu401_uart.c |   12 +++---
>  sound/pci/ca0106/ca0106_mixer.c|   18 ++-
>  sound/pci/cmipci.c |5 +--
>  sound/pci/hda/hda_codec.c  |   40 +--
>  sound/pci/hda/hda_local.h  |1 +
>  sound/pci/hda/patch_analog.c   |8 +++
>  7 files changed, 56 insertions(+), 30 deletions(-)
> 
> 
> The following things were done:
> 
> Clemens Ladisch (2):
>   [ALSA] cmipci: fix FLINKON/OFF bits
>   [ALSA] mpu401: fix recursive locking in timer
> 
> Jaroslav Kysela (1):
>   [ALSA] version 1.0.15
> 
> Takashi Iwai (4):
>   [ALSA] hda-codec - Disable shared stream on AD1986A
>   [ALSA] hda-codec - Check PINCAP only for PIN widgets
>   [ALSA] ca0106 - Check value range in ctl callbacks
>   [ALSA] hda-codec - Revert volume knob controls in STAC codecs
> 
> -
> Jaroslav Kysela <[EMAIL PROTECTED]>
> Linux Kernel Sound Maintainer
> ALSA Project, SUSE Labs
> 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 01/59] arch/ia64: Add missing "space"

2007-11-19 Thread Joe Perches
On Tue, 2007-11-20 at 15:54 +0900, Simon Horman wrote:
> Is it really neccessary for this fragment to create a line that
> is greater than 80 characters long? Persumably the entire reason
> that the printk line was split in the first place was to avoid
> a long line.

No.  Many other lines in that source file are > 80 char.

My initial preference was to reformat the indented lines to the
printk open parenthesis, but the minimal change seemed better.

cheers, Joe

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHv4 1/6] actual sys_indirect code

2007-11-19 Thread Ingo Molnar

cool patchset. Small nit, the series is not bisectable:

> +#include 

> --- kernel/Makefile
> +++ kernel/Makefile
> @@ -9,7 +9,7 @@ obj-y = sched.o fork.o exec_domain.o panic.o printk.o 
> profile.o \
>   rcupdate.o extable.o params.o posix-timers.o \
>   kthread.o wait.o kfifo.o sys_ni.o posix-cpu-timers.o mutex.o \
>   hrtimer.o rwsem.o latency.o nsproxy.o srcu.o \
> - utsname.o notifier.o
> + utsname.o notifier.o indirect.o

indirect.o is built unconditionally, but it wont build at this stage 
because asm/indirect.h is only introduced in later patches. I suspect a 
CONFIG_ARCH_SUPPORTS_INDIRECT_SYSCALLS Kconfig flag, and its use in the 
Makefile ought to do the trick. (it also reduces object size on 
architectures that have no support for indirect syscalls (yet))

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/3] tty: Add the new termios2 ioctls to the compatible list.

2007-11-19 Thread Andrew Morton
On Mon, 19 Nov 2007 13:52:06 +0100 Heiko Carstens <[EMAIL PROTECTED]> wrote:

> From: Heiko Carstens <[EMAIL PROTECTED]>
> 
> Make them depend on TCGETS2. If that one is implemented the rest
> should be there as well.
> 
> Cc: Alan Cox <[EMAIL PROTECTED]>
> Signed-off-by: Heiko Carstens <[EMAIL PROTECTED]>
> ---
> 
>  fs/compat_ioctl.c |6 ++
>  1 file changed, 6 insertions(+)
> 
> Index: linux-2.6/fs/compat_ioctl.c
> ===
> --- linux-2.6.orig/fs/compat_ioctl.c
> +++ linux-2.6/fs/compat_ioctl.c
> @@ -1954,6 +1954,12 @@ ULONG_IOCTL(TIOCSCTTY)
>  COMPATIBLE_IOCTL(TIOCGPTN)
>  COMPATIBLE_IOCTL(TIOCSPTLCK)
>  COMPATIBLE_IOCTL(TIOCSERGETLSR)
> +#ifdef TCGETS2
> +COMPATIBLE_IOCTL(TCGETS2)
> +COMPATIBLE_IOCTL(TCSETS2)
> +COMPATIBLE_IOCTL(TCSETSW2)
> +COMPATIBLE_IOCTL(TCSETSF2)
> +#endif
>  /* Little f */
>  COMPATIBLE_IOCTL(FIOCLEX)
>  COMPATIBLE_IOCTL(FIONCLEX)

Should this be in 2.6.24?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 52/59] net/ipv4: Add missing "space"

2007-11-19 Thread Simon Horman
On Mon, Nov 19, 2007 at 05:53:39PM -0800, Joe Perches wrote:
> 
> Signed-off-by: Joe Perches <[EMAIL PROTECTED]>

Acked-by: Simon Horman <[EMAIL PROTECTED]>

> ---
>  net/ipv4/ipvs/ip_vs_core.c   |2 +-
>  net/ipv4/netfilter/iptable_raw.c |2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/net/ipv4/ipvs/ip_vs_core.c b/net/ipv4/ipvs/ip_vs_core.c
> index 20c884a..8fba202 100644
> --- a/net/ipv4/ipvs/ip_vs_core.c
> +++ b/net/ipv4/ipvs/ip_vs_core.c
> @@ -637,7 +637,7 @@ static int ip_vs_out_icmp(struct sk_buff *skb, int 
> *related)
>   verdict = NF_DROP;
>  
>   if (IP_VS_FWD_METHOD(cp) != 0) {
> - IP_VS_ERR("shouldn't reach here, because the box is on the"
> + IP_VS_ERR("shouldn't reach here, because the box is on the "
> "half connection in the tun/dr module.\n");
>   }
>  
> diff --git a/net/ipv4/netfilter/iptable_raw.c 
> b/net/ipv4/netfilter/iptable_raw.c
> index 5de6e57..f867865 100644
> --- a/net/ipv4/netfilter/iptable_raw.c
> +++ b/net/ipv4/netfilter/iptable_raw.c
> @@ -66,7 +66,7 @@ ipt_local_hook(unsigned int hook,
>   if (skb->len < sizeof(struct iphdr) ||
>   ip_hdrlen(skb) < sizeof(struct iphdr)) {
>   if (net_ratelimit())
> - printk("iptable_raw: ignoring short SOCK_RAW"
> + printk("iptable_raw: ignoring short SOCK_RAW "
>  "packet.\n");
>   return NF_ACCEPT;
>   }
> -- 
> 1.5.3.5.652.gf192c
> 
> -
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to [EMAIL PROTECTED]
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

-- 
Horms

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc2-mm1: kcryptd vs lockdep

2007-11-19 Thread Torsten Kaiser
On Nov 19, 2007 10:00 PM, Milan Broz <[EMAIL PROTECTED]> wrote:
> Torsten Kaiser wrote:
> > Anything I could try, apart from more boots with slub_debug=F?

One time it triggered with slub_debug=F, but no additional output.
With slub_debug=FP I have not seen it again, so I can't say if that
would yield more info.

> Please could you try which patch from the dm-crypt series cause this ?
> (agk-dm-dm-crypt* names.)
>
> I suspect agk-dm-dm-crypt-move-bio-submission-to-thread.patch because
> there is one work struct used subsequently in two threads...
> (io thread already started while crypt thread is processing lockdep_map
> after calling f(work)...)

After reverting only
agk-dm-dm-crypt-move-bio-submission-to-thread.patch I also have not
seen the 'held lock freed' message again.

If it happens again with this revert, I will post that output.

Thanks for the hint.

Torsten
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 01/59] arch/ia64: Add missing "space"

2007-11-19 Thread Simon Horman
[snip]

> diff --git a/arch/ia64/kernel/kprobes.c b/arch/ia64/kernel/kprobes.c
> index 5fd65d8..90518e4 100644
> --- a/arch/ia64/kernel/kprobes.c
> +++ b/arch/ia64/kernel/kprobes.c
> @@ -182,8 +182,8 @@ static int __kprobes unsupported_inst(uint template, uint 
>  slot,
>   qp = kprobe_inst & 0x3f;
>   if (is_cmp_ctype_unc_inst(template, slot, major_opcode, kprobe_inst)) {
>   if (slot == 1 && qp)  {
> - printk(KERN_WARNING "Kprobes on cmp unc"
> - "instruction on slot 1 at <0x%lx>"
> + printk(KERN_WARNING "Kprobes on cmp unc "
> + "instruction on slot 1 at <0x%lx> "
>   "is not supported\n", addr);
>   return -EINVAL;
>  
> @@ -221,8 +221,8 @@ static int __kprobes unsupported_inst(uint template, uint 
>  slot,
>* bit 12 to be equal to 1
>*/
>   if (slot == 1 && qp) {
> - printk(KERN_WARNING "Kprobes on test bit"
> - "instruction on slot at <0x%lx>"
> + printk(KERN_WARNING "Kprobes on test bit "
> + "instruction on slot at <0x%lx> 
> "
>   "is not supported\n", addr);
>   return -EINVAL;
>   }

Is it really neccessary for this fragment to create a line that
is greater than 80 characters long? Persumably the entire reason
that the printk line was split in the first place was to avoid
a long line.

[snip]

-- 
Horms

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCHv4 4/6] Allow setting FD_CLOEXEC flag for new sockets

2007-11-19 Thread Ulrich Drepper
This is a first user of sys_indirect.  Several of the socket-related system
calls which produce a file handle now can be passed an additional parameter
to set the FD_CLOEXEC flag.

 arch/x86/ia32/Makefile|1 +
 arch/x86/ia32/sys_ia32.c  |4 
 include/asm-x86/ia32_unistd.h |1 +
 include/linux/indirect.h  |   33 +
 kernel/Makefile   |2 ++
 kernel/indirect.c |4 
 net/socket.c  |   21 +
 7 files changed, 58 insertions(+), 8 deletions(-)

--- arch/x86/ia32/Makefile
+++ arch/x86/ia32/Makefile
@@ -36,6 +36,7 @@ $(obj)/vsyscall-sysenter.so.dbg 
$(obj)/vsyscall-syscall.so.dbg: \
 $(obj)/vsyscall-%.so.dbg: $(src)/vsyscall.lds $(obj)/vsyscall-%.o FORCE
$(call if_changed,syscall)
 
+CFLAGS_sys_ia32.o = -Wno-undef
 AFLAGS_vsyscall-sysenter.o = -m32 -Wa,-32
 AFLAGS_vsyscall-syscall.o = -m32 -Wa,-32
 
--- kernel/Makefile
+++ kernel/Makefile
@@ -67,6 +67,8 @@ ifneq ($(CONFIG_SCHED_NO_NO_OMIT_FRAME_POINTER),y)
 CFLAGS_sched.o := $(PROFILING) -fno-omit-frame-pointer
 endif
 
+CFLAGS_indirect.o = -Wno-undef
+
 $(obj)/configs.o: $(obj)/config_data.h
 
 # config_data.h contains the same information as ikconfig.h but gzipped.
diff -u net/socket.c net/socket.c
--- net/socket.c
+++ net/socket.c
@@ -344,11 +344,11 @@
  * but we take care of internal coherence yet.
  */
 
-static int sock_alloc_fd(struct file **filep)
+static int sock_alloc_fd(struct file **filep, int flags)
 {
int fd;
 
-   fd = get_unused_fd();
+   fd = get_unused_fd_flags(flags);
if (likely(fd >= 0)) {
struct file *file = get_empty_filp();
 
@@ -391,10 +391,10 @@
return 0;
 }
 
-int sock_map_fd(struct socket *sock)
+static int sock_map_fd_flags(struct socket *sock, int flags)
 {
struct file *newfile;
-   int fd = sock_alloc_fd();
+   int fd = sock_alloc_fd(, flags);
 
if (likely(fd >= 0)) {
int err = sock_attach_fd(sock, newfile);
@@ -409,6 +409,11 @@
return fd;
 }
 
+int sock_map_fd(struct socket *sock)
+{
+   return sock_map_fd_flags(sock, 0);
+}
+
 static struct socket *sock_from_file(struct file *file, int *err)
 {
if (file->f_op == _file_ops)
@@ -1208,7 +1213,7 @@
if (retval < 0)
goto out;
 
-   retval = sock_map_fd(sock);
+   retval = sock_map_fd_flags(sock, INDIRECT_PARAM(file_flags, flags));
if (retval < 0)
goto out_release;
 
@@ -1249,13 +1254,13 @@
if (err < 0)
goto out_release_both;
 
-   fd1 = sock_alloc_fd();
+   fd1 = sock_alloc_fd(, INDIRECT_PARAM(file_flags, flags));
if (unlikely(fd1 < 0)) {
err = fd1;
goto out_release_both;
}
 
-   fd2 = sock_alloc_fd();
+   fd2 = sock_alloc_fd(, INDIRECT_PARAM(file_flags, flags));
if (unlikely(fd2 < 0)) {
err = fd2;
put_filp(newfile1);
@@ -1411,7 +1416,7 @@
 */
__module_get(newsock->ops->owner);
 
-   newfd = sock_alloc_fd();
+   newfd = sock_alloc_fd(, INDIRECT_PARAM(file_flags, flags));
if (unlikely(newfd < 0)) {
err = newfd;
sock_release(newsock);
diff -u arch/x86/ia32/sys_ia32.c arch/x86/ia32/sys_ia32.c
--- arch/x86/ia32/sys_ia32.c
+++ arch/x86/ia32/sys_ia32.c
@@ -902,6 +902,10 @@
 
switch (INDIRECT_SYSCALL32())
{
+#define INDSYSCALL(name) __NR_ia32_##name
+#include 
+   break;
+
default:
return -EINVAL;
}
diff -u include/linux/indirect.h include/linux/indirect.h
--- include/linux/indirect.h
+++ include/linux/indirect.h
@@ -1,6 +1,39 @@
+#ifndef INDSYSCALL
 #ifndef _LINUX_INDIRECT_H
 #define _LINUX_INDIRECT_H
 
 #include 
 
+
+union indirect_params {
+  struct {
+int flags;
+  } file_flags;
+};
+
+#define INDIRECT_PARAM(set, name) current->indirect_params.set.name
+
+#endif
+#else
+
+/* Here comes the list of system calls which can be called through
+   sys_indirect.  When the list if support system calls is needed the
+   file including this header is supposed to define a macro "INDSYSCALL"
+   which adds a prefix fitting to the use.  If the resulting macro is
+   defined we generate a line
+   case MACRO:
+   */
+#if INDSYSCALL(accept)
+  case INDSYSCALL(accept):
+#endif
+#if INDSYSCALL(socket)
+  case INDSYSCALL(socket):
+#endif
+#if INDSYSCALL(socketcall)
+  case INDSYSCALL(socketcall):
+#endif
+#if INDSYSCALL(socketpair)
+  case INDSYSCALL(socketpair):
+#endif
+
 #endif
diff -u kernel/indirect.c kernel/indirect.c
--- kernel/indirect.c
+++ kernel/indirect.c
@@ -19,6 +19,10 @@
 
switch (INDIRECT_SYSCALL ())
{
+#define INDSYSCALL(name) __NR_##name
+#include 
+   break;
+
default:
return -EINVAL;
}
--- include/asm-x86/ia32_unistd.h
+++ include/asm-x86/ia32_unistd.h
@@ -12,6 +12,7 @@
 

[PATCHv4 1/6] actual sys_indirect code

2007-11-19 Thread Ulrich Drepper
This is the actual architecture-independent part of the system call
implementation.

 include/linux/indirect.h |6 ++
 include/linux/sched.h|4 
 include/linux/syscalls.h |4 
 kernel/Makefile  |2 +-
 kernel/indirect.c|   36 
 5 files changed, 51 insertions(+), 1 deletion(-)

--- /dev/null
+++ include/linux/indirect.h
@@ -0,0 +1,6 @@
+#ifndef _LINUX_INDIRECT_H
+#define _LINUX_INDIRECT_H
+
+#include 
+
+#endif
--- include/linux/sched.h
+++ include/linux/sched.h
@@ -80,6 +80,7 @@ struct sched_param {
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -1174,6 +1175,9 @@ struct task_struct {
int make_it_fail;
 #endif
struct prop_local_single dirties;
+
+   /* Additional system call parameters.  */
+   union indirect_params indirect_params;
 };
 
 /*
--- include/linux/syscalls.h
+++ include/linux/syscalls.h
@@ -54,6 +54,7 @@ struct compat_stat;
 struct compat_timeval;
 struct robust_list_head;
 struct getcpu_cache;
+struct indirect_registers;
 
 #include 
 #include 
@@ -611,6 +612,9 @@ asmlinkage long sys_timerfd(int ufd, int clockid, int flags,
const struct itimerspec __user *utmr);
 asmlinkage long sys_eventfd(unsigned int count);
 asmlinkage long sys_fallocate(int fd, int mode, loff_t offset, loff_t len);
+asmlinkage long sys_indirect(struct indirect_registers __user *userregs,
+void __user *userparams, size_t paramslen,
+int flags);
 
 int kernel_execve(const char *filename, char *const argv[], char *const 
envp[]);
 
--- /dev/null
+++ kernel/indirect.c
@@ -0,0 +1,36 @@
+#include 
+#include 
+#include 
+#include 
+
+
+asmlinkage long sys_indirect(struct indirect_registers __user *userregs,
+void __user *userparams, size_t paramslen,
+int flags)
+{
+   struct indirect_registers regs;
+   long result;
+
+   if (unlikely(flags != 0))
+   return -EINVAL;
+
+   if (copy_from_user(, userregs, sizeof(regs)))
+   return -EFAULT;
+
+   switch (INDIRECT_SYSCALL ())
+   {
+   default:
+   return -EINVAL;
+   }
+
+   if (paramslen > sizeof(union indirect_params))
+   return -EINVAL;
+
+   result = -EFAULT;
+   if (!copy_from_user(>indirect_params, userparams, paramslen))
+   result = CALL_INDIRECT();
+
+   memset(>indirect_params, '\0', paramslen);
+
+   return result;
+}
--- kernel/Makefile
+++ kernel/Makefile
@@ -9,7 +9,7 @@ obj-y = sched.o fork.o exec_domain.o panic.o printk.o 
profile.o \
rcupdate.o extable.o params.o posix-timers.o \
kthread.o wait.o kfifo.o sys_ni.o posix-cpu-timers.o mutex.o \
hrtimer.o rwsem.o latency.o nsproxy.o srcu.o \
-   utsname.o notifier.o
+   utsname.o notifier.o indirect.o
 
 obj-$(CONFIG_SYSCTL) += sysctl_check.o
 obj-$(CONFIG_STACKTRACE) += stacktrace.o
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCHv4 5/6] Allow setting O_NONBLOCK flag for new sockets

2007-11-19 Thread Ulrich Drepper
This patch adds support for setting the O_NONBLOCK flag of the file
descriptors returned by socket, socketpair, and accept.

 socket.c |   15 +--
 1 file changed, 9 insertions(+), 6 deletions(-)

--- net/socket.c
+++ net/socket.c
@@ -362,7 +362,7 @@ static int sock_alloc_fd(struct file **filep, int flags)
return fd;
 }
 
-static int sock_attach_fd(struct socket *sock, struct file *file)
+static int sock_attach_fd(struct socket *sock, struct file *file, int flags)
 {
struct dentry *dentry;
struct qstr name = { .name = "" };
@@ -384,7 +384,7 @@ static int sock_attach_fd(struct socket *sock, struct file 
*file)
init_file(file, sock_mnt, dentry, FMODE_READ | FMODE_WRITE,
  _file_ops);
SOCK_INODE(sock)->i_fop = _file_ops;
-   file->f_flags = O_RDWR;
+   file->f_flags = O_RDWR | (flags & O_NONBLOCK);
file->f_pos = 0;
file->private_data = sock;
 
@@ -397,7 +397,7 @@ static int sock_map_fd_flags(struct socket *sock, int flags)
int fd = sock_alloc_fd(, flags);
 
if (likely(fd >= 0)) {
-   int err = sock_attach_fd(sock, newfile);
+   int err = sock_attach_fd(sock, newfile, flags);
 
if (unlikely(err < 0)) {
put_filp(newfile);
@@ -1268,12 +1268,14 @@ asmlinkage long sys_socketpair(int family, int type, 
int protocol,
goto out_release_both;
}
 
-   err = sock_attach_fd(sock1, newfile1);
+   err = sock_attach_fd(sock1, newfile1,
+INDIRECT_PARAM(file_flags, flags));
if (unlikely(err < 0)) {
goto out_fd2;
}
 
-   err = sock_attach_fd(sock2, newfile2);
+   err = sock_attach_fd(sock2, newfile2,
+INDIRECT_PARAM(file_flags, flags));
if (unlikely(err < 0)) {
fput(newfile1);
goto out_fd1;
@@ -1423,7 +1425,8 @@ asmlinkage long sys_accept(int fd, struct sockaddr __user 
*upeer_sockaddr,
goto out_put;
}
 
-   err = sock_attach_fd(newsock, newfile);
+   err = sock_attach_fd(newsock, newfile,
+INDIRECT_PARAM(file_flags, flags));
if (err < 0)
goto out_fd_simple;
 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCHv4 2/6] x86 support for sys_indirect

2007-11-19 Thread Ulrich Drepper
This part adds support for sys_indirect on x86 and x86-64.

 arch/x86/ia32/ia32entry.S  |2 ++
 arch/x86/ia32/sys_ia32.c   |   31 +++
 arch/x86/kernel/syscall_table_32.S |1 +
 include/asm-x86/indirect.h |5 +
 include/asm-x86/indirect_32.h  |   23 +++
 include/asm-x86/indirect_64.h  |   34 ++
 include/asm-x86/unistd_32.h|3 ++-
 include/asm-x86/unistd_64.h|2 ++
 8 files changed, 100 insertions(+), 1 deletion(-)

--- arch/x86/ia32/ia32entry.S
+++ arch/x86/ia32/ia32entry.S
@@ -400,6 +400,7 @@ END(ia32_ptregs_common)
 
.section .rodata,"a"
.align 8
+   .globl ia32_sys_call_table
 ia32_sys_call_table:
.quad sys_restart_syscall
.quad sys_exit
@@ -726,4 +727,5 @@ ia32_sys_call_table:
.quad compat_sys_timerfd
.quad sys_eventfd
.quad sys32_fallocate
+   .quad sys32_indirect/* 325  */
 ia32_syscall_end:
--- arch/x86/ia32/sys_ia32.c
+++ arch/x86/ia32/sys_ia32.c
@@ -887,3 +887,37 @@ asmlinkage long sys32_fallocate(int fd, int mode, unsigned 
offset_lo,
return sys_fallocate(fd, mode, ((u64)offset_hi << 32) | offset_lo,
 ((u64)len_hi << 32) | len_lo);
 }
+
+asmlinkage long sys32_indirect(struct indirect_registers32 __user *userregs,
+  void __user *userparams, size_t paramslen,
+  int flags)
+{
+   extern long (*ia32_sys_call_table[])(u32, u32, u32, u32, u32, u32);
+
+   struct indirect_registers32 regs;
+   long result;
+
+   if (flags != 0)
+   return -EINVAL;
+
+   if (copy_from_user(, userregs, sizeof(regs)))
+   return -EFAULT;
+
+   switch (INDIRECT_SYSCALL32())
+   {
+   default:
+   return -EINVAL;
+   }
+
+   if (paramslen > sizeof(union indirect_params))
+   return -EINVAL;
+   result = -EFAULT;
+   if (!copy_from_user(>indirect_params, userparams, paramslen))
+   result = ia32_sys_call_table[regs.eax](regs.ebx, regs.ecx,
+  regs.edx, regs.esi,
+  regs.edi, regs.ebp);
+
+   memset(>indirect_params, '\0', paramslen);
+
+   return result;
+}
--- arch/x86/kernel/syscall_table_32.S
+++ arch/x86/kernel/syscall_table_32.S
@@ -324,3 +324,4 @@ ENTRY(sys_call_table)
.long sys_timerfd
.long sys_eventfd
.long sys_fallocate
+   .long sys_indirect  /* 325 */
--- /dev/null
+++ include/asm-x86/indirect_32.h
@@ -0,0 +1,23 @@
+#ifndef _ASM_X86_INDIRECT_32_H
+#define _ASM_X86_INDIRECT_32_H
+
+struct indirect_registers {
+   __u32 eax;
+   __u32 ebx;
+   __u32 ecx;
+   __u32 edx;
+   __u32 esi;
+   __u32 edi;
+   __u32 ebp;
+};
+
+#define INDIRECT_SYSCALL(regs) (regs)->eax
+
+#define CALL_INDIRECT(regs) \
+  ({ extern long (*sys_call_table[]) (__u32, __u32, __u32, __u32, __u32, 
__u32); \
+ sys_call_table[INDIRECT_SYSCALL(regs)] ((regs)->ebx, (regs)->ecx, \
+(regs)->edx, (regs)->esi, \
+(regs)->edi, (regs)->ebp); \
+ })
+
+#endif
--- /dev/null
+++ include/asm-x86/indirect_64.h
@@ -0,0 +1,34 @@
+#ifndef _ASM_X86_INDIRECT_64_H
+#define _ASM_X86_INDIRECT_64_H
+
+struct indirect_registers {
+   __u64 rax;
+   __u64 rdi;
+   __u64 rsi;
+   __u64 rdx;
+   __u64 r10;
+   __u64 r8;
+   __u64 r9;
+};
+
+struct indirect_registers32 {
+   __u32 eax;
+   __u32 ebx;
+   __u32 ecx;
+   __u32 edx;
+   __u32 esi;
+   __u32 edi;
+   __u32 ebp;
+};
+
+#define INDIRECT_SYSCALL(regs) (regs)->rax
+#define INDIRECT_SYSCALL32(regs) (regs)->eax
+
+#define CALL_INDIRECT(regs) \
+  ({ extern long (*sys_call_table[]) (__u64, __u64, __u64, __u64, __u64, 
__u64); \
+ sys_call_table[INDIRECT_SYSCALL(regs)] ((regs)->rdi, (regs)->rsi, \
+(regs)->rdx, (regs)->r10, \
+(regs)->r8, (regs)->r9); \
+ })
+
+#endif
--- /dev/null
+++ include/asm-x86/indirect.h
@@ -0,0 +1,5 @@
+#ifdef CONFIG_X86_32
+# include "indirect_32.h"
+#else
+# include "indirect_64.h"
+#endif
--- include/asm-x86/unistd_32.h
+++ include/asm-x86/unistd_32.h
@@ -330,10 +330,11 @@
 #define __NR_timerfd   322
 #define __NR_eventfd   323
 #define __NR_fallocate 324
+#define __NR_indirect  325
 
 #ifdef __KERNEL__
 
-#define NR_syscalls 325
+#define NR_syscalls 326
 
 #define __ARCH_WANT_IPC_PARSE_VERSION
 #define __ARCH_WANT_OLD_READDIR
--- include/asm-x86/unistd_64.h
+++ include/asm-x86/unistd_64.h
@@ -635,6 +635,8 @@ __SYSCALL(__NR_timerfd, sys_timerfd)
 __SYSCALL(__NR_eventfd, sys_eventfd)
 #define __NR_fallocate  

[PATCHv4 3/6] UML support for sys_indirect

2007-11-19 Thread Ulrich Drepper
This part adds support for sys_indirect for UML.

 indirect.h |6 ++
 1 file changed, 6 insertions(+)

--- /dev/null
+++ include/asm-um/indirect.h
@@ -0,0 +1,6 @@
+#ifndef __UM_INDIRECT_H
+#define __UM_INDIRECT_H
+
+#include "asm/arch/indirect.h"
+
+#endif
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCHv4 0/6] sys_indirect system call

2007-11-19 Thread Ulrich Drepper

wing patches provide an alternative implementation of the
sys_indirect system call which has been discussed a few times.
This no system call allows us to extend existing system call
interfaces with adding more system calls.

Davide's previous implementation is IMO far more complex than
warranted.  This code here is trivial, as you can see.  I've
discussed this approach with Linus last week and for a brief moment
we actually agreed on something.

We pass an additional block of data to the kernel, it is copied into
the task_struct, and then it is up to the function implementing the system
call to interpret the data.  Each system call, which is meant to be
extended this way, has to be white-listed in sys_indirect.  The
alternative is to filter out those system calls which absolutely cannot
be handled using sys_indirect (like clone, execve) since they require
the stack layout of an ordinary system call.  This is more dangerous
since it is too easy to miss a call.

The code for x86 and x86-64 gets by without a single line of assembly
code.  This is likely to be true for most/all the other archs as well.
There is architecture-dependent code, though.  For x86 and x86-64 I've
also fixed up UML (although only x86-64 is tested, that's my setup).

The last three patches show the first application of the functionality.
They also show a complication: we need the test for valid sub-syscalls in the
main implementation and in the compatibility code.  And more: the actual
sources and generated binary for the test are very different (the numbers
differ).  Duplicating the information is a big problem, though.  I've used
some macro tricks to avoid this.  All the information about the flags and
the system calls using them is concentrated in one header.  This should
maintenance bearable.

This patch to use sys_indirect is just the beginning.  More will follow,
but I want to see how these patches are received before I spend more time
on it.  This code is enough to test the implementation with the following
test program.  Adjust it for architectures other than x86 and x86-64.


#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 

typedef uint32_t __u32;
typedef uint64_t __u64;

union indirect_params {
  struct {
int flags;
  } file_flags;
};

#ifdef __x86_64__
# define __NR_indirect 286
struct indirect_registers {
  __u64 rax;
  __u64 rdi;
  __u64 rsi;
  __u64 rdx;
  __u64 r10;
  __u64 r8;
  __u64 r9;
};
#elif defined __i386__
# define __NR_indirect 325
struct indirect_registers {
  __u32 eax;
  __u32 ebx;
  __u32 ecx;
  __u32 edx;
  __u32 esi;
  __u32 edi;
  __u32 ebp;
};
#else
# error "need to define __NR_indirect and struct indirect_params"
#endif

#define FILL_IN(var, values...) \
  var = (struct indirect_registers) { values }

int
main (void)
{
  int fd = socket (AF_INET, SOCK_DGRAM, IPPROTO_IP);
  int s1 = fcntl (fd, F_GETFD);
  int t1 = fcntl (fd, F_GETFL);
  printf ("old: FD_CLOEXEC %s set, NONBLOCK %s set\n",
  s1 == 0 ? "not" : "is", (t1 & O_NONBLOCK) ? "is" : "not");
  close (fd);

  union indirect_params i;
  i.file_flags.flags = O_CLOEXEC|O_NONBLOCK;

  struct indirect_registers r;
#ifdef __NR_socketcall
# define SOCKOP_socket   1
  long args[3] = { AF_INET, SOCK_DGRAM, IPPROTO_IP };
  FILL_IN (r, __NR_socketcall, SOCKOP_socket, (long) args);
#else
  FILL_IN (r, __NR_socket, AF_INET, SOCK_DGRAM, IPPROTO_IP);
#endif

  fd = syscall (__NR_indirect, , , sizeof (i));
  int s2 = fcntl (fd, F_GETFD);
  int t2 = fcntl (fd, F_GETFL);
  printf ("new: FD_CLOEXEC %s set, NONBLOCK %s set\n",
  s2 == 0 ? "not" : "is", (t2 & O_NONBLOCK) ? "is" : "not");
  close (fd);

  i.file_flags.flags = O_CLOEXEC;
  sigset_t ss;
  sigemptyset();
  FILL_IN(r, __NR_signalfd, -1, (long) , 8);
  fd = syscall (__NR_indirect, , , sizeof (i));
  int s3 = fcntl (fd, F_GETFD);
  printf ("signalfd: FD_CLOEXEC %s set\n", s3 == 0 ? "not" : "is");
  close (fd);

  FILL_IN(r, __NR_eventfd, 8);
  fd = syscall (__NR_indirect, , , sizeof (i));
  int s4 = fcntl (fd, F_GETFD);
  printf ("eventfd: FD_CLOEXEC %s set\n", s4 == 0 ? "not" : "is");
  close (fd);

  return s1 != 0 || s2 == 0 || t1 != 0 || t2 == 0 || s3 == 0 || s4 == 0;
}


Signed-off-by: Ulrich Drepper <[EMAIL PROTECTED]>


 arch/x86/ia32/Makefile |1 
 arch/x86/ia32/ia32entry.S  |2 +
 arch/x86/ia32/sys_ia32.c   |   37 +-
 arch/x86/kernel/syscall_table_32.S |1 
 include/asm-um/indirect.h  |6 +
 include/asm-x86/ia32_unistd.h  |1 
 include/asm-x86/indirect.h |5 
 include/asm-x86/indirect_32.h  |   23 +
 include/asm-x86/indirect_64.h  |   34 +++
 include/asm-x86/unistd_32.h|3 +-
 include/asm-x86/unistd_64.h|2 +
 

[PATCHv4 6/6] FD_CLOEXEC support for eventfd, signalfd, timerfd

2007-11-19 Thread Ulrich Drepper
This patch adds support to set the FD_CLOEXEC flag for the file descriptors
returned by eventfd, signalfd, timerfd.

 fs/anon_inodes.c  |   15 +++
 fs/eventfd.c  |5 +++--
 fs/signalfd.c |6 --
 fs/timerfd.c  |6 --
 include/asm-x86/ia32_unistd.h |3 +++
 include/linux/anon_inodes.h   |3 +++
 include/linux/indirect.h  |3 +++
 7 files changed, 31 insertions(+), 10 deletions(-)

--- fs/anon_inodes.c
+++ fs/anon_inodes.c
@@ -70,9 +70,9 @@ static struct dentry_operations 
anon_inodefs_dentry_operations = {
  * hence saving memory and avoiding code duplication for the file/inode/dentry
  * setup.
  */
-int anon_inode_getfd(int *pfd, struct inode **pinode, struct file **pfile,
-const char *name, const struct file_operations *fops,
-void *priv)
+int anon_inode_getfd_flags(int *pfd, struct inode **pinode, struct file 
**pfile,
+  const char *name, const struct file_operations *fops,
+  void *priv, int flags)
 {
struct qstr this;
struct dentry *dentry;
@@ -85,7 +85,7 @@ int anon_inode_getfd(int *pfd, struct inode **pinode, struct 
file **pfile,
if (!file)
return -ENFILE;
 
-   error = get_unused_fd();
+   error = get_unused_fd_flags(flags);
if (error < 0)
goto err_put_filp;
fd = error;
@@ -138,6 +138,13 @@ err_put_filp:
put_filp(file);
return error;
 }
+
+int anon_inode_getfd(int *pfd, struct inode **pinode, struct file **pfile,
+const char *name, const struct file_operations *fops,
+void *priv)
+{
+   return anon_inode_getfd_flags(pfd, pinode, pfile, name, fops, priv, 0);
+}
 EXPORT_SYMBOL_GPL(anon_inode_getfd);
 
 /*
--- fs/eventfd.c
+++ fs/eventfd.c
@@ -215,8 +215,9 @@ asmlinkage long sys_eventfd(unsigned int count)
 * When we call this, the initialization must be complete, since
 * anon_inode_getfd() will install the fd.
 */
-   error = anon_inode_getfd(, , , "[eventfd]",
-_fops, ctx);
+   error = anon_inode_getfd_flags(, , , "[eventfd]",
+  _fops, ctx,
+  INDIRECT_PARAM(file_flags, flags));
if (!error)
return fd;
 
--- fs/signalfd.c
+++ fs/signalfd.c
@@ -224,8 +224,10 @@ asmlinkage long sys_signalfd(int ufd, sigset_t __user 
*user_mask, size_t sizemas
 * When we call this, the initialization must be complete, since
 * anon_inode_getfd() will install the fd.
 */
-   error = anon_inode_getfd(, , , "[signalfd]",
-_fops, ctx);
+   error = anon_inode_getfd_flags(, , ,
+  "[signalfd]", _fops,
+  ctx, INDIRECT_PARAM(file_flags,
+  flags));
if (error)
goto err_fdalloc;
} else {
--- fs/timerfd.c
+++ fs/timerfd.c
@@ -182,8 +182,10 @@ asmlinkage long sys_timerfd(int ufd, int clockid, int 
flags,
 * When we call this, the initialization must be complete, since
 * anon_inode_getfd() will install the fd.
 */
-   error = anon_inode_getfd(, , , "[timerfd]",
-_fops, ctx);
+   error = anon_inode_getfd_flags(, , , "[timerfd]",
+  _fops, ctx,
+  INDIRECT_PARAM(file_flags,
+ flags));
if (error)
goto err_tmrcancel;
} else {
--- include/asm-x86/ia32_unistd.h
+++ include/asm-x86/ia32_unistd.h
@@ -15,5 +15,8 @@
 #define __NR_ia32_socketcall   102
 #define __NR_ia32_sigreturn119
 #define __NR_ia32_rt_sigreturn 173
+#define __NR_ia32_signalfd 321
+#define __NR_ia32_timerfd  322
+#define __NR_ia32_eventfd  323
 
 #endif /* _ASM_X86_64_IA32_UNISTD_H_ */
--- include/linux/anon_inodes.h
+++ include/linux/anon_inodes.h
@@ -8,6 +8,9 @@
 #ifndef _LINUX_ANON_INODES_H
 #define _LINUX_ANON_INODES_H
 
+int anon_inode_getfd_flags(int *pfd, struct inode **pinode, struct file 
**pfile,
+  const char *name, const struct file_operations *fops,
+  void *priv, int flags);
 int anon_inode_getfd(int *pfd, struct inode **pinode, struct file **pfile,
 const char *name, const struct file_operations *fops,
 void *priv);
--- include/linux/indirect.h
+++ include/linux/indirect.h
@@ -35,5 +35,8 @@ union indirect_params {
 #if INDSYSCALL(socketpair)
   case 

Re: Is there any word about this bug in gcc ?

2007-11-19 Thread Herbert Xu
On Mon, Nov 19, 2007 at 10:47:59PM -0800, H. Peter Anvin wrote:
> 
> This one is definitely messy.  There is absolutely no way to know what 
> gcc has miscompiled.  It looks to me that both gcc 4.2 and 4.3 are 
> affected, any others?

I just tested it here and gcc 3.3 is also affected so presumably
everything in between is too.  Gcc 2.95 is not affected.  I don't
have the intervening versions to test.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] x86/paravirt: revert exports to restore old behaviour

2007-11-19 Thread Takashi Iwai
At Mon, 19 Nov 2007 17:14:15 -0800,
Jeremy Fitzhardinge wrote:
> 
> Takashi Iwai wrote:
> > I took at this problem (as I have an nvidia card on one of my
> > workstations), and found out that the following suffer from
> > EXPORT_SYMBOL_GPL changes:
> >   
> 
> Which kernel version are you using?  This is different in .24-rc
> compared to .23.

24-rc2.  23 has no problem, as you know :)

> > * local_disable_irq(), local_irq_save*(), etc.
> >   
> 
> These should be OK either way.  pv_irq_ops is not _GPL.

Right.  I thought it somehow involved with other pv ops indirectly,
but it seems not.

> > * MSR-related macros like rdmsr(), wrmsr(), read_cr0(), etc.
> >   wbinvd(), too.
> >   
> 
> These could reasonably use the the native_* versions anyway, since the
> driver won't be being used in an environment where these won't work. 
> Perhaps they should be split out separate from the gdt/ldt operations,
> which they should have no business touching.

Yes, that's possible.

> > * pmd_val(), pgd_val(), etc are all involved with pv_mm_ops.
> >   pmd_large() and pmd_bad() is also indirectly involved.
> >   __flush_tlb() and friends suffer, too.
> >   
> 
> Yeah, I guess they can be expected to play with pagetables.
> 
> > The easiest workaround I found was to undefine CONFIG_PARAVIRT before
> > inclusion of linux kernel headers, but it is really ugly and hacky.
> >   
> 
> Yeah.  It will explode if you are running in a virtual environment which
> still gives the virtual machine graphics hardware access.

Yes.  More over, there is no guarantee that this will be built
properly in the future.  It's a kind of coincident that the driver is
built.  If any non-paravirt implementation accesses an exported symbol
instead of inlining, then this won't work, too.

> > Redefinig with raw_*() and native_*() is another way, but it takes
> > much more work than defining these primitive functions in assembly.
> >
> > So, in short, with EXPORT_SYMBOL_GPL change, it's pretty hard to write
> > a non-GPL driver in a same manner...
> >   
> 
> Yeah.  I think removing the difference between PARAVIRT and non-PARAVIRT
> is enough to justify the exports.  If we want to make the policy
> decision that modules can't use pagetable or msr operations at all, then
> that's a separate decision which can be applied uniformly to PARAVIRT
> and non-PARAVIRT.

Agreed.


thanks,

Takashi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/5] arch/sparc64: Add missing pci_dev_put

2007-11-19 Thread David Miller
From: Julia Lawall <[EMAIL PROTECTED]>
Date: Mon, 19 Nov 2007 09:02:22 +0100 (CET)

> From: Julia Lawall <[EMAIL PROTECTED]>
> 
> There should be a pci_dev_put when breaking out of a loop that iterates
> over calls to pci_get_device and similar functions.
 ..
> Signed-off-by: Julia Lawall <[EMAIL PROTECTED]>

Patch applied, but something in your email client adds
extra spaces to the second column of several lines in your
your patch.

Please correct this before making future patch submissions.

Thank you.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Is there any word about this bug in gcc ?

2007-11-19 Thread H. Peter Anvin

Herbert Xu wrote:

David Miller <[EMAIL PROTECTED]> wrote:

Because the compiler knows things about the inputs and can
thus apply optimizations that a static implementation in glibc
that has to handle all forms of inputs cannot.


On an unrelated note, I wonder if distros will be treating this
with the same level of urgency as security vulnerabilities,
especially in light of Shamir's recent note on maths errors.



This one is definitely messy.  There is absolutely no way to know what 
gcc has miscompiled.  It looks to me that both gcc 4.2 and 4.3 are 
affected, any others?


-hpa
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Time-based RFC 4122 UUID generator

2007-11-19 Thread H. Peter Anvin

David Schwartz wrote:


Any UUID generator that can produce duplicate UUIDs with probability
significantly less than purely random UUIDs is so badly broken that it
should not ever be used. Anyone who finds such a UUID generator should
immediately either fix it or throw it on the junk heap. Anyone who knowingly
uses such a UUID generator should be publically shamed.

Rather than (or at the very least, in addition to) adding a new UUID
generator, let's fix the one(s) we have.



I presume you mean "significantly higher."

Realistically speaking, a random UUID is probably the best you're going 
to ever get.  I highly suspect that any time- and MAC-address-based 
solution is going to suffer from mis-set clocks and misprogrammed MAC 
addresses more often than you will have collisions in a 122-bit random 
number.


-hpa
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] wait_task_stopped: pass correct exit_code to wait_noreap_copyout

2007-11-19 Thread Andrew Morton
On Sun, 18 Nov 2007 09:13:24 + Scott James Remnant <[EMAIL PROTECTED]> 
wrote:

> In wait_task_stopped() exit_code already contains the right value for
> the si_status member of siginfo, and this is simply set in the non
> WNOWAIT case.
> 
> Pass it unchanged to wait_noreap_copyout();  we would only need to
> shift it and add 0x7f if we were returning it in the user status field
> and that isn't used for any function that permits WNOWAIT.
> 
> Signed-off-by: Scott James Remnant <[EMAIL PROTECTED]>
> Signed-off-by: Oleg Nesterov <[EMAIL PROTECTED]>
> Signed-off-by: Roland McGrath <[EMAIL PROTECTED]>
> 
> --- a/kernel/exit.c
> +++ b/kernel/exit.c
> @@ -1389,7 +1389,7 @@ static int wait_task_stopped(struct task_struct
> *p, int delayed_group_leader,
>   if (unlikely(!exit_code) || unlikely(p->exit_state))
>   goto bail_ref;
>   return wait_noreap_copyout(p, pid, uid,
> -why, (exit_code << 8) | 0x7f,
> +why, exit_code,
>  infop, ru);
>   }

Is this bug visible to userspace?  If so, I'm surprised that none of the
various testsuites (which like to exercise this sort of interface) has
detected it.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Is there any word about this bug in gcc ?

2007-11-19 Thread Herbert Xu
David Miller <[EMAIL PROTECTED]> wrote:
>
> Because the compiler knows things about the inputs and can
> thus apply optimizations that a static implementation in glibc
> that has to handle all forms of inputs cannot.

On an unrelated note, I wonder if distros will be treating this
with the same level of urgency as security vulnerabilities,
especially in light of Shamir's recent note on maths errors.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] - TPM device driver layer (tpm.c|h) - repost

2007-11-19 Thread Andrew Morton
On Tue, 25 Sep 2007 15:14:50 +0200 Richard MUSIL <[EMAIL PROTECTED]> wrote:

> Hello all,
> 
> sometime ago I submitted patch to TPM layer, originally I thought this
> patch could be accepted into kernel (see below). However,
> since this did not happen, I wonder, if there are some problems with the
> patch or whether I am expected to do/provide something else, in order to
> have it accepted.
> 
> The patch follows even more below.
> 

Thanks.  We prefer that contributors sign off their work as per
Documentation/SubmittingPatches.  Please review that and if agrereable,
send a Signed-off-by: for this work.

>  /*
> + * Once all references to platform device are down to 0,
> + * release all allocated structures.
> + * In case vendor provided release function,
> + * call it too.
> + */
> +static void tpm_dev_release(struct device *dev)
> +{
> + struct tpm_chip *chip = dev_get_drvdata(dev);
> + /* call vendor release, if defined */

That's not the most useful of comments ;)

> + if (chip->vendor.release)
> + chip->vendor.release(dev);
> +
> + /* it *should* be: chip->release != NULL */

And that one's actually wrong in the context of kernel coding practices. 
But whatever.

> + if (likely(chip->release))
> + chip->release(dev);

>From my reading, neither of these fields can ever be NULL, so the tests
simply aren't needed?

> + clear_bit(chip->dev_num, dev_mask);
> + kfree(chip->vendor.miscdev.name);
> + kfree(chip);
> +}
> +
> +/*
>   * Called from tpm_.c probe function only for devices 
>   * the driver has determined it should claim.  Prior to calling
>   * this function the specific probe function has called pci_enable_device
> @@ -1136,23 +1153,21 @@ struct tpm_chip *tpm_register_hardware(struct device 
> *dev, const struct tpm_vend
>  
>   chip->vendor.miscdev.parent = dev;
>   chip->dev = get_device(dev);
> + chip->release = dev->release;
> + dev->release = tpm_dev_release;
> + dev_set_drvdata(dev, chip);
>  
>   if (misc_register(>vendor.miscdev)) {
>   dev_err(chip->dev,
>   "unable to misc_register %s, minor %d\n",
>   chip->vendor.miscdev.name,
>   chip->vendor.miscdev.minor);
> - put_device(dev);
> - clear_bit(chip->dev_num, dev_mask);
> - kfree(chip);
> - kfree(devname);
> + put_device(chip->dev);
>   return NULL;
>   }
>  
>   spin_lock(_lock);
>  
> - dev_set_drvdata(dev, chip);
> -
>   list_add(>list, _chip_list);
>  
>   spin_unlock(_lock);
> @@ -1160,10 +1175,7 @@ struct tpm_chip *tpm_register_hardware(struct device 
> *dev, const struct tpm_vend
>   if (sysfs_create_group(>kobj, chip->vendor.attr_group)) {
>   list_del(>list);
>   misc_deregister(>vendor.miscdev);
> - put_device(dev);
> - clear_bit(chip->dev_num, dev_mask);
> - kfree(chip);
> - kfree(devname);
> + put_device(chip->dev);
>   return NULL;
>   }
>  
> diff --git a/drivers/char/tpm/tpm.h b/drivers/char/tpm/tpm.h
> index b2e2b00..f1c265e 100644
> --- a/drivers/char/tpm/tpm.h
> +++ b/drivers/char/tpm/tpm.h
> @@ -74,6 +74,7 @@ struct tpm_vendor_specific {
>   int (*send) (struct tpm_chip *, u8 *, size_t);
>   void (*cancel) (struct tpm_chip *);
>   u8 (*status) (struct tpm_chip *);
> + void (*release) (struct device *);
>   struct miscdevice miscdev;
>   struct attribute_group *attr_group;
>   struct list_head list;
> @@ -106,6 +107,7 @@ struct tpm_chip {
>   struct dentry **bios_dir;
>  
>   struct list_head list;
> + void (*release) (struct device *);
>  };
>  
>  #define to_tpm_chip(n) container_of(n, struct tpm_chip, vendor)

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] printk.c: use ints instead of longs for logbuf index

2007-11-19 Thread Andrew Morton
On Sun, 18 Nov 2007 19:32:12 -0800 Denys Vlasenko <[EMAIL PROTECTED]> wrote:

> Subject: [PATCH] printk.c: use ints instead of longs for logbuf index

"unsigned ints".  It matters - using ints would fill the code with bugs.

> Date: Sun, 18 Nov 2007 19:32:12 -0800
> User-Agent: KMail/1.9.1
> 
> Hi Andrew,
> 
> This patch stops using unsigned _longs_ for printk
> buffer indexes. Log buffer is way smaller than 2 gigabytes
> and unsigned ints will work too . Indeed, they do work nicely
> on all 32-bit platforms where longs and ints are the same.
> 
> With this patch, we have following size savings on amd64:
> 
>textdata bss dec hex filename
>5997 313   17736   240465dee 2.6.23.1.t64/kernel/printk.o
>5858 313   17700   238715d3f 2.6.23.1.printk.t64/kernel/printk.o

I can imagine someone using an 8GB log buffer for crazy
i-cant-be-bothered-using-relayfs stuff.

Oh well, they'll live.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Time-based RFC 4122 UUID generator

2007-11-19 Thread Matt Mackall
On Sun, Nov 18, 2007 at 10:40:34PM +0100, Helge Deller wrote:
> On Sunday 18 November 2007, Andrew Morton wrote:
> > On Sun, 18 Nov 2007 20:38:21 +0100 Helge Deller <[EMAIL PROTECTED]> wrote:
> > 
> > > Title: Add time-based RFC 4122 UUID generator
> > > 
> > > The current Linux kernel currently contains the generate_random_uuid() 
> > > function, which creates - based on RFC 4122 - truly random UUIDs and 
> > > provides them to userspace through /proc/sys/kernel/random/boot_id and 
> > > /proc/sys/kernel/random/uuid.
> > > 
> > > This patch additionally adds the "Time-based UUID" variant of RFC 4122, 
> > > with which userspace applications can easily get real unique time-based 
> > > UUIDs through /proc/sys/kernel/random/uuid_time.
> > > A new /proc/sys/kernel/random/uuid_time_clockseq sysfs entry is available,
> > > so that the clock_seq value can be retained across system bootups (which
> > > is required by RFC 4122).
> > > 
> > > The attached implementation uses getnstimeofday() to get very fine-grained
> > > granularity. This helps, so that userspace tools can get a lot more UUIDs 
> > > (if needed) per time than before.
> > > A mutex takes care of the proper locking against a mistaken double 
> > > creation 
> > > of UUIDs for simultanious running processes.
> 
> 
> > Who will use this feature, and for what?
> > (In fact, who uses the existing UUID generators, and for what?)
> 
> Current users I know of (but there are more):
> - e2fsprogs uses it e.g. to create unique UUIDs for disks (it ships an own 
> library for that)
> - http://commons.apache.org/sandbox/id/uuid.html uses it with own libraries
> - SAP Netweaver on Linux uses it 
> (http://www.sap.com/platform/netweaver/index.epx)
> 
> I'm mostly interested in fixing problems I see with SAP (I'm working for SAP).
> SAP Netweaver often needs during a very short time frame lots of unique UUIDs 
> (to reference the data afterwards) when new data is imported into the 
> database.
> Main problem with current implementations is, is that they don't 100% 
> guarantee uniqness of the generated UUIDs. Sometimes, esp. on very fast 
> multi-processor machines, double UUIDs are generated and returned to the 
> application which is very bad and may result in unreliable behaviour.
> 
> Current implemenations use userspace-libraries. In userspace you e.g. can't 
> easily protect the uniquness of a UUID against other running _processes_.
> If you try do, you'll need to do locking e.g. with shared memory, which can 
> get very expensive.

Even with a futex? Or userspace atomics? I think something as simple
as a server stuffing a bunch of clock sequence numbers into a pipe
for clients to pop into their generated UUIDs should be plenty fast
enough.

> The problem will get even worse with virtualization technologies like XEN and
> containers. There it's even impossible to protect against processes in other 
> VMs.

Nor does it make sense to try! A virtual machine is an independent machine
after all.

> Another user which could benefit from it are embedded devices. They could 
> drop their userspace-implementations in favour of this smaller kernel version
> to create UUIDs for their disks, using it in the webservers, ...

That's a silly tradeoff. It's an unusual embedded device that ships
with any need for a UUID, especially mkfs. And generally, putting a
feature in the kernel has no inherent size advantage. In fact, it has
a size disadvantage: it's no longer pageable.

ps: I'm the listed random.c maintainer so you'll want to cc: me in the
future.
-- 
Mathematics is the supreme nostalgia of our time.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] TPM TIS device driver locality request

2007-11-19 Thread Andrew Morton
On Mon, 19 Nov 2007 00:32:51 +0100 Marcel Selhorst <[EMAIL PROTECTED]> wrote:

> Dear all,
> 
> during the initialization of the TPM TIS driver, the necessary
> locality has to be requested earlier in the init-process. Depending on
> the used TPM chip, this leads to wrong information.
> For example: Lenovo X61s with Atmel TPM:
> 
> tpm_tis 00:0a: 1.2 TPM (device-id 0x, rev-id 255)
> 
> But correct is:
> 
> tpm_tis 00:0c: 1.2 TPM (device-id 0x3203, rev-id 9)
> 
> This short patch fixes this issue.

Is this bug sufficiently serious to warrant inclusion of the fix in
2.6.24?  2.6.23.x?

It looks like it's just a cosmetic thing, but I'd like to check...

> Signed-Off-by Marcel Selhorst <[EMAIL PROTECTED]>

Signed-off-by:, please.

> ---
> --- tpm_tis.c.orig  2007-11-19 00:21:09.0 +0100
> +++ tpm_tis.c   2007-11-19 00:21:23.0 +0100

`patch -p1' form, please.  This should have been

--- a/drivers/char/tpm/tpm_tis.c
+++ a/drivers/char/tpm/tpm_tis.c

> @@ -450,6 +450,11 @@ static int tpm_tis_init(struct device *d
> goto out_err;
> }
> 
> +   if (request_locality(chip, 0) != 0) {
> +   rc = -ENODEV;
> +   goto out_err;
> +   }
> +

Your email client is converting tabs to spaces.

http://mbligh.org/linuxdocs/Email/Clients/Thunderbird has help for this.

Thanks.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Is there any word about this bug in gcc ?

2007-11-19 Thread David Miller
From: WANG Cong <[EMAIL PROTECTED]>
Date: Tue, 20 Nov 2007 13:39:05 +0800

> And you mean abs() is not in glibc, then where is it? Built in gcc?
> And what's more, why not put it in glibc?

Because the compiler knows things about the inputs and can
thus apply optimizations that a static implementation in glibc
that has to handle all forms of inputs cannot.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 7/24] consolidate msr.h

2007-11-19 Thread Steven Rostedt

On Tue, 20 Nov 2007, Ingo Molnar wrote:
>
> i dont think there's ever any true need (and good cause) to force
> integer type casts like that at the callee site.

Unless you mean we should do something like this:

static inline void __wrmsrl(unsigned int msr, unsigned long long val);
#define wrmsr(msr, val) __wrmsrl(msr, (unsigned long long)var)

-- Steve


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Time-based RFC 4122 UUID generator

2007-11-19 Thread Andrew Morton
On Sun, 18 Nov 2007 20:38:21 +0100 Helge Deller <[EMAIL PROTECTED]> wrote:

> Andrew,
> 
> could you please consider adding this patch to your 2.6.25 patch series?

please cc netdev on networking-related things

> This is the third version of the patch in which I cleaned up and fixed quite 
> some stuff according to feedback from Ted.
> I assume this version is OK, since I didn't received any further feedback 
> since two weeks: http://lkml.org/lkml/2007/11/4/128.
> 
> Thanks,
> Helge
> ---
> Title: Add time-based RFC 4122 UUID generator
> 
> The current Linux kernel currently contains the generate_random_uuid() 
> function, which creates - based on RFC 4122 - truly random UUIDs and 
> provides them to userspace through /proc/sys/kernel/random/boot_id and 
> /proc/sys/kernel/random/uuid.
> 
> This patch additionally adds the "Time-based UUID" variant of RFC 4122, 
> with which userspace applications can easily get real unique time-based 
> UUIDs through /proc/sys/kernel/random/uuid_time.
> A new /proc/sys/kernel/random/uuid_time_clockseq sysfs entry is available,
> so that the clock_seq value can be retained across system bootups (which
> is required by RFC 4122).
> 
> The attached implementation uses getnstimeofday() to get very fine-grained
> granularity. This helps, so that userspace tools can get a lot more UUIDs 
> (if needed) per time than before.
> A mutex takes care of the proper locking against a mistaken double creation 
> of UUIDs for simultanious running processes.
> 
> Signed-off-by: Helge Deller <[EMAIL PROTECTED]>
> 
>  drivers/char/random.c  |  205 
> -
>  include/linux/sysctl.h |5 -
>  2 files changed, 190 insertions(+), 20 deletions(-)
> 
> diff --git a/drivers/char/random.c b/drivers/char/random.c
> index 5fee056..fc48c29 100644
> --- a/drivers/char/random.c
> +++ b/drivers/char/random.c
> @@ -6,6 +6,9 @@
>   * Copyright Theodore Ts'o, 1994, 1995, 1996, 1997, 1998, 1999.  All
>   * rights reserved.
>   *
> + * Time based UUID (RFC 4122) generator:
> + * Copyright Helge Deller <[EMAIL PROTECTED]>, 2007
> + *
>   * Redistribution and use in source and binary forms, with or without
>   * modification, are permitted provided that the following conditions
>   * are met:
> @@ -239,6 +242,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  #include 
>  #include 
> @@ -1174,12 +1178,169 @@ EXPORT_SYMBOL(generate_random_uuid);
>  static int min_read_thresh = 8, min_write_thresh;
>  static int max_read_thresh = INPUT_POOL_WORDS * 32;
>  static int max_write_thresh = INPUT_POOL_WORDS * 32;
> -static char sysctl_bootid[16];
> +static unsigned char sysctl_bootid[16] __read_mostly;
>  
>  /*
> - * These functions is used to return both the bootid UUID, and random
> - * UUID.  The difference is in whether table->data is NULL; if it is,
> - * then a new UUID is generated and returned to the user.
> + * Helper functions and variables for time based UUID generator
> + */
> +static unsigned int clock_seq;
> +static const unsigned int clock_seq_max = 0x3fff; /* 14 bits */

There isn't a lot of point in `static const'.  Hopefully the compiler will
do the right thing with it (use literal constant and elide the storage if
nothing takes its address) but why not just do #define UPPER_CASE_THING in
the time-homoured manner?

> +static int clock_seq_initialized __read_mostly;
> +
> +static void init_clockseq(void)
> +{
> + get_random_bytes(_seq, sizeof(clock_seq));
> + clock_seq &= clock_seq_max;
> + clock_seq_initialized = 1;
> +}
> +
> +static int proc_dointvec_clockseq(struct ctl_table *table, int write,
> + struct file *filp, void __user *buffer,
> + size_t *lenp, loff_t *ppos)
> +{
> + int ret;
> +
> + if (!write && !clock_seq_initialized)
> + init_clockseq();

Seems there's a straightfroward race here where multiple tasks can run
init_clockseq() concurrently.

Can't we use a regular initcall here and make init_clockseq() __init?  I
guess that would cast doubt over the quality of the thing which
get_random_bytes() returned, but that's already the case - super-early
userspace in initramfs could mount /proc and trigger this call.  We end up
with a predictable sequence number?

> + ret = proc_dointvec(table, write, filp, buffer, lenp, ppos);
> +
> + if (write && ret >= 0) {
> + clock_seq_initialized = 1;
> + clock_seq &= clock_seq_max;
> + }
> +
> + return ret;
> +}
> +
> +/*
> + * Generate time based UUID (RFC 4122)
> + *
> + * This function is protected with a mutex to ensure system-wide
> + * uniqiness of the new time based UUID.
> + */
> +static void generate_random_uuid_time(unsigned char uuid_out[16])
> +{
> + static DEFINE_MUTEX(uuid_mutex);
> + static u64 last_time_all;
> + static unsigned int clock_seq_started;
> + static unsigned char last_mac[ETH_ALEN];
> +
> + struct timespec ts;
> + u64 time_all;
> + unsigned 

Re: BUG: soft lockup detected on CPU#0! in 2.6.23.8

2007-11-19 Thread David Miller
From: Erik de Castro Lopo <[EMAIL PROTECTED]>
Date: Tue, 20 Nov 2007 16:22:25 +1100

> I've just compiled 2.6.23.8 from kernel.org sources and I getting a
> bunch of these soft lockups detected.

Yes, this is getting hit by everyone, a fix is in the works.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 7/24] consolidate msr.h

2007-11-19 Thread Steven Rostedt


On Tue, 20 Nov 2007, Ingo Molnar wrote:

> * Steven Rostedt <[EMAIL PROTECTED]> wrote:
>
> > With PVOPS on it gives compiler warnings without that explict cast.
> > Without looking at the code, IIRC with non-PVOPS it is a macro
> > directly into asm, so it didn't matter what the cast was. But with
> > PVOPS as a function, it gave compiler warnings.
> >
> > Take it out and try compiling it for both i386 and x86_64. One of them
> > gave warnings. But maybe it's not a problem now.
>
> i dont think there's ever any true need (and good cause) to force
> integer type casts like that at the callee site.

I guess the problem is that we converted a macro to a function, where the
macro did no type checking. Now we need to pick between integers and
pointers. Some places uses intergers in wrmsrl and some use pointers. So
changing this to a typechecking protocol is not going to be nice.

Looking at the current code now, we have this:


checking_wrmsrl(MSR_IA32_SYSENTER_CS, (u64)__KERNEL_CS);
checking_wrmsrl(MSR_IA32_SYSENTER_ESP, 0ULL);
checking_wrmsrl(MSR_IA32_SYSENTER_EIP, (u64)ia32_sysenter_target);

wrmsrl(MSR_CSTAR, ia32_cstar_target);


A typecast is already used in that same area.

-- Steve


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Is there any word about this bug in gcc ?

2007-11-19 Thread WANG Cong
On Tue, Nov 20, 2007 at 02:03:12PM +0800, Li Zefan wrote:
>WANG Cong wrote:
>> On Mon, Nov 19, 2007 at 09:10:44PM -0800, H. Peter Anvin wrote:
>>> WANG Cong wrote:
 On Tue, Nov 20, 2007 at 10:13:42AM +0800, zhengyi wrote:
> Is there any relevance to the kernel ?
>
> I found the folowing code here:
> http://linux.solidot.org/article.pl?sid=07/11/19/0512218=rss
>
> ---
> int main( void )
> {
> int i=2;
> if( -10*abs (i-1) == 10*abs(i-1) )
>   printf ("OMG,-10==10 in linux!\n");
> else
>   printf ("nothing special here\n") ;
>
> return 0 ;
> }
 I think no. It is considered a bug in abs(), kernel, of course,
 doesn't use glibc's abs().

>>> Wrong.
>>>
>>> abs() is internal to gcc, and the above is optimized out at compile 
>>> time, so any user of abs() as a function at all is vulnerable.
>> 
>> This is an urgent bug, I think.
>> 
>> And you mean abs() is not in glibc, then where is it? Built in gcc?
>> And what's more, why not put it in glibc?
>> 
>
>Gcc optimises abs() to use gcc builtin-in abs(). So if we use -fno-builin, 
>we'll get the correct result. That is to say the bug has nothing to do with
>glibc.
>
>And this bug has been fixed just several days ago.
>
>http://www.nabble.com/-PATCH--Fix-PR34130,-extract_muldiv-broken-t4826688.html
>

Good explanation! Thank you!


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch/backport] CFS scheduler, -v24, for v2.6.24-rc3, v2.6.23.8,v2.6.22.13, v2.6.21.7

2007-11-19 Thread Ingo Molnar

* David <[EMAIL PROTECTED]> wrote:

> El Lunes, 19 de Noviembre de 2007, Ingo Molnar escribió:
> > * David <[EMAIL PROTECTED]> wrote:
> > > I have removed all other patches, and applied only cfs v24 above
> > > 2.6.23.8, and the compiler ran into (with CONFIG_FAIR_GROUP_SCHED
> > > enabled):
> >
> > does the patch below help?
> >
> > Ingo
> 
> Yes, now sched.c compile without errors, but linking fails at:
> 
> 
>   LD  init/built-in.o
>   LD  .tmp_vmlinux1
> kernel/built-in.o:(.data+0x4a8): undefined reference to 
> `sysctl_sched_min_bal_int_shares'
> kernel/built-in.o:(.data+0x4d4): undefined reference to 
> `sysctl_sched_max_bal_int_shares'
> make: *** [.tmp_vmlinux1] Error 1

does the patch below do the trick?

Ingo

Index: linux/kernel/sysctl.c
===
--- linux.orig/kernel/sysctl.c
+++ linux/kernel/sysctl.c
@@ -309,7 +309,7 @@ static struct ctl_table kern_table[] = {
.mode   = 644,
.proc_handler   = _dointvec,
},
-#ifdef CONFIG_FAIR_GROUP_SCHED
+#if defined(CONFIG_FAIR_GROUP_SCHED) && defined(CONFIG_SMP)
{
.ctl_name   = CTL_UNNUMBERED,
.procname   = "sched_min_bal_int_shares",
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [stable] Soft lockups since stable kernel upgrade to 2.6.23.8

2007-11-19 Thread Ingo Molnar

* Chuck Ebbert <[EMAIL PROTECTED]> wrote:

> On 11/17/2007 07:55 PM, Ingo Molnar wrote:
> > * Greg KH <[EMAIL PROTECTED]> wrote:
> > 
> >> Great, thanks for tracking this down.
> >>
> >> Ingo, this corrisponds to changeset 
> >> a115d5caca1a2905ba7a32b408a6042b20179aaa in mainline.  Is that patch 
> >> incorrect?  Should this patch in the -stable tree be reverted?
> > 
> > hm, there are no such problems in .24 and the cpu_clock() and other 
> > fixes i did were not picked up. Find the missing fixes below. They 
> > should work just fine in .23 as it has the cpu_clock() functionality 
> > too.
> > 
> > [ NOTE: the most robust thing is to make the .23 version match the .24
> >   version of kernel/softlockup.c, so i included two other harmless
> >   changes in this diff as well. ]
> > 
> > Ingo
> > 
> > --->
> > commit a5f2ce3c6024a5bb895647b6bd88ecae5001020a
> > Author: Ingo Molnar <[EMAIL PROTECTED]>
> > Date:   Tue Oct 16 23:26:08 2007 -0700
> > 
> > commit 43581a10075492445f65234384210492ff333eba
> > Author: Ingo Molnar <[EMAIL PROTECTED]>
> > Date:   Tue Oct 16 23:26:08 2007 -0700
> 
> Those are just cosmetic / cleanup changes.
> 
> Don't you need commit a3b13c23f186ecb57204580cc1f2dbe9c284953a ??

yes:

> > [...] the cpu_clock() and other fixes i did were not picked up.

i just forgot to attach the cpu_clock() changes - they are in a3b13c23.

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Is there any word about this bug in gcc ?

2007-11-19 Thread Li Zefan
WANG Cong wrote:
> On Mon, Nov 19, 2007 at 09:10:44PM -0800, H. Peter Anvin wrote:
>> WANG Cong wrote:
>>> On Tue, Nov 20, 2007 at 10:13:42AM +0800, zhengyi wrote:
 Is there any relevance to the kernel ?

 I found the folowing code here:
 http://linux.solidot.org/article.pl?sid=07/11/19/0512218=rss

 ---
 int main( void )
 {
 int i=2;
 if( -10*abs (i-1) == 10*abs(i-1) )
   printf ("OMG,-10==10 in linux!\n");
 else
   printf ("nothing special here\n") ;

 return 0 ;
 }
>>> I think no. It is considered a bug in abs(), kernel, of course,
>>> doesn't use glibc's abs().
>>>
>> Wrong.
>>
>> abs() is internal to gcc, and the above is optimized out at compile 
>> time, so any user of abs() as a function at all is vulnerable.
> 
> This is an urgent bug, I think.
> 
> And you mean abs() is not in glibc, then where is it? Built in gcc?
> And what's more, why not put it in glibc?
> 

Gcc optimises abs() to use gcc builtin-in abs(). So if we use -fno-builin, 
we'll get the correct result. That is to say the bug has nothing to do with
glibc.

And this bug has been fixed just several days ago.

http://www.nabble.com/-PATCH--Fix-PR34130,-extract_muldiv-broken-t4826688.html

> 
>> However, the Linux kernel defines abs() as a macro:
>>
>> #define abs(x) ({   \
>>int __x = (x);  \
>>(__x < 0) ? -__x : __x; \
>>})
>>
>> ... which means gcc never sees it.  So the kernel isn't affected, 
>> because it doesn't use *gcc's* abs().
> 
> Thanks for clarifying this!
> 
> Regards.
> 

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Is there any word about this bug in gcc ?

2007-11-19 Thread H. Peter Anvin

WANG Cong wrote:


This is an urgent bug, I think.

And you mean abs() is not in glibc, then where is it? Built in gcc?
And what's more, why not put it in glibc?



If you need answers to this type of questions, this is not the place for it.

-hpa
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [stable] Soft lockups since stable kernel upgrade to 2.6.23.8

2007-11-19 Thread Ingo Molnar

* Jeremy Fitzhardinge <[EMAIL PROTECTED]> wrote:

> Greg KH wrote:
> > Can you try applying the patch below to see if that solves the problem
> > for you?
> >   
> 
> I don't think this patch will help; it only has cosmetic changes in
> addition to the original message printing fix.  I think it also needs
> change a3b13c23f186ecb57204580cc1f2dbe9c284953a:
> 
> diff -r 79f0ea1e0e70 -r 06f060ab58aa kernel/softlockup.c

yes, it does need the cpu_clock() changes as i mentioned.

  commit a3b13c23f186ecb57204580cc1f2dbe9c284953a
  Author: Ingo Molnar <[EMAIL PROTECTED]>
  Date:   Tue Oct 16 23:26:06 2007 -0700

softlockup: use cpu_clock() instead of sched_clock()

sched_clock() is not a reliable time-source, use cpu_clock() instead.

but we only have cpu_clock() from v2.6.23 onwards - so we should not 
apply the original patch to v2.6.22. (we should not have applied your 
patch that started the mess to begin with - but that's another matter.)

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: High priority tasks break SMP balancer?

2007-11-19 Thread Ingo Molnar

* Micah Dowty <[EMAIL PROTECTED]> wrote:

> > this one is being triggered whenever a cpu becomes idle (schedule() 
> > --> idle_balance() --> load_balance_newidle()).
> > 
> > (this flag is a bit #1 == 2)
> > 
> > cat /proc/sys/kernel/sched_domain/cpu0/domain0/flags
> 
> Hmm. I don't have this file on my system:
> 
> [EMAIL PROTECTED]:/proc/sys/kernel/sched_domain/cpu0/domain0# ls
> busy_factor  busy_idx  forkexec_idx  idle_idx  imbalance_pct  max_interval  
> min_interval  newidle_idx  wake_idx
> [EMAIL PROTECTED]:/proc/sys/kernel/sched_domain/cpu0/domain0# uname -a
> Linux micah-64 2.6.23.1 #1 SMP Fri Nov 2 12:25:47 PDT 2007 x86_64 GNU/Linux
> 
> Is there a config option I'm missing?

yes, CONFIG_SCHED_DEBUG.

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] ipconfig.c : implement DHCP Class-identifier

2007-11-19 Thread David Miller
From: Francois Romieu <[EMAIL PROTECTED]>
Date: Wed, 14 Nov 2007 23:11:19 +0100

> Rainer Jochem <[EMAIL PROTECTED]> :
> [...]
> > --- net/ipv4/ipconfig.c.orig2007-11-14 09:16:15.800566536 +0100
> > +++ net/ipv4/ipconfig.c 2007-11-14 10:34:22.471219274 +0100
> > @@ -139,6 +139,8 @@ __be32 ic_servaddr = NONE;  /* Boot serve
> >  __be32 root_server_addr = NONE;/* Address of NFS server */
> >  u8 root_server_path[256] = { 0, }; /* Path to mount as root */
> >  
> > +static char vendor_class_identifier[253]; /* vendor class identifier */
> > +
> 
> ic_dhcp_init_options is __init. Should it not be __initdata ?

I think so and I've made that change in net-2.6.25, thanks.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc2 STD with s2disk fails to activate suspended system after loading - now 2.6.24-rc3

2007-11-19 Thread Ingo Molnar

* Rafael J. Wysocki <[EMAIL PROTECTED]> wrote:

> > increasing CONFIG_BLK_DEV_RAM_SIZE from  to 131072 hasn't 
> > changed the non-functioning of 2.6.24-rc3
> > 
> > s2disk works with 2.6.23.8 ; I tested 4 cycles in a row, 2 from 
> > console and 2 from within X
> 
> I've attached a patch to the bugzilla entry, please test it.
[...]
> --- linux-2.6.orig/init/do_mounts_initrd.c
> +++ linux-2.6/init/do_mounts_initrd.c
> @@ -55,6 +55,8 @@ static void __init handle_initrd(void)
>   sys_mount(".", "/", NULL, MS_MOVE, NULL);
>   sys_chroot(".");
>  
> + current->flags |= PF_NOFREEZE;
> +
>   pid = kernel_thread(do_linuxrc, "/linuxrc", SIGCHLD);
>   if (pid > 0)
>   while (pid != sys_wait4(-1, NULL, 0, NULL)) {

this is not the first time (and not the last time) that a missing 
PF_NOFREEZE is causing hard to debug suspend problems. I think we should 
be more robust about this and at minimum include some debug mechanism 
that determines when a PF_NOFREEZE annotation is missing. (or at least 
detect the condition somehow and report it) This bug took 10 days to 
track down.

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] PPC: Fix potential NULL dereference

2007-11-19 Thread Kumar Gala
On Thu, 15 Nov 2007, Cyrill Gorcunov wrote:

> This patch does fix potential NULL pointer dereference
> that could take place inside of strcmp() if
> of_get_property() call failed.
>
> Signed-off-by: Cyrill Gorcunov <[EMAIL PROTECTED]>
> ---
>
>  arch/powerpc/platforms/83xx/usb.c |8 
>  1 files changed, 4 insertions(+), 4 deletions(-)
>

CC the [EMAIL PROTECTED] list in the future.

applied.

- k
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 7/24] consolidate msr.h

2007-11-19 Thread Ingo Molnar

* Steven Rostedt <[EMAIL PROTECTED]> wrote:

> > On Fri, Nov 09, 2007 at 04:42:48PM -0200, Glauber de Oliveira Costa wrote:
> > > - wrmsrl(MSR_CSTAR, ia32_cstar_target);
> > > + wrmsrl(MSR_CSTAR, (u64)ia32_cstar_target);
> >
> > Hmm, why do you add explicit casts? The compiler should convert that
> > correctly on its own.
> >
> > > +static inline void wrmsrl(unsigned int msr, unsigned long long val)
> >
> > Hmm, long long is 64 bit on all x86, but why not use explicit u64 to
> > show that?
> 
> (quick reply)
> 
> With PVOPS on it gives compiler warnings without that explict cast. 
> Without looking at the code, IIRC with non-PVOPS it is a macro 
> directly into asm, so it didn't matter what the cast was. But with 
> PVOPS as a function, it gave compiler warnings.
> 
> Take it out and try compiling it for both i386 and x86_64. One of them 
> gave warnings. But maybe it's not a problem now.

i dont think there's ever any true need (and good cause) to force 
integer type casts like that at the callee site.

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/2] sata_nv: fix ATAPI issues with memory over 4GB (v3)

2007-11-19 Thread Tejun Heo
Robert Hancock wrote:
> It looks like the problem is that even though we set the DMA mask after
> we allocate the PRD and pad buffers, when the other port is set up, the
> DMA mask is already over 64-bit and so it allocates its buffers over 4GB
> and fails. I think we just need to explicitly set to 32-bit first,
> getting the reporter to try that one now.

Ah.. right.  That makes sense.  Thanks.

-- 
tejun
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [BUG?] OOM with large cache....(x86_64, 2.6.24-rc3-git1, nohz)

2007-11-19 Thread Ingo Molnar

* Nick Piggin <[EMAIL PROTECTED]> wrote:

> Unfortunately, we don't show NR_ANON_PAGES in these stats, [...]

sidenote: the way i combat these missing pieces of instrumentation in 
the scheduler is to add them immediately to the cfs-debug-info.sh script 
(and to /proc/sched_debug if needed). I.e. if we get one report that 
misses a piece of critical information is OK, but if it's two reports 
and we still havent made it easy to report the right kind of information 
that is our fault entirely. This constant ping-ponging for information 
that goes on for basically every MM problem - which information could 
have been provided in the first message (by running a single, easy to 
download tool) is getting pretty hindering i believe.

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [BUG] 2.6.24-rc2-mm1 - kernel bug on nfs v4

2007-11-19 Thread Andrew Morton
On Sun, 18 Nov 2007 14:18:06 -0500 Trond Myklebust <[EMAIL PROTECTED]> wrote:

> > 
> > Torsten
> 
> I had already fixed that one in my own stack. Attached are the 3 patches
> that I've got. 1 from SteveD, 2 fixes.
> 
> Andrew, could you please unapply the sillyrename patches you've got, and
> apply these 3 instead?

I'd expect to see things like this appear in git-nfs.patch.  Did something 
change?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Is there any word about this bug in gcc ?

2007-11-19 Thread WANG Cong
On Mon, Nov 19, 2007 at 09:10:44PM -0800, H. Peter Anvin wrote:
>WANG Cong wrote:
>>On Tue, Nov 20, 2007 at 10:13:42AM +0800, zhengyi wrote:
>>>Is there any relevance to the kernel ?
>>>
>>>I found the folowing code here:
>>>http://linux.solidot.org/article.pl?sid=07/11/19/0512218=rss
>>>
>>>---
>>>int main( void )
>>>{
>>> int i=2;
>>> if( -10*abs (i-1) == 10*abs(i-1) )
>>>   printf ("OMG,-10==10 in linux!\n");
>>> else
>>>   printf ("nothing special here\n") ;
>>>
>>> return 0 ;
>>>}
>>
>>I think no. It is considered a bug in abs(), kernel, of course,
>>doesn't use glibc's abs().
>>
>
>Wrong.
>
>abs() is internal to gcc, and the above is optimized out at compile 
>time, so any user of abs() as a function at all is vulnerable.

This is an urgent bug, I think.

And you mean abs() is not in glibc, then where is it? Built in gcc?
And what's more, why not put it in glibc?

Thanks.

>
>However, the Linux kernel defines abs() as a macro:
>
>#define abs(x) ({   \
>int __x = (x);  \
>(__x < 0) ? -__x : __x; \
>})
>
>... which means gcc never sees it.  So the kernel isn't affected, 
>because it doesn't use *gcc's* abs().

Thanks for clarifying this!

Regards.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CONFIG_IRQBALANCE for 64-bit x86 ?

2007-11-19 Thread Arjan van de Ven
On Tue, 20 Nov 2007 15:17:15 +1100
Nick Piggin <[EMAIL PROTECTED]> wrote:

> On Tuesday 20 November 2007 15:12, Mark Lord wrote:
> > On 32-bit x86, we have CONFIG_IRQBALANCE available,
> > but not on 64-bit x86.  Why not?

because the in-kernel one is actually quite bad.


> > My QuadCore box works very well in 32-bit mode with IRQBALANCE,
> > but responsiveness sucks bigtime when run in 64-bit mode (no
> > IRQBALANCE) during periods of multiple heavy I/O streams (USB flash
> > drives).

please run the userspace irq balancer, see http://www.irqbalance.org
afaik most distros ship that by default anyway.


> > As near as I can tell, when IRQBALANCE is not configured,
> > all I/O device interrupts go to CPU#0.

that depends on your chipset; some chipsets do worse than that.

>
> > I don't think our CPU scheduler takes that into account when
> > assigning tasks to CPUs, so anything sent to CPU0 runs with very
> > high latencies.
> >
> > Or something like that.
> >
> > Why no IRQ_BALANCE in 64-bit mode ?
> 
> For that matter, I'd like to know why it has been decided that the
> best place for IRQ balancing is in userspace. It should be in kernel
> IMO, and it would probably allow better power saving, performance,
> fairness, etc. if it were to be integrated with the task balancer as
> well.

actually no. IRQ balancing is not a "fast" decision; every time you
move an interrupt around, you end up causing a really a TON of cache
line bounces, and generally really bad performance (esp if you do it
for networking ones, since you destroy the packet reassembly stuff in
the tcp/ip stack).

Instead, what ends up working is if you do high level categories of
interrupt classes and balance within those (so that no 2 networking
irqs are on the same core/package unless you have more nics than cores)
etc. Balancing on a 10 second scale seems to work quite well; no need
to pull that complexity into the kernel 

-- 
If you want to reach me at my work email, use [EMAIL PROTECTED]
For development, discussion and tips for power savings, 
visit http://www.lesswatts.org
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] e100: free IRQ to remove warning when rebooting

2007-11-19 Thread Ian Wienand
Hi,

When rebooting today I got

Will now restart.
ACPI: PCI interrupt for device :00:03.0 disabled
GSI 20 (level, low) -> CPU 1 (0x0100) vector 53 unregistered
Destroying IRQ53 without calling free_irq
WARNING: at 
/home/insecure/ianw/programs/git-kernel/linux-2.6/kernel/irq/chip.c:76 
dynamic_irq_cleanup()

Call Trace:
 [] show_stack+0x40/0xa0
sp=e0407c927b40 bsp=e0407c920eb8
 [] dump_stack+0x30/0x60
sp=e0407c927d10 bsp=e0407c920ea0
 [] dynamic_irq_cleanup+0x160/0x1e0
sp=e0407c927d10 bsp=e0407c920e70
 [] destroy_and_reserve_irq+0x30/0xc0
sp=e0407c927d10 bsp=e0407c920e40
 [] iosapic_unregister_intr+0x5b0/0x5e0
sp=e0407c927d10 bsp=e0407c920dd8
 [] acpi_unregister_gsi+0x30/0x60
sp=e0407c927d10 bsp=e0407c920db8
 [] acpi_pci_irq_disable+0x140/0x160
sp=e0407c927d10 bsp=e0407c920d88
 [] pcibios_disable_device+0xa0/0xc0
sp=e0407c927d20 bsp=e0407c920d68
 [] pci_disable_device+0x130/0x160
sp=e0407c927d20 bsp=e0407c920d38
 [] e100_shutdown+0x1c0/0x220
sp=e0407c927d30 bsp=e0407c920d08
 [] pci_device_shutdown+0x80/0xc0
sp=e0407c927d30 bsp=e0407c920ce8
 [] device_shutdown+0xf0/0x180
sp=e0407c927d30 bsp=e0407c920cc8
 [] kernel_restart+0x60/0x120
sp=e0407c927d30 bsp=e0407c920ca8
 [] sys_reboot+0x3b0/0x480
sp=e0407c927d30 bsp=e0407c920c30
 [] ia64_ret_from_syscall+0x0/0x20
sp=e0407c927e30 bsp=e0407c920c30
 [] ia64_ivt+0x00010620/0x400
sp=e0407c928000 bsp=e0407c920c30
Restarting system.

I think the solution might be to free the IRQ before the pci_device_shutdown

Signed-off-by: Ian Wienand <[EMAIL PROTECTED]>

---

 e100.c |1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/e100.c b/drivers/net/e100.c
index 3dbaec6..8ae5ac3 100644
--- a/drivers/net/e100.c
+++ b/drivers/net/e100.c
@@ -2782,6 +2782,7 @@ static void e100_shutdown(struct pci_dev *pdev)
pci_enable_wake(pdev, PCI_D3cold, 0);
}
 
+   free_irq(pdev->irq, netdev);
pci_disable_device(pdev);
pci_set_power_state(pdev, PCI_D3hot);
 }
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CONFIG_IRQBALANCE for 64-bit x86 ?

2007-11-19 Thread H. Peter Anvin

Nick Piggin wrote:

On Tuesday 20 November 2007 15:37, Adrian Bunk wrote:

On Tue, Nov 20, 2007 at 05:29:29AM +0100, Willy Tarreau wrote:



Agreed. When userspace has something to do with the way IRQs are
delivered, it's going to smell as bad as micro-kernels...

The next step to a micro-kernel would then be hardware drivers and file
systems in userspace?  ;-)


We already have those. So the next step would be to pretend the
performance critical ones can be in userspace and remain competitive,
wouldn't it? ;)


Hey, I have a great idea... we can create a microkernel^W hypervisor and 
make a single process^W domain do all the I/O...


-hpa
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Keep UML Kconfig in sync with x86

2007-11-19 Thread WANG Cong
On Mon, Nov 19, 2007 at 02:02:24PM -0500, Jeff Dike wrote:
>Fix a 2.6.24-rc3 UML build breakage introduced by commit
>1032c0ba9da5c5b53173ad2dcf8b2a2da78f8b17 - it introduces X86_32, with
>many things which UML needs depending on it.
>
>This patch adds definitions of X86_32 and RWSEM_XCHGADD_ALGORITHM to
>the UML/i386 Kconfig.
>
>Signed-off-by: Jeff Dike <[EMAIL PROTECTED]>

Tested-by: WANG Cong <[EMAIL PROTECTED]>

Thanks, Jeff. With this and that patch[1], uml building works fine. ;)

[1] http://lkml.org/lkml/2007/11/15/231

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


BUG: soft lockup detected on CPU#0! in 2.6.23.8

2007-11-19 Thread Erik de Castro Lopo
HI all,

I've just compiled 2.6.23.8 from kernel.org sources and I getting a
bunch of these soft lockups detected. This seems to be similar to
the problems reported here:

http://lkml.org/lkml/2007/11/19/345

but I am not as far as I am aware using the ondemand governor.

> cat /proc/acpi/processor/CPU0/info 
processor id:0
acpi id: 0
bus mastering control:   no
power management:no
throttling control:  no
limit interface: no

What is suspisious is "Clocksource tsc unstable".

Dump from dmsg follows. Any light people can shed on this much
appreciated.

Cheers,
Erik


[   80.215847] Marking TSC unstable due to: cpufreq changes.
[   80.217903] Time: acpi_pm clocksource has been installed.
[   80.217908] BUG: soft lockup detected on CPU#0!
[   80.217924]  [] softlockup_tick+0xbb/0xf0
[   80.217934]  [] update_process_times+0x32/0x80
[   80.217941]  [] tick_sched_timer+0x75/0x1c0
[   80.217952]  [] hrtimer_interrupt+0x170/0x1c0
[   80.217960]  [] smp_apic_timer_interrupt+0x3d/0x80
[   80.217966]  [] apic_timer_interrupt+0x28/0x30
[   80.217973]  [] pci_pcbios_init+0x18b/0x260
[   80.217979]  [] default_idle+0x2b/0x40
[   80.217984]  [] cpu_idle+0x48/0x80
[   80.217987]  [] start_kernel+0x21f/0x2a0
[   80.217992]  [] unknown_bootoption+0x0/0x1d0
[   80.217997]  ===
[   80.757447] Clocksource tsc unstable (delta = -9062 ns)
[   98.773859] BUG: soft lockup detected on CPU#0!
[   98.773881]  [] softlockup_tick+0xbb/0xf0
[   98.773893]  [] update_process_times+0x32/0x80
[   98.773901]  [] tick_sched_timer+0x75/0x1c0
[   98.773913]  [] hrtimer_interrupt+0x170/0x1c0
[   98.773922]  [] smp_apic_timer_interrupt+0x3d/0x80
[   98.773930]  [] apic_timer_interrupt+0x28/0x30
[   98.773938]  [] hci_sock_setsockopt+0x0/0x1b0
[   98.773947]  ===
[   85.153581] BUG: soft lockup detected on CPU#0!
[   85.153602]  [] softlockup_tick+0xbb/0xf0
[   85.153614]  [] update_process_times+0x32/0x80
[   85.153621]  [] tick_sched_timer+0x75/0x1c0
[   85.153631]  [] hrtimer_interrupt+0x170/0x1c0
[   85.153639]  [] smp_apic_timer_interrupt+0x3d/0x80
[   85.153645]  [] apic_timer_interrupt+0x28/0x30
[   85.153653]  [] hci_sock_setsockopt+0x0/0x1b0
[   85.153660]  ===
[   72.741364] usb 1-1: reset full speed USB device using uhci_hcd and address 2
[   99.596976] BUG: soft lockup detected on CPU#0!
[   99.596995]  [] softlockup_tick+0xbb/0xf0
[   99.597006]  [] update_process_times+0x32/0x80
[   99.597013]  [] tick_sched_timer+0x75/0x1c0
[   99.597024]  [] hrtimer_interrupt+0x170/0x1c0
[   99.597032]  [] smp_apic_timer_interrupt+0x3d/0x80
[   99.597038]  [] apic_timer_interrupt+0x28/0x30
[   99.597045]  [] pci_pcbios_init+0x18b/0x260
[   99.597050]  [] default_idle+0x2b/0x40
[   99.597054]  [] cpu_idle+0x48/0x80
[   99.597057]  [] start_kernel+0x21f/0x2a0
[   99.597063]  [] unknown_bootoption+0x0/0x1d0
[   99.597068]  ===
[  124.441285] BUG: soft lockup detected on CPU#0!
[  124.441305]  [] softlockup_tick+0xbb/0xf0
[  124.441315]  [] update_process_times+0x32/0x80
[  124.441323]  [] tick_sched_timer+0x75/0x1c0
[  124.441336]  [] hrtimer_interrupt+0x170/0x1c0
[  124.441345]  [] smp_apic_timer_interrupt+0x3d/0x80
[  124.441352]  [] apic_timer_interrupt+0x28/0x30
[  124.441360]  [] pci_pcbios_init+0x18b/0x260
[  124.441366]  [] default_idle+0x2b/0x40
[  124.441371]  [] cpu_idle+0x48/0x80
[  124.441375]  [] start_kernel+0x21f/0x2a0
[  124.441381]  [] unknown_bootoption+0x0/0x1d0
[  124.441388]  ===


-- 
-
Erik de Castro Lopo
-
Linux, the UNIX defragmentation tool.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: build #337 failed for 2.6.24-rc1-gb1d08ac In function `usbnet_set_settings':

2007-11-19 Thread David Miller
From: Adrian Bunk <[EMAIL PROTECTED]>
Date: Thu, 8 Nov 2007 04:30:10 +0100

> @davem:
> 
> Please look at net/ipv4/arp.c:arp_process()
> 
> Am I right that CONFIG_NET_ETHERNET=n and CONFIG_NETDEV_1000=y or 
> CONFIG_NETDEV_1=y will not be handled correctly there?
> 
> And the best solution is to nuke all #ifdef's in this function and make 
> the code unconditionally available?

I think removing those specific ifdefs in arp_process()
is the best option, yes.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/2] sata_nv: fix ATAPI issues with memory over 4GB (v3)

2007-11-19 Thread Robert Hancock

Tejun Heo wrote:

Robert Hancock wrote:

Tejun Heo wrote:

Robert Hancock wrote:

This fixes some problems with ATAPI devices on nForce4 controllers in
ADMA mode
on systems with memory located above 4GB. We need to delay setting
the 64-bit
DMA mask until the PRD table and padding buffer are allocated so that
they don't
get allocated above 4GB and break legacy mode (which is needed for ATAPI
devices).

Signed-off-by: Robert Hancock <[EMAIL PROTECTED]>

applied to #tj-upstream-fixes.


I have a report that these patches crashed but the previous patch worked:

https://bugzilla.redhat.com/show_bug.cgi?id=351451

So there may still be a problem here.


Any progress?


It looks like the problem is that even though we set the DMA mask after 
we allocate the PRD and pad buffers, when the other port is set up, the 
DMA mask is already over 64-bit and so it allocates its buffers over 4GB 
and fails. I think we just need to explicitly set to 32-bit first, 
getting the reporter to try that one now.


--
Robert Hancock  Saskatoon, SK, Canada
To email, remove "nospam" from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CONFIG_IRQBALANCE for 64-bit x86 ?

2007-11-19 Thread Nick Piggin
On Tuesday 20 November 2007 15:37, Adrian Bunk wrote:
> On Tue, Nov 20, 2007 at 05:29:29AM +0100, Willy Tarreau wrote:

> > Agreed. When userspace has something to do with the way IRQs are
> > delivered, it's going to smell as bad as micro-kernels...
>
> The next step to a micro-kernel would then be hardware drivers and file
> systems in userspace?  ;-)

We already have those. So the next step would be to pretend the
performance critical ones can be in userspace and remain competitive,
wouldn't it? ;)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] net/ipv4/arp.c: Fix arp reply when sender ip 0 (was: Strange behavior in arp probe reply, bug or feature?)

2007-11-19 Thread Bill Fink
On Mon, 19 Nov 2007, Alexey Kuznetsov wrote:

> Hello!
> 
> > Is there a reason that the target hardware address isn't the target
> > hardware address?
> 
> It is bound only to the fact that linux uses protocol address
> of the machine, which responds. It would be highly confusing
> (more than confusing :-)), if we used our protocol address and hardware
> address of requestor.
> 
> But if you use zero protocol address as source, you really can use
> any hw address.
> 
> > The dhcp clients I examined, and the implementation of the arpcheck
> > that I use will compare the target hardware field of the arp-reply and
> > match it against its own mac, to verify the reply. And this fails with
> > the current implementation in the kernel.
> 
> 1. Do not do this. Mainly, because you already know that this does not work
>with linux. :-) Logically, target hw address in arp reply is just
>a nonsensial redundancy, it should not be checked and even looked at.

Repeating what I posted earlier from the ARP RFC 826:

"The target hardware address is included for completeness and
network monitoring.  It has no meaning in the request form,
since it is this number that the machine is requesting.  Its
meaning in the reply form is the address of the machine making
the request.  In some implementations (which do not get to look
at the 14.byte ethernet header, for example) this may save some
register shuffling or stack space by sending this field to the
hardware driver as the hardware destination address of the
packet.

Unless there is some other RFC that supercedes this, which doesn't appear
to be the case since it's also STD37, it appears to me that the current
Linux behavior is wrong.  It clearly states that for the ARP reply, the
target hardware address is "the address of the machine making the request",
and not the address of the machine making the reply as Linux is apparently
doing.

> 2. What's about your suggestion, I thought about this and I am going to agree.
> 
>Arguments, which convinced me are:
> 
>- arping still works.
>- any piece of reasonable software should work.
>- if Windows understands DaD (is it really true? I cannot believe)
>  and it is unhappy about our responce and does not block use
>  of duplicate address only due to this, we _must_ accomodate ASAP.
>- if we do,we have to use 0 protocol address, no choice.

I agree the target protocol address should be 0 in this case.

-Bill
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[patch] vt: bitlock fix

2007-11-19 Thread Nick Piggin
Don't know who maintains vt.c, but Antonino's name comes up regularly ;)

--
vt is missing a memory barrier to close the critical section. Use a real
spinlock for this.

Signed-off-by: Nick Piggin <[EMAIL PROTECTED]>
---
Index: linux-2.6/drivers/char/vt.c
===
--- linux-2.6.orig/drivers/char/vt.c
+++ linux-2.6/drivers/char/vt.c
@@ -2400,13 +2400,15 @@ static void vt_console_print(struct cons
 {
struct vc_data *vc = vc_cons[fg_console].d;
unsigned char c;
-   static unsigned long printing;
+   static DEFINE_SPINLOCK(printing_lock);
const ushort *start;
ushort cnt = 0;
ushort myx;
 
/* console busy or not yet initialized */
-   if (!printable || test_and_set_bit(0, ))
+   if (!printable)
+   return;
+   if (!spin_trylock(_lock))
return;
 
if (kmsg_redirect && vc_cons_allocated(kmsg_redirect - 1))
@@ -2481,7 +2483,7 @@ static void vt_console_print(struct cons
notify_update(vc);
 
 quit:
-   clear_bit(0, );
+   spin_unlock(_lock);
 }
 
 static struct tty_driver *vt_console_device(struct console *c, int *index)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [rfc-patch 0/9] Immediate Values for 2.6.24-rc2-git5

2007-11-19 Thread Borislav Petkov
On Mon, Nov 19, 2007 at 10:31:39AM -0500, Mathieu Desnoyers wrote:
> * Borislav Petkov ([EMAIL PROTECTED]) wrote:
> > On Fri, Nov 16, 2007 at 03:02:38PM -0500, Mathieu Desnoyers wrote:
> > Hi, 
> > just a conventions proposal: have you thought of shortening all those
> > "immediate_foo" prefixes to 'imm_foo', for example? This'll make the 
> > code much more readable, i think.
> > 
> 
> Hrm, a quick grep in the kernel tree shows me that the imm_* namespace
> is already quite clobbered (although I do not detect any imm_read or
> imm_set). 

Right, something called "low level driver for the IOMEGA MatchMaker"
(drivers/scsi/imm.c) has gotten hold of the imm_* prefix already so there might
be a problem later, probably. Nevertheless, you could use the imm_ prefix or
choose some other 3-n letter prefix:

immed_* (3 matches)
imme_* (no matches but dumb)
imd_* (none!)
immv_* (none, v like [v]alues), etc.
...

-- 
Regards/Gruß,
Boris.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Is there any word about this bug in gcc ?

2007-11-19 Thread H. Peter Anvin

WANG Cong wrote:

On Tue, Nov 20, 2007 at 10:13:42AM +0800, zhengyi wrote:

Is there any relevance to the kernel ?

I found the folowing code here:
http://linux.solidot.org/article.pl?sid=07/11/19/0512218=rss

---
int main( void )
{
 int i=2;
 if( -10*abs (i-1) == 10*abs(i-1) )
   printf ("OMG,-10==10 in linux!\n");
 else
   printf ("nothing special here\n") ;

 return 0 ;
}


I think no. It is considered a bug in abs(), kernel, of course,
doesn't use glibc's abs().



Wrong.

abs() is internal to gcc, and the above is optimized out at compile 
time, so any user of abs() as a function at all is vulnerable.


However, the Linux kernel defines abs() as a macro:

#define abs(x) ({   \
int __x = (x);  \
(__x < 0) ? -__x : __x; \
})

... which means gcc never sees it.  So the kernel isn't affected, 
because it doesn't use *gcc's* abs().


-hpa

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [EMAIL PROTECTED] created...

2007-11-19 Thread Greg KH
On Tue, Nov 20, 2007 at 09:41:31AM +0500, Alexander E. Patrakov wrote:
> David Miller wrote:
>> From: Greg KH <[EMAIL PROTECTED]>
>> Date: Mon, 19 Nov 2007 18:29:15 -0800
>>> [EMAIL PROTECTED] would be great to have. 
>> Created, enjoy.
>
> It would be nice to have the archives of this list and the nntp interface 
> on gmane.

I'm sure they will migrate once I post the information to the lists themselves 
:)

thanks,

greg k-h
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Patch] mm/sparse.c: Check the return value of sparse_index_alloc().

2007-11-19 Thread WANG Cong
On Mon, Nov 19, 2007 at 01:17:02PM -0800, Dave Hansen wrote:
>On Thu, 2007-11-15 at 21:54 +0800, WANG Cong wrote:
>> Since sparse_index_alloc() can return NULL on memory allocation failure,
>> we must deal with the failure condition when calling it.
>> 
>> Signed-off-by: WANG Cong <[EMAIL PROTECTED]>
>> Cc: Christoph Lameter <[EMAIL PROTECTED]>
>> Cc: Rik van Riel <[EMAIL PROTECTED]>
>> 
>> ---
>> 
>> diff --git a/Makefile b/Makefile
>> diff --git a/mm/sparse.c b/mm/sparse.c
>> index e06f514..d245e59 100644
>> --- a/mm/sparse.c
>> +++ b/mm/sparse.c
>> @@ -83,6 +83,8 @@ static int __meminit sparse_index_init(unsigned long 
>> section_nr, int nid)
>>  return -EEXIST;
>> 
>>  section = sparse_index_alloc(nid);
>> +if (!section)
>> +return -ENOMEM;
>>  /*
>>   * This lock keeps two different sections from
>>   * reallocating for the same index
>
>Oddly enough, sparse_add_one_section() doesn't seem to like to check
>its allocations.  The usemap is checked, but not freed on error.  If you
>want to fix this up, I think it needs a little more love than just two
>lines.  

Er, right. I missed this point.

>
>Do you want to try to add some actual error handling to
>sparse_add_one_section()?

Yes, I will have a try. And memory_present() also doesn't check it.
More patches around this will come up soon. Since Andrew has included
the above patch, so I won't remake it with others together.

Andrew, is this OK for you?

Thanks.



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [EMAIL PROTECTED] created...

2007-11-19 Thread Greg KH
On Mon, Nov 19, 2007 at 07:13:34PM -0800, David Miller wrote:
> From: Greg KH <[EMAIL PROTECTED]>
> Date: Mon, 19 Nov 2007 19:12:32 -0800
> 
> > Actually, if we are going to stick with this new list, can we just call
> > it "[EMAIL PROTECTED]" instead of the "-devel" stuff?
> 
> Done.

Great, thanks so much for this.

greg k-h
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] Add packet filtering based on process\'s security context.

2007-11-19 Thread Tetsuo Handa
This patch allows LSM modules filter incoming connections/datagrams
based on the process's security context who is attempting to pick up.

There are already hooks to filter incoming connections/datagrams
based on the socket's security context, but these hooks are not
applicable when one wants to do TCP Wrapper-like filtering
(e.g. App1 is permitted to accept TCP connections from 192.168.0.0/16).



There is a side effect which unlikely occurs.

If a socket is shared by multiple processes with different policy,
the process who should be able to accept this connection
will not be able to accept this connection
because socket_post_accept() aborts this connection.
But if socket_post_accept() doesn't abort this connection,
the process who must not be able to accept this connection
will repeat accept() forever, which is a worse side effect.

Similarly, if a socket is shared by multiple processes with different policy,
the process who should be able to pick up this datagram
will not be able to pick up this datagram
because socket_post_recv_datagram() discards this datagram.
But if socket_post_recv_datagram() doesn't discard this datagram,
the process who must not be able to pick up this datagram
will repeat recvmsg() forever, which is a worse side effect.

Signed-off-by: Kentaro Takeda <[EMAIL PROTECTED]>
Signed-off-by: Tetsuo Handa <[EMAIL PROTECTED]>

 include/linux/security.h |   34 +-
 net/core/datagram.c  |   26 --
 net/socket.c |7 +--
 security/dummy.c |   13 ++---
 security/security.c  |   10 --
 5 files changed, 76 insertions(+), 14 deletions(-)

--- linux-2.6.24-rc2-mm1.orig/include/linux/security.h
+++ linux-2.6.24-rc2-mm1/include/linux/security.h
@@ -778,8 +778,12 @@ struct request_sock;
  * @socket_post_accept:
  * This hook allows a security module to copy security
  * information into the newly created socket's inode.
+ * This hook also allows a security module to filter connections
+ * from unwanted peers based on the process accepting this connection.
+ * The connection will be aborted if this hook returns nonzero.
  * @sock contains the listening socket structure.
  * @newsock contains the newly created server socket for connection.
+ * Return 0 if permission is granted.
  * @socket_sendmsg:
  * Check permission before transmitting a message to another socket.
  * @sock contains the socket structure.
@@ -793,6 +797,15 @@ struct request_sock;
  * @size contains the size of message structure.
  * @flags contains the operational flags.
  * Return 0 if permission is granted.  
+ * @socket_post_recv_datagram:
+ * Check permission after receiving a datagram.
+ * This hook allows a security module to filter packets
+ * from unwanted peers based on the process receiving this datagram.
+ * The packet will be discarded if this hook returns nonzero.
+ * @sk contains the socket.
+ * @skb contains the socket buffer (may be NULL).
+ * @flags contains the operational flags.
+ * Return 0 if permission is granted.
  * @socket_getsockname:
  * Check permission before the local address (name) of the socket object
  * @sock is retrieved.
@@ -1384,12 +1397,13 @@ struct security_operations {
   struct sockaddr * address, int addrlen);
int (*socket_listen) (struct socket * sock, int backlog);
int (*socket_accept) (struct socket * sock, struct socket * newsock);
-   void (*socket_post_accept) (struct socket * sock,
-   struct socket * newsock);
+   int (*socket_post_accept) (struct socket *sock, struct socket *newsock);
int (*socket_sendmsg) (struct socket * sock,
   struct msghdr * msg, int size);
int (*socket_recvmsg) (struct socket * sock,
   struct msghdr * msg, int size, int flags);
+   int (*socket_post_recv_datagram) (struct sock *sk, struct sk_buff *skb,
+ unsigned int flags);
int (*socket_getsockname) (struct socket * sock);
int (*socket_getpeername) (struct socket * sock);
int (*socket_getsockopt) (struct socket * sock, int level, int optname);
@@ -2294,10 +2308,12 @@ int security_socket_bind(struct socket *
 int security_socket_connect(struct socket *sock, struct sockaddr *address, int 
addrlen);
 int security_socket_listen(struct socket *sock, int backlog);
 int security_socket_accept(struct socket *sock, struct socket *newsock);
-void security_socket_post_accept(struct socket *sock, struct socket *newsock);
+int security_socket_post_accept(struct socket *sock, struct socket *newsock);
 int security_socket_sendmsg(struct socket *sock, struct msghdr *msg, int size);
 int security_socket_recvmsg(struct socket *sock, struct msghdr *msg,
int size, int flags);
+int 

Re: [EMAIL PROTECTED] created...

2007-11-19 Thread Alexander E. Patrakov

David Miller wrote:

From: Greg KH <[EMAIL PROTECTED]>
Date: Mon, 19 Nov 2007 18:29:15 -0800

[EMAIL PROTECTED] would be great to have. 


Created, enjoy.


It would be nice to have the archives of this list and the nntp interface on 
gmane.

--
Alexander E. Patrakov
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CONFIG_IRQBALANCE for 64-bit x86 ?

2007-11-19 Thread Adrian Bunk
On Tue, Nov 20, 2007 at 05:29:29AM +0100, Willy Tarreau wrote:
> On Tue, Nov 20, 2007 at 03:17:15PM +1100, Nick Piggin wrote:
> > On Tuesday 20 November 2007 15:12, Mark Lord wrote:
> > > On 32-bit x86, we have CONFIG_IRQBALANCE available,
> > > but not on 64-bit x86.  Why not?
> > >
> > > I ask, because this feature seems almost essential to obtaining
> > > reasonable latencies during heavy I/O with fast devices.
> > >
> > > My 32-bit Core2Duo MythTV box drops audio frames without it,
> > > but works perfectly *with* IRQBALANCE.
> > >
> > > My QuadCore box works very well in 32-bit mode with IRQBALANCE,
> > > but responsiveness sucks bigtime when run in 64-bit mode (no IRQBALANCE)
> > > during periods of multiple heavy I/O streams (USB flash drives).
> > >
> > > That's with both the 32 and 64 bit versions of Kubuntu Gutsy,
> > > so the software uses pretty much identical versions either way.
> > >
> > > As near as I can tell, when IRQBALANCE is not configured,
> > > all I/O device interrupts go to CPU#0.
> > >
> > > I don't think our CPU scheduler takes that into account when assigning
> > > tasks to CPUs, so anything sent to CPU0 runs with very high latencies.
> > >
> > > Or something like that.
> > >
> > > Why no IRQ_BALANCE in 64-bit mode ?
> > 
> > For that matter, I'd like to know why it has been decided that the
> > best place for IRQ balancing is in userspace. It should be in kernel
> > IMO, and it would probably allow better power saving, performance,
> > fairness, etc. if it were to be integrated with the task balancer as
> > well.
> 
> Agreed. When userspace has something to do with the way IRQs are
> delivered, it's going to smell as bad as micro-kernels...

The next step to a micro-kernel would then be hardware drivers and file 
systems in userspace?  ;-)

> Willy

cu
Adrian

-- 

   "Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
   "Only a promise," Lao Er said.
   Pearl S. Buck - Dragon Seed

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CONFIG_IRQBALANCE for 64-bit x86 ?

2007-11-19 Thread Willy Tarreau
On Tue, Nov 20, 2007 at 03:17:15PM +1100, Nick Piggin wrote:
> On Tuesday 20 November 2007 15:12, Mark Lord wrote:
> > On 32-bit x86, we have CONFIG_IRQBALANCE available,
> > but not on 64-bit x86.  Why not?
> >
> > I ask, because this feature seems almost essential to obtaining
> > reasonable latencies during heavy I/O with fast devices.
> >
> > My 32-bit Core2Duo MythTV box drops audio frames without it,
> > but works perfectly *with* IRQBALANCE.
> >
> > My QuadCore box works very well in 32-bit mode with IRQBALANCE,
> > but responsiveness sucks bigtime when run in 64-bit mode (no IRQBALANCE)
> > during periods of multiple heavy I/O streams (USB flash drives).
> >
> > That's with both the 32 and 64 bit versions of Kubuntu Gutsy,
> > so the software uses pretty much identical versions either way.
> >
> > As near as I can tell, when IRQBALANCE is not configured,
> > all I/O device interrupts go to CPU#0.
> >
> > I don't think our CPU scheduler takes that into account when assigning
> > tasks to CPUs, so anything sent to CPU0 runs with very high latencies.
> >
> > Or something like that.
> >
> > Why no IRQ_BALANCE in 64-bit mode ?
> 
> For that matter, I'd like to know why it has been decided that the
> best place for IRQ balancing is in userspace. It should be in kernel
> IMO, and it would probably allow better power saving, performance,
> fairness, etc. if it were to be integrated with the task balancer as
> well.

Agreed. When userspace has something to do with the way IRQs are
delivered, it's going to smell as bad as micro-kernels...

Willy

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHv3 0/4] sys_indirect system call

2007-11-19 Thread H. Peter Anvin

Ulrich Drepper wrote:


But I still don't see that the magic encoding is a valid solution, it
doesn't address the limited parameter number.  Plus, using sys_indirect
could in future be used to transport entire parameters (like a sigset_t)
along with other information, thereby saving individual copy operations.



The limited number of parameters is a non-issue, we already have the 
convention for that: for more than 6 parameters, pass a parameter to 
arguments 6 and higher in the register normally used for parameter 6.


Now, for the specific case of x86-64 (as well as some of the RISC 
architectures), this meshes poorly with the C calling convention, which 
is that parameter 7+ are passed on the stack.  We would obtain a more 
efficient calling convention by allocating an additional register for 
such an indirect pointer and/or adopt the convention that the additional 
parameters are simply stored on the user stack starting at a specific 
offset, presumably +8.


I would really like to see a systematic calling convention that doesn't 
have limits that we have already broken several times.  In particular, I 
would like to see a convention that can be mapped 1:1 onto the platform 
C calling convention in a syscall-independent way.  We *almost* have 
this today, but sys_indirect would blow that out of the water in a 
particularly ugly way.(*)



I think the sys_indirect approach is the way forward.  I'll submit a
last version of the patch in a bit.


I think it is a horrible kluge.  It's yet another multiplexer, which we 
are trying desperately to avoid in the kernel.  Just to make things more 
painful, it is a multiplexer which creates yet another ad hoc calling 
convention, whereas we should strive to make the kernel calling 
convention as uniform as possible.


If is is NOT going to be used in the common case, then it really doesn't 
matter -- we can just use manually or automatically generated thunks. 
It's not like we have thousands of system calls, and it's not like "a 
system call" is a precious thing.


If it IS going to be used in the common case, then we should use 
something more streamlined.


-hpa

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CONFIG_IRQBALANCE for 64-bit x86 ?

2007-11-19 Thread Nick Piggin
On Tuesday 20 November 2007 15:12, Mark Lord wrote:
> On 32-bit x86, we have CONFIG_IRQBALANCE available,
> but not on 64-bit x86.  Why not?
>
> I ask, because this feature seems almost essential to obtaining
> reasonable latencies during heavy I/O with fast devices.
>
> My 32-bit Core2Duo MythTV box drops audio frames without it,
> but works perfectly *with* IRQBALANCE.
>
> My QuadCore box works very well in 32-bit mode with IRQBALANCE,
> but responsiveness sucks bigtime when run in 64-bit mode (no IRQBALANCE)
> during periods of multiple heavy I/O streams (USB flash drives).
>
> That's with both the 32 and 64 bit versions of Kubuntu Gutsy,
> so the software uses pretty much identical versions either way.
>
> As near as I can tell, when IRQBALANCE is not configured,
> all I/O device interrupts go to CPU#0.
>
> I don't think our CPU scheduler takes that into account when assigning
> tasks to CPUs, so anything sent to CPU0 runs with very high latencies.
>
> Or something like that.
>
> Why no IRQ_BALANCE in 64-bit mode ?

For that matter, I'd like to know why it has been decided that the
best place for IRQ balancing is in userspace. It should be in kernel
IMO, and it would probably allow better power saving, performance,
fairness, etc. if it were to be integrated with the task balancer as
well.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Is there any word about this bug in gcc ?

2007-11-19 Thread WANG Cong
On Tue, Nov 20, 2007 at 10:13:42AM +0800, zhengyi wrote:
>Is there any relevance to the kernel ?
>
>I found the folowing code here:
>http://linux.solidot.org/article.pl?sid=07/11/19/0512218=rss
>
>---
>int main( void )
>{
>  int i=2;
>  if( -10*abs (i-1) == 10*abs(i-1) )
>printf ("OMG,-10==10 in linux!\n");
>  else
>printf ("nothing special here\n") ;
>
>  return 0 ;
>}

I think no. It is considered a bug in abs(), kernel, of course,
doesn't use glibc's abs().

Regards.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CONFIG_IRQBALANCE for 64-bit x86 ?

2007-11-19 Thread Ismail Dönmez
Tuesday 20 November 2007 Tarihinde 06:12:21 yazmıştı:
> On 32-bit x86, we have CONFIG_IRQBALANCE available,
> but not on 64-bit x86.  Why not?
>
> I ask, because this feature seems almost essential to obtaining
> reasonable latencies during heavy I/O with fast devices.
>
> My 32-bit Core2Duo MythTV box drops audio frames without it,
> but works perfectly *with* IRQBALANCE.
>
> My QuadCore box works very well in 32-bit mode with IRQBALANCE,
> but responsiveness sucks bigtime when run in 64-bit mode (no IRQBALANCE)
> during periods of multiple heavy I/O streams (USB flash drives).
>
> That's with both the 32 and 64 bit versions of Kubuntu Gutsy,
> so the software uses pretty much identical versions either way.
>
> As near as I can tell, when IRQBALANCE is not configured,
> all I/O device interrupts go to CPU#0.
>
> I don't think our CPU scheduler takes that into account when assigning
> tasks to CPUs, so anything sent to CPU0 runs with very high latencies.
>
> Or something like that.
>
> Why no IRQ_BALANCE in 64-bit mode ?

Have you tried running irqbalance on userspace? Checkout 
http://irqbalance.org/ . AFAIK CONFIG_IRQBALANCE is deprecated and eats 
battery power.

Regards,
ismail

-- 
Faith is believing what you know isn't so -- Mark Twain
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [BUG?] OOM with large cache....(x86_64, 2.6.24-rc3-git1, nohz)

2007-11-19 Thread Nick Piggin
On Tuesday 20 November 2007 11:59, Ian Kumlien wrote:
> Hi,
>
> I have had this before and sent a mail about it.
>
> It seems like the diskcache is still in use and is never shrunk. This
> happened with a odd load though, trackerd started indexing a bit late
> and the other workload which is a large bittorrent seed/download.
>
> The bittorrent app is the one that drives up the diskcache.
>
> I don't think that trackerd was triggering it, i actually upgraded
> kernel since it kept happening on 2.6.23...
>
> I really don't know what other information i can provide.
>
> free from now (some hours later)
> vmstat from now ^
>
> and the dmesg log.
>
> Ideas? Comments?
>
> free:
>  total   used   free sharedbuffers cached
> Mem:   20564842039736  16748  0  207761585408
> -/+ buffers/cache: 4335521622932
> Swap:  2530180 4260202104160
> ---
>
> vmstat:
> procs ---memory-- ---swap-- -io -system--
> cpu r  b   swpd   free   buff  cache   si   sobibo   in  
> cs us sy id wa 0  0 426020  16612  20580 1585848   26   21   68456   34
>   51  5  3 88  4 ---
>
> --- 8<--- 8<---
> ntpd invoked oom-killer: gfp_mask=0x1201d2, order=0, oomkilladj=0
>
> Call Trace:
>  [] oom_kill_process+0xf6/0x110
>  [] out_of_memory+0x1b6/0x200
>  [] __alloc_pages+0x387/0x3c0
>  [] __do_page_cache_readahead+0x103/0x260
>  [] filemap_fault+0x2f1/0x420
>  [] __do_fault+0x6b/0x410
>  [] recalc_sigpending+0xe/0x40
>  [] handle_mm_fault+0x1bd/0x7a0
>  [] save_i387+0x9a/0xe0
>  [] do_page_fault+0x176/0x790
>  [] sys_rt_sigreturn+0x35f/0x400
>  [] error_exit+0x0/0x51
>
> Mem-info:
> DMA per-cpu:
> CPU0: Hot: hi:0, btch:   1 usd:   0   Cold: hi:0, btch:   1
> usd:   0 CPU1: Hot: hi:0, btch:   1 usd:   0   Cold: hi:0,
> btch:   1 usd:   0 DMA32 per-cpu:
> CPU0: Hot: hi:  186, btch:  31 usd: 148   Cold: hi:   62, btch:  15
> usd:  60 CPU1: Hot: hi:  186, btch:  31 usd: 116   Cold: hi:   62,
> btch:  15 usd:  18 Active:241172 inactive:241825 dirty:0 writeback:0
> unstable:0
>  free:3388 slab:8095 mapped:149 pagetables:6263 bounce:0
> DMA free:7908kB min:20kB low:24kB high:28kB active:0kB inactive:0kB
> present:7436kB pages_scanned:0 all_unreclaimable? yes lowmem_reserve[]: 0
> 2003 2003 2003
> DMA32 free:5644kB min:5716kB low:7144kB high:8572kB active:964688kB
> inactive:967188kB present:2052008kB pages_scanned:5519125
> all_unreclaimable? yes lowmem_reserve[]: 0 0 0 0
> DMA: 5*4kB 4*8kB 3*16kB 4*32kB 6*64kB 5*128kB 4*256kB 3*512kB 0*1024kB
> 0*2048kB 1*4096kB = 7908kB DMA32: 95*4kB 2*8kB 0*16kB 0*32kB 0*64kB 1*128kB
> 0*256kB 2*512kB 0*1024kB 0*2048kB 1*4096kB = 5644kB Swap cache: add
> 1979600, delete 1979592, find 144656/307405, race 1+17 Free swap  = 0kB
> Total swap = 2530180kB
> Free swap:0kB
> 524208 pages of RAM
> 10149 reserved pages
> 5059 pages shared
> 8 pages swap cached
> Out of memory: kill process 8421 (trackerd) score 1016524 or a child
> Killed process 8421 (trackerd)

It's also used up all your 2.5GB of swap. The output of your `free` shows
a fair bit of disk cache there, but it also shows a lot of swap free, which
isn't the case at oom-time.

Unfortunately, we don't show NR_ANON_PAGES in these stats, but at a guess,
I'd say that the file cache is mostly shrunk and you still don't have
enough memory. trackerd probably has a memory leak in it, or else is just
trying to allocate more memory than you have. Is this a regression?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


CONFIG_IRQBALANCE for 64-bit x86 ?

2007-11-19 Thread Mark Lord

On 32-bit x86, we have CONFIG_IRQBALANCE available,
but not on 64-bit x86.  Why not?

I ask, because this feature seems almost essential to obtaining
reasonable latencies during heavy I/O with fast devices.

My 32-bit Core2Duo MythTV box drops audio frames without it,
but works perfectly *with* IRQBALANCE.

My QuadCore box works very well in 32-bit mode with IRQBALANCE,
but responsiveness sucks bigtime when run in 64-bit mode (no IRQBALANCE)
during periods of multiple heavy I/O streams (USB flash drives).

That's with both the 32 and 64 bit versions of Kubuntu Gutsy,
so the software uses pretty much identical versions either way.

As near as I can tell, when IRQBALANCE is not configured,
all I/O device interrupts go to CPU#0.

I don't think our CPU scheduler takes that into account when assigning
tasks to CPUs, so anything sent to CPU0 runs with very high latencies.

Or something like that.

Why no IRQ_BALANCE in 64-bit mode ?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [v4l-dvb-maintainer] [PATCH 25/59] drivers/media/video: Add missing "space"

2007-11-19 Thread Joe Perches
On Mon, 2007-11-19 at 18:24 -0800, Brandon Philips wrote:
> On 17:48 Mon 19 Nov 2007, Joe Perches wrote:
> > v4l_dbg(1, cx25840_debug, client, "hblank %i, hactive %i, "
> > -   "vblank %i , vactive %i, vblank656 %i, src_dec %i,"
> > +   "vblank %i , vactive %i, vblank656 %i, src_dec %i, "
>   ^
>   can you remove that unintended

diff --git a/drivers/media/video/cx25840/cx25840-vbi.c 
b/drivers/media/video/cx25840/cx25840-vbi.c
index ced13fe..d2949e5 100644
--- a/drivers/media/video/cx25840/cx25840-vbi.c
+++ b/drivers/media/video/cx25840/cx25840-vbi.c
@@ -180,7 +180,7 @@ void cx25840_vbi_setup(struct i2c_client *client)
fsc/100,fsc%100);
 
v4l_dbg(1, cx25840_debug, client, "hblank %i, hactive %i, "
-   "vblank %i , vactive %i, vblank656 %i, src_dec %i,"
+   "vblank %i, vactive %i, vblank656 %i, src_dec %i, "
"burst 0x%02x, luma_lpf %i, uv_lpf %i, comb 0x%02x,"
" sc 0x%06x\n",
hblank, hactive, vblank, vactive, vblank656,



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [rfc 03/45] Generic CPU operations: Core piece

2007-11-19 Thread Mathieu Desnoyers
* Christoph Lameter ([EMAIL PROTECTED]) wrote:
> On Mon, 19 Nov 2007, Mathieu Desnoyers wrote:
> 
> > 
> > Very interesting patch! I did not expect we could mix local atomic ops
> > with per CPU offsets in an atomic manner.. brilliant :)
> > 
> > Some nitpicking follows...
> 
> Well this is a draft so I was not that thorough. The beast is getting too 
> big. It would be good if I could get the first patches merged that just 
> deal with the two allocators and then gradually work the rest.
> 
> > I think you could use extra () around old, new etc.. ?
> 
> Right.
> 
> > Same here.
> > 
> > > + (x);\
> > 
> > () seems unneeded here, since x is local.
> 
> But (x) is returned to the "caller" of the macro so it should be specially 
> marged.
> 

I don't think that it really matters.. the preprocessor already wraps
all the ({ }) in a single statement, doesn't it ?


Grepping for usage of ({ in include/linux shows that the return value is
never surrounded by supplementary ().

> > > + * In that case we can simply disable preemption which
> > > + * may be free if the kernel is compiled without preemption.
> > > + */
> > > +
> > > +#define _CPU_READ(addr)  \
> > > +({   \
> > > + (__CPU_READ(addr)); \
> > > +})
> > 
> > ({ }) seems to be unneeded here.
> 
> Hmmm I wanted a consistent style.
> 

Since checkpatch.pl emits a warning when a one liner if() uses brackets,
I guess compactness of code is preferred to a consistent style.

Just my 2 cents though :)

Mathieu

-- 
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] ext4: dir inode reservation V3

2007-11-19 Thread Coly Li
Thanks for the feedback :-)

Mingming Cao wrote:
> On Tue, 2007-11-13 at 22:12 +0800, Coly Li wrote:
>> Basic idea of my dir inode reservation patch can be found here,
>> http://lists.openwall.net/linux-ext4/2007/11/05/3
>>
>> 1, What does dir inode reservation do
>> Dir inode reservation tries to reserve several inodes in inodes table for a 
>> directory when this
>> directory is created. When create new file under this directory, try to 
>> allocate inode from the
>> reserved inodes area. This is called as dir_ireserve inode allocator.
>>
> Thanks for the update.
> 
> Let me try to understand your method:
> 
> So the basic idea is not do linear inode allocation for directory? Inode
> structure block for directory file is only coming from block 0, N, N
> +N,... where the number of skipped blocks N is stored in the in-core
> superblock structure. 

N is not stored in in-core superblock. N = s_dir_ireserve_nr / 
inodes_per_block. What is stored in
in-core superblock is number of inodes to be reserved for each directory.

> 
> When ever need to allocate an inode for directory, skip N reserved bits
> (space for N*16 inodes) if the previous block is already allocated. That
> way place two directories with the hole of N*16 inodes structures, then
> allow files under the first directory stay closer with their parent
> directory. Is this correct?

The hole is (s_dir_ireserve_nr - 1), not N * s_dir_ireserve_nr. Because 
directory inode will also
use a inode slot from reserved area, reset slots number for files is 
(s_dir_ireserve_nr - 1).
Except for the reserved inodes number, your understanding exactly matches my 
idea.

>  
> 
>> 4, Dir inode reservation is optional
>> Dir inode reservation is optional, you can use -o followed by one of these 
>> options to enable dir
>> inode reservation during mount ext4 file system:
>>  dir_ireserve=low
>>  dir_ireserve=normal
>>  dir_ireserve=high
> 
> Would be nice to pass the tuning info low/normal/high(16/64/128 blocks
> correspondingly) via something else rather than mount options. 

Sure, I agree with you. Also I am thinking should this patch permit user to 
input reserved inodes
number directly other than a low/normal/high. Also I am looking for methods to 
display the tuning
info more convenient to users.

>  
>> Currently, 'low' reserves 15 file inodes for each directory, 'normal' 
>> reserves 31 inodes and 'high'
>> reserves 127 inodes. Reserving more than 127 inodes does not help to 
>> performance obviously.
>>
>>
>> 5, Performance number
>> On a Core-Duo, 2MB DDM memory, 7200 RPM SATA PC, I built a 50GB ext4 
>> partition, and tried to create
>> 5 directories, and create 15 (1KB) files in each directory 
>> alternatively. After a remount, I
>> tried to remove all the directories and files recursively by a 'rm -rf'. 
>> Bellow is the benchmark result,
>>  normal ext4 ext4 with dir inode 
>> reservation
>>  mount options:  -o data=writeback   -o 
>> data=writeback,dir_ireserve=low
>>  Create dirs:real0m49.101s   real2m59.703s
>>  Create files:   real24m17.962s  real21m8.161s
>>  Unlink all: real24m43.788s  real17m29.862s
>> Creating dirs with dir inode reservation is slower than normal ext4 as 
>> predicted, because allocating
>> directory inodes in non-linear order will cause extra hard disk seeking and 
>> block I/O.
> 
> Hmm...I suspect there is bug in your patch, the extra seek should not
> contribute to 4 times slower

I agree with you :-)

> 
>>  #include 
>> @@ -478,6 +480,75 @@ static int find_group_other(struct super_block *sb, 
>> struct inode *parent,
>>  return -1;
>>  }
>>
>> +static int ext4_ino_from_ireserve(handle_t *handle, struct inode *dir,
>> +  int mode, ext4_group_t *group, unsigned long *ino)
>> +{
>> +struct super_block *sb;
>> +struct ext4_sb_info *sbi;
>> +struct ext4_group_desc *gdp = NULL;
>> +struct buffer_head *gdp_bh = NULL, *bitmap_bh = NULL;
>> +ext4_group_t ires_group = *group;
>> +unsigned long ires_ino;
>> +int i, bit;
>> +
>> +sb = dir->i_sb;
>> +sbi = EXT4_SB(sb);
>> +
>> +/* if the inode number is not for directory,
>> + * only try to allocate after directory's inode
>> + */
>> +if (!S_ISDIR(mode)) {
>> +*ino = dir->i_ino % EXT4_INODES_PER_GROUP(sb);
>> +return 0;
>> +}
>> +
>> +/* reserve inodes for new directory */
>> +for (i = 0; i < sbi->s_groups_count; i++) {
>> +gdp = ext4_get_group_desc(sb, ires_group, _bh);
>> +if (!gdp)
>> +goto fail;
>> +bit = 0;
>> +try_same_group:
>> +if (bit < EXT4_INODES_PER_GROUP(sb)) {
>> +brelse(bitmap_bh);
>> +bitmap_bh = read_inode_bitmap(sb, ires_group);
>> +if (!bitmap_bh)
>> +

Re: [rfc 00/45] [RFC] CPU ops and a rework of per cpu data handling on x86_64

2007-11-19 Thread David Miller
From: Andi Kleen <[EMAIL PROTECTED]>
Date: Tue, 20 Nov 2007 04:25:34 +0100

> 
> > Although we have a per-cpu area base in a fixed global register
> > for addressing, the above isn't beneficial on sparc64 because
> > the atomic is much slower than doing a:
> >
> > local_irq_disable();
> > nonatomic_percpu_memory_op();
> > local_irq_enable();
> 
> Again might be pointing out the obvious, but you 
> need of course save_flags()/restore_flags(), not disable/enable().

Right, but the cost is the same for that on sparc64 unlike
x86 et al.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [linux-usb-devel] [EMAIL PROTECTED] created...

2007-11-19 Thread David Miller
From: David Brownell <[EMAIL PROTECTED]>
Date: Mon, 19 Nov 2007 19:26:02 -0800

> On Monday 19 November 2007, David Miller wrote:
> > From: Greg KH <[EMAIL PROTECTED]>
> > Date: Mon, 19 Nov 2007 19:12:32 -0800
> > 
> > > Actually, if we are going to stick with this new list, can we just call
> > > it "[EMAIL PROTECTED]" instead of the "-devel" stuff?
> > 
> > Done.
> 
> Subscribe/unsubscribe ... how?

Just like any other list at vger.kernel.org:

http://vger.kernel.org/majordomo-info.html

or the quick version:

bash$ echo "subscribe linux-usb" | mail [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [rfc 08/45] cpu alloc: x86 support

2007-11-19 Thread Nick Piggin
On Tuesday 20 November 2007 13:02, Christoph Lameter wrote:
> On Mon, 19 Nov 2007, H. Peter Anvin wrote:
> > You're making the assumption here that NUMA = large number of CPUs. This
> > assumption is flat-out wrong.
>
> Well maybe. Usually one gets to NUMA because the hardware gets too big to
> be handleed the UMA way.
>
> > On x86-64, most two-socket systems are still NUMA, and I would expect
> > that most distro kernels probably compile in NUMA.  However,
> > burning megabytes of memory on a two-socket dual-core system when we're
> > talking about tens of kilobytes used would be more than a wee bit insane.
>
> Yeah yea but the latencies are minimal making the NUMA logic too expensive
> for most loads ... If you put a NUMA kernel onto those then performance
> drops (I think someone measures 15-30%?)

Small socket count systems are going to increasingly be NUMA in future.
If CONFIG_NUMA hurts performance by that much on those systems, then the
kernel is broken IMO.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [rfc 08/45] cpu alloc: x86 support

2007-11-19 Thread Christoph Lameter
On Tue, 20 Nov 2007, Andi Kleen wrote:

> I might be pointing out the obvious, but on x86-64 there is definitely not 
> 256TB of VM available for this.

Well maybe in the future.

One of the issues that I ran into is that I had to place the cpu area
in between to make the offsets link right.

However, it would be best if the cpuarea came *after* the modules area. We 
only need linking that covers the per cpu area of processor 0.

So I think we have a 2GB area right?

1GB kernel
1GB - 1x per cpu area (128M?) modules?
cpu aree 0
 2GB limit
cpu area 1 
cpu area 2


For that we would need to move the kernel down a bit. Can we do that?

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] radix_tree.h trivial comment correction

2007-11-19 Thread Nick Piggin
On Mon, Nov 19, 2007 at 11:17:48AM -0800, Tim Pepper wrote:
> There is an unmatched parenthesis in the locking commentary of radix_tree.h
> which is trivially fixed by the patch below.
> 
> Signed-off-by: Tim Pepper <[EMAIL PROTECTED]>
> Cc: Nick Piggin <[EMAIL PROTECTED]>

Acked-by: Nick Piggin <[EMAIL PROTECTED]>

> 
> ---
> 
> diff --git a/include/linux/radix-tree.h b/include/linux/radix-tree.h
> --- a/include/linux/radix-tree.h
> +++ b/include/linux/radix-tree.h
> @@ -91,7 +91,7 @@ do {
> \
>   *
>   * For API usage, in general,
>   * - any function _modifying_ the tree or tags (inserting or deleting
> - *   items, setting or clearing tags must exclude other modifications, and
> + *   items, setting or clearing tags) must exclude other modifications, and
>   *   exclude any functions reading the tree.
>   * - any function _reading_ the tree or tags (looking up items or tags,
>   *   gang lookups) must exclude modifications to the tree, but may occur
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: wrong NUMA detection on HP385 G2

2007-11-19 Thread Andi Kleen
Pavel Krauz <[EMAIL PROTECTED]> writes:

> Hello
> my HP 385 G2 - 2x dual core Opteron 2216 running 2.6.23.1 with NUMA support 
> says the following:

Can you post a full boot log? 
-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [rfc 08/45] cpu alloc: x86 support

2007-11-19 Thread Nick Piggin
On Tuesday 20 November 2007 13:02, Christoph Lameter wrote:
> On Mon, 19 Nov 2007, H. Peter Anvin wrote:
> > You're making the assumption here that NUMA = large number of CPUs. This
> > assumption is flat-out wrong.
>
> Well maybe. Usually one gets to NUMA because the hardware gets too big to
> be handleed the UMA way.

Not the way things are going with multicore and multithread, though
(that is, the hardware can be one socket and still have many cpus).

The chip might have several memory controllers on it, but they could
well be connected to the caches with a crossbar, so it needn't be
NUMA at all. Future scalability work shouldn't rely on many cores
~= many nodes, IMO.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [rfc 00/45] [RFC] CPU ops and a rework of per cpu data handling on x86_64

2007-11-19 Thread Christoph Lameter
n Tue, 20 Nov 2007, Andi Kleen wrote:

> 
> > Although we have a per-cpu area base in a fixed global register
> > for addressing, the above isn't beneficial on sparc64 because
> > the atomic is much slower than doing a:
> >
> > local_irq_disable();
> > nonatomic_percpu_memory_op();
> > local_irq_enable();
> 
> Again might be pointing out the obvious, but you 
> need of course save_flags()/restore_flags(), not disable/enable().
> 
> If it was just disable/enable x86 could do it much faster too 
> and Christoph probably would never felt the need to approach
> this project for his SLUB fast path.

I already have no need for that anymore with the material now in Andrews 
tree. However, this cuts out another 6 cycles from the fastpath and I 
found that the same principles reduce overhead all over the kernel.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


  1   2   3   4   5   6   7   8   9   10   >