Re: [PATCH] include linux/pagemap.h in asm-generic/tlb.h

2007-08-25 Thread Rob Landley
On Friday 24 August 2007 7:20:27 pm Andrew Morton wrote:
> On Fri, 24 Aug 2007 14:46:16 -0400
>
> Jeff Dike <[EMAIL PROTECTED]> wrote:
> > [ This looks non-urgent to me ]
> >
> > Without linux/pagemap.h, asm-generic/tlb.h is missing declarations of
> > page_cache_release and release_pages.
> >
> > Signed-off-by: Jeff Dike <[EMAIL PROTECTED]>
> > --
> >  include/asm-generic/tlb.h |1 +
> >  1 file changed, 1 insertion(+)
> >
> > Index: linux-2.6.22/include/asm-generic/tlb.h
> > ===
> > --- linux-2.6.22.orig/include/asm-generic/tlb.h 2007-07-08
> > 19:32:17.0 -0400 +++
> > linux-2.6.22/include/asm-generic/tlb.h  2007-08-22 17:29:45.0
> > -0400 @@ -13,6 +13,7 @@
> >  #ifndef _ASM_GENERIC__TLB_H
> >  #define _ASM_GENERIC__TLB_H
> >
> > +#include 
> >  #include 
> >  #include 
> >  #include 
>
> This is worrisome.  If you look at pagemap.h, it includes a pile of things
> which could easily themsleves try to include tlb.h via some path or
> another.  I fear that this patch will cause explosions with some config
> and/or architecture.
>
> If you like, tlb.h is a low-level sort of thing whereas pagemap.h is a
> higher-level VFS thing.  It is more appropriate that pagemap.h be including
> tlb.h.
>
>
> So I think a better fix would be better, but I'm not able to suggest what,
> as there is little detail about the failure here and I can find no mention
> of page_cache_release and release_pages in asm-generic/tlb.h.

To reproduce it, do this in -rc3:

cat > mini.conf << EOF
CONFIG_MODE_SKAS=y
CONFIG_BINFMT_ELF=y
CONFIG_HOSTFS=y
CONFIG_SYSCTL=y
CONFIG_STDERR_CONSOLE=y
CONFIG_UNIX98_PTYS=y
CONFIG_BLK_DEV_LOOP=y
CONFIG_LBD=y
CONFIG_EXT2_FS=y
CONFIG_PROC_FS=y
EOF
make ARCH=um allnoconfig KCONFIG_ALLCONFIG=mini.conf
make ARCH=um

Rob

(P.S. I note that in order for CONFIG_BLK_DEV_LOOP to actually trigger work 
now, I have to add CONFIG_BLK_DEV=y to the above, which I didn't have to do 
in 2.6.22 or in any previous version all the way back to 2.6.12.  Not a major 
regression, but still a bit of a surprise.  That said, the above is what 
triggered the break for me.)
-- 
"One of my most productive days was throwing away 1000 lines of code."
  - Ken Thompson.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Add all thread stats for TASKSTATS_CMD_ATTR_TGID

2007-08-25 Thread Balbir Singh
Guillaume Chazarain wrote:
> Le Mon, 20 Aug 2007 22:31:08 +0530,
> Balbir Singh <[EMAIL PROTECTED]> a écrit :
> 
>>> --- a/kernel/taskstats.cSat Aug 18 17:15:17 2007 -0700
>>> +++ b/kernel/taskstats.cSun Aug 19 17:20:15 2007 +0200
>>> @@ -246,6 +246,8 @@ static int fill_tgid(pid_t tgid, struct 
>>>
>>> stats->nvcsw += tsk->nvcsw;
>>> stats->nivcsw += tsk->nivcsw;
>>> +   bacct_add_tsk(stats, tsk);
>>> +   xacct_add_tsk(stats, tsk);
>> I'm afraid this is still not good enough. bacct_add_tsk() will assign
>> values and do nothing in the loop (HINT: no summation).
> 
> Hi Balbir, thank you for your review. I agree with everything you said
> and am on my way to do it as time permits, but I have some trouble
> understanding this part. You stated that bacct_add_tsk() would overwrite
> the stats of each thread in the tgid stats, but the other part of the
> patch is the (actually wrong) combination of stats in xxx_add_tsk()
> using min/max/sum.
> 

Hi, Guillaume,

The CSA code was written by Jay Lan, but I'll see if I can answer your
questions. In the current implementation, CSA does not use TGID exit
notifications. In fill_pid(), both bacct_add_tsk() and xacct_add_tsk()
are called (which is correct), they complement each other.

The code needs refactoring to ensure that they can work together
in the fill_tgid() scenario.

> Also, I don't understand why the code to update btime:
> 
> /* calculate task elapsed time in timespec */
> do_posix_clock_monotonic_gettime();
> ts = timespec_sub(uptime, tsk->start_time);
>   ...
> stats->ac_btime = get_seconds() - ts.tv_sec;
> 
> does not simply use tsk->start_time or tsk->real_start_time without
> comparing it to the current time.
> 

>From what I understand, task->start_time and task->real_start_time
are taken from the realtime clock. The accounting in CSA seems
to be very similar to the accounting done in do_acct_process()
(kernel/acct.c).

Jay/Jonathan any comments?


-- 
Warm Regards,
Balbir Singh
Linux Technology Center
IBM, ISTL
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC 0/7] Postphone reclaim laundry to write at high water marks

2007-08-25 Thread Rik van Riel

Christoph Lameter wrote:

On Wed, 22 Aug 2007, Peter Zijlstra wrote:

That is an extreme case that AFAIK we currently ignore and could be 
avoided with some effort.

Its not extreme, not even rare, and its handled now. Its what
PF_MEMALLOC is for.


No its not. If you have all pages allocated as anonymous pages and your 
writeout requires more pages than available in the reserves then you are 
screwed either way regardless if you have PF_MEMALLOC set or not.


Only if the _first_ writeout needs more pages.

If the sum of all writeouts need more pages than you have
available, that is fine.  After all, buffer heads and some
other metadata is freed on IO completion.

Recursive reclaim will also be able to free the data pages
after IO completion, and really fix the problem.

--
Politics is the struggle between those who want to make their country
the best in the world, and those who believe it already is.  Each group
calls the other unpatriotic.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.22.5 SATA Failure

2007-08-25 Thread Robert Hancock

Dong Feng wrote:

I have been using 2.6.21.1. It seems working well, that is, all my
disk partitions are mapped as "/dev/sda*" and the performance looks
good. After I upgrade to 2.6.22.5 with the exact same configuration,
all the disk device turn to "/dev/hda*" and the performance degrade
obviously.

While I boot with 2.6.21.1, the logs regarding disk are:

Aug 26 06:11:26 localhost kernel: ata_piix :00:1f.2: MAP [ P0 P2 IDE IDE ]
Aug 26 06:11:26 localhost kernel: ACPI: PCI Interrupt :00:1f.2[B]
-> GSI 17 (level, low) -> IRQ 17
Aug 26 06:11:26 localhost kernel: ata: 0x170 IDE port busy
Aug 26 06:11:26 localhost kernel: ata: conflict with ide1
Aug 26 06:11:26 localhost kernel: ata1: SATA max UDMA/133 cmd
0x000101f0 ctl 0x000103f6 bmdma 0x0001bfa0 irq 14
Aug 26 06:11:26 localhost kernel: ata2: DUMMY
Aug 26 06:11:26 localhost kernel: scsi0 : ata_piix
Aug 26 06:11:26 localhost kernel: ata1.00: ATA-7: ST9120821AS, 8.03,
max UDMA/133
Aug 26 06:11:26 localhost kernel: ata1.00: 234441648 sectors, multi 8:
LBA48 NCQ (depth 0/32)
Aug 26 06:11:26 localhost kernel: ata1.00: configured for UDMA/133
Aug 26 06:11:26 localhost kernel: scsi1 : ata_piix
Aug 26 06:11:26 localhost kernel: scsi 0:0:0:0: Direct-Access ATA
ST9120821AS  8.03 PQ: 0 ANSI: 5
Aug 26 06:11:26 localhost kernel: SCSI device sda: 234441648 512-byte
hdwr sectors (120034 MB)
Aug 26 06:11:26 localhost kernel: sda: Write Protect is off
Aug 26 06:11:26 localhost kernel: SCSI device sda: write cache:
enabled, read cache: enabled, doesn't support DPO or FUA
Aug 26 06:11:26 localhost kernel: SCSI device sda: 234441648 512-byte
hdwr sectors (120034 MB)
Aug 26 06:11:26 localhost kernel: sda: Write Protect is off
Aug 26 06:11:26 localhost kernel: SCSI device sda: write cache:
enabled, read cache: enabled, doesn't support DPO or FUA
Aug 26 06:11:26 localhost kernel:  sda: sda1 sda2 sda3 < sda5 sda6 sda7 >



While booting with 2.6.22.5, the logs are:

Aug 26 09:30:04 localhost kernel: ACPI: PCI Interrupt :00:1f.2[B]
-> GSI 17 (level, low) -> IRQ 17
Aug 26 09:30:04 localhost kernel: ata_piix :00:1f.2: 0x1F0 IDE port busy
Aug 26 09:30:04 localhost kernel: ata_piix :00:1f.2: 0x170 IDE port busy
Aug 26 09:30:04 localhost kernel: ata_piix :00:1f.2: no available
legacy port



It seems the dramatic change in libata.c incurs this failure.


It looks like you have some CONFIG_IDE options enabled in your kernel 
configuration that result in drivers/ide trying to drive part or all of 
that controller, preventing libata from doing so. Likely the easiest 
thing to do is just set CONFIG_IDE=n entirely..


--
Robert Hancock  Saskatoon, SK, Canada
To email, remove "nospam" from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Fix boot-time hang on G31/G33 PC

2007-08-25 Thread Robert Hancock

Matthew Wilcox wrote:

This patch, loosely based on a patch from Robert Hancock, which was in
turn based on a patch from Jesse Barnes, fixes a boot-time hang on my
shiny new PC.  The 'conflict' mentioned in the patch in my case happens
to be between mmconfig and the graphics card, but it could easily be
between any pair of devices if they are left enabled by the BIOS and
mappen in the wrong place.

Signed-off-by: Matthew Wilcox <[EMAIL PROTECTED]>



We've already got a patch for this in Greg's PCI tree, hopefully it 
should go in for 2.6.24.


Are you getting MMCONFIG enabled on your system with 2.6.23? If not this 
problem shouldn't matter. In the cases I've seen that have caused 
problems in the past (Intel boards mainly), where the MMCONFIG area 
overlaps with where the graphics card BAR ends up during BAR sizing, the 
BIOS happened to not reserve the MMCONFIG table in the E820 memory map, 
so current mainline will turn off MMCONFIG. However, it's quite possible 
that some systems will pass the old E820 validation check and turn on 
MMCONFIG where the overlap happens..


There's a patch in Andi's tree (also hopefully for 2.6.24) to loosen the 
MMCONFIG validation to check against ACPI reservations instead of the 
E820 map (which isn't required to have a reservation for MMCONFIG). This 
makes the disable-decode change more critical.



diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 34b8dae..51ef450 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -180,11 +180,26 @@ static inline int is_64bit_memory(u32 mask)
return 0;
 }
 
+/*

+ * Sizing PCI BARs requires us to disable decoding, otherwise we may run
+ * into conflicts with other devices while trying to size the BAR.  Normally
+ * this isn't a problem, but it happens on some machines normally, and can
+ * happen on others during PCI device hotplug.  Don't disable BARs for host
+ * bridges, though.  Some of them do silly things like disable accesses to
+ * RAM from the CPU
+ */
 static void pci_read_bases(struct pci_dev *dev, unsigned int howmany, int rom)
 {
unsigned int pos, reg, next;
u32 l, sz;
struct resource *res;
+   u16 orig_cmd;
+
+   if ((dev->class >> 8) != PCI_CLASS_BRIDGE_HOST) {
+   pci_read_config_word(dev, PCI_COMMAND, _cmd);
+   pci_write_config_word(dev, PCI_COMMAND,
+   orig_cmd & ~(PCI_COMMAND_MEMORY | PCI_COMMAND_IO));
+   }
 
 	for(pos=0; pos
u64 l64;
@@ -283,6 +298,9 @@ static void pci_read_bases(struct pci_dev *dev, unsigned 
int howmany, int rom)
}
}
}
+
+   if ((dev->class >> 8) != PCI_CLASS_BRIDGE_HOST)
+   pci_write_config_word(dev, PCI_COMMAND, orig_cmd);
 }
 
 void __devinit pci_read_bridge_bases(struct pci_bus *child)



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: cciss: fix error reporting for SG_IO

2007-08-25 Thread Stephen Cameron
> > This fixes a problem with the way cciss was filling out the "errors"
> > field of the request structure upon completion of requests.
> > Previously, it just put a 1 or a 0 in there and used the negation
> > of this as the uptodate parameter to one of the functions in the
> > block layer, being a block device.  For the SG_IO ioctl, this was not
> > sufficient, and we noticed that, for example, sg_turs from sg3_utils
> > did not correctly detect problems due to cciss having set rq->errors
> > incorrectly.
> 
> Do we think this problem is sufficiently serious to merit merging
> this (largeish) patch into 2.6.23?
> 
> I'm thinking "no", but that might be wrong...

Without saying too much (I hope), if you want multipath i/o to cciss 
devices to work which depend on device mapper, (which I can't say what 
specific device(s) match that description without getting myself into 
trouble) then you want this patch.  If my understanding is correct, then
some DM multipath stuff depends on TUR response to know if a path is
failed.  If you don't care about that, then you can skip it.
 
-- steve



   

Moody friends. Drama queens. Your life? Nope! - their life, your story. Play 
Sims Stories at Yahoo! Games.
http://sims.yahoo.com/  
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [4/4] 2.6.23-rc3: known regressions v3

2007-08-25 Thread Satyam Sharma


On Fri, 24 Aug 2007, Michal Piotrowski wrote:
> 
> Alpha
> 
> Subject : -Werror compilation problem - make[1]: *** 
> [arch/alpha/kernel/sys_titan.o] Error 1
> References  : http://lkml.org/lkml/2007/8/6/137
> Last known good : ?
> Submitter   : Oliver Falk <[EMAIL PROTECTED]>
> Caused-By   : ?
> Handled-By  : ?
> Status  : unknown

This was fixed by commit f6901e639800e745457b1dcd99c52647981438d7.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RT] - rebalance_domains incorrect parameter

2007-08-25 Thread Sven-Thorsten Dietrich
On Fri, 2007-08-24 at 11:15 +0200, Ingo Molnar wrote:
> * Sven-Thorsten Dietrich <[EMAIL PROTECTED]> wrote:
> 
> > Same issue has been fixed in mainline by: 
> > 
> > diff-tree de0cf899bbf06b6f64a5dce9c59d74c41b6b4232 (from
> > 5d2b3d3695a841231b65b55
> > Author: Oleg Nesterov <[EMAIL PROTECTED]>
> > Date:   Sun Aug 12 18:08:19 2007 +0200
> 
> > signed-off-by: Sven-Thorsten Dietrich <[EMAIL PROTECTED]>
> 
> ah, you mean we should pick up an upstream fix for -rt?

Well, um yes - I had made the patch a few days ago, but then realized it
was already fixed in mainline.

>  We'll do that 
> and we'll pick up much more: all the other ~100 CFS commits that 
> happened meanwhile. (Btw., there's no need to sign off on patch 
> forwarding or backport requests - the signoff made me first believe this 
> is some new patch.)
> 

Ok. Thanks

Sven

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Sleep problems with kernels >= 2.6.21 on powerpc

2007-08-25 Thread Rogério Brito
Hi.

Unfortunately, it seems that kernels later than 2.6.21 have problems
letting my powerpc iBook (G3 processor) going to sleep (suspend to
ram).

The userland that I am using is a Debian testing (lenny) and the
default kernel that comes with it is 2.6.22, with some patches applied
and pbbuttonsd (as the daemon for making the machine sleep).

With kernel 2.6.21, from Debian (and other earlier kernels), the
symptoms that I see when I press the power button is that the machine
goes to sleep and the led that indicates that the machine is sleeping
is blinking normally.

If I, on the other hand, use Debian's kernel 2.6.22 or compile my own
kernel with just the necessary parts for my work (version 2.6.23-rc3
taken from kernel.org), then I can't make the machine sleep: when I
press the button, it acts like if I had, in sequence, pressed anything
to wake it up (say, like pressing shift).

I have already grabbed Linus's git tree and I am willing to do some
cycles of "git bisect" to discover the point where it stopped working.

I just thought that others may have seen such behaviour before or, if
not, that being informative about what I see is of use for debugging
this.

I would also appreciate any guidance on this as I wish kernel 2.6.23
to be working again on powerpc machines.

Please, if any tests are required, don't hesitate to ask me and I will
try to whatever is needed to restore the correct behaviour of sleep
with the Linux kernel.

I have copied mailing lists that I think that are relevant. If they
aren't, then please let me know. I would also appreciate if I were
kept on carbon copies as I am only subscribed to debian-powerpc at the
moment.


Regards, Rogério Brito.

P.S.: It unfortunately doesn't matter if I switch to a console or if I
am in X when I press the power button with recent kernels.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [TOMOYO 14/15] Conditional permission support.

2007-08-25 Thread Tetsuo Handa
Hello.

Pavel Machek wrote:
> What is that? Language parser in kernel?

Yes. This is a policy parser in kernel.

TOMOYO Linux' policy is passed from/to the kernel as a plain text
(i.e. ASCII printable) file via /proc/tomoyo interface.

For example, to add a permission to allow /usr/sbin/sshd
to execute /bin/bash if the authenticated user's uid = 500,
the administrator runs

# /bin/cat > /proc/tomoyo/domain_policy << EOF
select  /usr/sbin/sshd
1 /bin/bash if task.uid=500
EOF

and to remove this permission, the administrator runs

# /bin/cat > /proc/tomoyo/domain_policy << EOF
select  /usr/sbin/sshd
delete 1 /bin/bash if task.uid=500
EOF

The patch [TOMOYO 14/15] handles "if task.uid=500" part.

No compilation at userspace and
only difference between old and new policy is written.
This is similar to LDAP manipulation using LDIF format.

(To be exact, only programs that are registered in
/proc/tomoyo/manager can modify policy via /proc/tomoyo interface.
You need to use /usr/lib/ccs/loadpolicy or something
instead of /bin/cat .)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] Fix boot-time hang on G31/G33 PC

2007-08-25 Thread Matthew Wilcox

This patch, loosely based on a patch from Robert Hancock, which was in
turn based on a patch from Jesse Barnes, fixes a boot-time hang on my
shiny new PC.  The 'conflict' mentioned in the patch in my case happens
to be between mmconfig and the graphics card, but it could easily be
between any pair of devices if they are left enabled by the BIOS and
mappen in the wrong place.

Signed-off-by: Matthew Wilcox <[EMAIL PROTECTED]>

diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 34b8dae..51ef450 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -180,11 +180,26 @@ static inline int is_64bit_memory(u32 mask)
return 0;
 }
 
+/*
+ * Sizing PCI BARs requires us to disable decoding, otherwise we may run
+ * into conflicts with other devices while trying to size the BAR.  Normally
+ * this isn't a problem, but it happens on some machines normally, and can
+ * happen on others during PCI device hotplug.  Don't disable BARs for host
+ * bridges, though.  Some of them do silly things like disable accesses to
+ * RAM from the CPU
+ */
 static void pci_read_bases(struct pci_dev *dev, unsigned int howmany, int rom)
 {
unsigned int pos, reg, next;
u32 l, sz;
struct resource *res;
+   u16 orig_cmd;
+
+   if ((dev->class >> 8) != PCI_CLASS_BRIDGE_HOST) {
+   pci_read_config_word(dev, PCI_COMMAND, _cmd);
+   pci_write_config_word(dev, PCI_COMMAND,
+   orig_cmd & ~(PCI_COMMAND_MEMORY | PCI_COMMAND_IO));
+   }
 
for(pos=0; posclass >> 8) != PCI_CLASS_BRIDGE_HOST)
+   pci_write_config_word(dev, PCI_COMMAND, orig_cmd);
 }
 
 void __devinit pci_read_bridge_bases(struct pci_bus *child)

-- 
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


2.6.22.5 SATA Failure

2007-08-25 Thread Dong Feng
I have been using 2.6.21.1. It seems working well, that is, all my
disk partitions are mapped as "/dev/sda*" and the performance looks
good. After I upgrade to 2.6.22.5 with the exact same configuration,
all the disk device turn to "/dev/hda*" and the performance degrade
obviously.

While I boot with 2.6.21.1, the logs regarding disk are:

Aug 26 06:11:26 localhost kernel: ata_piix :00:1f.2: MAP [ P0 P2 IDE IDE ]
Aug 26 06:11:26 localhost kernel: ACPI: PCI Interrupt :00:1f.2[B]
-> GSI 17 (level, low) -> IRQ 17
Aug 26 06:11:26 localhost kernel: ata: 0x170 IDE port busy
Aug 26 06:11:26 localhost kernel: ata: conflict with ide1
Aug 26 06:11:26 localhost kernel: ata1: SATA max UDMA/133 cmd
0x000101f0 ctl 0x000103f6 bmdma 0x0001bfa0 irq 14
Aug 26 06:11:26 localhost kernel: ata2: DUMMY
Aug 26 06:11:26 localhost kernel: scsi0 : ata_piix
Aug 26 06:11:26 localhost kernel: ata1.00: ATA-7: ST9120821AS, 8.03,
max UDMA/133
Aug 26 06:11:26 localhost kernel: ata1.00: 234441648 sectors, multi 8:
LBA48 NCQ (depth 0/32)
Aug 26 06:11:26 localhost kernel: ata1.00: configured for UDMA/133
Aug 26 06:11:26 localhost kernel: scsi1 : ata_piix
Aug 26 06:11:26 localhost kernel: scsi 0:0:0:0: Direct-Access ATA
ST9120821AS  8.03 PQ: 0 ANSI: 5
Aug 26 06:11:26 localhost kernel: SCSI device sda: 234441648 512-byte
hdwr sectors (120034 MB)
Aug 26 06:11:26 localhost kernel: sda: Write Protect is off
Aug 26 06:11:26 localhost kernel: SCSI device sda: write cache:
enabled, read cache: enabled, doesn't support DPO or FUA
Aug 26 06:11:26 localhost kernel: SCSI device sda: 234441648 512-byte
hdwr sectors (120034 MB)
Aug 26 06:11:26 localhost kernel: sda: Write Protect is off
Aug 26 06:11:26 localhost kernel: SCSI device sda: write cache:
enabled, read cache: enabled, doesn't support DPO or FUA
Aug 26 06:11:26 localhost kernel:  sda: sda1 sda2 sda3 < sda5 sda6 sda7 >



While booting with 2.6.22.5, the logs are:

Aug 26 09:30:04 localhost kernel: ACPI: PCI Interrupt :00:1f.2[B]
-> GSI 17 (level, low) -> IRQ 17
Aug 26 09:30:04 localhost kernel: ata_piix :00:1f.2: 0x1F0 IDE port busy
Aug 26 09:30:04 localhost kernel: ata_piix :00:1f.2: 0x170 IDE port busy
Aug 26 09:30:04 localhost kernel: ata_piix :00:1f.2: no available
legacy port



It seems the dramatic change in libata.c incurs this failure.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 09/30] mtd: Don't cast kmalloc() return value in drivers/mtd/maps/pmcmsp-flash.c

2007-08-25 Thread Kyle Moffett

On Aug 25, 2007, at 20:36:32, Jesper Juhl wrote:

On 26/08/07, Robert P. J. Day <[EMAIL PROTECTED]> wrote:
technically, nothing.  but if you're not going to use kcalloc()  
when you're explicitly allocating an array of identical objects  
(that you want zero-filled, as a bonus), then what's the point of  
ever having defined a kcalloc() routine in the first place?



I wonder a bit about that myself...

I have found some other issues in that function that I want to fix,  
so I'll be respinning the patch as a patch series instead - and why  
not; I'll just go with kcalloc() and see what the maintainers have  
to say, it's not like I personally care much one way or the other.


I think the original reasoning behind kcalloc() was that it did some  
extra input checking, so that if the product of the two numbers  
overflowed, it would fail with NULL instead of allocating  
insufficient space.  In the kernel it doesn't matter in practice  
since you MUST have additional checking on the size of allocated  
memory anyways, not even considering the fact that >PAGE_SIZE  
allocations are probably going to fail with decent frequency regardless.


Cheers,
Kyle Moffett


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RT 3/3 - take two ] fix get_monotonic_cycles for latency tracer

2007-08-25 Thread Steven Rostedt

--
On Sat, 25 Aug 2007, Frank Ch. Eigler wrote:

>
> Steven Rostedt <[EMAIL PROTECTED]> writes:
>
> > [...]
> > +* [...] We don't need to grab
> > +* any locks, we just keep trying until get all the
> > +* calculations together in one state.
> > +*
> > +* In fact, we __cant__ grab any locks. This
> > +* function is called from the latency_tracer which can
> > +* be called anywhere. To grab any locks (including
> > +* seq_locks) we risk putting ourselves into a deadlock.
>
> Perhaps you could add a comment about why the loop, which appears
> potentially infinite as written, avoids livelock.  (It looks rather
> like a seqlock read loop.)
>

I guess I need to rewrite that comment. It shouldn't appear infinitely
looping, since it is basically: do { x=A; func() } while (x != A); which
to me seems that while is most likely to fail unless something touches A.

But yes, it _is_ basically a seq lock, but its on what we are working
with.  And we don't even need any memory barriers that a seq lock might
do, since it is very obvious to gcc that the function call can modify the
variables that we are testing.

But do you still think that looks inifinte?  If so, I'll reword it.

-- Steve

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/1] hotplug cpu: migrate a task within its cpuset

2007-08-25 Thread Rusty Russell
On Sun, 2007-08-26 at 05:46 +0530, Gautham R Shenoy wrote:
> On Sat, Aug 25, 2007 at 01:47:40PM +0400, Oleg Nesterov wrote:
> > Before this patch, process leaves its ->cpuset and migrates to some "random"
> > any_online_cpu(). With this patch it stays within ->cpuset and migrates to
> > CPU 3.
> 
> The decision to bind a task to a specific cpu, was taken by the userspace
> for a reason, which is _unknown_ to the kernel.
> So logically, shouldn't the userspace decide what should be 
> the fate of those exclusive-affined tasks, whose cpu is about to go
> offline? After all, the reason to offline the cpu is, again, unknown to
> the kernel.

Userspace is not monolithic.  If you refuse to take a CPU offline
because a task is affine, then any user can prevent a CPU from going
offline.

You could, perhaps, introduce a "gentle" offline which fails if process
affinity can no longer be met.

Rusty.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 09/30] mtd: Don't cast kmalloc() return value in drivers/mtd/maps/pmcmsp-flash.c

2007-08-25 Thread Jesper Juhl
On 26/08/07, Robert P. J. Day <[EMAIL PROTECTED]> wrote:
> On Sun, 26 Aug 2007, Jesper Juhl wrote:
>
> > On 26/08/07, Robert P. J. Day <[EMAIL PROTECTED]> wrote:
>
> > > i was thinking more along the lines of
> > >
> > > msp_parts[i] = kcalloc(pcnt, sizeof(struct mtd_partition), GFP_KERNEL);
> > >
> > > which was kind of the obvious implication, no?
> >
> > I guess
> >
> > > unless there's a reason kcalloc() wouldn't work here, this is
> > > pretty much what kcalloc() was designed for.
> > >
> > When Denys brought up the zeroing thing and mentioned kzalloc() I
> > did consider kcalloc() instead, but kzalloc() makes this allocation
> > nicely look like the preceding ones visually and I couldn't convince
> > myself that kcalloc() would give us any real benefit here.
> >
> > What exactely would using kcalloc() over kzalloc() here buy us?
>
> technically, nothing.  but if you're not going to use kcalloc() when
> you're explicitly allocating an array of identical objects (that you
> want zero-filled, as a bonus), then what's the point of ever having
> defined a kcalloc() routine in the first place?
>
I wonder a bit about that myself...

I have found some other issues in that function that I want to fix, so
I'll be respinning the patch as a patch series instead - and why not;
I'll just go with kcalloc() and see what the maintainers have to say,
it's not like I personally care much one way or the other.

-- 
Jesper Juhl <[EMAIL PROTECTED]>
Don't top-post  http://www.catb.org/~esr/jargon/html/T/top-post.html
Plain text mails only, please  http://www.expita.com/nomime.html
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 09/30] mtd: Don't cast kmalloc() return value in drivers/mtd/maps/pmcmsp-flash.c

2007-08-25 Thread Robert P. J. Day
On Sun, 26 Aug 2007, Jesper Juhl wrote:

> On 26/08/07, Robert P. J. Day <[EMAIL PROTECTED]> wrote:

> > i was thinking more along the lines of
> >
> > msp_parts[i] = kcalloc(pcnt, sizeof(struct mtd_partition), GFP_KERNEL);
> >
> > which was kind of the obvious implication, no?
>
> I guess
>
> > unless there's a reason kcalloc() wouldn't work here, this is
> > pretty much what kcalloc() was designed for.
> >
> When Denys brought up the zeroing thing and mentioned kzalloc() I
> did consider kcalloc() instead, but kzalloc() makes this allocation
> nicely look like the preceding ones visually and I couldn't convince
> myself that kcalloc() would give us any real benefit here.
>
> What exactely would using kcalloc() over kzalloc() here buy us?

technically, nothing.  but if you're not going to use kcalloc() when
you're explicitly allocating an array of identical objects (that you
want zero-filled, as a bonus), then what's the point of ever having
defined a kcalloc() routine in the first place?

rday
-- 

Robert P. J. Day
Linux Consulting, Training and Annoying Kernel Pedantry
Waterloo, Ontario, CANADA

http://crashcourse.ca

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 09/30] mtd: Don't cast kmalloc() return value in drivers/mtd/maps/pmcmsp-flash.c

2007-08-25 Thread Jesper Juhl
On 26/08/07, Robert P. J. Day <[EMAIL PROTECTED]> wrote:
> On Sun, 26 Aug 2007, Jesper Juhl wrote:
>
> > On 24/08/07, Robert P. J. Day <[EMAIL PROTECTED]> wrote:
>
> > > actually, i would think kcalloc would be more appropriate here, no?
> > >
> >
> > Why?
> >
> > msp_parts[i] = kzalloc(pcnt * sizeof(struct mtd_partition), GFP_KERNEL);
> >
> > seems better to me than
> >
> > msp_parts[i] = kcalloc(1, pcnt * sizeof(struct mtd_partition), GFP_KERNEL);
>
> i was thinking more along the lines of
>
> msp_parts[i] = kcalloc(pcnt, sizeof(struct mtd_partition), GFP_KERNEL);
>
> which was kind of the obvious implication, no?

I guess

> unless there's a
> reason kcalloc() wouldn't work here, this is pretty much what
> kcalloc() was designed for.
>
When Denys brought up the zeroing thing and mentioned kzalloc() I did
consider kcalloc() instead, but kzalloc() makes this allocation nicely
look like the preceding ones visually and I couldn't convince myself
that kcalloc() would give us any real benefit here.

What exactely would using kcalloc() over kzalloc() here buy us?

-- 
Jesper Juhl <[EMAIL PROTECTED]>
Don't top-post  http://www.catb.org/~esr/jargon/html/T/top-post.html
Plain text mails only, please  http://www.expita.com/nomime.html
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/1] hotplug cpu: migrate a task within its cpuset

2007-08-25 Thread Gautham R Shenoy
On Sat, Aug 25, 2007 at 01:47:40PM +0400, Oleg Nesterov wrote:
> On 08/24, Andrew Morton wrote:
> >
> > On Fri, 24 Aug 2007 17:18:06 -0500
> > Cliff Wickman <[EMAIL PROTECTED]> wrote:
> > 
> > > When a cpu is disabled, move_task_off_dead_cpu() is called for tasks
> > > that have been running on that cpu.
> > > 
> > > Currently, such a task is migrated:
> > >  1) to any cpu on the same node as the disabled cpu, which is both online
> > > and among that task's cpus_allowed
> > >  2) to any cpu which is both online and among that task's cpus_allowed
> > > 
> > > It is typical of a multithreaded application running on a large NUMA 
> > > system
> > > to have its tasks confined to a cpuset so as to cluster them near the
> > > memory that they share. Furthermore, it is typical to explicitly place 
> > > such
> > > a task on a specific cpu in that cpuset.  And in that case the task's
> > > cpus_allowed includes only a single cpu.
> > 
> > operator error..
> > 
> > > This patch would insert a preference to migrate such a task to some cpu 
> > > within
> > > its cpuset (and set its cpus_allowed to its entire cpuset).
> > > 
> > > With this patch, migrate the task to:
> > >  1) to any cpu on the same node as the disabled cpu, which is both online
> > > and among that task's cpus_allowed
> > >  2) to any online cpu within the task's cpuset
> > >  3) to any cpu which is both online and among that task's cpus_allowed
> > 
> > Wouldn't it be saner to refuse the offlining request if the CPU has tasks
> > which cannot be migrated to any other CPU?  I mean, the operator has gone
> > and asked the machine to perform two inconsistent/incompatible things at
> > the same time.
> 
> I don't think so (regardless of this patch and CONFIG_CPUSETS). Any user
> can bind its process to (say) CPU 4. This shouldn't block cpu-unplug.
> 
> Now, let's suppose that this process is a member of some cpuset which
> contains CPUs 3 and 4, and CPU 4 goes down.
> 
> Before this patch, process leaves its ->cpuset and migrates to some "random"
> any_online_cpu(). With this patch it stays within ->cpuset and migrates to
> CPU 3.

The decision to bind a task to a specific cpu, was taken by the userspace
for a reason, which is _unknown_ to the kernel.
So logically, shouldn't the userspace decide what should be 
the fate of those exclusive-affined tasks, whose cpu is about to go
offline? After all, the reason to offline the cpu is, again, unknown to
the kernel.

Though we have been breaking such a task's affinity in  
/* No more Mr. Nice Guy. */ part, IMO a nicer way to do it would be to: 
- Fail the cpu-offline operation with -EBUSY, if there exist userspace tasks 
  exclusively affined to that cpu.
- Provide some kind of infrastructure like a sysfs file
  /sys/devices/system/cpu/cpuX/exclusive_tasks which will help
  the administrator take proactive steps before requesting a
  cpu offline, instead of the kernel taking the reactive step once the
  offline is done.

The side-effect, ofcourse would be that it would break some of the
existing applications, which are not used to cpu-offline failing unless
the cpu was already offline or there was only one online cpu. But is the
side effect so critical, that we continue with this funny contradiction in 
the kernel?! Or is there something important, I'm missing here?

> 
> > Look at it this way.  If we were to merge this patch then it would be
> > logical to also merge a patch which has the following description:
> >
> >   "if an process attempts to pin itself onto an presently-offlined CPU,
> >the kernel will choose a different CPU according to  and
> >will pin the process to that CPU instead".
> 
> set_cpus_allowed() just returns -EINVAL in that case, this looks a bit
> more logical.
> 

Yup, it sure does!

> Oleg.
> 

Thanks and Regards
gautham.
-- 
Gautham R Shenoy
Linux Technology Center
IBM India.
"Freedom comes with a price tag of responsibility, which is still a bargain,
because Freedom is priceless!"
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 09/30] mtd: Don't cast kmalloc() return value in drivers/mtd/maps/pmcmsp-flash.c

2007-08-25 Thread Robert P. J. Day
On Sun, 26 Aug 2007, Jesper Juhl wrote:

> On 24/08/07, Robert P. J. Day <[EMAIL PROTECTED]> wrote:

> > actually, i would think kcalloc would be more appropriate here, no?
> >
>
> Why?
>
> msp_parts[i] = kzalloc(pcnt * sizeof(struct mtd_partition), GFP_KERNEL);
>
> seems better to me than
>
> msp_parts[i] = kcalloc(1, pcnt * sizeof(struct mtd_partition), GFP_KERNEL);

i was thinking more along the lines of

msp_parts[i] = kcalloc(pcnt, sizeof(struct mtd_partition), GFP_KERNEL);

which was kind of the obvious implication, no?  unless there's a
reason kcalloc() wouldn't work here, this is pretty much what
kcalloc() was designed for.

rday

-- 

Robert P. J. Day
Linux Consulting, Training and Annoying Kernel Pedantry
Waterloo, Ontario, CANADA

http://crashcourse.ca

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc3-mm1

2007-08-25 Thread Randy Dunlap
On Sun, 26 Aug 2007 01:26:21 +0200 Tilman Schmidt wrote:

> Am 25.08.2007 02:38 schrieb Pallipadi, Venkatesh:
> >>
>  FATAL: Error inserting processor 
>  (/lib/modules/2.6.23-rc3-mm1-testing/kernel/drivers/acpi/processor.ko): 
>  Input/output error
>  WARNING: Error inserting processor 
>  (/lib/modules/2.6.23-rc3-mm1-testing/kernel/drivers/acpi/processor.ko): 
>  Input/output error
> > 
> > This is indeed related to CPUIDLE.
> > 
> > Tilman: Can you configure CONFIG_CPU_IDLE in your config (under Power
> > Management option) and double check that the frequency part works after
> > that.
> 
> Strangely enough, I do not see that option in "make xconfig".
> The "Power Management" subtree ends with "CPU Frequency scaling".
> In "make menuconfig" the option is there, though.

That sounds strange.  Please share your .config file.

> After activating it, these two errors are indeed gone, and the
> "thermal: Unknown symbol acpi_processor_set_thermal_limit" one
> as well.



---
~Randy
*** Remember to use Documentation/SubmitChecklist when testing your code ***
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc3-mm1

2007-08-25 Thread Tilman Schmidt
Am 25.08.2007 09:55 schrieb Paul Rolland:

> On Sat, 25 Aug 2007 00:28:09 -0400
> Dave Jones <[EMAIL PROTECTED]> wrote:
> 
>> On Fri, Aug 24, 2007 at 08:30:00PM -0700, Andrew Morton wrote:
>>  
>>  > > -<6>rtc_cmos 00:03: rtc core: registered rtc_cmos as rtc0
>>  > > -<4>rtc_cmos: probe of 00:03 failed with error -16
>>  > 
>>  > I wonder if that was supposed to happen.  It's also happening in
>>  > 2.6.23-rc3 base.
>>  
>> EBUSY. I've seen this happen when you have both CONFIG_RTC and
>> CONFIG_RTC_DRV_CMOS set.
> 
> This one is becoming quite worth an entry in a FAQ, it pops up one every
> month ;)
> There was a discussion about preventing both being set at the same time
> when configuring, but I don't remember how it ends... 

I must have missed that discussion. I have:
CONFIG_RTC=y
CONFIG_RTC_DRV_CMOS=m
because both of these options claim in their help texts that you
should select them if you want to access the PC RTC.

-- 
Tilman Schmidt  E-Mail: [EMAIL PROTECTED]
Bonn, Germany
Diese Nachricht besteht zu 100% aus wiederverwerteten Bits.
Ungeöffnet mindestens haltbar bis: (siehe Rückseite)



signature.asc
Description: OpenPGP digital signature


Re: 2.6.23-rc3-mm1

2007-08-25 Thread Tilman Schmidt
Am 25.08.2007 02:38 schrieb Pallipadi, Venkatesh:
>>
 FATAL: Error inserting processor 
 (/lib/modules/2.6.23-rc3-mm1-testing/kernel/drivers/acpi/processor.ko): 
 Input/output error
 WARNING: Error inserting processor 
 (/lib/modules/2.6.23-rc3-mm1-testing/kernel/drivers/acpi/processor.ko): 
 Input/output error
> 
> This is indeed related to CPUIDLE.
> 
> Tilman: Can you configure CONFIG_CPU_IDLE in your config (under Power
> Management option) and double check that the frequency part works after
> that.

Strangely enough, I do not see that option in "make xconfig".
The "Power Management" subtree ends with "CPU Frequency scaling".
In "make menuconfig" the option is there, though.
After activating it, these two errors are indeed gone, and the
"thermal: Unknown symbol acpi_processor_set_thermal_limit" one
as well.

HTH
T.

-- 
Tilman Schmidt  E-Mail: [EMAIL PROTECTED]
Bonn, Germany
Diese Nachricht besteht zu 100% aus wiederverwerteten Bits.
Ungeöffnet mindestens haltbar bis: (siehe Rückseite)



signature.asc
Description: OpenPGP digital signature


Re: division and cpu usage

2007-08-25 Thread Luka Napotnik
Hello again.

I have the following code:

clock_t c_sum, j, p;
cputime_t j_tmp;
...
c_sum = cputime64_to_clock_t(task->utime) +
cputime64_to_clock_t(task->stime);
cur_j = jiffies;
j_tmp =  jiffies64_to_cputime64(cur_j);
j = cputime64_to_clock_t(j_tmp);
p = (c_sum * 100) / j;


And if I check the p value of a certain process it gives wrong results.
For example for a process using 99% of the CPU it shows 20. What am I
doing wrong?

Greets,
Luka

Jan Engelhardt pravi:
> On Aug 24 2007 07:34, linux-os (Dick Johnson) wrote:
>>> I'm new to kernel development and have some questions.
>>>
>>> 1. Why can't I divide with regular casting to double ((double)a /
>>> (double)b)? It gives me strange errors when compiling:
>>>
>>> WARNING: "__divdf3" [/root] undefined!
>>> WARNING: "__addf3" [/root/...] undefined!
>>> WARNING: "__floatsidf" [/root/...] undefined!
>>>
>>> And if I compile with normal integers, I get zero as the result.
>>>
>>> 2. I'm trying to get the percentage of CPU used for a certain
>>> task_struct and figured the following formula:
>>>
>>> (task->utime + task->stime) / jiffies
>>>
>>> Before calculating I convert all the variables to jiffies. Is this correct?
> 
> * So use integer math: (task->utime + task->stime) * 100 / jiffies
>   and you get the 'common' percentage. In integer, that is.
> 
> * I am not sure about the use of jiffies when it comes to CONFIG_NO_HZ=y.
> 
>> Floating point operations are not allowed in the kernel. Often,
> 
> IIRC they are allowed since ... recently (2.6.16, .17? can't remember). When
> the kernel tries to execute an FP instruction (and traps as a result), more
> kernel code will enable that the FP stack gets properly switched when a 
> process
> changes between userspace-kernelspace.
> 
>> You can use "long long" for high precision math if necessary.
> 
> That will give link failure for __udivdi3. Use do_div().
> 
> 
>   Jan

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CFS review

2007-08-25 Thread Ingo Molnar

* Al Boldi <[EMAIL PROTECTED]> wrote:

> > ok. I think i might finally have found the bug causing this. Could 
> > you try the fix below, does your webserver thread-startup test work 
> > any better?
> 
> It seems to help somewhat, but the problem is still visible.  Even 
> v20.3 on 2.6.22.5 didn't help.
> 
> It does look related to ia-boosting, so I turned off __update_curr 
> like Roman mentioned, which had an enormous smoothing effect, but then 
> nice levels completely break down and lockup the system.

you can turn sleeper-fairness off via:

   echo 28 > /proc/sys/kernel/sched_features

another thing to try would be:

   echo 12 > /proc/sys/kernel/sched_features

(that's the new-task penalty turned off.)

Another thing to try would be to edit this:

if (sysctl_sched_features & SCHED_FEAT_START_DEBIT)
p->se.wait_runtime = -(sched_granularity(cfs_rq) / 2);

to:

if (sysctl_sched_features & SCHED_FEAT_START_DEBIT)
p->se.wait_runtime = -(sched_granularity(cfs_rq);

and could you also check 20.4 on 2.6.22.5 perhaps, or very latest -git? 
(Peter has experienced smaller spikes with that.)

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [TOMOYO 14/15] Conditional permission support.

2007-08-25 Thread Toshiharu Harada
Hi,

2007/8/25, Pavel Machek <[EMAIL PROTECTED]>:
> Hi!
>
> > This patch allows administrators use conditional permission.
> > TOMOYO Linux supports conditional permission based on
> > process's UID,GID etc. and/or requested pathname's UID/GID.
> >
> > Signed-off-by: Kentaro Takeda <[EMAIL PROTECTED]>
> > Signed-off-by: Tetsuo Handa <[EMAIL PROTECTED]>
>
> > + * Since the trailing spaces are removed by tmy_normalize_line(),
> > + * the last "\040if\040" sequence corresponds to condition part.
> > + */
> > +char *tmy_find_condition_part(char *data)
> > +{
> > + char *cp = strstr(data, " if ");
> > + if (cp) {
> > + char *cp2;
> > + while ((cp2 = strstr(cp + 3, " if ")) != NULL)
> > + cp = cp2;
> > + *cp++ = '\0';
> > + }
> > + return cp;
> > +}
> ...
>
> > + unsigned long left_min = 0;
> > + unsigned long left_max = 0;
> > + unsigned long right_min = 0;
> > + unsigned long right_max = 0;
> > + if (strncmp(condition, "if ", 3))
> > + return NULL;
> > + condition += 3;
> > + start = condition;
> > + while (*condition) {
> > + if (*condition == ' ')
> > + condition++;
> > + for (left = 0; left < MAX_KEYWORD; left++) {
> > + if (strncmp(condition, cc_keyword[left].keyword,
> > + cc_keyword[left].keyword_len))
> > + continue;
> > + condition += cc_keyword[left].keyword_len;
> > + break;
> > + }
> > + if (left == MAX_KEYWORD) {
> > + if (!tmy_parse_ulong(_min, ))
> > + goto out;
> > + counter++; /* body */
> > + if (*condition != '-')
> > + goto not_range1;
> > + condition++;
> > + if (!tmy_parse_ulong(_max, )
> > + || left_min > left_max)
> > + goto out;
> > + counter++; /* body */
> > +not_range1: ;
> > + }
> > + if (strncmp(condition, "!=", 2) == 0)
> > + condition += 2;
> > + else if (*condition == '=')
> > + condition++;
> > + else
> > + goto out;
> > + counter++; /* header */
> > + for (right = 0; right < MAX_KEYWORD; right++) {
> > + if (strncmp(condition, cc_keyword[right].keyword,
> > + cc_keyword[right].keyword_len))
> > + continue;
> > + condition += cc_keyword[right].keyword_len;
> > + break;
> > + }
>
> What is that? Language parser in kernel?
>
> Pavel

Key idea of TOMOYO Linux is to let each process to remember the program
(path) name. Names are stored in task struct and "appended" to the list when
execve is called.

An example of /usr/lib/cups/backend/lpd.
(picked up from
http://tomoyo.sourceforge.jp/cgi-bin/lxr/source/centos4.4/domain_policy.txt?v=policy-sample)

/etc/rc.d/init.d/cups (fork)
 /sbin/initlog (fork)
   /usr/sbin/cupsd (fork)
 /bin/bash (fork)
   /usr/lib/cups/backend/lpd (current process)

SELinux and other DTE implementations need domain definitions to  work.
It is administrators task to design domains and name each domains.
TOMOYO Linux can be used as DTE MAC, but administrators don't
have to define and name domains. Because TOMOYO Linux
automatically defines domains and name them (from booting to
shutdown).

I wrote "TOMOYO Linux can be used as MAC", because
users can just view the domain transitions and analyze systems
with TOMOYO Linux. Or they can use TOMOYO Linux to
get logs with process invocation histories instead of a simple
program name.

TOMOYO Linux policy consists of path names and they are currently
handled as strings.

Thanks.

--
Toshiharu Harada
NTT DATA CORPORATION
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc3-mm1

2007-08-25 Thread Tilman Schmidt
Am 25.08.2007 02:21 schrieb john stultz:

>> Tilman Schmidt <[EMAIL PROTECTED]> wrote:

>>> - on console early during boot, also in SuSE's /var/log/boot.msg:
>>>
>>> your system time is not correct:
>>> Wed Jul 13 13:15:31 UTC 1910
>>> setting system time to:
>>> Tue Jul 24 00:00:00 UTC 2007
> 
> Hrmm. I'm not super familiar w/ SuSE's init scripts, but I'm guessing
> that's the ntpdate call.

Nope. The ntpdate call comes much later, and finally sets the system clock
correctly so that there are no lasting effects of all this.

> And "Tuesday Jul 24th"? Sounds about a month
> off, is this just stale info?

I have no idea where that might come from.

>>> /dev/system/root: Superblock last mount time is in the future.  FIXED.
>>> /dev/system/root: Superblock last write time is in the future.  FIXED.
> 
> Does this show up before or after the above date stuff?

After the "your system time is not correct" messages, and before the
regular "Try to get initial date and time via NTP" message accompanying
the ntpdate call.

> Does the issue go away using an older kernel (I want to eliminate easy
> stuff like CMOS batteries giving up)?

It does. Booting 2.6.23-rc3 after that, the system comes up with none
of these messages.

> Also you're not using Linus' CMOS corrupting suspend/resume debugging
> trick, right (I'm forgetting the CONFIG name).

PM_TRACE? No. The entire PM_DEBUG branch is turned off.

-- 
Tilman Schmidt  E-Mail: [EMAIL PROTECTED]
Bonn, Germany
Diese Nachricht besteht zu 100% aus wiederverwerteten Bits.
Ungeöffnet mindestens haltbar bis: (siehe Rückseite)



signature.asc
Description: OpenPGP digital signature


Re: CFS review

2007-08-25 Thread Al Boldi
Ingo Molnar wrote:
> * Al Boldi <[EMAIL PROTECTED]> wrote:
> > > > The problem is that consecutive runs don't give consistent results
> > > > and sometimes stalls.  You may want to try that.
> > >
> > > well, there's a natural saturation point after a few hundred tasks
> > > (depending on your CPU's speed), at which point there's no idle time
> > > left. From that point on things get slower progressively (and the
> > > ability of the shell to start new ping tasks is impacted as well),
> > > but that's expected on an overloaded system, isnt it?
> >
> > Of course, things should get slower with higher load, but it should be
> > consistent without stalls.
> >
> > To see this problem, make sure you boot into /bin/sh with the normal
> > VGA console (ie. not fb-console).  Then try each loop a few times to
> > show different behaviour; loops like:
> >
> > # for ((i=0; i<; i++)); do ping 10.1 -A > /dev/null & done
> >
> > # for ((i=0; i<; i++)); do nice -99 ping 10.1 -A > /dev/null & done
> >
> > # { for ((i=0; i<; i++)); do
> > ping 10.1 -A > /dev/null &
> > done } > /dev/null 2>&1
> >
> > Especially the last one sometimes causes a complete console lock-up,
> > while the other two sometimes stall then surge periodically.
>
> ok. I think i might finally have found the bug causing this. Could you
> try the fix below, does your webserver thread-startup test work any
> better?

It seems to help somewhat, but the problem is still visible.  Even v20.3 on 
2.6.22.5 didn't help.

It does look related to ia-boosting, so I turned off __update_curr like Roman 
mentioned, which had an enormous smoothing effect, but then nice levels 
completely break down and lockup the system.

There is another way to show the problem visually under X (vesa-driver), by 
starting 3 gears simultaneously, which after laying them out side-by-side 
need some settling time before smoothing out.  Without __update_curr it's 
absolutely smooth from the start.


Thanks!

--
Al

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 09/30] mtd: Don't cast kmalloc() return value in drivers/mtd/maps/pmcmsp-flash.c

2007-08-25 Thread Jesper Juhl
On 24/08/07, Robert P. J. Day <[EMAIL PROTECTED]> wrote:
> On Fri, 24 Aug 2007, Denys Vlasenko wrote:
>
> > On Friday 24 August 2007 00:52, Jesper Juhl wrote:
> > > kmalloc() returns a void pointer.
> > > No need to cast it.
> >
> > > -   msp_flash = (struct mtd_info **)kmalloc(
> > > -   fcnt * sizeof(struct map_info *), GFP_KERNEL);
> > > -   msp_parts = (struct mtd_partition **)kmalloc(
> > > -   fcnt * sizeof(struct mtd_partition *), GFP_KERNEL);
> > > -   msp_maps = (struct map_info *)kmalloc(
> > > -   fcnt * sizeof(struct mtd_info), GFP_KERNEL);
> > > +   msp_flash = kmalloc(fcnt * sizeof(struct map_info *), GFP_KERNEL);
> > > +   msp_parts = kmalloc(fcnt * sizeof(struct mtd_partition *), 
> > > GFP_KERNEL);
> > > +   msp_maps = kmalloc(fcnt * sizeof(struct mtd_info), GFP_KERNEL);
> > > memset(msp_maps, 0, fcnt * sizeof(struct mtd_info));
> >
> > This one wants kzalloc.
> >
> > > -   msp_parts[i] = (struct mtd_partition *)kmalloc(
> > > -   pcnt * sizeof(struct mtd_partition), GFP_KERNEL);
> > > +   msp_parts[i] = kmalloc(pcnt * sizeof(struct mtd_partition),
> > > +   GFP_KERNEL);
> > > memset(msp_parts[i], 0, pcnt * sizeof(struct mtd_partition));
> > >
> > > /* now initialize the devices proper */
> >
> > Same
>
> actually, i would think kcalloc would be more appropriate here, no?
>

Why?

msp_parts[i] = kzalloc(pcnt * sizeof(struct mtd_partition), GFP_KERNEL);

seems better to me than

msp_parts[i] = kcalloc(1, pcnt * sizeof(struct mtd_partition), GFP_KERNEL);


-- 
Jesper Juhl <[EMAIL PROTECTED]>
Don't top-post  http://www.catb.org/~esr/jargon/html/T/top-post.html
Plain text mails only, please  http://www.expita.com/nomime.html
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: USB Key light on/off state depending on mount

2007-08-25 Thread Stefan Richter
Guennadi Liakhovetski wrote:
> I might imagine how windows turns the LED off on 
> unmount. Try "eject /dev/sdX", where sdX is your USB storage, after you 
> unmount it. Be careful, especially if you have SATA (or SCSI) discs in 
> your system or if you use libata for PATA discs not to eject the wrong 
> one...

If there is only one USB disk connected:
# eject /dev/disk/by-path/*usb*:0

Provided you let udev create links for you.  BTW, the /dev/disk/by-id/
symlinks are nice for static mount points in /etc/fstab.

After a disk was mounted, eject also accepts the mountpoint as parameter
and will unmount the disk before it tries to eject it.
-- 
Stefan Richter
-=-=-=== =--- ==-=-
http://arcgraph.de/sr/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH, RFC] wake up from a serial port

2007-08-25 Thread Guennadi Liakhovetski
Enable wakeup from serial ports, make it run-time configurable over sysfs, 
e.g.,

echo enabled > /sys/devices/platform/serial8250.0/tty/ttyS0/power/wakeup

Requires

# CONFIG_SYSFS_DEPRECATED is not set

Signed-off-by: Guennadi Liakhovetski <[EMAIL PROTECTED]>

---

I've sent this rfc/patch earlier to linuxppc-dev (I need it for a ppc 
platform) and to linux-serial, and got no comments - but no objections 
either from either of them. So, re-sending to a broader and hopefully more 
relevant audience this time.

Thanks
Guennadi

diff --git a/drivers/serial/8250.c b/drivers/serial/8250.c
index 0b3ec38..5118914 100644
--- a/drivers/serial/8250.c
+++ b/drivers/serial/8250.c
@@ -130,6 +130,7 @@ struct uart_8250_port {
unsigned char   mcr_mask;   /* mask of user bits */
unsigned char   mcr_force;  /* mask of forced bits */
unsigned char   lsr_break_flag;
+   charsuspended;
 
/*
 * We provide a per-port pm hook.
@@ -2673,6 +2674,14 @@ static int __devexit serial8250_remove(struct 
platform_device *dev)
return 0;
 }
 
+static int serial8250_match_port(struct device *dev, void *data)
+{
+   struct uart_port *port = data;
+   dev_t devt = MKDEV(serial8250_reg.major, serial8250_reg.minor) + 
port->line;
+
+   return dev->devt == devt; /* Actually, only one tty per port */
+}
+
 static int serial8250_suspend(struct platform_device *dev, pm_message_t state)
 {
int i;
@@ -2680,8 +2689,16 @@ static int serial8250_suspend(struct platform_device 
*dev, pm_message_t state)
for (i = 0; i < UART_NR; i++) {
struct uart_8250_port *up = _ports[i];
 
-   if (up->port.type != PORT_UNKNOWN && up->port.dev == >dev)
-   uart_suspend_port(_reg, >port);
+   if (up->port.type != PORT_UNKNOWN && up->port.dev == >dev) 
{
+   struct device *tty_dev = 
device_find_child(up->port.dev, >port,
+  
serial8250_match_port);
+   if (device_may_wakeup(tty_dev))
+   enable_irq_wake(up->port.irq);
+   else {
+   uart_suspend_port(_reg, >port);
+   up->suspended = 1;
+   }
+   }
}
 
return 0;
@@ -2694,8 +2711,13 @@ static int serial8250_resume(struct platform_device *dev)
for (i = 0; i < UART_NR; i++) {
struct uart_8250_port *up = _ports[i];
 
-   if (up->port.type != PORT_UNKNOWN && up->port.dev == >dev)
-   serial8250_resume_port(i);
+   if (up->port.type != PORT_UNKNOWN && up->port.dev == >dev) 
{
+   if (up->suspended) {
+   serial8250_resume_port(i);
+   up->suspended = 0;
+   } else
+   disable_irq_wake(up->port.irq);
+   }
}
 
return 0;
diff --git a/drivers/serial/serial_core.c b/drivers/serial/serial_core.c
index 9c57486..716fbe2 100644
--- a/drivers/serial/serial_core.c
+++ b/drivers/serial/serial_core.c
@@ -2266,6 +2266,7 @@ int uart_add_one_port(struct uart_driver *drv, struct 
uart_port *port)
 {
struct uart_state *state;
int ret = 0;
+   struct device *tty_dev;
 
BUG_ON(in_interrupt());
 
@@ -2301,7 +2302,13 @@ int uart_add_one_port(struct uart_driver *drv, struct 
uart_port *port)
 * Register the port whether it's detected or not.  This allows
 * setserial to be used to alter this ports parameters.
 */
-   tty_register_device(drv->tty_driver, port->line, port->dev);
+   tty_dev = tty_register_device(drv->tty_driver, port->line, port->dev);
+   if (likely(!IS_ERR(tty_dev))) {
+   device_can_wakeup(tty_dev) = 1;
+   device_set_wakeup_enable(tty_dev, 0);
+   } else
+   printk(KERN_ERR "Cannot register tty device on line %d\n",
+  port->line);
 
/*
 * If this driver supports console, and it hasn't been
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] debloat aic7xxx and aic79xx drivers by deinlining

2007-08-25 Thread Arjan van de Ven
On Sat, 25 Aug 2007 22:57:07 +0100
Denys Vlasenko <[EMAIL PROTECTED]> wrote:

> Hi,
> 
> Attached patch deinlines and moves big functions from .h to .c files
> in drivers/scsi/aic7xxx/*. I also had to add prototypes for
> ahc_lookup_scb and ahd_lookup_scb to .h files.
> 

one question... how many of these can actually be static (or would be
if they were in their only-caller-c-file) ? Did you run the find static
script or are you waiting for Adrian to do that ;-)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] FS: Make RAMFS both selectable and tristate.

2007-08-25 Thread Al Viro
On Sat, Aug 25, 2007 at 05:40:00PM -0400, Robert P. J. Day wrote:
> On Sat, 25 Aug 2007, Al Viro wrote:
> 
> > On Sat, Aug 25, 2007 at 03:40:23PM -0400, Robert P. J. Day wrote:
> > >
> > > Allow RAMFS to be user-selectable, and to be built as a module.
> > >
> > > Signed-off-by: Robert P. J. Day <[EMAIL PROTECTED]>
> > >
> > > ---
> > >
> > >   given that the help content for that option suggests it can be built
> > > as a module, it just makes sense to make it selectable and tristate,
> > > unless someone has a compelling argument against it.
> >
> > How about "check if the kernel builds if you do that"?
> 
> i did.  i did a simple "make defconfig" and "make", and the kernel
> built fine.  that patch didn't change the status of RAMFS in any way,
> it was still selected as default "y", so why would that patch have
> made any difference to the eventual build?

Your patch allows to make it a module.  That seems to be the only point of
your patch.  So check if it builds when RAMFS is made "m" or "n".
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 2/2] Sort module list by pointer address to get coherent sleepable seq_file iterators

2007-08-25 Thread Mathieu Desnoyers
* Rusty Russell ([EMAIL PROTECTED]) wrote:
> On Fri, 2007-08-24 at 11:45 -0400, Mathieu Desnoyers wrote:
> > * Rusty Russell ([EMAIL PROTECTED]) wrote:
> > > On Mon, 2007-08-20 at 16:26 -0400, Mathieu Desnoyers wrote:
> > > > plain text document attachment (module.c-sort-module-list.patch)
> > > > A race that appears both in /proc/modules and in kallsyms: if, between 
> > > > the
> > > > seq file reads, the process is put to sleep and at this moment a module 
> > > > is
> > > > or removed from the module list, the listing will skip an amount of
> > > > modules/symbols corresponding to the amount of elements present in the 
> > > > unloaded
> > > > module, but at the current position in the list if the iteration is 
> > > > located
> > > > after the removed module.
> > > > 
> > > > The cleanest way I found to deal with this problem is to sort the 
> > > > module list.
> > > > We can then keep the old struct module * as the old iterator, knowing 
> > > > the it may
> > > > be removed between the seq file reads, but we only use it as "get 
> > > > next". If it
> > > > is not present in the module list, the next pointer will be used.
> > > > 
> > > > By doing this, removing a given module will now only fuzz the output 
> > > > related to
> > > > this specific module, not any random module anymore. Since modprobe uses
> > > > /proc/modules, it might be important to make sure multiple concurrent 
> > > > running
> > > > modprobes won't interfere with each other.
> > > 
> > > You've reduced, but not eliminated, the problem.  A new module inserted
> > > is quite likely to reuse the same address.
> > > 
> > 
> > Hi Rusty,
> > 
> > Please tell me if I'm wrong, but I think it would not be a problem:
> > 
> > - seq_read() makes sure that a buffer large enough is available so that
> >   m_show() can fully extract and print the information relative to 1
> >   module.
> > - m_start() and m_stop() takes the module_mutex, therefore within one
> >   seq_read(), once m_start has returned, the struct module * that we
> >   have is valid and will be consistent during the whole seq_read
> >   operation.
> > - If a module is removed, and then a different one is inserted at the
> >   same address, while we are between two seq_reads for this given struct
> >   module address, the seq_reads will copy to user-space the information
> >   that is still in the buffer for the _first_ struct module encountered,
> >   not the new one.
> > - After that, iteration will continue to the new struct module address,
> >   effectively skipping the newly inserted module.
> 
> Indeed, I thought that this was a general problem: the seq_list code was
> never intended to work on modifiable lists unless you get them in one
> big read.
> 
> If we accept this problem, what do we do about all the other users?
> 

Hum, I guess it would be best for them to switch to the proposed seq
sorted list too. I think that having one example (module.c) that shows
well how this works will be an incentive for other developers to port
their seq_file code to the sorted list (I am thinking, among others,
about kallsyms).

Mathieu

> Rusty.
> 

-- 
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] FS: Make RAMFS both selectable and tristate.

2007-08-25 Thread Robert P. J. Day
On Sat, 25 Aug 2007, Al Viro wrote:

> On Sat, Aug 25, 2007 at 03:40:23PM -0400, Robert P. J. Day wrote:
> >
> > Allow RAMFS to be user-selectable, and to be built as a module.
> >
> > Signed-off-by: Robert P. J. Day <[EMAIL PROTECTED]>
> >
> > ---
> >
> >   given that the help content for that option suggests it can be built
> > as a module, it just makes sense to make it selectable and tristate,
> > unless someone has a compelling argument against it.
>
> How about "check if the kernel builds if you do that"?

ah, i see what you mean -- selecting it as a module.  apparently,
then, the "help" text telling me "To compile this as a module, choose
M here: the module will be called ramfs." was overly optimistic.

rday
--

Robert P. J. Day
Linux Consulting, Training and Annoying Kernel Pedantry
Waterloo, Ontario, CANADA

http://crashcourse.ca

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] allow send/recv(MSG_DONTWAIT) on non-sockets

2007-08-25 Thread Denys Vlasenko
Hello list,

This patch attempts to make it possible to do a nonblocking read or
write of fd's pointing to possibly shared struct file's in a non-racy
manner, i.e. without using fcntl.

One use case is when you want to read standard input, but
don't want to switch fd 0 to O_NONBLOCK mode: if you get SIGKILLed,
O_NONBLOCK will stay and your parent (e.g. a shell) can be upset.

On Tuesday 14 August 2007 13:33, Alan Cox wrote:
> > b) Make recv(fd, buf, size, flags) and send(fd, buf, size, flags);
> >    work with non-socket fds too, for flags==0 or flags==MSG_DONTWAIT.
> >    (it's ok to fail with "socket op on non-socket fd" for other values
> >    of flags)
>
> I think that makes a lot of sense, and to be honest other MSG_ flags make
> useful sense and have meaningful semantics that might be helpful
> elsewhere if ever coded that way.

Yes, that's my feeling too.

> If you want to do this the first job is going to be to sort out the way
> non-block is propogated to device driver read/write handlers. At the
> moment they all check filp->f_flags

...which happens in ~250 files. I'd rather not touch that much
of code, if possible.

Attached patch detects send/recv(fd, buf, size, MSG_DONTWAIT) on
non-sockets and turns them into non-blocking write/read.
Since filp->f_flags appear to be read and modified without any locking,
I cannot modify it without potentially affecting other processes
accessing the same file through shared struct file.

Therefore I simply make a temporary copy of struct file, set
O_NONBLOCK in it and pass it to vfs_read/write.
Is this heresy? ;) I see only one spinlock in struct file:

#ifdef CONFIG_EPOLL
        spinlock_t              f_ep_lock;
#endif /* #ifdef CONFIG_EPOLL */

Do I need to take it?

Also attached is ndelaytest.c which can be used to test that
send(MSG_DONTWAIT) indeed is failing with EAGAIN if write would block
and that other processes never see O_NONBLOCK set.

Comments?
--
vda
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 

#define SECONDS 10

#define STR "."
//#define STR "123456789 123456789 123456789 123456789 "

/* To see send() resulting in EAGAIN:
 * strace -ff -o log ndelaytest | while sleep 11; do break; done
 * log.$PID:
 * send(1, "123456789 123456789 123456789 12"..., 40, MSG_DONTWAIT)
 *= -1 EAGAIN (Resource temporarily unavailable)
 */

int main()
{
	pid_t pid;
	time_t t;
	int fl;

	puts("starting");
	t = time(0);

	pid = fork();
	if (pid == 0) {
		/* child */
		while ((time(0) - t) < SECONDS-1) {
#if 0 
			/* Uncomment this part and simply run the executable
			 * to see race detection code in action */
#define OP "write"
			fcntl(1, F_SETFL, fcntl(1, F_GETFL) | O_NONBLOCK);
			fl = write(1, STR, sizeof(STR) - 1);
			fcntl(1, F_SETFL, fcntl(1, F_GETFL) & ~O_NONBLOCK);
#else
			/* This part tests whether send(MSG_DONTWAIT)
			 * is racy or not */
#define OP "send"
			fl = send(1, STR, sizeof(STR) - 1, MSG_DONTWAIT);
#endif
			if (fl < 0) {
perror(OP);
kill(getppid(), SIGKILL);
return 1;
			}
		}
		return 0;
	}

	while ((time(0) - t) < SECONDS) {
		fl = fcntl(1, F_GETFL);
		if (fl & O_NONBLOCK) {
			fprintf(stderr, "NONBLOCK:1\n");
			kill(pid, SIGKILL);
			fcntl(1, F_SETFL, fl & ~O_NONBLOCK);
			return 1;
		}
	}
	fprintf(stderr, "NONBLOCK:0\n");
	return 0;
}
--- linux-2.6.22-rc6.src/fs/read_write.c	Fri Jun 15 19:30:05 2007
+++ linux-2.6.22-rc6_ndelay/fs/read_write.c	Sun Aug 19 10:43:24 2007
@@ -15,6 +15,7 @@
 #include 
 #include 
 #include 
+#include 
 #include "read_write.h"
 
 #include 
@@ -351,6 +352,36 @@
 static inline void file_pos_write(struct file *file, loff_t pos)
 {
 	file->f_pos = pos;
+}
+
+/* Helper for send/recv on non-sockets */
+ssize_t rw_with_flags(struct file *file, int fput_needed, void __user *buf, size_t count, unsigned flags)
+{
+	int err;
+	loff_t pos;
+	struct file *file_copy;
+
+	file_copy = file;
+	if (flags & MSG_DONTWAIT) {
+		/* We make copy even if O_NONBLOCK is already set. */
+		/* We don't want it to change under our feet. */
+		file_copy = kmalloc(sizeof(*file_copy), GFP_KERNEL);
+		memcpy(file_copy, file, sizeof(*file_copy));
+		file_copy->f_flags |= O_NONBLOCK;
+	}
+
+	pos = file_pos_read(file);
+	if (flags & MSG_OOB) /* MSG_OOB is reused to mean 'write' */
+		err = vfs_write(file_copy, buf, count, );
+	else
+		err = vfs_read(file_copy, buf, count, );
+	file_pos_write(file, pos);
+
+	if (flags & MSG_DONTWAIT) {
+		kfree(file_copy);
+	}
+	fput_light(file, fput_needed);
+	return err;
 }
 
 asmlinkage ssize_t sys_read(unsigned int fd, char __user * buf, size_t count)
--- linux-2.6.22-rc6.src/include/linux/fs.h	Wed Jun 27 21:24:18 2007
+++ linux-2.6.22-rc6_ndelay/include/linux/fs.h	Sun Aug 19 10:32:20 2007
@@ -1154,6 +1154,9 @@
 extern ssize_t vfs_writev(struct file *, const struct iovec __user *,
 		unsigned long, loff_t *);
 
+extern ssize_t rw_with_flags(struct file *, int, void __user *, size_t,
+		unsigned);
+
 /*
  * NOTE: write_inode, delete_inode, 

Re: [PATCH] FS: Make RAMFS both selectable and tristate.

2007-08-25 Thread Robert P. J. Day
On Sat, 25 Aug 2007, Al Viro wrote:

> On Sat, Aug 25, 2007 at 03:40:23PM -0400, Robert P. J. Day wrote:
> >
> > Allow RAMFS to be user-selectable, and to be built as a module.
> >
> > Signed-off-by: Robert P. J. Day <[EMAIL PROTECTED]>
> >
> > ---
> >
> >   given that the help content for that option suggests it can be built
> > as a module, it just makes sense to make it selectable and tristate,
> > unless someone has a compelling argument against it.
>
> How about "check if the kernel builds if you do that"?

i did.  i did a simple "make defconfig" and "make", and the kernel
built fine.  that patch didn't change the status of RAMFS in any way,
it was still selected as default "y", so why would that patch have
made any difference to the eventual build?

rday
-- 

Robert P. J. Day
Linux Consulting, Training and Annoying Kernel Pedantry
Waterloo, Ontario, CANADA

http://crashcourse.ca

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 2/2] Sort module list by pointer address to get coherent sleepable seq_file iterators

2007-08-25 Thread Rusty Russell
On Fri, 2007-08-24 at 11:45 -0400, Mathieu Desnoyers wrote:
> * Rusty Russell ([EMAIL PROTECTED]) wrote:
> > On Mon, 2007-08-20 at 16:26 -0400, Mathieu Desnoyers wrote:
> > > plain text document attachment (module.c-sort-module-list.patch)
> > > A race that appears both in /proc/modules and in kallsyms: if, between the
> > > seq file reads, the process is put to sleep and at this moment a module is
> > > or removed from the module list, the listing will skip an amount of
> > > modules/symbols corresponding to the amount of elements present in the 
> > > unloaded
> > > module, but at the current position in the list if the iteration is 
> > > located
> > > after the removed module.
> > > 
> > > The cleanest way I found to deal with this problem is to sort the module 
> > > list.
> > > We can then keep the old struct module * as the old iterator, knowing the 
> > > it may
> > > be removed between the seq file reads, but we only use it as "get next". 
> > > If it
> > > is not present in the module list, the next pointer will be used.
> > > 
> > > By doing this, removing a given module will now only fuzz the output 
> > > related to
> > > this specific module, not any random module anymore. Since modprobe uses
> > > /proc/modules, it might be important to make sure multiple concurrent 
> > > running
> > > modprobes won't interfere with each other.
> > 
> > You've reduced, but not eliminated, the problem.  A new module inserted
> > is quite likely to reuse the same address.
> > 
> 
> Hi Rusty,
> 
> Please tell me if I'm wrong, but I think it would not be a problem:
> 
> - seq_read() makes sure that a buffer large enough is available so that
>   m_show() can fully extract and print the information relative to 1
>   module.
> - m_start() and m_stop() takes the module_mutex, therefore within one
>   seq_read(), once m_start has returned, the struct module * that we
>   have is valid and will be consistent during the whole seq_read
>   operation.
> - If a module is removed, and then a different one is inserted at the
>   same address, while we are between two seq_reads for this given struct
>   module address, the seq_reads will copy to user-space the information
>   that is still in the buffer for the _first_ struct module encountered,
>   not the new one.
> - After that, iteration will continue to the new struct module address,
>   effectively skipping the newly inserted module.

Indeed, I thought that this was a general problem: the seq_list code was
never intended to work on modifiable lists unless you get them in one
big read.

If we accept this problem, what do we do about all the other users?

Rusty.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] FS: Make RAMFS both selectable and tristate.

2007-08-25 Thread Al Viro
On Sat, Aug 25, 2007 at 03:40:23PM -0400, Robert P. J. Day wrote:
> 
> Allow RAMFS to be user-selectable, and to be built as a module.
> 
> Signed-off-by: Robert P. J. Day <[EMAIL PROTECTED]>
> 
> ---
> 
>   given that the help content for that option suggests it can be built
> as a module, it just makes sense to make it selectable and tristate,
> unless someone has a compelling argument against it.

How about "check if the kernel builds if you do that"?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 1/4] Linux Kernel Markers - Architecture Independent Code

2007-08-25 Thread Mathieu Desnoyers
* Rusty Russell ([EMAIL PROTECTED]) wrote:
> On Fri, 2007-08-24 at 12:26 -0400, Mathieu Desnoyers wrote:
> > * Rusty Russell ([EMAIL PROTECTED]) wrote:
> > > On Mon, 2007-08-20 at 16:27 -0400, Mathieu Desnoyers wrote:
> > > > +{
> > > > +   struct hlist_head *head;
> > > > +   struct hlist_node *node;
> > > > +   struct marker_entry *e;
> > > > +   size_t len = strlen(name) + 1;
> > > > +   u32 hash = jhash(name, len-1, 0);
> > > > +
> > > > +   head = _table[hash & ((1 << MARKER_HASH_BITS)-1)];
> > > > +   hlist_for_each_entry(e, node, head, hlist) {
> > > > +   if (!strcmp(name, e->name))
> > > > +   return e;
> > > > +   }
> > > > +   return NULL;
> > > > +}
> > > 
> > > OK, don't understand the strlen, len, len-1 dance here?
> > > 
> > 
> > Let's say we have abc\0 for marker name as name input.
> > 
> > len = 3 + 1 = 4 (including \0)
> > hash is done only on the 3 first chars, excluding the \0 (therefore the
> >   len-1 there)
> > 
> > Actually, it's like this only for a matter of consistency between
> > add_marker and remove_marker, which are quite similar, but add_marker
> > needs name_len to include the \0 value. It would be odd to change the
> > logic between the two functions to one including the \0 and the other
> > excluding it.
> 
> Sure, but that doesn't really explain why the code does:
> 
>   size_t len = strlen(name) + 1;
>   u32 hash = jhash(name, len-1, 0);
> 
> Rather than:
> 
>   u32 hash = jhash(name, strlen(name), 0);
> 

Yup, good point. Fixed.

Thanks,

Mathieu

> > Thanks for the review,
> 
> That's fine, just some light reading...
> 
> Cheers,
> Rusty.
> > 
> 

-- 
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [git pull request] scheduler updates

2007-08-25 Thread Peter Zijlstra
On Sat, 2007-08-25 at 19:23 +0200, Ingo Molnar wrote:

> Peter and me tested this all day with various workloads and extreme-load 
> behavior has improved all over the place
 
Adaptive granularity makes a large difference for me on my somewhat
ancient laptop (1200 MHz). When browsing the interweb using firefox (or
trying to) while doing a (non-niced) kbuild -j5 the difference in
interactivity is significant.
 
[ kbuild -j5 was quite unbearable on 2.6.22 - so CFS is a clear win in
any case ]
 
The reduced latency is clearly noticable in a much smoother scroll
behaviour. Whereas both still present a usable browsing experience the
clear reduction in latency spikes makes it much more pleasant.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [-mm patch] enforce noreplace-smp in alternative_instructions()

2007-08-25 Thread Frederik Deweerdt
On Sat, Aug 25, 2007 at 02:23:24PM +0200, Frederik Deweerdt wrote:
> On Sat, Aug 25, 2007 at 10:07:29PM +1000, Rusty Russell wrote:
> > Do you need noreplace-smp even on 2.6.23-rc3,
> > or only 2.6.23-rc3-mm1?
> I'll try ASAP.
> 
Ok, tested: I need noreplace-smp + patch to make it work on mainline too.

Regards,
Frederik
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


2.6.23-rc3 USB segfaults + urb status -32

2007-08-25 Thread Lasse Kärkkäinen
My system is unusably unstable using this kernel. On last boot it
started flooding urb status -32 to kernel log at a rate of several
megabytes per second. Now it printed segfaults before the system had
finished booting and then some other errors... The full log is here:

I couldn't find information on these bugs. If you need more debug info,
please contact me. I can also reproduce the errors without the Nvidia
kernel module, if that is really necessary (note, however, that the
first segfaults in this log happen before the module loads).

I think that part of the USB problems may be related to M-Audio
FastTrack Pro USB sound card, because I have managed to crash the kernel
USB system before (with a 32 bit kernel, and also on another computer)
by switching bConfigurationValue of that card.

Running on x86-64 with Core2 Q6600 B3.

Linux version 2.6.23-rc3 ([EMAIL PROTECTED]) (gcc version 4.1.1 (Gentoo
4.1.1-r3)) #1 SMP PREEMPT Sat Aug 25 10:01:23 EEST 2007
Command line: root=/dev/md3 usbhid.mousepoll=2 usb-storage.delay_use=0
BIOS-provided physical RAM map:
 BIOS-e820:  - 0009f800 (usable)
 BIOS-e820: 0009f800 - 000a (reserved)
 BIOS-e820: 000f - 0010 (reserved)
 BIOS-e820: 0010 - 7fee (usable)
 BIOS-e820: 7fee - 7fee3000 (ACPI NVS)
 BIOS-e820: 7fee3000 - 7fef (ACPI data)
 BIOS-e820: 7fef - 7ff0 (reserved)
 BIOS-e820: f000 - f400 (reserved)
 BIOS-e820: fec0 - 0001 (reserved)
Entering add_active_range(0, 0, 159) 0 entries of 256 used
Entering add_active_range(0, 256, 524000) 1 entries of 256 used
end_pfn_map = 1048576
DMI 2.4 present.
ACPI: RSDP 000F6CF0, 0014 (r0 GBT   )
ACPI: RSDT 7FEE3040, 0034 (r1 GBTGBTUACPI 42302E31 GBTU  1010101)
ACPI: FACP 7FEE30C0, 0074 (r1 GBTGBTUACPI 42302E31 GBTU  1010101)
ACPI: DSDT 7FEE3180, 49F4 (r1 GBTGBTUACPI 1000 MSFT  10C)
ACPI: FACS 7FEE, 0040
ACPI: HPET 7FEE7CC0, 0038 (r1 GBTGBTUACPI 42302E31 GBTU   98)
ACPI: MCFG 7FEE7D40, 003C (r1 GBTGBTUACPI 42302E31 GBTU  1010101)
ACPI: APIC 7FEE7BC0, 0084 (r1 GBTGBTUACPI 42302E31 GBTU  1010101)
Entering add_active_range(0, 0, 159) 0 entries of 256 used
Entering add_active_range(0, 256, 524000) 1 entries of 256 used
Zone PFN ranges:
  DMA 0 -> 4096
  DMA324096 ->  1048576
  Normal1048576 ->  1048576
Movable zone start PFN for each node
early_node_map[2] active PFN ranges
0:0 ->  159
0:  256 ->   524000
On node 0 totalpages: 523903
  DMA zone: 56 pages used for memmap
  DMA zone: 10 pages reserved
  DMA zone: 3933 pages, LIFO batch:0
  DMA32 zone: 7108 pages used for memmap
  DMA32 zone: 512796 pages, LIFO batch:31
  Normal zone: 0 pages used for memmap
  Movable zone: 0 pages used for memmap
ACPI: PM-Timer IO Port: 0x408
ACPI: Local APIC address 0xfee0
ACPI: LAPIC (acpi_id[0x00] lapic_id[0x00] enabled)
Processor #0 (Bootup-CPU)
ACPI: LAPIC (acpi_id[0x01] lapic_id[0x01] enabled)
Processor #1
ACPI: LAPIC (acpi_id[0x02] lapic_id[0x03] enabled)
Processor #3
ACPI: LAPIC (acpi_id[0x03] lapic_id[0x02] enabled)
Processor #2
ACPI: LAPIC_NMI (acpi_id[0x00] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x01] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x02] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x03] dfl dfl lint[0x1])
ACPI: IOAPIC (id[0x02] address[0xfec0] gsi_base[0])
IOAPIC[0]: apic_id 2, address 0xfec0, GSI 0-23
ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level)
ACPI: IRQ0 used by override.
ACPI: IRQ2 used by override.
ACPI: IRQ9 used by override.
Setting APIC routing to flat
ACPI: HPET id: 0x8086a201 base: 0xfed0
Using ACPI (MADT) for SMP configuration information
Allocating PCI resources starting at 8000 (gap: 7ff0:7010)
SMP: Allowing 4 CPUs, 0 hotplug CPUs
PERCPU: Allocating 31200 bytes of per cpu data
Built 1 zonelists in Zone order.  Total pages: 516729
Kernel command line: root=/dev/md3 usbhid.mousepoll=2
usb-storage.delay_use=0
Initializing CPU#0
PID hash table entries: 4096 (order: 12, 32768 bytes)
time.c: Detected 2400.000 MHz processor.
Console: colour VGA+ 80x25
console [tty0] enabled
Dentry cache hash table entries: 262144 (order: 9, 2097152 bytes)
Inode-cache hash table entries: 131072 (order: 8, 1048576 bytes)
Checking aperture...
Memory: 2055772k/2096000k available (4847k kernel code, 39476k reserved,
1890k data, 268k init)
Calibrating delay using timer specific routine.. 4802.57 BogoMIPS
(lpj=2401287)
Mount-cache hash table entries: 256
CPU: L1 I cache: 32K, L1 D cache: 32K
CPU: L2 cache: 4096K
using mwait in idle threads.
CPU: Physical Processor ID: 0
CPU: Processor Core ID: 0
SMP alternatives: switching to UP code
ACPI: Core revision 20070126
Using local APIC timer interrupts.
result 1654
Detected 16.666 MHz APIC timer.
SMP 

Re: [patch 1/4] Linux Kernel Markers - Architecture Independent Code

2007-08-25 Thread Rusty Russell
On Fri, 2007-08-24 at 12:26 -0400, Mathieu Desnoyers wrote:
> * Rusty Russell ([EMAIL PROTECTED]) wrote:
> > On Mon, 2007-08-20 at 16:27 -0400, Mathieu Desnoyers wrote:
> > > +{
> > > + struct hlist_head *head;
> > > + struct hlist_node *node;
> > > + struct marker_entry *e;
> > > + size_t len = strlen(name) + 1;
> > > + u32 hash = jhash(name, len-1, 0);
> > > +
> > > + head = _table[hash & ((1 << MARKER_HASH_BITS)-1)];
> > > + hlist_for_each_entry(e, node, head, hlist) {
> > > + if (!strcmp(name, e->name))
> > > + return e;
> > > + }
> > > + return NULL;
> > > +}
> > 
> > OK, don't understand the strlen, len, len-1 dance here?
> > 
> 
> Let's say we have abc\0 for marker name as name input.
> 
> len = 3 + 1 = 4 (including \0)
> hash is done only on the 3 first chars, excluding the \0 (therefore the
>   len-1 there)
> 
> Actually, it's like this only for a matter of consistency between
> add_marker and remove_marker, which are quite similar, but add_marker
> needs name_len to include the \0 value. It would be odd to change the
> logic between the two functions to one including the \0 and the other
> excluding it.

Sure, but that doesn't really explain why the code does:

size_t len = strlen(name) + 1;
u32 hash = jhash(name, len-1, 0);

Rather than:

u32 hash = jhash(name, strlen(name), 0);

> Thanks for the review,

That's fine, just some light reading...

Cheers,
Rusty.
> 

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [git pull request] scheduler updates

2007-08-25 Thread Ingo Molnar

* Ingo Molnar <[EMAIL PROTECTED]> wrote:

>git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched.git
> 
> Find the shortlog further below. There are 3 commits in it: adaptive 
> granularity, a subsequent cleanup, and a lockdep sysctl bug Peter 
> noticed while hacking on this. (the bug was introduced with the 
> initial CFS commits but nobody noticed because the lockdep sysctls are 
> rarely used.)

hm, a small (and mostly harmless) buglet sneaked into it: the wakeup 
granularity and the runtime limit is now dependent on sched_latency - 
while it should be dependent on min_granularity and latency. To pick up 
that fix please pull from:

git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched.git

(ontop of your very latest git tree)

the effect of this bug was too high wakeup latency that could cause 
audio skipping on small-audio-buffer setups. (didnt happen on mine, they 
have large enough buffers.)

Ingo

-->
Ingo Molnar (1):
  sched: s/sched_latency/sched_min_granularity

 sched.c |4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH -mm 2/2] Hibernation: Arbitrary boot kernel support on x86_64 (updated)

2007-08-25 Thread Rafael J. Wysocki
On Saturday, 25 August 2007 20:27, Rafael J. Wysocki wrote:
> On Friday, 24 August 2007 22:46, Pavel Machek wrote:
> > Hi!
> > 
> > > From: Rafael J. Wysocki <[EMAIL PROTECTED]>
> > > 
> > > Make it possible to restore a hibernation image on x86_64 with the help 
> > > of a
> > > kernel different from the one in the image.
> > > 
> > > The idea is to split the core restoration code into two separate parts 
> > > and to
> > > place each of them in a different page.  The first part belongs to the 
> > > boot
> > 
> > What happens in case where both parts want to be
> > at the same place? (Like kernel being restored is 4KB smaller, so that
> > routines now collide?)
> 
> Bad things, but I can't see how to avoid that reliably.

Below is an analogous patch without this problem.  The slightly ugly thing
about it is that all pages in the temporary mapping have the NX bit cleard
now, so that we can run some code out of one of them.  Still, IMO, that isn't
really important, because the temporary page tables are dropped as soon as
we jump to restore_registers.

Greetings,
Rafael

---
From: Rafael J. Wysocki <[EMAIL PROTECTED]>

Make it possible to restore a hibernation image on x86_64 with the help of a
kernel different from the one in the image.

The idea is to split the core restoration code into two separate parts and to
place each of them in a different page.  The first part belongs to the boot
kernel and is executed as the last step of the image kernel's memory restoration
procedure.  Before being executed, it is relocated to a safe page that won't be
overwritten while copying the image kernel pages.

The final operation performed by it is a jump to the second part of the core
restoration code that belongs to the image kernel and has just been restored.
This code makes the CPU switch to the image kernel's page tables and
restores the state of general purpose registers (including the stack pointer)
from before the hibernation.

The main issue with this idea is that in order to jump to the second part of the
core restoration code the boot kernel needs to know its address.  However, this
address may be passed to it in the image header.  Namely, the part of the image
header previously used for checking if the version of the image kernel is
correct can be replaced with some architecture specific data that will allow
the boot kernel to jump to the right address within the image kernel.  These
data should also be used for checking if the image kernel is compatible with
the boot kernel (as far as the memory restroration procedure is concerned).
It can be done, for example, with the help of a "magic" value that has to be
equal in both kernels, so that they can be regarded as compatible.

Signed-off-by: Rafael J. Wysocki <[EMAIL PROTECTED]>
---
 arch/x86_64/Kconfig  |5 +++
 arch/x86_64/kernel/suspend.c |   54 ++-
 arch/x86_64/kernel/suspend_asm.S |   41 -
 include/asm-x86_64/suspend.h |3 ++
 4 files changed, 95 insertions(+), 8 deletions(-)

Index: linux-2.6.23-rc3/arch/x86_64/kernel/suspend_asm.S
===
--- linux-2.6.23-rc3.orig/arch/x86_64/kernel/suspend_asm.S  2007-08-25 
22:09:53.0 +0200
+++ linux-2.6.23-rc3/arch/x86_64/kernel/suspend_asm.S   2007-08-25 
22:10:25.0 +0200
@@ -2,8 +2,8 @@
  *
  * Distribute under GPLv2.
  *
- * swsusp_arch_resume may not use any stack, nor any variable that is
- * not "NoSave" during copying pages:
+ * swsusp_arch_resume must not use any stack or any nonlocal variables while
+ * copying pages:
  *
  * Its rewriting one kernel image with another. What is stack in "old"
  * image could very well be data page in "new" image, and overwriting
@@ -36,6 +36,10 @@ ENTRY(swsusp_arch_suspend)
pushfq
popqpt_regs_eflags(%rax)
 
+   /* save the address of restore_registers */
+   movq$restore_registers, %rax
+   movq%rax, restore_jump_address(%rip)
+
call swsusp_save
ret
 
@@ -54,7 +58,16 @@ ENTRY(restore_image)
movq%rcx, %cr3;
movq%rax, %cr4;  # turn PGE back on
 
+   /* prepare to jump to the image kernel */
+   movqrestore_jump_address(%rip), %rax
+
+   /* prepare to copy image data to their original locations */
movqrestore_pblist(%rip), %rdx
+   movqrelocated_restore_code(%rip), %rcx
+   jmpq*%rcx
+
+   /* code below has been relocated to a safe page */
+ENTRY(core_restore_code)
 loop:
testq   %rdx, %rdx
jz  done
@@ -62,7 +75,7 @@ loop:
/* get addresses from the pbe and copy the page */
movqpbe_address(%rdx), %rsi
movqpbe_orig_address(%rdx), %rdi
-   movq$512, %rcx
+   movq$(PAGE_SIZE >> 3), %rcx
rep
movsq
 
@@ -70,6 +83,20 @@ loop:
movqpbe_next(%rdx), %rdx
jmp loop
 done:
+   /* jump to 

Re: [PATCH -mm 2/2] Hibernation: Arbitrary boot kernel support on x86_64

2007-08-25 Thread Rafael J. Wysocki
On Saturday, 25 August 2007 01:23, Andrew Morton wrote:
> On Fri, 24 Aug 2007 12:11:54 +0200
> "Rafael J. Wysocki" <[EMAIL PROTECTED]> wrote:
> 
> > Index: linux-2.6.23-rc3/include/asm-x86_64/suspend.h
> > ===
> > --- linux-2.6.23-rc3.orig/include/asm-x86_64/suspend.h  2007-08-21 
> > 20:36:49.0 +0200
> > +++ linux-2.6.23-rc3/include/asm-x86_64/suspend.h   2007-08-21 
> > 20:37:47.0 +0200
> > @@ -43,4 +43,10 @@ extern void fix_processor_context(void);
> >  /* routines for saving/restoring kernel state */
> >  extern int acpi_save_state_mem(void);
> >  
> > +#define ARCH_HAS_HIBERNATION_HEADER
> 
> The preferred way of doing this is via Kconfig, please.  ie: add a
> CONFIG_HIBERNATION_HEADER to arch/x86_64/Kconfig.

OK

> > +
> > +/* arch/x86_64/kernel/suspend.c */
> > +extern int arch_hibernation_header_save(void *addr, unsigned int max_size);
> > +extern int arch_hibernation_header_restore(void *addr);
> 
> Given that these are called from non-arch-specific code, they must have the
> same signature across all architectures.  So there's no point in putting
> the prototypes into an arch-specific header file.
> 
> It would be better to do something like this in (say) suspend.h:
> 
> #ifdef CONFIG_HIBERNATION_HEADER
> extern int arch_hibernation_header_save(void *addr, unsigned int max_size);
> extern int arch_hibernation_header_restore(void *addr);
> #else
> static inline int arch_hibernation_header_save(void *addr,
>   unsigned int max_size)
> {
>   return 0;
> }
> 
> static inline int arch_hibernation_header_restore(void *addr)
> {
>   return 0;
> }
> #endif
> 
> then go nuke some ifdefs from the .c files.

The ifdefs in snapshot.c are necessary anyway, because they are around some
code that is only compiled when the CONFIG_HIBERNATION_HEADER is undefined.

I'll post the reworked patches in a new thread once again after the other issue
raised by Pavel gets settled.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Suspend2-devel] Problem with CFS V20 and Suspend2/tuxonice

2007-08-25 Thread Christian Hesse
On Saturday 25 August 2007, Ingo Molnar wrote:
> * Christian Hesse <[EMAIL PROTECTED]> wrote:
> > > Reproduced on a Intel Centrino based laptop with gentoo kamikaze7
> > > sources (http://forums.gentoo.org/viewtopic-t-577970.html)
> >
> > Same problem here: Core Duo, Kernel 2.6.22.5, Suspend 2.2.10, CFS v20.2.
>
> please try the patch below - does it fix the problem?

Works for me. Thanks a lot!
-- 
Regards,
Chris


signature.asc
Description: This is a digitally signed message part.


Re: Problem with CFS V20 and Suspend2/tuxonice

2007-08-25 Thread David Rodriguez
2007/8/25, Ingo Molnar <[EMAIL PROTECTED]>:
>
> * David Rodriguez <[EMAIL PROTECTED]> wrote:
>
> > I'm using 2.6.22.5 with cfs v20.3 and suspend2 2.2.10.2. With that
> > combination, suspend is not working anymore (with cfs v19 was
> > working). Stops on suspend in "Suspending tasks" Looking at cfs patch,
> > I managed to change the migration_thread, adding again the
> > try_to_freeze() removed in last patch and now the suspend finished,
> > but resume not work. Of course I don't know why that was removed, and
> > rewriting it is not a solution, but I want to report it.
>
> could you try the patch below, does it fix this problem?
>
> Ingo
>
> Index: linux-cfs-2.6.22.5.q/kernel/sched.c
> ===
> --- linux-cfs-2.6.22.5.q.orig/kernel/sched.c
> +++ linux-cfs-2.6.22.5.q/kernel/sched.c
> @@ -5043,6 +5043,8 @@ static int migration_thread(void *data)
> struct migration_req *req;
> struct list_head *head;
>
> +   try_to_freeze();
> +
> spin_lock_irq(>lock);
>
> if (cpu_is_offline(cpu)) {
> @@ -5399,6 +5401,7 @@ migration_call(struct notifier_block *nf
> p = kthread_create(migration_thread, hcpu, "migration/%d", 
> cpu);
> if (IS_ERR(p))
> return NOTIFY_BAD;
> +   p->flags |= PF_NOFREEZE;
> kthread_bind(p, cpu);
> /* Must be high prio: stop_machine expects to yield to it. */
> rq = task_rq_lock(p, );
>



Yes, it fixes the problem. Thanks
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] FS: Make RAMFS both selectable and tristate.

2007-08-25 Thread Robert P. J. Day

Allow RAMFS to be user-selectable, and to be built as a module.

Signed-off-by: Robert P. J. Day <[EMAIL PROTECTED]>

---

  given that the help content for that option suggests it can be built
as a module, it just makes sense to make it selectable and tristate,
unless someone has a compelling argument against it.

diff --git a/fs/Kconfig b/fs/Kconfig
index 58a0650..e79ac86 100644
--- a/fs/Kconfig
+++ b/fs/Kconfig
@@ -1003,7 +1003,7 @@ config HUGETLB_PAGE
def_bool HUGETLBFS

 config RAMFS
-   bool
+   tristate "Ramfs support"
default y
---help---
  Ramfs is a file system which keeps all files in RAM. It allows
-- 

Robert P. J. Day
Linux Consulting, Training and Annoying Kernel Pedantry
Waterloo, Ontario, CANADA

http://crashcourse.ca

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH -mm 2/2] Hibernation: Arbitrary boot kernel support on x86_64

2007-08-25 Thread Rafael J. Wysocki
On Saturday, 25 August 2007 20:32, [EMAIL PROTECTED] wrote:
> On Sat, 25 Aug 2007, Rafael J. Wysocki wrote:
> 
> > On Friday, 24 August 2007 22:46, Pavel Machek wrote:
> >> Hi!
> >>
> >>> From: Rafael J. Wysocki <[EMAIL PROTECTED]>
> >>>
> >>> Make it possible to restore a hibernation image on x86_64 with the help 
> >>> of a
> >>> kernel different from the one in the image.
> >>>
> >>> The idea is to split the core restoration code into two separate parts 
> >>> and to
> >>> place each of them in a different page.  The first part belongs to the 
> >>> boot
> >>
> >> What happens in case where both parts want to be
> >> at the same place? (Like kernel being restored is 4KB smaller, so that
> >> routines now collide?)
> >
> > Bad things, but I can't see how to avoid that reliably.
> 
> can you at least detect it reliably? (feed a program both kernel images 
> and have it tell you 'yes/no')

Well, I have an idea how to handle that, but I need to test it.  Stay tuned. :-)

Greetings,
Rafael
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Problem with CFS V20 and Suspend2/tuxonice

2007-08-25 Thread Rafael J. Wysocki
On Saturday, 25 August 2007 21:01, Ingo Molnar wrote:
> 
> * David Rodriguez <[EMAIL PROTECTED]> wrote:
> 
> > I'm using 2.6.22.5 with cfs v20.3 and suspend2 2.2.10.2. With that 
> > combination, suspend is not working anymore (with cfs v19 was 
> > working). Stops on suspend in "Suspending tasks" Looking at cfs patch, 
> > I managed to change the migration_thread, adding again the 
> > try_to_freeze() removed in last patch and now the suspend finished, 
> > but resume not work. Of course I don't know why that was removed, and 
> > rewriting it is not a solution, but I want to report it.
> 
> could you try the patch below, does it fix this problem?
> 
>   Ingo
> 
> Index: linux-cfs-2.6.22.5.q/kernel/sched.c
> ===
> --- linux-cfs-2.6.22.5.q.orig/kernel/sched.c
> +++ linux-cfs-2.6.22.5.q/kernel/sched.c
> @@ -5043,6 +5043,8 @@ static int migration_thread(void *data)
>   struct migration_req *req;
>   struct list_head *head;
>  
> + try_to_freeze();
> +
>   spin_lock_irq(>lock);
>  
>   if (cpu_is_offline(cpu)) {
> @@ -5399,6 +5401,7 @@ migration_call(struct notifier_block *nf
>   p = kthread_create(migration_thread, hcpu, "migration/%d", cpu);
>   if (IS_ERR(p))
>   return NOTIFY_BAD;
> + p->flags |= PF_NOFREEZE;

Yeah.

In 2.6.23-rc all kernel threads are PF_NOFREEZE by default.

>   kthread_bind(p, cpu);
>   /* Must be high prio: stop_machine expects to yield to it. */
>   rq = task_rq_lock(p, );
> -

Greetings,
Rafael


-- 
"Premature optimization is the root of all evil." - Donald Knuth
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] isdn/sc: remove unused REQUEST_IRQ and unnecessary header file

2007-08-25 Thread Karsten Keil

[PATCH] isdn/sc: remove unused REQUEST_IRQ and unnecessary

REQUEST_IRQ is never used, so delete it. In the process get rid of the
macro FREE_IRQ which makes the code unnecessarily difficult to read.

Signed-off-by: Fernando Luis Vazquez Cao <[EMAIL PROTECTED]>
Acked-by: Karsten Keil <[EMAIL PROTECTED]>
---

diff -urNp linux-2.6.23-rc3-orig/drivers/isdn/sc/debug.h 
linux-2.6.23-rc3/drivers/isdn/sc/debug.h
--- linux-2.6.23-rc3-orig/drivers/isdn/sc/debug.h   2007-07-09 
08:32:17.0 +0900
+++ linux-2.6.23-rc3/drivers/isdn/sc/debug.h1970-01-01 09:00:00.0 
+0900
@@ -1,19 +0,0 @@
-/* $Id: debug.h,v 1.2.8.1 2001/09/23 22:24:59 kai Exp $
- *
- * Copyright (C) 1996  SpellCaster Telecommunications Inc.
- *
- * This software may be used and distributed according to the terms
- * of the GNU General Public License, incorporated herein by reference.
- *
- * For more information, please contact [EMAIL PROTECTED] or write:
- *
- * SpellCaster Telecommunications Inc.
- * 5621 Finch Avenue East, Unit #3
- * Scarborough, Ontario  Canada
- * M1B 2T9
- * +1 (416) 297-8565
- * +1 (416) 297-6433 Facsimile
- */
-
-#define REQUEST_IRQ(a,b,c,d,e) request_irq(a,b,c,d,e)
-#define FREE_IRQ(a,b) free_irq(a,b)
diff -urNp linux-2.6.23-rc3-orig/drivers/isdn/sc/includes.h 
linux-2.6.23-rc3/drivers/isdn/sc/includes.h
--- linux-2.6.23-rc3-orig/drivers/isdn/sc/includes.h2007-07-09 
08:32:17.0 +0900
+++ linux-2.6.23-rc3/drivers/isdn/sc/includes.h 2007-08-24 20:55:45.0 
+0900
@@ -14,4 +14,3 @@
 #include 
 #include 
 #include 
-#include "debug.h"
diff -urNp linux-2.6.23-rc3-orig/drivers/isdn/sc/init.c 
linux-2.6.23-rc3/drivers/isdn/sc/init.c
--- linux-2.6.23-rc3-orig/drivers/isdn/sc/init.c2007-07-09 
08:32:17.0 +0900
+++ linux-2.6.23-rc3/drivers/isdn/sc/init.c 2007-08-24 20:31:58.0 
+0900
@@ -404,7 +404,7 @@ static void __exit sc_exit(void)
/*
 * Release the IRQ
 */
-   FREE_IRQ(sc_adapter[i]->interrupt, NULL);
+   free_irq(sc_adapter[i]->interrupt, NULL);
 
/*
 * Reset for a clean start

-- 
Karsten Keil
SuSE Labs
ISDN and VOIP development
SUSE LINUX Products GmbH, Maxfeldstr.5 90409 Nuernberg, GF: Markus Rex, HRB 
16746 (AG Nuernberg)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2/4] hpt366: UltraDMA filter for SATA cards (take 3)

2007-08-25 Thread Sergei Shtylyov
The Marvell bridge chips used on HighPoint SATA cards do not seem to support
the UltraDMA modes 1, 2, and 3 as well as any MWDMA modes, so the driver needs
to account for this in the udma_filter() method.  In order to achieve that, do
the following changes:

- install the method for all chips, not only HPT36x/370 and impove the code
  formatting by killing the extra tabs while at it;

- add to the end of the 'switch' statement in the method cases for HPT372[AN]
  and HPT374 chips upon which the known SATA cards are based;

- use hwif->ultra_mask as a default mask for the ide_dma_filter() method to
  behave correctly;

- move the HPT370[A] cases below the HPT36x case for consistency...

Signed-off-by: Sergei Shtylyov <[EMAIL PROTECTED]>

---
Argh!  I've managed to put = instead of &= here and there, so please disregard
the take #2... :-/
This version doesn't use explicit UltraDMA masks, so converting them to the
ATA_UDMA* is left for another, global patch.  This patch against the current
Linus' tree and unfortunately I was able to only compile test it since that
tree gives MODPOST warning and dies early on bootup.

 drivers/ide/pci/hpt366.c |   78 ---
 1 files changed, 40 insertions(+), 38 deletions(-)

Index: linux-2.6/drivers/ide/pci/hpt366.c
===
--- linux-2.6.orig/drivers/ide/pci/hpt366.c
+++ linux-2.6/drivers/ide/pci/hpt366.c
@@ -1,5 +1,5 @@
 /*
- * linux/drivers/ide/pci/hpt366.c  Version 1.11Aug 11, 2007
+ * linux/drivers/ide/pci/hpt366.c  Version 1.12Aug 25, 2007
  *
  * Copyright (C) 1999-2003 Andre Hedrick <[EMAIL PROTECTED]>
  * Portions Copyright (C) 2001 Sun Microsystems, Inc.
@@ -114,6 +114,7 @@
  *   unify HPT36x/37x timing setup code and the speedproc handlers by joining
  *   the register setting lists into the table indexed by the clock selected
  * - set the correct hwif->ultra_mask for each individual chip
+ * - add UltraDMA mode filtering for the HPT37[24] based SATA cards
  * Sergei Shtylyov, <[EMAIL PROTECTED]> or <[EMAIL PROTECTED]>
  */
 
@@ -524,36 +525,38 @@ static int check_in_drive_list(ide_drive
 
 static u8 hpt3xx_udma_filter(ide_drive_t *drive)
 {
-   struct hpt_info *info   = pci_get_drvdata(HWIF(drive)->pci_dev);
-   u8 mask;
+   ide_hwif_t *hwif= HWIF(drive);
+   struct hpt_info *info   = pci_get_drvdata(hwif->pci_dev);
+   u8 mask = hwif->ultra_mask;
 
switch (info->chip_type) {
-   case HPT370A:
-   if (!HPT370_ALLOW_ATA100_5 ||
-   check_in_drive_list(drive, bad_ata100_5))
-   return 0x1f;
-   else
-   return 0x3f;
-   case HPT370:
-   if (!HPT370_ALLOW_ATA100_5 ||
-   check_in_drive_list(drive, bad_ata100_5))
-   mask = 0x1f;
-   else
-   mask = 0x3f;
-   break;
case HPT36x:
-   if (!HPT366_ALLOW_ATA66_4 ||
+   if (HPT366_ALLOW_ATA66_4 &&
check_in_drive_list(drive, bad_ata66_4))
-   mask = 0x0f;
-   else
-   mask = 0x1f;
+   mask &= ~0x10;
 
-   if (!HPT366_ALLOW_ATA66_3 ||
+   if (HPT366_ALLOW_ATA66_3 &&
check_in_drive_list(drive, bad_ata66_3))
-   mask = 0x07;
+   mask &= ~0x08;
break;
+   case HPT370 :
+   case HPT370A:
+   if (HPT370_ALLOW_ATA100_5 &&
+   check_in_drive_list(drive, bad_ata100_5))
+   mask &= ~0x20;
+
+   if (info->chip_type == HPT370A)
+   return mask;
+   break;
+   case HPT372 :
+   case HPT372A:
+   case HPT372N:
+   case HPT374 :
+   if (ide_dev_is_sata(drive->id))
+   mask &= ~0x0e;
+   /* Fall thru */
default:
-   return 0x7f;
+   return mask;
}
 
return check_in_drive_list(drive, bad_ata33) ? 0x00 : mask;
@@ -1236,25 +1239,24 @@ static unsigned int __devinit init_chips
 
 static void __devinit init_hwif_hpt366(ide_hwif_t *hwif)
 {
-   struct pci_dev  *dev= hwif->pci_dev;
-   struct hpt_info *info   = pci_get_drvdata(dev);
-   int serialize   = HPT_SERIALIZE_IO;
-   u8  scr1 = 0, ata66 = hwif->channel ? 0x01 : 0x02;
-   u8  chip_type   = info->chip_type;
-   u8  new_mcr, old_mcr= 0;
+   struct pci_dev  *dev= hwif->pci_dev;
+   struct hpt_info *info   = pci_get_drvdata(dev);
+   int serialize   = HPT_SERIALIZE_IO;
+   u8  scr1 = 0, ata66 = hwif->channel ? 0x01 : 0x02;
+   u8  chip_type   = 

Re: USB Key light on/off state depending on mount

2007-08-25 Thread Guennadi Liakhovetski
On Fri, 24 Aug 2007, Josh Boyer wrote:

> On 8/24/07, Casey Dahlin <[EMAIL PROTECTED]> wrote:
> > Most USB keys nowadays have a small LED somewhere inside of them that
> > lights up when they are plugged in. On a windows box, the key is lit up
> > whenever it is mounted, and as soon as it is unmounted it turns off,
> > giving a handy physical indicator that the key is safe to remove. On
> > linux, the light is simply on whenever the key is plugged in.
> >
> > Should linux toggle the light depending on mount state? Is it as trivial
> > as it seems or does this reflect some larger issue?
> 
> I think that depends on the key.  My Corsair keys have the light
> flicker whenever I/O is on-going.

Yeah, it does, I am not a big expert in USB storage, and I haven't seen 
many USB sticks, but I might imagine how windows turns the LED off on 
unmount. Try "eject /dev/sdX", where sdX is your USB storage, after you 
unmount it. Be careful, especially if you have SATA (or SCSI) discs in 
your system or if you use libata for PATA discs not to eject the wrong 
one... For example, if you think you have to eject /dev/sdc, check before 
ejecting it with "mount" if there are any /dev/sdc* partitions mounted.

As for LED going on only when mounting, don't know. Maybe they also issue 
eject after initial enumeration? In which case the LED would go shortly on 
on plug in, then off, then on again on mount. You can try it under Linux 
too, if you issue eject after plugging in but before mount. I think, you 
won't be able to mount it directly after ejecting, you'd have to force 
re-enumeration under Linux.

Thanks
Guennadi
---
Guennadi Liakhovetski
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2/4] hpt366: UltraDMA filter for SATA cards (take 2)

2007-08-25 Thread Sergei Shtylyov
The Marvell bridge chips used on HighPoint SATA cards do not seem to support
the UltraDMA modes 1, 2, and 3 as well as any MWDMA modes, so the driver needs
to account for this in the udma_filter() method.  In order to achieve that, do
the following changes:

- install the method for all chips, not only HPT36x/370 and impove the code
  formatting by killing the extra tabs while at it;

- add to the end of the 'switch' statement in the method cases for HPT372[AN]
  and HPT374 chips upon which the known SATA cards are based;

- use hwif->ultra_mask as a default mask for the ide_dma_filter() method to
  behave correctly;

- move the HPT370[A] cases below the HPT36x case for consistency...

---
This version doesn't use explicit UltraDMA masks, so converting them to the
ATA_UDMA* is left for another, global patch.  This patch against the current
Linus' tree and unfortunately I was able to only compile test it since that
tree gives MODPOST warning and dies early on bootup.

Signed-off-by: Sergei Shtylyov <[EMAIL PROTECTED]>

 drivers/ide/pci/hpt366.c |   78 ---
 1 files changed, 40 insertions(+), 38 deletions(-)

Index: linux-2.6/drivers/ide/pci/hpt366.c
===
--- linux-2.6.orig/drivers/ide/pci/hpt366.c
+++ linux-2.6/drivers/ide/pci/hpt366.c
@@ -1,5 +1,5 @@
 /*
- * linux/drivers/ide/pci/hpt366.c  Version 1.11Aug 11, 2007
+ * linux/drivers/ide/pci/hpt366.c  Version 1.12Aug 25, 2007
  *
  * Copyright (C) 1999-2003 Andre Hedrick <[EMAIL PROTECTED]>
  * Portions Copyright (C) 2001 Sun Microsystems, Inc.
@@ -114,6 +114,7 @@
  *   unify HPT36x/37x timing setup code and the speedproc handlers by joining
  *   the register setting lists into the table indexed by the clock selected
  * - set the correct hwif->ultra_mask for each individual chip
+ * - add UltraDMA mode filtering for the HPT37[24] based SATA cards
  * Sergei Shtylyov, <[EMAIL PROTECTED]> or <[EMAIL PROTECTED]>
  */
 
@@ -524,36 +525,38 @@ static int check_in_drive_list(ide_drive
 
 static u8 hpt3xx_udma_filter(ide_drive_t *drive)
 {
-   struct hpt_info *info   = pci_get_drvdata(HWIF(drive)->pci_dev);
-   u8 mask;
+   ide_hwif_t *hwif= HWIF(drive);
+   struct hpt_info *info   = pci_get_drvdata(hwif->pci_dev);
+   u8 mask = hwif->ultra_mask;
 
switch (info->chip_type) {
-   case HPT370A:
-   if (!HPT370_ALLOW_ATA100_5 ||
-   check_in_drive_list(drive, bad_ata100_5))
-   return 0x1f;
-   else
-   return 0x3f;
-   case HPT370:
-   if (!HPT370_ALLOW_ATA100_5 ||
-   check_in_drive_list(drive, bad_ata100_5))
-   mask = 0x1f;
-   else
-   mask = 0x3f;
-   break;
case HPT36x:
-   if (!HPT366_ALLOW_ATA66_4 ||
+   if (HPT366_ALLOW_ATA66_4 &&
check_in_drive_list(drive, bad_ata66_4))
-   mask = 0x0f;
-   else
-   mask = 0x1f;
+   mask = ~0x10;
 
-   if (!HPT366_ALLOW_ATA66_3 ||
+   if (HPT366_ALLOW_ATA66_3 &&
check_in_drive_list(drive, bad_ata66_3))
-   mask = 0x07;
+   mask = ~0x08;
break;
+   case HPT370 :
+   case HPT370A:
+   if (HPT370_ALLOW_ATA100_5 &&
+   check_in_drive_list(drive, bad_ata100_5))
+   mask = ~0x20;
+
+   if (info->chip_type == HPT370A)
+   return mask;
+   break;
+   case HPT372 :
+   case HPT372A:
+   case HPT372N:
+   case HPT374 :
+   if (ide_dev_is_sata(drive->id))
+   mask &= ~0x0e;
+   /* Fall thru */
default:
-   return 0x7f;
+   return mask;
}
 
return check_in_drive_list(drive, bad_ata33) ? 0x00 : mask;
@@ -1236,25 +1239,24 @@ static unsigned int __devinit init_chips
 
 static void __devinit init_hwif_hpt366(ide_hwif_t *hwif)
 {
-   struct pci_dev  *dev= hwif->pci_dev;
-   struct hpt_info *info   = pci_get_drvdata(dev);
-   int serialize   = HPT_SERIALIZE_IO;
-   u8  scr1 = 0, ata66 = hwif->channel ? 0x01 : 0x02;
-   u8  chip_type   = info->chip_type;
-   u8  new_mcr, old_mcr= 0;
+   struct pci_dev  *dev= hwif->pci_dev;
+   struct hpt_info *info   = pci_get_drvdata(dev);
+   int serialize   = HPT_SERIALIZE_IO;
+   u8  scr1 = 0, ata66 = hwif->channel ? 0x01 : 0x02;
+   u8  chip_type   = info->chip_type;
+   u8  new_mcr, old_mcr= 0;
 
/* Cache the channel's MISC. control 

Re: Linux-Kernel MAINTAINERS jffs-dev list entry alive?

2007-08-25 Thread Josh Boyer
On 8/25/07, Jörn Engel <[EMAIL PROTECTED]> wrote:
> On Fri, 24 August 2007 17:18:54 -0700, Joe Perches wrote:
> >
> > JOURNALLING FLASH FILE SYSTEM V2 (JFFS2)
> > P:David Woodhouse
> > M:[EMAIL PROTECTED]
> > L:[EMAIL PROTECTED]
> > W:http://sources.redhat.com/jffs2/
> >
> > The L: [EMAIL PROTECTED] entry doesn't seem active.
>
> [EMAIL PROTECTED] seems more appropriate these days.

Agreed.

josh
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/4] ide: add ide_dev_is_sata() helper (take 2)

2007-08-25 Thread Sergei Shtylyov
Make the SATA drive detection code from eighty_ninty_three() into inline
ide_dev_is_sata() helper fixing it along the way to be more strict while
checking word 80 for the reserved values...

Signed-off-by: Sergei Shtylyov <[EMAIL PROTECTED]>

---
This version doesn't flip the MSB, it just casts word80 to 'signed short'
This is still against the current Linus' tree and unfortunately I was able to
only compile test it since that tree gives MODPOST warning and dies early on
bootup (yes, Bart this is still the case).  Alos, I have no SATA bridge/drive
here at hand...

 drivers/ide/ide-iops.c |3 +--
 include/linux/ide.h|   13 +
 2 files changed, 14 insertions(+), 2 deletions(-)

Index: linux-2.6/drivers/ide/ide-iops.c
===
--- linux-2.6.orig/drivers/ide/ide-iops.c
+++ linux-2.6/drivers/ide/ide-iops.c
@@ -615,8 +615,7 @@ u8 eighty_ninty_three (ide_drive_t *driv
if (hwif->cbl != ATA_CBL_PATA80 && !ivb)
goto no_80w;
 
-   /* Check for SATA but only if we are ATA5 or higher */
-   if (id->hw_config == 0 && (id->major_rev_num & 0x7FE0))
+   if (ide_dev_is_sata(id))
return 1;
 
/*
Index: linux-2.6/include/linux/ide.h
===
--- linux-2.6.orig/include/linux/ide.h
+++ linux-2.6/include/linux/ide.h
@@ -1380,6 +1380,19 @@ static inline int ide_dev_has_iordy(stru
return ((id->field_valid & 2) && (id->capability & 8)) ? 1 : 0;
 }
 
+static inline int ide_dev_is_sata(struct hd_driveid *id)
+{
+   /*
+* See if word 93 is 0 AND drive is at least ATA-5 compatible
+* verifying that word 80 by casting it to a signed type --
+* this trick allows us to filter out the reserved values of
+* 0x and 0x along with the earlier ATA revisions...
+*/
+   if (id->hw_config == 0 && (short)id->major_rev_num >= 0x0020)
+   return 1;
+   return 0;
+}
+
 u8 ide_dump_status(ide_drive_t *, const char *, u8);
 
 typedef struct ide_pio_timings_s {

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Suspend2-devel] Problem with CFS V20 and Suspend2/tuxonice

2007-08-25 Thread Ingo Molnar

* Christian Hesse <[EMAIL PROTECTED]> wrote:

> > Reproduced on a Intel Centrino based laptop with gentoo kamikaze7
> > sources (http://forums.gentoo.org/viewtopic-t-577970.html)
> 
> Same problem here: Core Duo, Kernel 2.6.22.5, Suspend 2.2.10, CFS v20.2.

please try the patch below - does it fix the problem?

Ingo

Index: linux-cfs-2.6.22.5.q/kernel/sched.c
===
--- linux-cfs-2.6.22.5.q.orig/kernel/sched.c
+++ linux-cfs-2.6.22.5.q/kernel/sched.c
@@ -5043,6 +5043,8 @@ static int migration_thread(void *data)
struct migration_req *req;
struct list_head *head;
 
+   try_to_freeze();
+
spin_lock_irq(>lock);
 
if (cpu_is_offline(cpu)) {
@@ -5399,6 +5401,7 @@ migration_call(struct notifier_block *nf
p = kthread_create(migration_thread, hcpu, "migration/%d", cpu);
if (IS_ERR(p))
return NOTIFY_BAD;
+   p->flags |= PF_NOFREEZE;
kthread_bind(p, cpu);
/* Must be high prio: stop_machine expects to yield to it. */
rq = task_rq_lock(p, );
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Suspend2-devel] Problem with CFS V20 and Suspend2/tuxonice

2007-08-25 Thread Ingo Molnar

* Matthias Hensler <[EMAIL PROTECTED]> wrote:

> > Same problem here: Core Duo, Kernel 2.6.22.5, Suspend 2.2.10, CFS 
> > v20.2.
> 
> Me too for 2.6.22.5, TuxOnIce 2.2.10 and Centrino based notebook.

possible bugfix below.

Ingo

Index: linux-cfs-2.6.22.5.q/kernel/sched.c
===
--- linux-cfs-2.6.22.5.q.orig/kernel/sched.c
+++ linux-cfs-2.6.22.5.q/kernel/sched.c
@@ -5043,6 +5043,8 @@ static int migration_thread(void *data)
struct migration_req *req;
struct list_head *head;
 
+   try_to_freeze();
+
spin_lock_irq(>lock);
 
if (cpu_is_offline(cpu)) {
@@ -5399,6 +5401,7 @@ migration_call(struct notifier_block *nf
p = kthread_create(migration_thread, hcpu, "migration/%d", cpu);
if (IS_ERR(p))
return NOTIFY_BAD;
+   p->flags |= PF_NOFREEZE;
kthread_bind(p, cpu);
/* Must be high prio: stop_machine expects to yield to it. */
rq = task_rq_lock(p, );

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Problem with CFS V20 and Suspend2/tuxonice

2007-08-25 Thread Ingo Molnar

* David Rodriguez <[EMAIL PROTECTED]> wrote:

> I'm using 2.6.22.5 with cfs v20.3 and suspend2 2.2.10.2. With that 
> combination, suspend is not working anymore (with cfs v19 was 
> working). Stops on suspend in "Suspending tasks" Looking at cfs patch, 
> I managed to change the migration_thread, adding again the 
> try_to_freeze() removed in last patch and now the suspend finished, 
> but resume not work. Of course I don't know why that was removed, and 
> rewriting it is not a solution, but I want to report it.

could you try the patch below, does it fix this problem?

Ingo

Index: linux-cfs-2.6.22.5.q/kernel/sched.c
===
--- linux-cfs-2.6.22.5.q.orig/kernel/sched.c
+++ linux-cfs-2.6.22.5.q/kernel/sched.c
@@ -5043,6 +5043,8 @@ static int migration_thread(void *data)
struct migration_req *req;
struct list_head *head;
 
+   try_to_freeze();
+
spin_lock_irq(>lock);
 
if (cpu_is_offline(cpu)) {
@@ -5399,6 +5401,7 @@ migration_call(struct notifier_block *nf
p = kthread_create(migration_thread, hcpu, "migration/%d", cpu);
if (IS_ERR(p))
return NOTIFY_BAD;
+   p->flags |= PF_NOFREEZE;
kthread_bind(p, cpu);
/* Must be high prio: stop_machine expects to yield to it. */
rq = task_rq_lock(p, );
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Suspend2-devel] Problem with CFS V20 and Suspend2/tuxonice

2007-08-25 Thread Matthias Hensler
Hi!

On Sat, Aug 25, 2007 at 06:52:20PM +0200, Christian Hesse wrote:
> On Saturday 25 August 2007, Fabio Comolli wrote:
> > On 8/25/07, David Rodriguez <[EMAIL PROTECTED]> wrote:
> > > I'm using 2.6.22.5 with cfs v20.3 and  suspend2  2.2.10.2.  With
> > > that combination, suspend is not working anymore (with cfs v19 was
> > > working).  Stops on suspend in "Suspending tasks"
> > Reproduced on a Intel Centrino based laptop with gentoo kamikaze7
> > sources (http://forums.gentoo.org/viewtopic-t-577970.html)
> 
> Same problem here: Core Duo, Kernel 2.6.22.5, Suspend 2.2.10, CFS
> v20.2.

Me too for 2.6.22.5, TuxOnIce 2.2.10 and Centrino based notebook.

Regards,
Matthias


pgpc5lVypdRxk.pgp
Description: PGP signature


Re: [PATCH -mm 2/2] Hibernation: Arbitrary boot kernel support on x86_64

2007-08-25 Thread david

On Sat, 25 Aug 2007, Rafael J. Wysocki wrote:


On Friday, 24 August 2007 22:46, Pavel Machek wrote:

Hi!


From: Rafael J. Wysocki <[EMAIL PROTECTED]>

Make it possible to restore a hibernation image on x86_64 with the help of a
kernel different from the one in the image.

The idea is to split the core restoration code into two separate parts and to
place each of them in a different page.  The first part belongs to the boot


What happens in case where both parts want to be
at the same place? (Like kernel being restored is 4KB smaller, so that
routines now collide?)


Bad things, but I can't see how to avoid that reliably.


can you at least detect it reliably? (feed a program both kernel images 
and have it tell you 'yes/no')


David Lang

Re: [PATCH] Fix kobject uevent string handling errors

2007-08-25 Thread Mathieu Desnoyers
* Kay Sievers ([EMAIL PROTECTED]) wrote:
 >  env->envp[env->envp_idx++] = >buf[env->buflen];
> > env->buflen += len + 1;
> > Index: linux-2.6-lttng/drivers/firmware/dmi-id.c
> > ===
> > --- linux-2.6-lttng.orig/drivers/firmware/dmi-id.c  2007-08-25 
> > 00:07:24.0 -0400
> > +++ linux-2.6-lttng/drivers/firmware/dmi-id.c   2007-08-25 
> > 00:07:58.0 -0400
> > @@ -152,9 +152,10 @@ static int dmi_dev_uevent(struct device 
> > if (add_uevent_var(env, "MODALIAS="))
> > return -ENOMEM;
> > len = get_modalias(>buf[env->buflen - 1],
> > -  sizeof(env->buf) - env->buflen);
> > -   if (len >= (sizeof(env->buf) - env->buflen))
> > +  sizeof(env->buf) - (env->buflen - 1));
> > +   if (len >= (sizeof(env->buf) - (env->buflen - 1)))
> > return -ENOMEM;
> > +   env->buflen += len + 1;
> 
> The increment for the trailing '\0' is already done in add_uevent_var(),
> so this change is not needed, I think.

Oh, you are right, since we replace the existing \0 which is already
accounted for, we don't have to do len +1 here.

Thanks,

Mathieu


-- 
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [2.6 patch] remove securebits

2007-08-25 Thread Adrian Bunk
On Fri, Aug 24, 2007 at 08:50:10PM -0700, Andrew Morgan wrote:
> 
> FWIW, in the mm kernel, I've actually already removed them when one
> configures without capabilities.
> 
> http://www.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.23-rc3/2.6.23-rc3-mm1/broken-out/v3-file-capabilities-alter-behavior-of-cap_setpcap.patch
> 
> Other than writing a custom module, so far as I can tell, there is/was
> no way to set them anyway.
> 
> I'd obviously prefer to wait for the mm-merge process to complete and
> minimize the churn in this area, but I basically agree that the bits as
> implemented are pretty useless in their current form. In a per-process
> mode (with filesystem capability support) they are much more useful...

It was in the tree for nine years (sic) without a single user...

Are you only improving a dead horse, or do you also have a rider for the 
improved dead horse?

> Cheers
> 
> Andrew

cu
Adrian

-- 

   "Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
   "Only a promise," Lao Er said.
   Pearl S. Buck - Dragon Seed

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH -mm 2/2] Hibernation: Arbitrary boot kernel support on x86_64

2007-08-25 Thread Rafael J. Wysocki
On Friday, 24 August 2007 22:46, Pavel Machek wrote:
> Hi!
> 
> > From: Rafael J. Wysocki <[EMAIL PROTECTED]>
> > 
> > Make it possible to restore a hibernation image on x86_64 with the help of a
> > kernel different from the one in the image.
> > 
> > The idea is to split the core restoration code into two separate parts and 
> > to
> > place each of them in a different page.  The first part belongs to the boot
> 
> What happens in case where both parts want to be
> at the same place? (Like kernel being restored is 4KB smaller, so that
> routines now collide?)

Bad things, but I can't see how to avoid that reliably.

Greetings,
Rafael
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] sysctl: Update sysctl_check to handle compiled out code.

2007-08-25 Thread Eric W. Biederman

Currently sysctl_check_table will complain if a strategy routine
is missing when we have sys_sysctl compiled out, or a proc_handler
is missing when we have procfs compiled out.  At least some
of the custom handlers actually expand to NULL when this is the
case so the warning is actually a problem.

[EMAIL PROTECTED] writes:
> As far as "something else wrong", I'm still seeing these in -rc3-mm1
>
> [0.628000] sysctl table check failed: /kernel/ostype .1.1 Missing strategy
> [ 0.628000] sysctl table check failed: /kernel/osrelease .1.2 Missing strategy
> [0.628000] sysctl table check failed: /kernel/version .1.4 Missing 
> strategy
> [ 0.628000] sysctl table check failed: /kernel/hostname .1.7 Missing strategy
> [ 0.628000] sysctl table check failed: /kernel/domainname .1.8 Missing 
> strategy
> [0.628000] sysctl table check failed: /kernel/shmmax .1.34 Missing 
> strategy
> [0.628000] sysctl table check failed: /kernel/shmall .1.41 Missing 
> strategy
> [0.628000] sysctl table check failed: /kernel/shmmni .1.45 Missing 
> strategy
> [0.628000] sysctl table check failed: /kernel/msgmax .1.35 Missing 
> strategy
> [0.628000] sysctl table check failed: /kernel/msgmni .1.42 Missing 
> strategy
> [0.628000] sysctl table check failed: /kernel/msgmnb .1.36 Missing 
> strategy
> [0.628000] sysctl table check failed: /kernel/sem .1.43 Missing strategy


So don't worry about missing strategy routines, or missing proc_handler
routines when they will never be called.

Signed-off-by: Eric W. Biederman <[EMAIL PROTECTED]>
---
 kernel/sysctl_check.c |4 
 1 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/kernel/sysctl_check.c b/kernel/sysctl_check.c
index aa5b6f6..10dd744 100644
--- a/kernel/sysctl_check.c
+++ b/kernel/sysctl_check.c
@@ -1552,14 +1552,18 @@ int sysctl_check_table(struct ctl_table *table)
set_fail(, table, "No 
max");
}
}
+#ifdef CONFIG_SYSCTL_SYSCALL
if (table->ctl_name && !table->strategy)
set_fail(, table, "Missing strategy");
+#endif
 #if 0
if (!table->ctl_name && table->strategy)
set_fail(, table, "Strategy without 
ctl_name");
 #endif
+#ifdef CONFIG_PROC_FS
if (table->procname && !table->proc_handler)
set_fail(, table, "No proc_handler");
+#endif
 #if 0
if (!table->procname && table->proc_handler)
set_fail(, table, "proc_handler without 
procname");
-- 
1.5.3.rc6.17.g1911

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/2] sysctl: Properly register the irda binary sysctl numbers.

2007-08-25 Thread Eric W. Biederman
[EMAIL PROTECTED] writes:

> On Sat, 25 Aug 2007 06:57:17 MDT, Eric W. Biederman said:
>
>> It's good to have confirmation that my sysctl_check routine
>> didn't find something else wrong.
>
> If I understand the code, anything it whinges about is either an outright bug
> or it's a round of ammo already chambered. ;)

Pretty much.  The heuristics aren't prefect but they are pretty good.

> As far as "something else wrong", I'm still seeing these in -rc3-mm1, but
> they've been reported before against -rc2-mm2, I think:

Interesting.  No I haven't seen this one.  This appears to be one of
those silly little corner cases I failed to account for in my checks.
It looks like you don't have CONFIG_SYSCTL_SYSCALL defined, and it
appears utsname_syscall and ipcdata_syscall both become NULL pointers
if they aren't needed.  So the complaint is a false positive.

> [0.628000] sysctl table check failed: /kernel/ostype .1.1 Missing strategy
> [ 0.628000] sysctl table check failed: /kernel/osrelease .1.2 Missing strategy
> [0.628000] sysctl table check failed: /kernel/version .1.4 Missing 
> strategy
> [ 0.628000] sysctl table check failed: /kernel/hostname .1.7 Missing strategy
> [ 0.628000] sysctl table check failed: /kernel/domainname .1.8 Missing 
> strategy
> [0.628000] sysctl table check failed: /kernel/shmmax .1.34 Missing 
> strategy
> [0.628000] sysctl table check failed: /kernel/shmall .1.41 Missing 
> strategy
> [0.628000] sysctl table check failed: /kernel/shmmni .1.45 Missing 
> strategy
> [0.628000] sysctl table check failed: /kernel/msgmax .1.35 Missing 
> strategy
> [0.628000] sysctl table check failed: /kernel/msgmni .1.42 Missing 
> strategy
> [0.628000] sysctl table check failed: /kernel/msgmnb .1.36 Missing 
> strategy
> [0.628000] sysctl table check failed: /kernel/sem .1.43 Missing strategy
>
> And this isn't on an allyesconfig or allmodconfig. There may well be sysctl
> code I didn't hit - my /lib/modules/2.6.23-rc3-mm1 is only about 10M, and
> the Fedora kernels are weighing in at about 75M of /lib/modules a
> pop.

Yes.  Thank you.  I figure as long as we are reasonably close people
we should catch most if not all things before this is merged into
Linus's tree.

Patch in a moment.

Eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Linux 2.6.20.17

2007-08-25 Thread Randy Dunlap
On Sat, 25 Aug 2007 15:38:10 + Willy Tarreau wrote:

> 
> I've just released Linux 2.6.20.17.
> 
> As a reminder, it fixes these 3 security issues :
>   CVE-2007-3105
>   CVE-2007-3848
>   CVE-2007-3851
> 
> I'll also be replying to this message with a copy of the patch between
> 2.6.20.16 and 2.6.20.17.
> 
> The patch and changelog will appear soon at the following locations:
>   ftp://ftp.all.kernel.org/pub/linux/kernel/v2.6/
>   ftp://ftp.all.kernel.org/pub/linux/kernel/v2.6/patch-2.6.20.17.bz2
>   ftp://ftp.all.kernel.org/pub/linux/kernel/v2.6/ChangeLog-2.6.20.17
> 
> Git repository:
>git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-2.6.20.y.git
>   http://www.kernel.org/pub/scm/linux/kernel/git/stable/linux-2.6.20.y.git
> 
> Git repository through the gitweb interface:
>   http://git.kernel.org/?p=linux/kernel/git/stable/linux-2.6.20.y.git

I'm getting build errors:

x86_64 allyesconfig, allmodconfig:

drivers/ata/pata_atiixp.c:286: error: 'PCI_DEVICE_ID_ATI_IXP700_IDE' undeclared 
here (not in a function)
net/bluetooth/rfcomm/tty.c:275: error: 'struct rfcomm_dev' has no member named 
'tty_dev'
net/bluetooth/rfcomm/tty.c:278: error: 'struct rfcomm_dev' has no member named 
'tty_dev'


---
~Randy
*** Remember to use Documentation/SubmitChecklist when testing your code ***
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Problem with CFS V20 and Suspend2/tuxonice

2007-08-25 Thread rb6

Same problem on my centrino duo.  Reverting to cfs v19 fixes it.

Please personally CC me on replies, as I am not on the list.



Reproduced on a Intel Centrino based laptop with gentoo kamikaze7
sources (http://forums.gentoo.org/viewtopic-t-577970.html)





On 8/25/07, David Rodriguez <[EMAIL PROTECTED]> wrote:

I'm using 2.6.22.5 with cfs v20.3 and  suspend2  2.2.10.2.
With that combination, suspend is not working anymore (with cfs v19
was working).
Stops on suspend in "Suspending tasks"
Looking at cfs patch, I managed to change the  migration_thread,
adding again the  try_to_freeze() removed in last patch and now the
suspend finished, but resume not work. Of course I don't know why that
was removed, and rewriting it is not a solution, but I want to report
it.

All that in a core duo machine.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" 

in

the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" 
in

the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] sigqueue_free: fix the race with collect_signal()

2007-08-25 Thread Oleg Nesterov
On 08/25, Sukadev Bhattiprolu wrote:
> 
> Yes. I see it now. I had missed the SIGQUEUE_PREALLOC in __sigqueue_free().

Thanks for looking at this. This code looks simple, but it is not obvious, at
least for me.

Oleg.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: PROBLEM: Caught SIGFPE exceptions aren't reset

2007-08-25 Thread Clark Cooper
Thank you for your prompt answer. You wrote:


> Your handler can do this by masking
>  exceptions, changing the operands of the failed FP instruction,
> or by changing the PC so that the failed instruction is skipped
> (your handler may want to emulate the instruction in this case).

Given the current SIGFPE handling in the kernel, a userland handler
CAN'T mask the exceptions or clear the exception flags. This is
demonstrated in my example program where I call both feclearexception
and fedisableexception in the handler. The reason these don't work is
that any change you make to these FPU registers is overwritten with
the saved FPU context on return from the signal handler. Changing the
PC won't help since the exception flag is still set and unmasked on
signal return.

Now either the kernel should clear the corresponding exception flags
or it should present the correct FPU state to the handler for
manipulation. Currently neither of these appear to be the case. So if
there is no problem with the kernel, then the signal man pages should
indicate that while you can catch a SIGFPE, you can't actually return
from it.

If I am completely wrong on this, could you please indicate what kind
of change could be made to the handler in my example program to allow
it to reach the final printf statement in the main program?

Thank you,
Clark Cooper
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] sigqueue_free: fix the race with collect_signal()

2007-08-25 Thread Sukadev Bhattiprolu

Oleg Nesterov wrote:

On 08/24, Sukadev Bhattiprolu wrote:
  

Oleg Nesterov wrote:


On 08/24, taoyue wrote:
 
  

Oleg Nesterov wrote:
   


collect_signal: sigqueue_free:

list_del_init(>list);
 spin_lock_irqsave(lock, flags);
  
   


   ^^

 
  

 if (!list_empty(>list))
   list_del_init(>list);
 spin_unlock_irqrestore(lock, 
 flags);

 q->flags &= ~SIGQUEUE_PREALLOC;

 __sigqueue_free(first);__sigqueue_free(q);
  
   


collect_signal() is always called under ->siglock which is also taken by
sigqueue_free(), so this is not possible.


 
  

I know, using current->sighand->siglock to prevent one sigqueue
is free twice. I want to know whether it is possible that the two
function is called in different thread. If that, the spin_lock is useless.
   


Not sure I understand. Yes, it is possible they are called by 2 different
threads, that is why we had a race. But all threads in the same thread
group have the same ->sighand, and thus the same ->sighand->siglock.
 
  
Oleg, if one thread can be in collect_signal() and another in 
sigqueue_free() and both operate on the exact same sigqueue object, its 
not clear how we prevent two calls to __sigqueue_free() to
the same object. In that case the lock (or some lock) should be around 
__sigqueue_free() - no ?


i.e if we enter sigqueue_free(), we will call __sigqueue_free() 
regardless of the state.



Yes. They both will call __sigqueue_free(). But please note that 
__sigqueue_free()
checks SIGQUEUE_PREALLOC, which is cleared by sigqueue_free().

IOW, when sigqueue_free() unlocks ->siglock, we know that it can't be used
by collect_signal() from another thread. So we can clear SIGQUEUE_PREALLOC
and free sigqueue. We don't need this lock around sigqueue_free() to prevent
the race. collect_signal() can "see" only those sigqueues which are on list.

IOW, when sigqueue_free() takes ->siglock, colect_signal() can't run, because
it needs the same lock. Now we delete this sigqueue from list, nobody can
see it, it can't have other references. So we can unlock ->siglock, mark
sigqueue as freeable (clear SIGQUEUE_PREALLOC), and free it.

Do you agree?
  


Yes. I see it now. I had missed the SIGQUEUE_PREALLOC in __sigqueue_free().

Thanks for clarifying

Suka


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [git pull request] scheduler updates

2007-08-25 Thread Ingo Molnar

* Ingo Molnar <[EMAIL PROTECTED]> wrote:

> We'll see - people can still readily tweak these values under 
> CONFIG_SCHED_DEBUG=y and give us feedback, and if there's enough 
> demand for very-finegrained scheduling (throughput be damned), we 
> could introduce yet another config option or enable the runtime 
> tunable unconditionally. (Or maybe the best option is Peter Zijlstra's 
> adaptive granularity idea, that gives the best of both worlds.)

hm, glxgears smoothness regresses with the de-HZ-ification change: with 
an increasing background load the turning of the wheels quickly becomes 
visually ugly - with small ruckles instead of smooth rotation.

The reason for that is that the 20 msec granularity on my testbox (which 
is a dual-core box, so the default 10msec turns into 20msec) turns into 
40, 60, 80, 100 msec 'observed latency' for glxgears as load increases 
to 2x, 3x, 4x etc - and a 100 msec pause in rotation is easily 
perceivable to the human eye (and brain). Before that the delay curve 
with increasing load was 4msec/8msec/12msec etc.

Due to the removal of the HZ dependency we now have upset the 
granularity picture anyway, so i believe we should do the adaptive 
granularity thing right now. That will aim for a 40msec task-observable 
latency, in a load-independent manner. (!) (This is an approach we 
couldnt even dream of with the previous, fixed-timeslice scheduler.)

The code is simple (and it is all in the slowpath), it in essence boils 
down to this new code:

 +static long
 +sched_granularity(struct cfs_rq *cfs_rq)
 +{
 +   unsigned int gran = sysctl_sched_latency;
 +   unsigned int nr = cfs_rq->nr_running;
 +
 +   if (nr > 1) {
 +   gran = gran/nr - gran/nr/nr;
 +   gran = max(gran, sysctl_sched_granularity);
 +   }
 +
 +   return gran;
 +}
 
IMO it is a good compromise between long slicing and short slicing: 
there are two values, one is the "CPU-bound task latency the scheduler 
aims for", the second one is a minimum granularity (to not do too many 
context-switches).

Peter and me tested this all day with various workloads and extreme-load 
behavior has improved all over the place - while the server benchmarks 
(which want less preemption) are still fine too. The glxgear ruckles are 
all gone.

If you do not disagree with this (it's pretty late in the game with more 
than 1 month spent from the kernel cycle already), please pull the 
latest scheduler tree from:

   git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched.git

Find the shortlog further below. There are 3 commits in it: adaptive 
granularity, a subsequent cleanup, and a lockdep sysctl bug Peter 
noticed while hacking on this. (the bug was introduced with the initial 
CFS commits but nobody noticed because the lockdep sysctls are rarely 
used.)

The linecount increase is mostly due to the comments added to explain 
the "gran = lat/nr - lat/nr/nr" magic formula and due to the extra 
parameter.

Tested on 32-bit and 64-bit x86, and with a few make randconfig build 
tests too.

Ingo

-->
Ingo Molnar (1):
  sched: cleanup, sched_granularity -> sched_min_granularity

Peter Zijlstra (2):
  sched: fix CONFIG_SCHED_DEBUG dependency of lockdep sysctls
  sched: adaptive scheduler granularity

 include/linux/sched.h |3 +
 kernel/sched.c|   16 ++
 kernel/sched_fair.c   |   77 ++
 kernel/sysctl.c   |   33 ++---
 4 files changed, 99 insertions(+), 30 deletions(-)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Suspend2-devel] Problem with CFS V20 and Suspend2/tuxonice

2007-08-25 Thread Christian Hesse
On Saturday 25 August 2007, Fabio Comolli wrote:
> On 8/25/07, David Rodriguez <[EMAIL PROTECTED]> wrote:
> > I'm using 2.6.22.5 with cfs v20.3 and  suspend2  2.2.10.2.
> > With that combination, suspend is not working anymore (with cfs v19
> > was working).
> > Stops on suspend in "Suspending tasks"
> > Looking at cfs patch, I managed to change the  migration_thread,
> > adding again the  try_to_freeze() removed in last patch and now the
> > suspend finished, but resume not work. Of course I don't know why that
> > was removed, and rewriting it is not a solution, but I want to report
> > it.
>
> Reproduced on a Intel Centrino based laptop with gentoo kamikaze7
> sources (http://forums.gentoo.org/viewtopic-t-577970.html)

Same problem here: Core Duo, Kernel 2.6.22.5, Suspend 2.2.10, CFS v20.2.
-- 
Regards,
Chris


signature.asc
Description: This is a digitally signed message part.


Re: [PATCH 0/6] x86: Reduce Memory Usage and Inter-Node message traffic (v2)

2007-08-25 Thread Randy Dunlap
On Sat, 25 Aug 2007 11:24:35 +0200 Andi Kleen wrote:

> On Fri, Aug 24, 2007 at 05:50:18PM -0700, Siddha, Suresh B wrote:
> > On Fri, Aug 24, 2007 at 03:26:54PM -0700, [EMAIL PROTECTED] wrote:
> > > Previous Intro:
> > 
> > Thanks for doing this.
> > 
> > > In x86_64 and i386 architectures most arrays that are sized
> > > using NR_CPUS lay in local memory on node 0.  Not only will most
> > > (99%?) of the systems not use all the slots in these arrays,
> > > particularly when NR_CPUS is increased to accommodate future
> > > very high cpu count systems, but a number of cache lines are
> > > passed unnecessarily on the system bus when these arrays are
> > > referenced by cpus on other nodes.
> > 
> > Can we move cpuinfo_x86 also to per cpu area? Though critical run
> 
> I worry how much impact that would be? boot_cpu_data is quite 
> widely used. 
> 
> > Wonder if this confusion is the reason for git commit f3fa8ebc
> 
> What git commit (full id) ? 

Looks like it's
commit f3fa8ebc25129bb69929e20b0c84049c39029d8d
Author: Rohit Seth <[EMAIL PROTECTED]>
Date:   Mon Jun 26 13:58:17 2006 +0200

[PATCH] x86_64: moving phys_proc_id and cpu_core_id to cpuinfo_x86

---
~Randy
*** Remember to use Documentation/SubmitChecklist when testing your code ***
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] IdealTEK URTC1000 support for usbtouchscreen

2007-08-25 Thread Daniel Ritz
[ please don't send patches as attachment.. ]

On Thursday 23 August 2007 23:55:28 Ondrej Zary wrote:
> Hello,
> this patch adds support for IdealTEK URTC1000 touchscreen controllers.
> 

looks good, just a few minor things...

> Documentation can be downloaded at 
> http://projects.tbmn.org/cgi-bin/trac.cgi/wiki/urtc-1000
> 
> I'm not sure that the code is correct, especially because the .rept_size is 
> set to 8 bytes. I've tried 5, 6, 7, 8, 12 and 16. Only 8 and 16 work. Other 
> values resulted in -EOVERFLOW in usbtouch_irq() and infinite retries.
> 

8 is fine.

> Signed-off-by: Ondrej Zary <[EMAIL PROTECTED]>
> 

+   if (pkt[0] == 1) {
+   /* response string - ignore it */
+   char *end = strchr(pkt, '\r');
+   end[0] = 0;
+   dbg("IdealTEK response: %s\n", [1]);
+   return 0;

just remove this block

+   } else if ((pkt[0] & 0x98) == 0x88) {
+   /* touch data in IdealTEK mode */
+   dev->x = (pkt[1] << 5) | (pkt[2] >> 2);
+   dev->y = (pkt[3] << 5) | (pkt[4] >> 2);
+   dev->touch = (pkt[0] & 0x40) ? 1 : 0;
+   return 1;
+   } else if ((pkt[0] & 0x98) == 0x98) {
+   /* touch data in MT emulation mode */
+   dev->x = (pkt[2] << 5) | (pkt[1] >> 2);
+   dev->y = (pkt[4] << 5) | (pkt[3] >> 2);
+   dev->touch = (pkt[0] & 0x40) ? 1 : 0;
+   return 1;
+   } else {
+   dbg("IdealTEK: invalid data\n");
+   return 0;
+   }

and this one as well...and do a 'return 0;' as last statement
outside the 'if'

+   .process_pkt= usbtouch_process_multi,

so you're using the support for packets split across two
URBs? is it required? if so, you have to ensure that the function
is compiled in. ie. extend this statement in the driver with your
config option:

#if defined(CONFIG_TOUCHSCREEN_USB_EGALAX) || 
defined(CONFIG_TOUCHSCREEN_USB_ETURBO)
#define MULTI_PACKET
#endif

the rest looks fine.

rgds
-daniel
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Corrupted filesystem with new Firewire stack

2007-08-25 Thread Stefan Richter
> Martin K. Petersen wrote:
>> Happens here on both a G4 and an intel Mini.  Both running FC7.
>>
>> Aug 17 08:24:08 mini kernel: firewire_sbp2: status write for unknown orb
>> Aug 17 08:25:08 mini kernel: firewire_sbp2: sbp2_scsi_abort
>> Aug 17 08:26:36 mini kernel: firewire_sbp2: status write for unknown orb
>> Aug 17 08:27:36 mini kernel: firewire_sbp2: sbp2_scsi_abort
>> Aug 17 08:33:51 mini kernel: firewire_sbp2: status write for unknown orb
>> Aug 17 08:34:51 mini kernel: firewire_sbp2: sbp2_scsi_abort

The infrequent occurrences of this on my test setup vanished after
Kristian's patch.  The patch is now in linux1394-2.6.git and in
http://me.in-berlin.de/~s5r6/linux1394/updates/.
-- 
Stefan Richter
-=-=-=== =--- ==--=
http://arcgraph.de/sr/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: gettimeofday() jumping into the future

2007-08-25 Thread Michael Smith
>
> Hmm. That does sound like unsycned TSCs. Normally Intel systems don't
> skew unless they are NUMA systems or you're entering low power states.
> We try to catch both of those cases, so I'm not sure how you box is
> slipping through.
>
> Can you run the following test to verify that the TSCs are skewed?

I ran this for the past two days, multiple copies to put some load on
the system (the initial problem with gettimeofday() was easier to
reproduce under load). Nothing.

So, I guess that rules out unsynced TSCs as the cause? Or perhaps it
only happens in other conditions, for some reason. I'm on holidays
right now, but my coworkers should be able to continue investigating
other possible causes.

Thanks for the advice/test program.

Mike
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[GIT PULL] FireWire fixes

2007-08-25 Thread Stefan Richter
Linus, please pull from the for-linus branch at

git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394-2.6.git 
for-linus

to receive the following fixes for the new and old 1394 subsystems:

Kristian Høgsberg (1):
  firewire: Add ref-counting for sbp2 orbs (fix command abortion)

Stefan Richter (2):
  ieee1394: sbp2: fix sbp2_remove_device for error cases
  firewire: fix unloading of fw-ohci while devices are attached


 drivers/firewire/fw-card.c |6 +++-
 drivers/firewire/fw-sbp2.c |   49 +--
 drivers/ieee1394/sbp2.c|   14 +-
 3 files changed, 51 insertions(+), 18 deletions(-)


commit e57d2011a6276d55a87f26653a0395f302ce0d51
Author: Kristian Høgsberg <[EMAIL PROTECTED]>
Date:   Fri Aug 24 18:59:58 2007 -0400

firewire: Add ref-counting for sbp2 orbs (fix command abortion)

This handles the case where we get the status write before getting the
complete_transaction callback ("status write for unknown orb").  In
this case, we just assume that the initial orb pointer transaction
succeeded and finish the orb.  To prevent the transaction callback
from touching freed memory, we ref-count the orb structures.

Signed-off-by: Kristian Høgsberg <[EMAIL PROTECTED]>
Signed-off-by: Stefan Richter <[EMAIL PROTECTED]>

diff --git a/drivers/firewire/fw-sbp2.c b/drivers/firewire/fw-sbp2.c
index ba816ef..238730f 100644
--- a/drivers/firewire/fw-sbp2.c
+++ b/drivers/firewire/fw-sbp2.c
@@ -159,6 +159,7 @@ struct sbp2_pointer {
 
 struct sbp2_orb {
struct fw_transaction t;
+   struct kref kref;
dma_addr_t request_bus;
int rcode;
struct sbp2_pointer pointer;
@@ -280,6 +281,14 @@ static const struct {
 };
 
 static void
+free_orb(struct kref *kref)
+{
+   struct sbp2_orb *orb = container_of(kref, struct sbp2_orb, kref);
+
+   kfree(orb);
+}
+
+static void
 sbp2_status_write(struct fw_card *card, struct fw_request *request,
  int tcode, int destination, int source,
  int generation, int speed,
@@ -312,8 +321,8 @@ sbp2_status_write(struct fw_card *card, struct fw_request 
*request,
spin_lock_irqsave(>lock, flags);
list_for_each_entry(orb, >orb_list, link) {
if (STATUS_GET_ORB_HIGH(status) == 0 &&
-   STATUS_GET_ORB_LOW(status) == orb->request_bus &&
-   orb->rcode == RCODE_COMPLETE) {
+   STATUS_GET_ORB_LOW(status) == orb->request_bus) {
+   orb->rcode = RCODE_COMPLETE;
list_del(>link);
break;
}
@@ -325,6 +334,8 @@ sbp2_status_write(struct fw_card *card, struct fw_request 
*request,
else
fw_error("status write for unknown orb\n");
 
+   kref_put(>kref, free_orb);
+
fw_send_response(card, request, RCODE_COMPLETE);
 }
 
@@ -335,13 +346,27 @@ complete_transaction(struct fw_card *card, int rcode,
struct sbp2_orb *orb = data;
unsigned long flags;
 
-   orb->rcode = rcode;
-   if (rcode != RCODE_COMPLETE) {
-   spin_lock_irqsave(>lock, flags);
+   /*
+* This is a little tricky.  We can get the status write for
+* the orb before we get this callback.  The status write
+* handler above will assume the orb pointer transaction was
+* successful and set the rcode to RCODE_COMPLETE for the orb.
+* So this callback only sets the rcode if it hasn't already
+* been set and only does the cleanup if the transaction
+* failed and we didn't already get a status write.
+*/
+   spin_lock_irqsave(>lock, flags);
+
+   if (orb->rcode == -1)
+   orb->rcode = rcode;
+   if (orb->rcode != RCODE_COMPLETE) {
list_del(>link);
-   spin_unlock_irqrestore(>lock, flags);
orb->callback(orb, NULL);
}
+
+   spin_unlock_irqrestore(>lock, flags);
+
+   kref_put(>kref, free_orb);
 }
 
 static void
@@ -360,6 +385,10 @@ sbp2_send_orb(struct sbp2_orb *orb, struct fw_unit *unit,
list_add_tail(>link, >orb_list);
spin_unlock_irqrestore(>card->lock, flags);
 
+   /* Take a ref for the orb list and for the transaction callback. */
+   kref_get(>kref);
+   kref_get(>kref);
+
fw_send_request(device->card, >t, TCODE_WRITE_BLOCK_REQUEST,
node_id, generation, device->max_speed, offset,
>pointer, sizeof(orb->pointer),
@@ -416,6 +445,7 @@ sbp2_send_management_orb(struct fw_unit *unit, int node_id, 
int generation,
if (orb == NULL)
return -ENOMEM;
 
+   kref_init(>base.kref);
orb->response_bus =
dma_map_single(device->card->device, >response,
   sizeof(orb->response), DMA_FROM_DEVICE);
@@ -490,7 +520,7 @@ sbp2_send_management_orb(struct fw_unit *unit, 

cpuset: attach_task() vs sched_setaffinity() race?

2007-08-25 Thread Oleg Nesterov
After the brief look at kernel/cpuset.c, it seems that attach_task() should
guarantee that the task can't use CPUs outside of cpuset->cpus_allowed.

But this looks racy wrt sched_setaffinity() which does

cpus_allowed = cpuset_cpus_allowed(p);
// callback_mutex is free
set_cpus_allowed(p);

What if attach_task()->set_cpus_allowed() happens in between?


Another question: update_cpumask(cs) does nothing with the tasks attached to
that cpuset, why? It may take a while before the task actually migrates to the
new CPU.

Thanks,

Oleg.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Linux 2.6.20.17

2007-08-25 Thread Willy Tarreau
diff --git a/Makefile b/Makefile
index b3806cb..bce2fbf 100644
--- a/Makefile
+++ b/Makefile
@@ -1,7 +1,7 @@
 VERSION = 2
 PATCHLEVEL = 6
 SUBLEVEL = 20
-EXTRAVERSION = .16
+EXTRAVERSION = .17
 NAME = Homicidal Dwarf Hamster
 
 # *DOCUMENTATION*
diff --git a/arch/i386/kernel/cpu/cpufreq/acpi-cpufreq.c 
b/arch/i386/kernel/cpu/cpufreq/acpi-cpufreq.c
index 10baa35..18c8b67 100644
--- a/arch/i386/kernel/cpu/cpufreq/acpi-cpufreq.c
+++ b/arch/i386/kernel/cpu/cpufreq/acpi-cpufreq.c
@@ -167,11 +167,13 @@ static void do_drv_read(struct drv_cmd *cmd)
 
 static void do_drv_write(struct drv_cmd *cmd)
 {
-   u32 h = 0;
+   u32 lo, hi;
 
switch (cmd->type) {
case SYSTEM_INTEL_MSR_CAPABLE:
-   wrmsr(cmd->addr.msr.reg, cmd->val, h);
+   rdmsr(cmd->addr.msr.reg, lo, hi);
+   lo = (lo & ~INTEL_MSR_RANGE) | (cmd->val & INTEL_MSR_RANGE);
+   wrmsr(cmd->addr.msr.reg, lo, hi);
break;
case SYSTEM_IO_CAPABLE:
acpi_os_write_port((acpi_io_address)cmd->addr.io.port,
@@ -372,7 +374,6 @@ static int acpi_cpufreq_target(struct cpufreq_policy 
*policy,
struct cpufreq_freqs freqs;
cpumask_t online_policy_cpus;
struct drv_cmd cmd;
-   unsigned int msr;
unsigned int next_state = 0; /* Index into freq_table */
unsigned int next_perf_state = 0; /* Index into perf table */
unsigned int i;
@@ -417,11 +418,7 @@ static int acpi_cpufreq_target(struct cpufreq_policy 
*policy,
case SYSTEM_INTEL_MSR_CAPABLE:
cmd.type = SYSTEM_INTEL_MSR_CAPABLE;
cmd.addr.msr.reg = MSR_IA32_PERF_CTL;
-   msr =
-   (u32) perf->states[next_perf_state].
-   control & INTEL_MSR_RANGE;
-   cmd.val = get_cur_val(online_policy_cpus);
-   cmd.val = (cmd.val & ~INTEL_MSR_RANGE) | msr;
+   cmd.val = (u32) perf->states[next_perf_state].control;
break;
case SYSTEM_IO_CAPABLE:
cmd.type = SYSTEM_IO_CAPABLE;
diff --git a/arch/sparc/kernel/entry.S b/arch/sparc/kernel/entry.S
index 831f540..eac3838 100644
--- a/arch/sparc/kernel/entry.S
+++ b/arch/sparc/kernel/entry.S
@@ -1749,8 +1749,8 @@ fpload:
 __ndelay:
save%sp, -STACKFRAME_SZ, %sp
mov %i0, %o0
-   call.umul
-mov0x1ad, %o1  ! 2**32 / (1 000 000 000 / HZ)
+   call.umul   ! round multiplier up so large ns ok
+mov0x1ae, %o1  ! 2**32 / (1 000 000 000 / HZ)
call.umul
 mov%i1, %o1! udelay_val
ba  delay_continue
@@ -1760,11 +1760,17 @@ __ndelay:
 __udelay:
save%sp, -STACKFRAME_SZ, %sp
mov %i0, %o0
-   sethi   %hi(0x10c6), %o1
+   sethi   %hi(0x10c7), %o1! round multiplier up so large us ok
call.umul
-or %o1, %lo(0x10c6), %o1   ! 2**32 / 1 000 000
+or %o1, %lo(0x10c7), %o1   ! 2**32 / 1 000 000
call.umul
 mov%i1, %o1! udelay_val
+   sethi   %hi(0x028f4b62), %l0! Add in rounding constant * 2**32,
+   or  %g0, %lo(0x028f4b62), %l0
+   addcc   %o0, %l0, %o0   ! 2**32 * 0.009 999
+   bcs,a   3f
+add%o1, 0x01, %o1
+3:
call.umul
 movHZ, %o0 ! >>32 earlier for wider range
 
diff --git a/arch/sparc/lib/memset.S b/arch/sparc/lib/memset.S
index a65eba4..1c37ea8 100644
--- a/arch/sparc/lib/memset.S
+++ b/arch/sparc/lib/memset.S
@@ -162,7 +162,7 @@ __bzero:
 8:
 add%o0, 1, %o0
subcc   %o1, 1, %o1
-   bne,a   8b
+   bne 8b
 EX(stb %g3, [%o0 - 1], add %o1, 1)
 0:
retl
diff --git a/arch/sparc64/kernel/head.S b/arch/sparc64/kernel/head.S
index 06459ae..0e19369 100644
--- a/arch/sparc64/kernel/head.S
+++ b/arch/sparc64/kernel/head.S
@@ -458,7 +458,6 @@ tlb_fixup_done:
or  %g6, %lo(init_thread_union), %g6
ldx [%g6 + TI_TASK], %g4
mov %sp, %l6
-   mov %o4, %l7
 
wr  %g0, ASI_P, %asi
mov 1, %g1
diff --git a/arch/um/os-Linux/user_syms.c b/arch/um/os-Linux/user_syms.c
index 3f33165..419b2d5 100644
--- a/arch/um/os-Linux/user_syms.c
+++ b/arch/um/os-Linux/user_syms.c
@@ -5,7 +5,8 @@
  * so I *must* declare good prototypes for them and then EXPORT them.
  * The kernel code uses the macro defined by include/linux/string.h,
  * so I undef macros; the userspace code does not include that and I
- * add an EXPORT for the glibc one.*/
+ * add an EXPORT for the glibc one.
+ */
 
 #undef strlen
 #undef strstr
@@ -61,12 +62,18 @@ EXPORT_SYMBOL_PROTO(dup2);
 EXPORT_SYMBOL_PROTO(__xstat);
 EXPORT_SYMBOL_PROTO(__lxstat);
 EXPORT_SYMBOL_PROTO(__lxstat64);
+EXPORT_SYMBOL_PROTO(__fxstat64);
 EXPORT_SYMBOL_PROTO(lseek);
 EXPORT_SYMBOL_PROTO(lseek64);
 EXPORT_SYMBOL_PROTO(chown);

Linux 2.6.20.17

2007-08-25 Thread Willy Tarreau

I've just released Linux 2.6.20.17.

As a reminder, it fixes these 3 security issues :
  CVE-2007-3105
  CVE-2007-3848
  CVE-2007-3851

I'll also be replying to this message with a copy of the patch between
2.6.20.16 and 2.6.20.17.

The patch and changelog will appear soon at the following locations:
  ftp://ftp.all.kernel.org/pub/linux/kernel/v2.6/
  ftp://ftp.all.kernel.org/pub/linux/kernel/v2.6/patch-2.6.20.17.bz2
  ftp://ftp.all.kernel.org/pub/linux/kernel/v2.6/ChangeLog-2.6.20.17

Git repository:
   git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-2.6.20.y.git
  http://www.kernel.org/pub/scm/linux/kernel/git/stable/linux-2.6.20.y.git

Git repository through the gitweb interface:
  http://git.kernel.org/?p=linux/kernel/git/stable/linux-2.6.20.y.git

Willy

---

 Makefile   |2 
 arch/i386/kernel/cpu/cpufreq/acpi-cpufreq.c|   13 +--
 arch/sparc/kernel/entry.S  |   14 ++-
 arch/sparc/lib/memset.S|2 
 arch/sparc64/kernel/head.S |1 
 arch/um/os-Linux/user_syms.c   |   20 +---
 drivers/ata/ata_piix.c |2 
 drivers/ata/pata_atiixp.c  |1 
 drivers/base/cpu.c |2 
 drivers/char/drm/i915_dma.c|   14 ++-
 drivers/char/drm/i915_drv.h|1 
 drivers/char/random.c  |9 +-
 drivers/char/sx.c  |4 
 drivers/cpufreq/cpufreq_ondemand.c |   30 --
 drivers/kvm/svm.c  |6 +
 drivers/kvm/svm.h  |3 
 drivers/md/dm-crypt.c  |5 -
 drivers/md/dm-exception-store.c|   11 +-
 drivers/md/dm-mpath.c  |3 
 drivers/md/dm-snap.c   |   11 --
 drivers/md/dm.c|9 ++
 drivers/md/raid10.c|   10 ++
 drivers/media/video/v4l2-common.c  |   19 +++-
 drivers/media/video/wm8739.c   |2 
 drivers/media/video/wm8775.c   |2 
 drivers/net/forcedeth.c|  108 ++---
 drivers/pcmcia/cs.c|3 
 drivers/scsi/aacraid/linit.c   |4 
 drivers/usb/core/hub.c |   10 +-
 drivers/video/macmodes.c   |5 -
 drivers/video/macmodes.h   |8 -
 drivers/video/stifb.c  |   19 ++--
 fs/9p/conv.c   |1 
 fs/direct-io.c |1 
 fs/exec.c  |   13 ++-
 fs/ext4/extents.c  |2 
 fs/jbd/commit.c|3 
 fs/jbd2/commit.c   |3 
 fs/nfsd/vfs.c  |2 
 fs/splice.c|5 -
 include/linux/Kbuild   |1 
 include/linux/netfilter_ipv4/ipt_iprange.h |2 
 include/net/bluetooth/rfcomm.h |1 
 include/net/xfrm.h |1 
 ipc/shm.c  |2 
 kernel/lockdep_proc.c  |2 
 mm/hugetlb.c   |   15 ++-
 mm/mlock.c |5 -
 mm/readahead.c |   12 ++
 net/bluetooth/rfcomm/tty.c |   34 ++-
 net/core/gen_estimator.c   |   82 +++---
 net/core/netpoll.c |2 
 net/ieee80211/softmac/ieee80211softmac_assoc.c |5 -
 net/ieee80211/softmac/ieee80211softmac_wx.c|   11 +-
 net/ipv6/addrconf.c|1 
 net/ipv6/anycast.c |1 
 net/ipv6/icmp.c|2 
 net/ipv6/tcp_ipv6.c|1 
 net/sctp/ipv6.c|4 
 net/sunrpc/auth_gss/svcauth_gss.c  |9 +-
 net/xfrm/xfrm_policy.c |2 
 61 files changed, 410 insertions(+), 168 deletions(-)

Summary of changes from 2.6.20.16 to 2.6.20.17


Adrian Bunk (2):
  Missing header include in ipt_iprange.h
  drivers/video/macmodes.c:mac_find_mode() mustn't be __devinit

Alan Cox (1):
  aacraid: fix security hole

Alan Stern (1):
  USB: fix warning caused by autosuspend counter going negative

Alexander Shmelev (1):
  Fix sparc32 memset()

Alexey Dobriyan (1):
  Fix leak on /proc/lockdep_stats

Arne Redlich (1):
  md: handle writes to broken raid10 arrays gracefully

Ayaz Abdulla (2):
  forcedeth bug fix: cicada phy
  forcedeth bug fix: 

Re: [Linux-fbdev-devel] [PATCH] uvesafb: select connector in Kconfig

2007-08-25 Thread Randy Dunlap
On Sat, 25 Aug 2007 10:59:31 +0200 Michal Januszewski wrote:

> Make uvesafb select connector instead of depending on it being already 
> selected.
> 
> Signed-off-by: Michal Januszewski <[EMAIL PROTECTED]>
> ---
> diff --git a/drivers/video/Kconfig b/drivers/video/Kconfig
> index f1cc899..e152eed 100644
> --- a/drivers/video/Kconfig
> +++ b/drivers/video/Kconfig
> @@ -594,7 +594,8 @@ config FB_TGA
>  
>  config FB_UVESA
>   tristate "Userspace VESA VGA graphics support"
> - depends on FB && CONNECTOR
> + depends on FB
> + select CONNECTOR
>   select FB_CFB_FILLRECT
>   select FB_CFB_COPYAREA
>   select FB_CFB_IMAGEBLIT

There are 2 problems with this.

a.  CONNECTOR depends on NET, but select won't enable NET, so if
NET is not enabled otherwise, CONNECTOR still won't build.
This is a longstanding select/kconfig issue.

b.  CONNECTOR depends on NET, so if select did follow the depends
chain and enable NET, it would be enabling the networking subsystem,
which is something that should not be done quietly.

select should only be used for enabling small library-like code
(if even then).  Avoid it whenever possible.


---
~Randy
*** Remember to use Documentation/SubmitChecklist when testing your code ***
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Add all thread stats for TASKSTATS_CMD_ATTR_TGID

2007-08-25 Thread Guillaume Chazarain
Le Mon, 20 Aug 2007 22:31:08 +0530,
Balbir Singh <[EMAIL PROTECTED]> a écrit :

> > --- a/kernel/taskstats.cSat Aug 18 17:15:17 2007 -0700
> > +++ b/kernel/taskstats.cSun Aug 19 17:20:15 2007 +0200
> > @@ -246,6 +246,8 @@ static int fill_tgid(pid_t tgid, struct 
> > 
> > stats->nvcsw += tsk->nvcsw;
> > stats->nivcsw += tsk->nivcsw;
> > +   bacct_add_tsk(stats, tsk);
> > +   xacct_add_tsk(stats, tsk);
> 
> I'm afraid this is still not good enough. bacct_add_tsk() will assign
> values and do nothing in the loop (HINT: no summation).

Hi Balbir, thank you for your review. I agree with everything you said
and am on my way to do it as time permits, but I have some trouble
understanding this part. You stated that bacct_add_tsk() would overwrite
the stats of each thread in the tgid stats, but the other part of the
patch is the (actually wrong) combination of stats in xxx_add_tsk()
using min/max/sum.

Also, I don't understand why the code to update btime:

/* calculate task elapsed time in timespec */
do_posix_clock_monotonic_gettime();
ts = timespec_sub(uptime, tsk->start_time);
...
stats->ac_btime = get_seconds() - ts.tv_sec;

does not simply use tsk->start_time or tsk->real_start_time without
comparing it to the current time.

Thanks.

-- 
Guillaume
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Add source address to sunrpc svc errors

2007-08-25 Thread Dr. David Alan Gilbert
Hi Randy,
  Thanks for your comments,

* Randy Dunlap ([EMAIL PROTECTED]) wrote:
  in reply to my patch:

> > +static int
> > +svc_printkerr(struct svc_rqst *rqstp, const char* fmt,...)
> 
> add:
>   __attribute__ ((format (printf, 2, 3)))
> so that the compiler can check the args list.

Added.

> > +   printk("svc: %s :", svc_print_addr(rqstp, buf, sizeof(buf)));
> 
> some KERN_* level, please.

I've gone with KERN_WARNING which seems about right.

Here is a v2 of the patch that adds those two fixes and also
tidies up the output a little.

Dave
-- 
 -Open up your eyes, open up your mind, open up your code ---   
/ Dr. David Alan Gilbert| Running GNU/Linux on Alpha,68K| Happy  \ 
\ gro.gilbert @ treblig.org | MIPS,x86,ARM,SPARC,PPC & HPPA | In Hex /
 \ _|_ http://www.treblig.org   |___/

--
Hi,
  This patch adds the address of the client that caused an
error in sunrpc/svc.c so that you get errors that look like:

svc: 192.168.66.28, port=709: unknown version (3 for prog 13, nfsd)

I've seen machines which get bunches of unknown version or similar
errors from time to time, and while the recent patch to add
the service helps to find which service has the wrong version it doesn't
help find the potentially bad client.

  The patch is against a checkout of Linus's git tree made
on 2007-08-24.

  One observation is that the svc_print_addr function
prints to a buffer which in this case makes life a little more complex;
it just feels as if there must be lots of places that print a
connection address - is there a better function to use anywhere?

I think actually there are a few places with semi duplicated code; e.g.
one_sock_name switches on the address family but only currently
has IPV4; I wonder how many other places are similar.

Dave

Signed-off-by: Dave Gilbert <[EMAIL PROTECTED]>

--- linux-2.6/net/sunrpc/svc.c.predag   2007-08-25 00:50:06.0 +0100
+++ linux-2.6/net/sunrpc/svc.c  2007-08-25 15:53:36.0 +0100
@@ -777,6 +777,30 @@ svc_register(struct svc_serv *serv, int 
 }
 
 /*
+ * Printk the given error with the address of the client that caused it.
+ */
+static int
+__attribute__ ((format (printf, 2, 3)))
+svc_printkerr(struct svc_rqst *rqstp, const char *fmt, ...)
+{
+   va_list args;
+   int r;
+   charbuf[RPC_MAX_ADDRBUFLEN];
+
+   if (!net_ratelimit())
+   return 0;
+
+   printk(KERN_WARNING "svc: %s: ",
+   svc_print_addr(rqstp, buf, sizeof(buf)));
+
+   va_start(args, fmt);
+   r = vprintk(fmt, args);
+   va_end(args);
+
+   return r;
+}
+
+/*
  * Process the RPC request.
  */
 int
@@ -963,14 +987,13 @@ svc_process(struct svc_rqst *rqstp)
return 0;
 
 err_short_len:
-   if (net_ratelimit())
-   printk("svc: short len %Zd, dropping request\n", argv->iov_len);
+   svc_printkerr(rqstp, "short len %Zd, dropping request\n",
+   argv->iov_len);
 
goto dropit;/* drop request */
 
 err_bad_dir:
-   if (net_ratelimit())
-   printk("svc: bad direction %d, dropping request\n", dir);
+   svc_printkerr(rqstp, "bad direction %d, dropping request\n", dir);
 
serv->sv_stats->rpcbadfmt++;
goto dropit;/* drop request */
@@ -1000,8 +1023,7 @@ err_bad_prog:
goto sendit;
 
 err_bad_vers:
-   if (net_ratelimit())
-   printk("svc: unknown version (%d for prog %d, %s)\n",
+   svc_printkerr(rqstp, "unknown version (%d for prog %d, %s)\n",
   vers, prog, progp->pg_name);
 
serv->sv_stats->rpcbadfmt++;
@@ -1011,16 +1033,14 @@ err_bad_vers:
goto sendit;
 
 err_bad_proc:
-   if (net_ratelimit())
-   printk("svc: unknown procedure (%d)\n", proc);
+   svc_printkerr(rqstp, "unknown procedure (%d)\n", proc);
 
serv->sv_stats->rpcbadfmt++;
svc_putnl(resv, RPC_PROC_UNAVAIL);
goto sendit;
 
 err_garbage:
-   if (net_ratelimit())
-   printk("svc: failed to decode args\n");
+   svc_printkerr(rqstp, "failed to decode args\n");
 
rpc_stat = rpc_garbage_args;
 err_bad:
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] avoid negative shifts in radix-tree.c

2007-08-25 Thread Peter Firefly Lund
([EMAIL PROTECTED] only bcc'ed because it's subscribers only,
Lameter addressed because I think he touched the code last, Velikov and
Hellwig because they touched the code first.)

The current code in __max_index() will shift by a negative amount first
and only then fix the situation by ignoring the result if the shift
amount would have been negative.  This happens to work on almost any
architecture despite not being valid C.

Chapter and verse from the (draft) ISO C99 standard, "6.5.7 Bitwise
shift operators", paragraph 3:

  "The integer promotions are performed on each of the operands. The
   type of the result is that of the promoted left operand. If the
   value of the right operand is negative or is greater than or equal
   to the width of the promoted left operand, the behavior is
   undefined."

Right-shifting by a negative amount causes a "reserved operand fault" on
the VAX with some gcc versions.


Applies to 2.6.22.
Boot tested lightly on an emulated VAX.

(The function is called 7 times on booth with the values 0..6 -- I
checked that the return values were the same + that it returns ~0UL for
height==6 as was clearly the intention.)

-Peter

--- lib/radix-tree-old.c2007-08-25 15:36:40.0 +0200
+++ lib/radix-tree.c2007-08-25 15:36:51.0 +0200
@@ -980,12 +980,14 @@ radix_tree_node_ctor(void *node, struct 
 
 static __init unsigned long __maxindex(unsigned int height)
 {
-   unsigned int tmp = height * RADIX_TREE_MAP_SHIFT;
-   unsigned long index = (~0UL >> (RADIX_TREE_INDEX_BITS - tmp - 1)) >> 1;
-
-   if (tmp >= RADIX_TREE_INDEX_BITS)
-   index = ~0UL;
-   return index;
+   unsigned inttmp   = height * RADIX_TREE_MAP_SHIFT;
+   int shift = RADIX_TREE_INDEX_BITS - tmp;
+   unsigned long   index;
+
+   if (shift < 0)
+   return ~0UL;
+   else
+   return ~0UL >> shift;
 }
 
 static __init void radix_tree_init_maxindex(void)


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/2] sysctl: Properly register the irda binary sysctl numbers.

2007-08-25 Thread Valdis . Kletnieks
On Sat, 25 Aug 2007 06:57:17 MDT, Eric W. Biederman said:

> It's good to have confirmation that my sysctl_check routine
> didn't find something else wrong.

If I understand the code, anything it whinges about is either an outright bug
or it's a round of ammo already chambered. ;)

As far as "something else wrong", I'm still seeing these in -rc3-mm1, but
they've been reported before against -rc2-mm2, I think:

[0.628000] sysctl table check failed: /kernel/ostype .1.1 Missing strategy
[0.628000] sysctl table check failed: /kernel/osrelease .1.2 Missing 
strategy
[0.628000] sysctl table check failed: /kernel/version .1.4 Missing strategy
[0.628000] sysctl table check failed: /kernel/hostname .1.7 Missing strategy
[0.628000] sysctl table check failed: /kernel/domainname .1.8 Missing 
strategy
[0.628000] sysctl table check failed: /kernel/shmmax .1.34 Missing strategy
[0.628000] sysctl table check failed: /kernel/shmall .1.41 Missing strategy
[0.628000] sysctl table check failed: /kernel/shmmni .1.45 Missing strategy
[0.628000] sysctl table check failed: /kernel/msgmax .1.35 Missing strategy
[0.628000] sysctl table check failed: /kernel/msgmni .1.42 Missing strategy
[0.628000] sysctl table check failed: /kernel/msgmnb .1.36 Missing strategy
[0.628000] sysctl table check failed: /kernel/sem .1.43 Missing strategy

And this isn't on an allyesconfig or allmodconfig. There may well be sysctl
code I didn't hit - my /lib/modules/2.6.23-rc3-mm1 is only about 10M, and
the Fedora kernels are weighing in at about 75M of /lib/modules a pop.


pgpKkRxFy6wrT.pgp
Description: PGP signature


Re: [POWERPC] Fix for assembler -g

2007-08-25 Thread Segher Boessenkool

But there is no lparmap.o!  lparmap.s is the generated file.


Yeah, tell that to scripts/Makefile.lib:

_c_flags   = $(CFLAGS) $(EXTRA_CFLAGS) $(CFLAGS_$(basetarget).o)

What would do what a person expects is $(CFLAGS_$(@F)), I think.


Looks good to me.  Sam?  We wanted to set CFLAGS_lparmap.s .


To avoid confusion (in most cases) setting CFLAGS_file.o
does the expected thing in case on .o, .s, .lst and .i targets.
So the general and easy to remember rule is to set CFLAGS_file.o
and then kbuild takes care of the rest.


Yeah, that makes sense in the "normal" case.  In this case, the
generated .s file is actually used in the build process though,
so it was a bit confusing.


I assume you already did so and it solved your problem - no?


Sure, it was just a question "is this the right thing or not".
In any case, the problematic thing will be removed completely
here :-)

Thanks for the explanation, it all makes sense now,


Segher

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [sata_nv] timeout waiting for ADMA IDLE, stat=0x440

2007-08-25 Thread Maarten Maathuis
A broken cable seems like a realistic possibility, so i swapped it for
another cable. I will try if that solves the problem.

Maarten.

On 8/25/07, Robert Hancock <[EMAIL PROTECTED]> wrote:
> Maarten Maathuis wrote:
> > I have this problem several times, always with the same harddrive, a
> > samsung sp2004c. My samsung hd161hj and hd321kj don't seem to suffer
> > from this problem. I do not know when exactly it happened for the
> > first, but it has happened twice on a 2.6.22 kernel.
> >
> > Is there anything that can be done about this (besides disabling adma
> > for all drives), or any information i can provide to help?
> >
> > Please CC me, as i am not a member of this mailinglist.
> >
> > Sincerely,
> >
> > Maarten Maathuis.
> >
> > dmesg snippet:
> >
> > ata4: timeout waiting for ADMA IDLE, stat=0x440
> > ata4.00: qc timeout (cmd 0x2f)
> > ata4: failed to read log page 10h (errno=-5)
> > ata4.00: exception Emask 0x1 SAct 0x1 SErr 0x38 action 0x2 frozen
> > ata4.00: (CPB resp_flags 0x11: CMD error)
> > ata4.00: cmd 60/80:00:89:b0:30/00:00:02:00:00/40 tag 0 cdb 0x0 data 65536 in
> >  res 51/84:00:02:00:00/84:00:02:00:00/40 Emask 0x10 (ATA bus error)
>
> Sounds like the drive has gotten into a really hosed state after this
> point. The SError register is showing a CRC error, disparity error, and
> 10b to 8b decode error, which indicates that there are some major SATA
> communication problems happening.
>
> It could be a hardware problem (bad drive, bad SATA cable, insufficient
> power, etc.) or maybe this is another drive with broken NCQ..
>
> --
> Robert Hancock  Saskatoon, SK, Canada
> To email, remove "nospam" from [EMAIL PROTECTED]
> Home Page: http://www.roberthancock.com/
>
>
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [POWERPC] Fix for assembler -g

2007-08-25 Thread Sam Ravnborg
On Tue, Aug 21, 2007 at 12:29:08AM +0200, Segher Boessenkool wrote:
> >>But there is no lparmap.o!  lparmap.s is the generated file.
> >
> >Yeah, tell that to scripts/Makefile.lib:
> >
> > _c_flags   = $(CFLAGS) $(EXTRA_CFLAGS) $(CFLAGS_$(basetarget).o)
> >
> >What would do what a person expects is $(CFLAGS_$(@F)), I think.
> 
> Looks good to me.  Sam?  We wanted to set CFLAGS_lparmap.s .

To avoid confusion (in most cases) setting CFLAGS_file.o
does the expected thing in case on .o, .s, .lst and .i targets.
So the general and easy to remember rule is to set CFLAGS_file.o
and then kbuild takes care of the rest.

I assume you already did so and it solved your problem - no?

Sam
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/2] sysctl: Properly register the irda binary sysctl numbers.

2007-08-25 Thread Eric W. Biederman
[EMAIL PROTECTED] writes:

> On Thu, 23 Aug 2007 21:53:53 MDT, Eric W. Biederman said:
>> 
>> Grumble.  These numbers should have been in sysctl.h from the
>> beginning if we ever expected anyone to use them.  Oh well put
>> them there now so we can find them and make maintenance easier.
>> 
>> Signed-off-by: Eric W. Biederman <[EMAIL PROTECTED]>
>
> Applied both patches, and now all I get from irda at boot time now is this:
>
> [  292.062000] irda_init()
> [  292.063000] NET: Registered protocol family 23
> [  292.069000] IrCOMM protocol (Dag Brattli)
> [  292.221000] PPP generic driver version 2.4.2
>
> in other words, business as usual. Thanks.
>
> Feel free to stick this on both patches:
>
> Tested-By: Valdis Kletnieks <[EMAIL PROTECTED]>

Thanks.

It's good to have confirmation that my sysctl_check routine
didn't find something else wrong.

Eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: possible BUG while doing gpg --gen-key

2007-08-25 Thread Udo van den Heuvel
Udo van den Heuvel wrote:
> Now it works:

Stranger even:

With audio-entropyd, rngd active and the netdev-random patch working I
cannot reproduce the crash. Even after a fresh boot.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Problem with CFS V20 and Suspend2/tuxonice

2007-08-25 Thread Fabio Comolli
Reproduced on a Intel Centrino based laptop with gentoo kamikaze7
sources (http://forums.gentoo.org/viewtopic-t-577970.html)



On 8/25/07, David Rodriguez <[EMAIL PROTECTED]> wrote:
> I'm using 2.6.22.5 with cfs v20.3 and  suspend2  2.2.10.2.
> With that combination, suspend is not working anymore (with cfs v19
> was working).
> Stops on suspend in "Suspending tasks"
> Looking at cfs patch, I managed to change the  migration_thread,
> adding again the  try_to_freeze() removed in last patch and now the
> suspend finished, but resume not work. Of course I don't know why that
> was removed, and rewriting it is not a solution, but I want to report
> it.
>
> All that in a core duo machine.
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [EMAIL PROTECTED]
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
>
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [1/4] 2.6.23-rc3: known regressions v3

2007-08-25 Thread Thomas Meyer
Ivan N. Zlatev schrieb:
> On 8/24/07, Michal Piotrowski <[EMAIL PROTECTED]> wrote:
>   
>> ALSA
>>
>> Subject : Master volume control broken
>> References  : http://lkml.org/lkml/2007/8/18/46
>> Last known good : ?
>> Submitter   : Thomas Meyer <[EMAIL PROTECTED]>
>> Caused-By   : Ivan N. Zlatev <[EMAIL PROTECTED]>
>>   commit 5d5d3bc3eddf2ad97b2cb090b92580e7fed6cee1
>> Handled-By  : ?
>> Status  : unknown
>> 
> In the meanwhile if you want to use your "fancy multimedia control
> keys" as a workaround you could try to bind your multimedia keys to
> execute:
> volume up: amixer -q set PCM 10%+
> volume down: amixer -q set PCM 10%-
>
>   
I have these key bindings in the .Xmodmap file

keycode 174 = XF86AudioLowerVolume
keycode 176 = XF86AudioRaiseVolume
keycode 160 = XF86AudioMute
keycode 204 = XF86Eject

Some kde component while take care of the "XF86AudioLowerVolume" and
"XF86AudioRaiseVolume" events and change the value of the "master"
volume control.

mfg
thomas

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] x86 Boot NUMA kernels on non-NUMA hardware with DISCONTIG memory model

2007-08-25 Thread Andy Whitcroft
Andi Kleen wrote:
> On Fri, Aug 24, 2007 at 06:44:38PM +0100, Mel Gorman wrote:
>> On (24/08/07 19:38), Andi Kleen didst pronounce:
 Other than the fact that the memmap must be PMD aligned to use hugepage
 entries for the memmap. 
>>> Why is that so?  mem_map should be just part of lowmem anyways.
>>>
>> Not in this case. memmap is allocated node local and mapped in the virtual
>> memory area normally occupied by the end of low memory. The objective was
>> to have memmap for the struct pages node-local. Hence, portions of
>> memmap are really in highmem.
> 
> Ok, but that still doesn't mean it has to be PMD aligned, 
> as long as illegal virtual aliases are prevent in the overlap
> (which is not very hard) 
> 
 It could be mapped with small pages in corner cases
 but the complexity worth it?
>>> You don't need to map it with small pages in the normal case,
>>> the only requirement is that c_p_a() is aware of it so it can
>>> split it if needed.
>>>
 I can't see this type of lifting being done any time soon. As SPARSEMEM 
 works
 and there is hope with the vmemmap work that DISCONTIG will finally go 
 away,
 it may not be the best investment of time.
>>> It's a trivial change, probably less code than your original patch.
>>>
>> I'll have to take your word for it because I haven't looked closely
>> enough. I'll try and find time to look at it but the earliest I'll get around
>> to it is post kernel-summit. In the meantime, SPARSEMEM works.
> 
> Ok, so we disable DISCONTIG i386 NUMA because there's nobody willing
> to maintain it?
>
> I'll take your word SPARSEMEM works, although I was told DISCONTIG NUMA
> works too and then my testing told a quite different story.

That sounds like over kill to me.  The code unfixed works for all actual
NUMA systems I am aware of, else we would have had reports of this
problem before in the years that this code has been in the kernel.  The
fix Mel sent up fixes the code so that it works on systems with
unaligned node ends (which is what triggers the issue).  It does mean
that a little memory is wasted when this kernel is used on a non-NUMA
systems with unaligned node ends (only), but it works as designed at
that point.  To be honest it looks very much that only a very small
memory systems is going to trip this, and we have traditionally used
non-NUMA kernels on non-NUMA systems so there is almost zero exposure in
our install base.

Does this sudden interest in this combination, indicate a distro driven
change to using NUMA kernels on non-NUMA systems??

Having been involved in the development of the code originally, I think
Mel's fix is a good compromise to fix the immediate problem.  Clearly
there are bigger problems in this code that need clearing up if we are
to use this code as it is on small memory non-NUMA systems.  For one the
change merged to fix the "memmap overlapping initrd allocation" severely
wastes memory by pushing the memmap into ZONE_NORMAL even when there is
spare Kernel Virtual Address space available, and also looses the memory
under it where it used to shift to HIGHMEM.

I think that most of this can become moot if we simply pull node-0 out
of this remap scheme, as node-0's memory is already local and the
problem only occurs on node-0.  I have a todo item to look over this,
but as Mel has indicated its probabally not going to be immediate.

I think it makes sense to take Mel's fix as the smallest repair and
we'll spend some time sorting it out cleanly soon.

-apw
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [-mm patch] enforce noreplace-smp in alternative_instructions()

2007-08-25 Thread Frederik Deweerdt
On Sat, Aug 25, 2007 at 10:07:29PM +1000, Rusty Russell wrote:
> On Fri, 2007-08-24 at 10:22 +0200, Frederik Deweerdt wrote:
> > [Added Gerd Hoffman and Rusty Russel to cc]
> > On Thu, Aug 23, 2007 at 11:46:52PM -0700, Jeremy Fitzhardinge wrote:
> > > Frederik Deweerdt wrote:
> > > > That means that even when you specify noreplace_smp, some replacing
> > > > takes place anyway. One of the consequences, besides noreplace_smp not
> > > > working as expected, is that lguest crashes when you feed it an SMP 
> > > > kernel
> > > > (I suspect that you can not replace alternatives for smp _and_ 
> > > > paravirt).
> > > >   
> > > 
> > > No, that should be fine.  Why does lguest crash?
> > It dies with:
> > [0.131000] SMP alternatives: switching to UP code
> > lguest: bad stack page 0xc057a000
Hello Rusty,
> 
> How odd!  This means that the guest set the kernel to a stack which it
> hadn't mapped writable (or perhaps not mapped at all).  I always run SMP

I had time to investigate this a little further, it appears that in fact
0xc057a000 is the beginning of the __smp_locks section.

The crash responsible function call is in alternative_instructions():
free_init_pages("SMP alternatives",
(unsigned long)__smp_locks,
(unsigned long)__smp_locks_end);

Ie, if I comment this out, I can boot lguest without passing
noreplace_smp.

BTW, to make things clear: the patch I sent does _not_ fix the
lguest/alternatives problem. It just makes noreplace_smp functional
again and hence allows working around the lguest/alternatives bug.

> kernels, and that seems a very strange side effect of a patching
> problem...
> 
> Nonetheless, I did have a previous problem with a bug in the patching
> code which didn't show up native and did show up under lguest.
> 
> Can you send your config? 
Here it is:
http://fdeweerdt.free.fr/lguest_smp/dot_config

> Do you need noreplace-smp even on 2.6.23-rc3,
> or only 2.6.23-rc3-mm1?
I'll try ASAP.

Thanks,
Frederik
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [-mm patch] enforce noreplace-smp in alternative_instructions()

2007-08-25 Thread Rusty Russell
On Fri, 2007-08-24 at 10:22 +0200, Frederik Deweerdt wrote:
> [Added Gerd Hoffman and Rusty Russel to cc]
> On Thu, Aug 23, 2007 at 11:46:52PM -0700, Jeremy Fitzhardinge wrote:
> > Frederik Deweerdt wrote:
> > > That means that even when you specify noreplace_smp, some replacing
> > > takes place anyway. One of the consequences, besides noreplace_smp not
> > > working as expected, is that lguest crashes when you feed it an SMP kernel
> > > (I suspect that you can not replace alternatives for smp _and_ paravirt).
> > >   
> > 
> > No, that should be fine.  Why does lguest crash?
> It dies with:
> [0.131000] SMP alternatives: switching to UP code
> lguest: bad stack page 0xc057a000

How odd!  This means that the guest set the kernel to a stack which it
hadn't mapped writable (or perhaps not mapped at all).  I always run SMP
kernels, and that seems a very strange side effect of a patching
problem...

Nonetheless, I did have a previous problem with a bug in the patching
code which didn't show up native and did show up under lguest.

Can you send your config?  Do you need noreplace-smp even on 2.6.23-rc3,
or only 2.6.23-rc3-mm1?

Thanks,
Rusty.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: USB Key light on/off state depending on mount

2007-08-25 Thread Éric Piel

25/08/07 12:49, James Bruce wrote/a écrit:

Robert Hancock wrote:

Casey Dahlin wrote:
Most USB keys nowadays have a small LED somewhere inside of them that 
lights up when they are plugged in. On a windows box, the key is lit 
up whenever it is mounted, and as soon as it is unmounted it turns 
off, giving a handy physical indicator that the key is safe to 
remove. On linux, the light is simply on whenever the key is plugged in.


Should linux toggle the light depending on mount state? Is it as 
trivial as it seems or does this reflect some larger issue?


I think that Windows turns off power to the port when you do the 
"safely remove hardware" on it, or something like that. Mount/unmount 
doesn't really indicate whether the device is in use in Linux, though, 
since it can still be potentially accessed even when the device isn't 
mounted.


If there is a way to toggle the power state from userspace, then a 
desktop environment or userland tool can emulate the Windows behavior if 
that is desired.  A lot of devices can charge via USB now, and this is 
actually more convenient on Linux than on Windows (in effect Windows 
requires drivers in order to charge something).  However, having direct 
control over this is useful.
Yes, maybe some userspace such as HAL could turn off the usb devices at 
the same time it's unmounted. Actually that would be rather intuitive 
way to tell the user the umount is finished. There doesn't seem to be 
any loss of funcitonality, once it's turned off you can still re-access 
the device, and it's automatically turned on again (at least on my PC).


For the record, here is how one can switch off a usb device (as root):
# cd /sys/bus/usb/devices/usb*/[0-9]-[0-9] (just go to the directory of 
the device)

# echo -n 2 > *:1.0/power/state
# echo -n 2 > power/state

I use this to turn off my optical mouse when watching movies, but it 
works fine as well to turn off usb storage devices.

It can also be turned on with
# echo -n 0 > power/state

See you,
Eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Fix preemptible lazy mode bug

2007-08-25 Thread Rusty Russell
On Thu, 2007-08-23 at 23:59 -0700, Zachary Amsden wrote:
> Jeremy Fitzhardinge wrote:
> > Hm.  Doing any kind of lazy-state operation with preemption enabled is
> > fundamentally meaningless.  How does it get into a preemptable state
> >   
> 
> Agree 100%.  It is the lazy mode flush that might happen when preempt is 
> enabled, but lazy mode is disabled.  In that case, the code relies on 
> per-cpu variables, which is a bad thing to do in preemtible code.  This 
> can happen in the current code path.

Frankly, we should hoist the per-cpu state into generic paravirt code,
get rid of the FLUSH "state" and only call the lazy_mode hooks when
actually entering or exiting a lazy mode.

The only reason lguest doesn't use a per-cpu var is that guests are
currently UP only.  If that were fixed, we'd have identical VMI, Xen and
lguest lazy state handing.

Cheers,
Rusty.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


  1   2   3   >