Re: [GIT PATCH] driver core fixes against 2.6.25-rc2 git

2008-02-21 Thread Sam Ravnborg
Hi Greg.

On Thu, Feb 21, 2008 at 03:46:49PM -0800, Greg KH wrote:
> Here are a few driver core fixes against your current git tree that fix some
> more problems that have cropped up:
>   - shutdown problem due to logic problem with cpufreq usage of
> kobjects.
>   - build fix for powerpc due to previous kobject changes.
>   - runtime errors when CONFIG_SYSFS=n
>   - UIO code now works properly from my previous messups
>   - proper encoding of the ja_JP stable_kernel_rules.txt file
>   - updates to the stable_kernel_rules.txt file
>   - mark ide=reverse as obsolete in preparation of 2.6.26 (Bart
> wanted this to go in through my tree as I have the .26 patches
> pending.)
>   - other minor fixes.


Do we have any outstanding issues with section mismatch warnings in PCI Core?
I am as always optimistic and hope to get down to zero warnings soon.

Sam
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/2] ResCounter: Use read_uint in memory controller

2008-02-21 Thread Pavel Emelyanov
[EMAIL PROTECTED] wrote:
> Update the memory controller to use read_uint for its
> limit/usage/failcnt control files, calling the new
> res_counter_read_uint() function.
> 
> Signed-off-by: Paul Menage <[EMAIL PROTECTED]>

Acked-by: Pavel Emelyanov <[EMAIL PROTECTED]>

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [rtc-linux] state of GEN_RTC vs rtc subsystem

2008-02-21 Thread J.A. Magallón
On Wed, 20 Feb 2008 18:03:25 +0100, Alessandro Zummo <[EMAIL PROTECTED]> wrote:

> On Wed, 20 Feb 2008 10:11:23 -0600
> Kumar Gala <[EMAIL PROTECTED]> wrote:
> 
> > 
> > Is the functionality provided by drivers/char/gen_rtc.c completely  
> > handled by the rtc subsystem in drivers/rtc?
> > 
> > I ask for two reasons:
> > 1. should we make it mutually exclusive in Kconfig
> > 2. I've enabled both and get (we'll my defconfig did):
> 
>  They shouldn't be enabled at once. I think a patch 
>  for Kconfig has been recently submitted to give a warning
>  in such a case.
> 
>  rtc-cmos should be able to handle the vast majority of x86
>  rtcs out there. 
> 

In fact, you have 3 rtc implementations available.
Please, can you take a look at this question also:

http://marc.info/?l=linux-kernel=120355254713965=2

>  The only real open issue is related to the ntp synchronization
>  mode and will be solved only when we can get rid of it :)
> 


-- 
J.A. Magallon  \   Software is like sex:
 \ It's better when it's free
Mandriva Linux release 2008.1 (Cooker) for i586
Linux 2.6.23-jam05 (gcc 4.2.2 20071128 (4.2.2-2mdv2008.1)) SMP PREEMPT
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [BUG] Linux 2.6.25-rc2 - Regression from 2.6.24-rc1-git1 softlockup while bootup on powerpc

2008-02-21 Thread Jens Axboe
On Thu, Feb 21 2008, Andrew Morton wrote:
> On Tue, 19 Feb 2008 09:36:34 +0100 Jens Axboe <[EMAIL PROTECTED]> wrote:
> 
> > But I think the radix 'scan over entire tree' is a bit fragile.
> 
> eek, it had better not be.  Was this an error in the caller?  Hope so.

The cfq use of it, not the radix tree code! It juggled the keys and
wants to make sure that we see all users, modulo raced added ones (ok if
we see them, doesn't matter if we don't).

> > This
> > patch adds a parallel hlist for ease of properly browsing the members,
> 
> Even though io_contexts are fairly uncommon, adding more stuff to a data
> structure was a pretty sad alternative to fixing a bug in
> radix_tree_gang_lookup(), or to fixing a bug in a caller of it.
> 
> IOW: what exactly went wrong here??

I could not convince myself that the current code would always do the
right thing. We should not have been seeing ->key == NULL entries in
there, it implied a double exit of that process. So I decided to fix it
by making the code a lot more readable (the patch in question deleted a
lot more than it added), at the cost of that hlist head + node.

-- 
Jens Axboe

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Rt2400-devel] 2.6.25-rc2 regression in rt61pci wireless driver

2008-02-21 Thread Chris Clayton
On Thursday 21 February 2008, Chris Vine wrote:
> On Thu, 2008-02-21 at 23:04 +, Chris Clayton wrote:
> > On Thursday 21 February 2008, Ivo van Doorn wrote:
> > > On Thursday 21 February 2008, Chris Vine wrote:
> [snip]
> > > > This probably explains the problem another user reported with rt61.
> > > 
> > > Perhaps something similar like:
> > > http://bugzilla.kernel.org/show_bug.cgi?id=10058
> > > in there a reference is made to the following patch:
> > > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.25-rc2/2.6.25-rc2-mm1/broken-out/revert-send-a-single-notification-on-device-state-changes.patch
> > > 
> > > Does applying that help?
> > 
> > I'm afraid not, Ivo. The test I ran last night was against 2.6.25.-rc2-git4 
> > and
> > that already has this patch applied. Furthermore, I have another card that 
> > uses
> > the rtl8180 driver and that works reliably. I, therefore, suspect that my 
> > problem
> > lies within the rt61pci driver or the rt2x00 infrastructure.
> 
> Does the same happen with 2.0.14 under kernel 2.6.24?

Unfortunately, a 2.6.24.2 tree with the drivers/net/wireless/rt2x00 directory 
replaced with that from 2.6.25-rc2-git4 doesn't build:

In file included from drivers/net/wireless/rt2x00/rt2x00dev.c:29:
drivers/net/wireless/rt2x00/rt2x00.h:942: warning: `struct ieee80211_bss_conf' 
declared inside parameter list
drivers/net/wireless/rt2x00/rt2x00.h:942: warning: its scope is only this 
definition or declaration, which is probably not what you want

[...]

drivers/net/wireless/rt2x00/rt2x00dev.c: In function 
`rt2x00lib_configuration_scheduled':
drivers/net/wireless/rt2x00/rt2x00dev.c:484: error: storage size of `bss_conf' 
isn't known
drivers/net/wireless/rt2x00/rt2x00dev.c:494: error: `BSS_CHANGED_ERP_PREAMBLE' 
undeclared (first use in this function)
drivers/net/wireless/rt2x00/rt2x00dev.c:494: error: (Each undeclared identifier 
is reported only once
drivers/net/wireless/rt2x00/rt2x00dev.c:494: error: for each function it 
appears in.)
drivers/net/wireless/rt2x00/rt2x00dev.c:484: warning: unused variable `bss_conf'
drivers/net/wireless/rt2x00/rt2x00dev.c: In function 
`rt2x00lib_beacondone_scheduled':
drivers/net/wireless/rt2x00/rt2x00dev.c:511: warning: passing arg 2 of 
`ieee80211_beacon_get' makes integer from pointer without a cast

> 
> Chris
> 
> 
> 
> 

-- 
Beauty is in the eye of the beerholder.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [BUG] Linux 2.6.25-rc2 - Regression from 2.6.24-rc1-git1 softlockup while bootup on powerpc

2008-02-21 Thread Andrew Morton
On Tue, 19 Feb 2008 09:36:34 +0100 Jens Axboe <[EMAIL PROTECTED]> wrote:

> But I think the radix 'scan over entire tree' is a bit fragile.

eek, it had better not be.  Was this an error in the caller?  Hope so.

> This
> patch adds a parallel hlist for ease of properly browsing the members,

Even though io_contexts are fairly uncommon, adding more stuff to a data
structure was a pretty sad alternative to fixing a bug in
radix_tree_gang_lookup(), or to fixing a bug in a caller of it.

IOW: what exactly went wrong here??
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: regression: CD burning (k3b) went broke

2008-02-21 Thread Jens Axboe
On Thu, Feb 21 2008, Mike Galbraith wrote:
> Greetings,
> 
> K3b recently (9a4c854..5d9c4a7 pull) began terminally griping about
> buffer underrun upon every attempt to burn a CD.  I can't fully bisect
> the problem  because intervening kernels hang soft during boot.  Using
> git bisect visualize, and converting to postable text:
> 
> bisect/bad   block: add request->raw_data_len 
> (6b00769fe1502b4ad97bb327ef7ac971b208bfb5)
> bisect   block: update bio according to DMA alignment padding 
> (40b01b9bbdf51ae543a04744283bf2d56c4a6afa)
> libata: update ATAPI overflow draining
> bisect/good-e164094964e6e20fe7fce418e06a9dce952bb7a4

Tejun?


> 
> Serial console log of hung kernel 40b01b9bbdf51ae543a04744283bf2d56c4a6afa 
> below
> 
> [0.00] Linux version 2.6.25-rc2-smp ([EMAIL PROTECTED]) (gcc version 
> 4.2.1 (SUSE Linux)) #14 SMP PREEMPT Thu Feb 21 08:49:51 CET 2008
> [0.00] BIOS-provided physical RAM map:
> [0.00]  BIOS-e820:  - 0009fc00 (usable)
> [0.00]  BIOS-e820: 0009fc00 - 000a (reserved)
> [0.00]  BIOS-e820: 000f - 0010 (reserved)
> [0.00]  BIOS-e820: 0010 - 3fff (usable)
> [0.00]  BIOS-e820: 3fff - 3fff3000 (ACPI NVS)
> [0.00]  BIOS-e820: 3fff3000 - 4000 (ACPI data)
> [0.00]  BIOS-e820: fec0 - 0001 (reserved)
> [0.00] 0MB HIGHMEM available.
> [0.00] 1023MB LOWMEM available.
> [0.00] Scan SMP from b000 for 1024 bytes.
> [0.00] Scan SMP from b009fc00 for 1024 bytes.
> [0.00] Scan SMP from b00f for 65536 bytes.
> [0.00] found SMP MP-table at [b00f5320] 000f5320
> [0.00] Zone PFN ranges:
> [0.00]   DMA 0 -> 4096
> [0.00]   Normal   4096 ->   262128
> [0.00]   HighMem262128 ->   262128
> [0.00] Movable zone start PFN for each node
> [0.00] early_node_map[1] active PFN ranges
> [0.00] 0:0 ->   262128
> [0.00] DMI 2.3 present.
> [0.00] ACPI: RSDP 000F6CC0, 0014 (r0 IntelR)
> [0.00] ACPI: RSDT 3FFF3000, 002C (r1 IntelR AWRDACPI 42302E31 AWRD
> 0)
> [0.00] ACPI: FACP 3FFF3040, 0074 (r1 IntelR AWRDACPI 42302E31 AWRD
> 0)
> [0.00] ACPI: DSDT 3FFF30C0, 4139 (r1 INTELR AWRDACPI 1000 MSFT  
> 10E)
> [0.00] ACPI: FACS 3FFF, 0040
> [0.00] ACPI: APIC 3FFF7200, 0068 (r1 IntelR AWRDACPI 42302E31 AWRD
> 0)
> [0.00] ACPI: PM-Timer IO Port: 0x408
> [0.00] ACPI: LAPIC (acpi_id[0x00] lapic_id[0x00] enabled)
> [0.00] Processor #0 15:2 APIC version 20
> [0.00] ACPI: LAPIC (acpi_id[0x01] lapic_id[0x01] enabled)
> [0.00] Processor #1 15:2 APIC version 20
> [0.00] WARNING: maxcpus limit of 1 reached. Processor ignored.
> [0.00] ACPI: LAPIC_NMI (acpi_id[0x00] high edge lint[0x1])
> [0.00] ACPI: LAPIC_NMI (acpi_id[0x01] high edge lint[0x1])
> [0.00] ACPI: IOAPIC (id[0x02] address[0xfec0] gsi_base[0])
> [0.00] IOAPIC[0]: apic_id 2, version 32, address 0xfec0, GSI 0-23
> [0.00] ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
> [0.00] ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level)
> [0.00] Enabling APIC mode:  Flat.  Using 1 I/O APICs
> [0.00] Using ACPI (MADT) for SMP configuration information
> [0.00] Allocating PCI resources starting at 5000 (gap: 
> 4000:bec0)
> [0.00] PM: Registered nosave memory: 0009f000 - 
> 000a
> [0.00] PM: Registered nosave memory: 000a - 
> 000f
> [0.00] PM: Registered nosave memory: 000f - 
> 0010
> [0.00] Built 1 zonelists in Zone order, mobility grouping on.  Total 
> pages: 260081
> [0.00] Kernel command line: root=/dev/sdb3 rootflags=data=writeback 
> vga=0x314 resume=/dev/sdb2 console=ttyS0,115200n8 console=tty splash=silent 
> PROFILE=default 1 maxcpus=1
> [0.00] Enabling fast FPU save and restore... done.
> [0.00] Enabling unmasked SIMD FPU exception support... done.
> [0.00] Initializing CPU#0
> [0.00] Preemptible RCU implementation.
> [0.00] CPU 0 irqstacks, hard=b0427000 soft=b0425000
> [0.00] PID hash table entries: 4096 (order: 12, 16384 bytes)
> [0.00] Detected 2992.603 MHz processor.
> [0.000999] Console: colour dummy device 80x25
> [0.000999] console [tty0] enabled
> [0.000999] console [ttyS0] enabled
> [0.000999] Dentry cache hash table entries: 131072 (order: 7, 524288 
> bytes)
> [0.000999] Inode-cache hash table entries: 65536 (order: 6, 262144 bytes)
> [0.000999] Memory: 1028968k/1048512k available (1998k kernel code, 18904k 
> reserved, 955k data, 236k init, 0k highmem)
> [ 

Re: linux-next: Tree for Feb 22

2008-02-21 Thread David Miller
From: Al Viro <[EMAIL PROTECTED]>
Date: Fri, 22 Feb 2008 07:25:54 +

> On Fri, Feb 22, 2008 at 06:21:16PM +1100, Stephen Rothwell wrote:
> > On Fri, 22 Feb 2008 17:04:21 +1100 Stephen Rothwell <[EMAIL PROTECTED]> 
> > wrote:
> > >
> > > Status of my local build tests is at
> > > http://kisskb.ellerman.id.au/kisskb/branch/9/.  The sparc builds have
> > > been mostly disabled while I obtain a working cross compiler.
> > 
> > I have reenabled the sparc and sparc64 builds but there is a bug in 32bit
> > sparc (in Linus' tree) at the moment.
> 
> Remove includes of linux/rcupdate.h and linux/mm.h from memcontrol.h;
> patch had been posted some time ago.

Yes, it's clogged up in Andrew's patch queue somewhere.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: linux-next: Tree for Feb 22

2008-02-21 Thread Al Viro
On Fri, Feb 22, 2008 at 06:21:16PM +1100, Stephen Rothwell wrote:
> On Fri, 22 Feb 2008 17:04:21 +1100 Stephen Rothwell <[EMAIL PROTECTED]> wrote:
> >
> > Status of my local build tests is at
> > http://kisskb.ellerman.id.au/kisskb/branch/9/.  The sparc builds have
> > been mostly disabled while I obtain a working cross compiler.
> 
> I have reenabled the sparc and sparc64 builds but there is a bug in 32bit
> sparc (in Linus' tree) at the moment.

Remove includes of linux/rcupdate.h and linux/mm.h from memcontrol.h;
patch had been posted some time ago.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.25-rc1 xen pvops regression

2008-02-21 Thread Ian Campbell

On Thu, 2008-02-21 at 14:58 -0800, H. Peter Anvin wrote:
> 
> Which it is on real hardware, because although it's not *reserved*
> (type 2), it is certainly not made available as *normal memory* (type
> 1).  If Xen maps this as type 1 then I definitely see the problem.
> 
> We can exclude type 1 memory from DMI scan, certainly.

I'd been meaning to ask this. So the machines you have which don't
describe 0xf as reserved also don't describe it as RAM? (I guess
it's either a hole in the table or one of the other e820 types).

So it sounds like it would be acceptable to simply invert the test in my
original patch as below? (actually reverting to my original-original
patch which I never sent out because checking for reserved sounded more
correct at the time, which was dumb of me because I was well aware of
the other possible types, I must have been having one of those days).

Ian.

>From 13bdb4ee9d80b83a81c3dbefa52464e511d1b4df Mon Sep 17 00:00:00 2001
From: Ian Campbell <[EMAIL PROTECTED]>
Date: Fri, 22 Feb 2008 07:17:14 +
Subject: [PATCH] x86: Do not scan for DMI if the DMI region is marked as RAM by 
e820.

Under Xen the memory at 0xf is regular RAM and so can potentially contain a
page table and hence cannot be mapped. The e820 map given to guest reflects
this.

Signed-off-by: Ian Campbell <[EMAIL PROTECTED]>
---
 drivers/firmware/dmi_scan.c |4 
 1 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/drivers/firmware/dmi_scan.c b/drivers/firmware/dmi_scan.c
index 653265a..f8fde74 100644
--- a/drivers/firmware/dmi_scan.c
+++ b/drivers/firmware/dmi_scan.c
@@ -7,6 +7,7 @@
 #include 
 #include 
 #include 
+#include 
 
 static char dmi_empty_string[] = "";
 
@@ -371,6 +372,9 @@ void __init dmi_scan_machine(void)
}
}
else {
+   if (e820_all_mapped(0xF, 0xF+0x1, E820_RAM))
+   goto out;
+
/*
 * no iounmap() for that ioremap(); it would be a no-op, but
 * it's so early in setup that sucker gets confused into doing
-- 
1.5.4.2



-- 
Ian Campbell

Stupidity, like virtue, is its own reward.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Kernel oops with bluetooth usb dongle

2008-02-21 Thread Thomas Gleixner
Quel,

On Fri, 22 Feb 2008, Quel Qun wrote:
> $ addr2line -e vmlinux c012d51d
> /usr/src/linux-2.6.25-rc2-git5kk1/kernel/timer.c:770
> 
> Crap, that is on the next list_for_each_entry in timer.c :(
> 
> I tried to make a similar test loop as you did a few lines above:

Cool.
 
> I thought I got it on the next crash, but the system locked too
>  fast, and the only thing I saw was:
> 
> TTRACE timer f7b52858 fn f8e7c608 addr c012d776
> TTRACE fn l2cap_info_timeout
> TTRACE addr mod_timer
> BUG: unable to handle kernel paging request at 6b6b6b6b

That's what I wanted to see.
 
> I hope the tiny bit of trace can trigger some idea. At least l2cap
> has something to do with bluetooth. l2cap_info_timeout is line 360
> of net/bluetooth/l2cap.c, apparently only called from
> l2cap_conn_add, line 391: setup_timer(>info_timer,
> l2cap_info_timeout, (unsigned long)conn);

Correct. And I don't see how it's guaranteed that the timer is deleted
before l2cap_conn_del() is called which kfree's the l2cap_conn
structure.

> After four hours and ten crashes today, it is the little I
> got. Kernel stuff is tough...

Yes, it is. The little information you got should be enough to solve
this. Thanks for your patience and help !

Does the patch below fix your problem ?

Thanks,

tglx

---
 net/bluetooth/l2cap.c |2 ++
 1 file changed, 2 insertions(+)

Index: linux-2.6/net/bluetooth/l2cap.c
===
--- linux-2.6.orig/net/bluetooth/l2cap.c
+++ linux-2.6/net/bluetooth/l2cap.c
@@ -417,6 +417,8 @@ static void l2cap_conn_del(struct hci_co
l2cap_sock_kill(sk);
}
 
+   del_timer(>info_timer);
+
hcon->l2cap_data = NULL;
kfree(conn);
 }
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: linux-next: Tree for Feb 22

2008-02-21 Thread Stephen Rothwell
On Fri, 22 Feb 2008 17:04:21 +1100 Stephen Rothwell <[EMAIL PROTECTED]> wrote:
>
> Status of my local build tests is at
> http://kisskb.ellerman.id.au/kisskb/branch/9/.  The sparc builds have
> been mostly disabled while I obtain a working cross compiler.

I have reenabled the sparc and sparc64 builds but there is a bug in 32bit
sparc (in Linus' tree) at the moment.

-- 
Cheers,
Stephen Rothwell[EMAIL PROTECTED]
http://www.canb.auug.org.au/~sfr/


pgpg6PYre8djK.pgp
Description: PGP signature


Re: [PATCH] Document huge memory/cache overhead of memory controller in Kconfig

2008-02-21 Thread Balbir Singh
KAMEZAWA Hiroyuki wrote:
> On Thu, 21 Feb 2008 16:33:33 +0530
> Balbir Singh <[EMAIL PROTECTED]> wrote:
> 
>>> Another issue is that it will slightly increase TLB/cache
>>> cost of the memory controller, but I think that would be a fair
>>> trade off for it being zero cost when disabled but compiled
>>> in.
>>>
>>> Doing it with vmalloc should be easy enough. I can do such
>>> a patch later unless someone beats me to it...
>>>
>> I'll get to it, but I have too many things on my plate at the moment. 
>> KAMEZAWA
>> also wanted to look at it. I looked through some vmalloc() internals 
>> yesterday
>> and I am worried about allocating all the memory on a single node in a NUMA
>> system and changing VMALLOC_ on every architecture to provide more 
>> vmalloc
>> space. I might be missing something obvious.
>>
> 
> I'll post a series of patch to do that later (it's under debug now...)
> I'm glad if people (including you) look it and give me advices.
> 

Thank you so much for your help. I'll definitely look at it and review/test 
them.

-- 
Warm Regards,
Balbir Singh
Linux Technology Center
IBM, ISTL
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: linux-next: Tree for Feb 22

2008-02-21 Thread Stephen Rothwell
Hi Frank,

On Fri, 22 Feb 2008 07:43:58 +0100 Frank Seidel <[EMAIL PROTECTED]> wrote:
>
> at least the bz2 tar i just looked at from that URL has a problem
> with the prefix resulting to this toplevel extraction:
> 
> next-20080222arch
> next-20080222block
> ...
> next-20080222virt
> 
> So, i guess you just missed the trailing slash for the prefix
> parameter of git archive?
> (should be --prefix=next-20080222/ afaik)

Indeed.  Thanks. I am regenerating them now.

-- 
Cheers,
Stephen Rothwell[EMAIL PROTECTED]


pgpSqos0GPRcs.pgp
Description: PGP signature


Re: linux-next: Tree for Feb 22

2008-02-21 Thread Stephen Rothwell
Hi Frank,

On Fri, 22 Feb 2008 07:32:58 +0100 Frank Seidel <[EMAIL PROTECTED]> wrote:
>
> Looks great :-) Of course i also just put a ref to it on the
> wiki.

Thanks.

-- 
Cheers,
Stephen Rothwell[EMAIL PROTECTED]


pgp0b6xYVxwV7.pgp
Description: PGP signature


Re: 2.6.25-rc2 regression - hang on suspend

2008-02-21 Thread Soeren Sonnenburg
On Fri, 2008-02-22 at 00:06 +0100, Rafael J. Wysocki wrote: 
> On Thursday, 21 of February 2008, Soeren Sonnenburg wrote:
> > On Thu, 2008-02-21 at 01:31 +0100, Rafael J. Wysocki wrote:
> > > On Wednesday, 20 of February 2008, Soeren Sonnenburg wrote:
> > > > On Wed, 2008-02-20 at 00:50 +0100, Rafael J. Wysocki wrote:
[...] 
> > Using echo none >/sys/power/pm_test and then
> > echo mem >/sys/power/state I see it hang on ata1 errors again. Waiting
> > about 10-30 seconds it progresses further and finally arrives at 
> > 
> > CPU0 attaching NULL sched-domain
> > CPU1 attaching NULL sched-domain
> > 
> > then hangs.
> 
> Please see if compiling the kernel with CONFIG_SMP unset makes suspend
> work.

*Argh*, this bug is not behaving nicely :( Whatever happened,
git-current now suspends correctly with and without CONFIG_SMP  and all
may CONFIG_PREEMPT_RCU=y and CONFIG_CLASSIC_RCU=y attempts. Also no sata
errors anymore.

However it is not reliably waking up (at least when all of the above
except CLASSIC_RCU is on). Sometimes the display remains black on the
console, but X still works and sometimes it hangs completely on resume.

Also when compiling these many kernels via make -j4 I noted that I could
hardly move the mouse / use the keyboard, but saw random jumps and
key-repetitions...

Soeren
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] Fix typos on lib/Kconfig

2008-02-21 Thread Satoru Takeuchi
Fix typos on lib/Kconfig.

Signed-off-by: Satoru Takeuchi <[EMAIL PROTECTED]>

---
Index: 2.6.25-rc2/lib/Kconfig
===
--- 2.6.25-rc2.orig/lib/Kconfig 2008-02-22 15:11:56.0 +0900
+++ 2.6.25-rc2/lib/Kconfig  2008-02-22 15:12:41.0 +0900
@@ -64,7 +64,7 @@ config AUDIT_GENERIC
default y
 
 #
-# compression support is select'ed if needed
+# compression support is selected if needed
 #
 config ZLIB_INFLATE
tristate
@@ -85,7 +85,7 @@ config GENERIC_ALLOCATOR
boolean
 
 #
-# reed solomon support is select'ed if needed
+# reed solomon support is selected if needed
 #
 config REED_SOLOMON
tristate
@@ -103,7 +103,7 @@ config REED_SOLOMON_DEC16
boolean
 
 #
-# Textsearch support is select'ed if needed
+# Textsearch support is selected if needed
 #
 config TEXTSEARCH
boolean
@@ -118,7 +118,7 @@ config TEXTSEARCH_FSM
tristate
 
 #
-# plist support is select#ed if needed
+# plist support is selected if needed
 #
 config PLIST
boolean
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Document huge memory/cache overhead of memory controller in Kconfig

2008-02-21 Thread KAMEZAWA Hiroyuki
On Thu, 21 Feb 2008 16:33:33 +0530
Balbir Singh <[EMAIL PROTECTED]> wrote:

> > Another issue is that it will slightly increase TLB/cache
> > cost of the memory controller, but I think that would be a fair
> > trade off for it being zero cost when disabled but compiled
> > in.
> > 
> > Doing it with vmalloc should be easy enough. I can do such
> > a patch later unless someone beats me to it...
> > 
> 
> I'll get to it, but I have too many things on my plate at the moment. KAMEZAWA
> also wanted to look at it. I looked through some vmalloc() internals yesterday
> and I am worried about allocating all the memory on a single node in a NUMA
> system and changing VMALLOC_ on every architecture to provide more vmalloc
> space. I might be missing something obvious.
> 

I'll post a series of patch to do that later (it's under debug now...)
I'm glad if people (including you) look it and give me advices.

Regards,
-Kame

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: linux-next: Tree for Feb 22

2008-02-21 Thread Frank Seidel
Hi Stephen,

Stephen Rothwell schrieb:
> I am now including tarballs in
> http://www.kernel.org/pub/linux/kernel/people/sfr/linux-next/.

at least the bz2 tar i just looked at from that URL has a problem
with the prefix resulting to this toplevel extraction:

next-20080222arch
next-20080222block
...
next-20080222virt

So, i guess you just missed the trailing slash for the prefix
parameter of git archive?
(should be --prefix=next-20080222/ afaik)

Thanks,
Frank
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Merging of completely unreviewed drivers

2008-02-21 Thread Ray Lee
On Thu, Feb 21, 2008 at 7:13 PM, Linus Torvalds
<[EMAIL PROTECTED]> wrote:
>  So I'd be happier with warnings about deep indentation (but how do you
>  count it? Will people then try to fake things out by using 4-space indents
>  and then "deep" indentations will look like just a couple of tabs?)

I suspect that 90% of the cases that people really care about would
get caught successfully just by counting brace depth.

ie, by looking at { { {} {} {{{}{}}} } } I bet you can tell me which
section should have been pulled out into a separate routine.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: linux-next: Tree for Feb 22

2008-02-21 Thread Frank Seidel
Stephen Rothwell schrieb:
> I am now including tarballs in
> http://www.kernel.org/pub/linux/kernel/people/sfr/linux-next/.

Looks great :-) Of course i also just put a ref to it on the
wiki.

Thanks,
Frank
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/2] xen pvfb: Para-virtual framebuffer, keyboard and pointer driver

2008-02-21 Thread Markus Armbruster
Jeremy Fitzhardinge <[EMAIL PROTECTED]> writes:

> Markus Armbruster wrote:
>> This is a pair of Xen para-virtual frontend device drivers:
>> drivers/video/xen-fbfront.c provides a framebuffer, and
>> drivers/input/xen-kbdfront provides keyboard and mouse.
>>   
>
> Unless they're actually inter-dependent, could you post this as two
> separate patches?  I don't know anything about these parts of the
> kernel, so it would be nice to make it very obvious which changes are
> fb vs mouse/keyboard.

I could do that do that, but the intermediate step (one driver, not
the other) is somewhat problematic: the backend in dom0 needs both
drivers, and will refuse to complete device initialization unless
they're both present.

> (I guess input/* vs video/* should make it obvious, but it looks like
> input has a config dependency on fb, so I'll avoid making too many
> presumptions...)

Framebuffer: fbif.h xen-fbfront.c
Keyboard/mouse: kbdif.h xen-kbdfront.h

I added the config dependency because having one without the other
doesn't make sense, as explained above.

Still want it split into two separate patches?

> (Couple of comments below)
>
>J
>
>> The backends run in dom0 user space.
>>
>> Signed-off-by: Markus Armbruster <[EMAIL PROTECTED]>
>>
>> ---
[...]
>> diff --git a/drivers/input/xen-kbdfront.c b/drivers/input/xen-kbdfront.c
>> new file mode 100644
>> index 000..84f65cf
>> --- /dev/null
>> +++ b/drivers/input/xen-kbdfront.c
>> @@ -0,0 +1,337 @@
[...]
>> +static int __devinit xenkbd_probe(struct xenbus_device *dev,
>> +  const struct xenbus_device_id *id)
>> +{
[...]
>> +if (ret < 0)
>> +goto error;
>> +
>> +return 0;
>> +
>> + error_nomem:
>> +ret = -ENOMEM;
>> +xenbus_dev_fatal(dev, ret, "allocating device memory");
>> + error:
>> +xenkbd_remove(dev);
>>   
>
> This is happy if dev->info is only partially initialized?

It's designed that way.  dev->info is initialized so that
xenkbd_remove() does nothing.  Then stuff is stored into dev->info
only when it's sufficiently initialized for xenkbd_remove() to clean
it up.

>> +return ret;
>> +}
>> +
>> +static int xenkbd_resume(struct xenbus_device *dev)
>> +{
>> +struct xenkbd_info *info = dev->dev.driver_data;
>> +
>> +xenkbd_disconnect_backend(info);
>> +memset(info->page, 0, PAGE_SIZE);
>> +return xenkbd_connect_backend(dev, info);
>> +}
>> +
>> +static int xenkbd_remove(struct xenbus_device *dev)
>> +{
>> +struct xenkbd_info *info = dev->dev.driver_data;
>> +
>> +xenkbd_disconnect_backend(info);
>> +input_unregister_device(info->kbd);
>> +input_unregister_device(info->ptr);
>>   
>
> Does this free kdb and ptr?

Yes.  xenkbd_probe() initializes info->kbd and info->ptr to null, and
changes that to the device only after input_register_device()
succeeds.  If something goes wrong between input_allocate_device() and
input_register_device(), xenkbd_probe() frees the device with
input_free_device().  This is how input_register_device() wants to be
used according to its function comment:

/**
 * input_register_device - register device with input core
 * @dev: device to be registered
 *
 * This function registers device with input core. The device must be
 * allocated with input_allocate_device() and all it's capabilities
 * set up before registering.
 * If function fails the device must be freed with input_free_device().
 * Once device has been successfully registered it can be unregistered
 * with input_unregister_device(); input_free_device() should not be
 * called in this case.
 */

There's another bug here: must not call input_unregister_device() when
the device is still null.  Man, I remember checking cleanup multiple
times when this stuff went into Xen (i.e. quite some time ago), and I
still missed this one.  Going to check cleanup *again*.

>> +free_page((unsigned long)info->page);
>> +kfree(info);
>> +return 0;
>> +}
[...]

Thanks!
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ofa-general] Re: Merging of completely unreviewed drivers

2008-02-21 Thread Junio C Hamano
Linus Torvalds <[EMAIL PROTECTED]> writes:

> So I'd be happier with warnings about deep indentation (but how do you 
> count it? Will people then try to fake things out by using 4-space indents 
> and then "deep" indentations will look like just a couple of tabs?) and 
> against complex expressions (ie "if ((a = xyz()) == NULL) .." should just 
> be split up into "a = xyz(); if (!a) ..", but there are sometimes reasons 
> for those things too!

Deep indentation should be fairly easy, given that you
already have rules in place that says "Tabs are 8 characters".
So if you find a line that begins with more than say 4 SP, you
can flag that as already bogus (i.e. "does not indent with HT"),
more than 8 SP definitely so.

I'll leave harder "complex expressions" to sparse experts ;-),

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [LTP] [PATCH 1/8] Scaling msgmni to the amount of lowmem

2008-02-21 Thread Nadia Derbey

Subrata Modak wrote:

Nadia Derbey wrote:


Matt Helsley wrote:



On Tue, 2008-02-19 at 18:16 +0100, Nadia Derbey wrote:




+#define MAX_MSGQUEUES  16  /* MSGMNI as defined in linux/msg.h */
+




It's not quite the maximum anymore, is it? More like the minumum
maximum ;). A better name might better document what the test is
actually trying to do.

One question I have is whether the unpatched test is still valuable.
Based on my limited knowledge of the test I suspect it's still a correct
test of message queues. If so, perhaps renaming the old test (so it's
not confused with a performance regression) and adding your patched
version is best?



So, here's the new patch based on Matt's points.

Subrata, it has to be applied on top of the original ltp-full-20080131. 
Please tell me if you'd prefer one based on the merged version you've 
got (i.e. with my Tuesday patch applied).



Nadia, I would prefer Patch on the top of the already merged version (on
top of latest CVS snapshot as of today). Anyways, thanks for all these
effort :-)

--Subrata



In attachment, you'll find a patch to apply on top of the patches I sent 
you on Tuesday.


Regards,
Nadia
Since msgmni now scales to the memory size, it may reach big values.
To avoid forking 2*msgmni processes and create msgmni msg queues, take the min
between the procfs value and MSGMNI (as found in linux/msg.h).

Also integrated the following in libipc.a:
  . get_max_msgqueues()
  . get_used_msgqueues()

Signed-off-by: Nadia Derbey <[EMAIL PROTECTED]>

---
 testcases/kernel/syscalls/ipc/lib/ipcmsg.h  |7 
 testcases/kernel/syscalls/ipc/lib/libipc.c  |   54 +
 testcases/kernel/syscalls/ipc/msgctl/msgctl08.c |   42 -
 testcases/kernel/syscalls/ipc/msgctl/msgctl09.c |   42 -
 testcases/kernel/syscalls/ipc/msgctl/msgctl10.c |  527 ++
 testcases/kernel/syscalls/ipc/msgctl/msgctl11.c |  696 
 testcases/kernel/syscalls/ipc/msgget/Makefile   |3 
 testcases/kernel/syscalls/ipc/msgget/msgget03.c |   22 
 8 files changed, 1318 insertions(+), 75 deletions(-)

Index: ltp-full-20080131/testcases/kernel/syscalls/ipc/lib/libipc.c
===
--- ltp-full-20080131.orig/testcases/kernel/syscalls/ipc/lib/libipc.c	2008-02-22 07:57:47.0 +0100
+++ ltp-full-20080131/testcases/kernel/syscalls/ipc/lib/libipc.c	2008-02-22 08:02:55.0 +0100
@@ -201,3 +201,57 @@ rm_shm(int shm_id)
 		tst_resm(TINFO, "id = %d", shm_id);
 	}
 }
+
+#define BUFSIZE 512
+
+/*
+ * Get the number of message queues already in use
+ */
+int
+get_used_msgqueues()
+{
+	FILE *f;
+	int used_queues;
+	char buff[BUFSIZE];
+
+	f = popen("ipcs -q", "r");
+	if (!f) {
+		tst_resm(TBROK, "Could not run 'ipcs' to calculate used "
+			"message queues");
+		tst_exit();
+	}
+	/* FIXME: Start at -4 because ipcs prints four lines of header */
+	for (used_queues = -4; fgets(buff, BUFSIZE, f); used_queues++)
+		;
+	pclose(f);
+	if (used_queues < 0) {
+		tst_resm(TBROK, "Could not read output of 'ipcs' to "
+			"calculate used message queues");
+		tst_exit();
+	}
+	return used_queues;
+}
+
+/*
+ * Get the max number of message queues allowed on system
+ */
+int
+get_max_msgqueues()
+{
+	FILE *f;
+	char buff[BUFSIZE];
+
+	/* Get the max number of message queues allowed on system */
+	f = fopen("/proc/sys/kernel/msgmni", "r");
+	if (!f) {
+		tst_resm(TBROK, "Could not open /proc/sys/kernel/msgmni");
+		return -1;
+	}
+	if (!fgets(buff, BUFSIZE, f)) {
+		fclose(f);
+		tst_resm(TBROK, "Could not read /proc/sys/kernel/msgmni");
+		return -1;
+	}
+	fclose(f);
+	return atoi(buff);
+}
Index: ltp-full-20080131/testcases/kernel/syscalls/ipc/lib/ipcmsg.h
===
--- ltp-full-20080131.orig/testcases/kernel/syscalls/ipc/lib/ipcmsg.h	2008-02-22 07:57:47.0 +0100
+++ ltp-full-20080131/testcases/kernel/syscalls/ipc/lib/ipcmsg.h	2008-02-22 08:04:15.0 +0100
@@ -41,7 +41,9 @@ void setup(void);
 #define MSGSIZE	1024		/* a resonable size for a message */
 #define MSGTYPE 1		/* a type ID for a message */
 
-#define MAX_MSGQUEUES	16	/* MSGMNI as defined in linux/msg.h */
+#define NR_MSGQUEUES	16	/* MSGMNI as defined in linux/msg.h */
+
+#define min(a, b)	(((a) < (b)) ? (a) : (b))
 
 typedef struct mbuf {		/* a generic message structure */
 	long mtype;
@@ -61,4 +63,7 @@ void rm_queue(int);
 int getipckey();
 int getuserid(char *);
 
+int get_max_msgqueues(void);
+int get_used_msgqueues(void);
+
 #endif /* ipcmsg.h */
Index: ltp-full-20080131/testcases/kernel/syscalls/ipc/msgctl/msgctl10.c
===
--- /dev/null	1970-01-01 00:00:00.0 +
+++ ltp-full-20080131/testcases/kernel/syscalls/ipc/msgctl/msgctl10.c	2008-02-22 08:05:53.0 +0100
@@ -0,0 +1,527 @@
+/*
+ *
+ *   Copyright (c) International Business Machines  Corp., 2002
+ *
+ *   This program is free software;  you can 

Re: [PATCH] capabilities: implement per-process securebits

2008-02-21 Thread Andrew G. Morgan

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


Andrew G. Morgan wrote:
| Serge E. Hallyn wrote:
| |> It all looks good to me.
| |
| |> Since we've confirmed that wireshark uses capabilities it must be using
| |> prctl(PR_SET_KEEPCAPS), so running it might be a good way to verify
that
| |> your changes to that codepath (with
CONFIG_SECURITY_FILE_CAPABILITIES=n)
| |> are 100% correct, and might set minds at ease.  Is that something
you're
| |> set up to be able to do?
|
| I guess I need someone to offer an existence proof that this particular
| wireshark code ever worked? (ldd dumpcap|grep libcap). For reference,
| I'm looking at:

Doh! :*)

I had mistaken cap_init() for cap_get_proc() in the wireshark code...

I now see what wireshark is trying to do, and can *confirm* that with
the present patch I do maintain the legacy behavior. :-)

FWIW I've updated capsh in the libcap git tree to add a few more hooks
and with the following sequence can now verify that the keep-caps works
as before:

As root:
# rm -f tcapsh
# cp capsh tcapsh
# chown root.root tcapsh
# chmod u+s tcapsh
# ls -l tcapsh
# ./capsh --uid=500 -- -c "./tcapsh --keep=1 \
~  --caps=\"cap_net_raw,cap_net_admin=ip\" --uid=500 \
~  --caps=\"cap_net_raw,cap_net_admin=pie\" --print"
# echo $?
0

The wireshark problem, that you have been discussing (in the other
thread), can also be simulated as follows:

# ./capsh --uid=500 -- -c "./tcapsh --keep=1 \
~  --caps=\"cap_net_raw,cap_net_admin=ip\" \
~  --uid=500 --forkfor=10 --caps= --print \
~  --killit=9 --print"

You might like to re-post your fix for that problem as a stand alone
patch; I suspect it may be lost in the noise at this point. (You might
also like to update the comment in that fix since the old comment looks
very stale if you delete the ->euid==0 check. I think it is safe to
simply say /* legacy signal behavior requires that a user can kill any
process running with their uid */)

Cheers

Andrew

|
| wireshark-0.99.7/dumpcap.c:302
|
| void
| relinquish_privs_except_capture(void)
| {
| ~/* CAP_NET_ADMIN: Promiscuous mode and a truckload of other
| ~ *stuff we don't need (and shouldn't have).
| ~ * CAP_NET_RAW:   Packet capture (raw sockets).
| ~ */
| ~cap_value_t cap_list[2] = { CAP_NET_ADMIN, CAP_NET_RAW };
| ~cap_t caps = cap_init();
| ~int cl_len = sizeof(cap_list) / sizeof(cap_value_t);
|
| ~if (started_with_special_privs()) {
| ~print_caps("Pre drop, pre set");
| ~if (prctl(PR_SET_KEEPCAPS, 1, 0, 0, 0) == -1) {
| ~perror("prctl()");
| ~}
|
| ~cap_set_flag(caps, CAP_PERMITTED,   cl_len, cap_list, CAP_SET);
| ~cap_set_flag(caps, CAP_INHERITABLE, cl_len, cap_list, CAP_SET);
|
| [ XXX:AGM since (caps.pE > caps.pP) this next line should fail ]
| ~if (cap_set_proc(caps)) {
| ~perror("capset()");
| ~}
| ~print_caps("Pre drop, post set");
| ~}
|
| ~relinquish_special_privs_perm();
|
| ~print_caps("Post drop, pre set");
| ~cap_set_flag(caps, CAP_EFFECTIVE,   cl_len, cap_list, CAP_SET);
| ~if (cap_set_proc(caps)) {
| ~perror("capset()");
| ~}
| ~print_caps("Post drop, post set");
| ~cap_free(caps);
| }
| #endif /* HAVE_LIBCAP */
|
| My reading of the above code suggests that the application believes that
| it can raise/retain effective capabilities that are not in its permitted
| set.
|
| Browsing back in my git tree all the way back to 'v2.6.12-rc2', the
| following code (cap_capset_check) correctly requires:
|
| ~   96  /* verify the _new_Effective_ is a subset of the
_new_Permitted_ */
| ~   97  if (!cap_issubset (*effective, *permitted)) {
| ~   98   return -EPERM;
| ~   99  }
|
| so my question is, why should one expect this wireshark code to work? It
| looks wrong to me.
|
| Thanks
|
| Andrew
|
| |
| |> -serge
| |
| | Thanks
| |
| | Andrew
|
| ~From 006ddf6903983dd596e360ab1ab8e537b29fab46 Mon Sep 17 00:00:00 2001
| From: Andrew G. Morgan <[EMAIL PROTECTED]>
| Date: Mon, 18 Feb 2008 15:23:28 -0800
| Subject: [PATCH] Implement per-process securebits
| |>
| [This patch represents a no-op unless CONFIG_SECURITY_FILE_CAPABILITIES
| ~ is enabled at configure time.]
| |>
| Filesystem capability support makes it possible to do away with
| (set)uid-0 based privilege and use capabilities instead. That is, with
| filesystem support for capabilities but without this present patch,
| it is (conceptually) possible to manage a system with capabilities
| alone and never need to obtain privilege via (set)uid-0.
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.6 (GNU/Linux)

iD8DBQFHvmeB+bHCR3gb8jsRAknmAKCMw0Qe7uDwtuRE+f3YVmnlE5pK4wCgsv0f
5E6+K9Z0Xp1P74iOlnt221o=
=+sLL
-END PGP SIGNATURE-
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


linux-next: Tree for Feb 22

2008-02-21 Thread Stephen Rothwell
Hi all,

I have created today's linux-next tree at
git://git.kernel.org/pub/scm/linux/kernel/git/sfr/linux-next.git.

You can see which trees have been included by looking in the Next/Trees
file in the source.  There are also quilt-import.log and merge.log files
in the Next directory.  Between each merge, the tree was built with
allmodconfig for both powerpc and x86_64.

There only one merge problem and no build failures!

We are up to 30 trees, more are welcome (even if they are currently
empty).  I would encourage architecture maintainers, in particular, to
set up a git branch or quilt tree now to avoid the rush after RC3 :-)

I am now including tarballs in
http://www.kernel.org/pub/linux/kernel/people/sfr/linux-next/.

Status of my local build tests is at
http://kisskb.ellerman.id.au/kisskb/branch/9/.  The sparc builds have
been mostly disabled while I obtain a working cross compiler.

-- 
Cheers,
Stephen Rothwell[EMAIL PROTECTED]
http://www.canb.auug.org.au/~sfr/


pgpPGLokBQttt.pgp
Description: PGP signature


Re: [PATCH] Fix building lguest as module.

2008-02-21 Thread Tony Breeds
On Wed, Feb 20, 2008 at 11:01:40AM +0100, Ingo Molnar wrote:
> 
> * Tony Breeds <[EMAIL PROTECTED]> wrote:
> 
> > I've attached the .config FWIW
> 
> indeed you are right...
> 
> I fixed this build failure too - could you check whether x86.git#test 
> (which has all these lguest build fixes) works fine for you:
> 
>http://people.redhat.com/mingo/x86.git/README
> 
> ? Thanks,

Sure that works.  I'm not conviniced that your patch is right as it
treats gust and host support as the same thing.  Having said that we're
only talkign about a few constants, so I don't think it's worth anymore
time.  If it's not right it can be fixed later by those that know
better.

Yours Tony

  linux.conf.auhttp://www.marchsouth.org/
  Jan 19 - 24 2009 The Australian Linux Technical Conference!

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH 03/10] PCI: AMD SATA IDE mode quirk

2008-02-21 Thread Cai, Crane
> On Thu, Feb 21, 2008 at 03:47:33PM -0800, Greg Kroah-Hartman wrote:
> > +static void __devinit quirk_amd_ide_mode(struct pci_dev *pdev)
> >  {
> > -   /* set sb600 sata to ahci mode */
> > -   if ((pdev->class >> 8) == PCI_CLASS_STORAGE_IDE) {
> > -   u8 tmp;
> > +   /* set sb600/sb700/sb800 sata to ahci mode */
> > +   u8 tmp;
> >  
> > +   pci_read_config_byte(pdev, PCI_CLASS_DEVICE, );
> > +   if (tmp == 0x01) {
> > pci_read_config_byte(pdev, 0x40, );
> 
> This seems like a dis-improvement.  Why are we reading a 
> config byte for something we already have in the pci_dev?  
> Why are we now checking against 0x01 instead of a symbolic 
> constant?  Why are we no longer checking that this is 
> PCI_BASE_CLASS_STORAGE?
It is a quirk. In pci_ids.h did have PCI_CLASS_STORAGE_IDE 
and PCI_BASE_CLASS_STORAGE, these can not represent 
the right situation we want to check. 0x01 represents 
PCI_CLASS_STORAGE_IDE last 2 bit. Also because it 
is a quirk, I do not think we need to change pci_ids.h. So 0x01 
used. 
> > -DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_ATI, 
> > PCI_DEVICE_ID_ATI_IXP600_SATA, quirk_sb600_sata); 
> > -DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_ATI, 
> > PCI_DEVICE_ID_ATI_IXP700_SATA, quirk_sb600_sata);
> > +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_ATI, 
> > +PCI_DEVICE_ID_ATI_IXP600_SATA, quirk_amd_ide_mode); 
> > +DECLARE_PCI_FIXUP_RESUME(PCI_VENDOR_ID_ATI, 
> > +PCI_DEVICE_ID_ATI_IXP600_SATA, quirk_amd_ide_mode); 
> > +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_ATI, 
> > +PCI_DEVICE_ID_ATI_IXP700_SATA, quirk_amd_ide_mode); 
> > +DECLARE_PCI_FIXUP_RESUME(PCI_VENDOR_ID_ATI, 
> > +PCI_DEVICE_ID_ATI_IXP700_SATA, quirk_amd_ide_mode);
> 
> Nothing in the changelog entry suggests why we now need 
> FIXUP_RESUME entries when we didn't before.
> 
PCI configuration space will be changed by BIOS and then in pci
init and restore. So resume also needed.
> --
> Intel are signing my paycheques ... these opinions are still 
> mine "Bill, look, we understand that you're interested in 
> selling us this operating system, but compare it to ours.  We 
> can't possibly take such a retrograde step."
> 
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: linux-next: first tree

2008-02-21 Thread Stephen Rothwell
Hi Frank,

On Fri, 22 Feb 2008 06:41:00 +0100 Frank Seidel <[EMAIL PROTECTED]> wrote:
>
> Stephen Rothwell wrote:
> > On Fri, 22 Feb 2008 01:07:01 +0100 Frank Seidel <[EMAIL PROTECTED]> wrote:
> >> Hi, i'll provide tars of the current linux-next tree reachable
> >> via my http://linux-next.f-seidel.de wiki ("Tar Downloads").
> >> Is that what you were looking for?
> > 
> > I was going to start providing tarballs yesterday, but other things
> > happened :-( I will provide a next-MMDD.tar.gz file on kernel.org
> > starting today.  I will mention it in today's announcement.
> 
> sorry, i didn't knew/considered you could have something prepared for this
> as well, as i couldn't see any feedback from you to this request for quite
> some time.

Yeah, sorry, but the last few days have been a bit hectic ... should be
better from now on, I hope.  Don't be sorry - I like enthusiasm.

> So, i just thought i could try to take care of this point. But of course
> its much more handy when you just do it together when releasing a new
> linux-next.

I can easily generate them straight out of the tree on master.kernel.org.

-- 
Cheers,
Stephen Rothwell[EMAIL PROTECTED]


pgp01xxweHuRj.pgp
Description: PGP signature


Re: linux-next: first tree

2008-02-21 Thread Frank Seidel
Hello Stephen,

Stephen Rothwell wrote:
> On Fri, 22 Feb 2008 01:07:01 +0100 Frank Seidel <[EMAIL PROTECTED]> wrote:
>> Hi, i'll provide tars of the current linux-next tree reachable
>> via my http://linux-next.f-seidel.de wiki ("Tar Downloads").
>> Is that what you were looking for?
> 
> I was going to start providing tarballs yesterday, but other things
> happened :-( I will provide a next-MMDD.tar.gz file on kernel.org
> starting today.  I will mention it in today's announcement.

sorry, i didn't knew/considered you could have something prepared for this
as well, as i couldn't see any feedback from you to this request for quite
some time.
So, i just thought i could try to take care of this point. But of course
its much more handy when you just do it together when releasing a new
linux-next.

Thanks,
Frank
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: linux-next: first tree

2008-02-21 Thread Frank Seidel
Greg KH wrote:
> Any reason we can't get these on kernel.org so that the mirror system
> will kick in for the whole world?

Only that i don't have a kernel.org account ;-) But Stephen has and
i suppose he'll put it there.

Thanks,
Frank
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: linux-next: first tree

2008-02-21 Thread Frank Seidel
Randy Dunlap wrote:
> Looks close.  It needs to be scriptable (not just a dynamically generated
> link) and have predictable names.  As long as those are true, then it
> should be great.

Yes, i would have scripted it when it tourned out to be of use for others.
But as i just saw Stephen already has something ready for this :-)

Thanks,
Frank
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 09/11] xen: make grant table arch portable.

2008-02-21 Thread Isaku Yamahata
split out x86 specific part from grant-table.c

Signed-off-by: Isaku Yamahata <[EMAIL PROTECTED]>
---
 arch/x86/xen/Makefile  |2 +-
 arch/x86/xen/grant-table.c |   91 
 drivers/xen/grant-table.c  |   35 +---
 include/xen/grant_table.h  |6 +++
 4 files changed, 101 insertions(+), 33 deletions(-)
 create mode 100644 arch/x86/xen/grant-table.c

diff --git a/arch/x86/xen/Makefile b/arch/x86/xen/Makefile
index 95c5926..3d8df98 100644
--- a/arch/x86/xen/Makefile
+++ b/arch/x86/xen/Makefile
@@ -1,4 +1,4 @@
 obj-y  := enlighten.o setup.o multicalls.o mmu.o \
-   time.o manage.o xen-asm.o
+   time.o manage.o xen-asm.o grant-table.o
 
 obj-$(CONFIG_SMP)  += smp.o
diff --git a/arch/x86/xen/grant-table.c b/arch/x86/xen/grant-table.c
new file mode 100644
index 000..49ba9b5
--- /dev/null
+++ b/arch/x86/xen/grant-table.c
@@ -0,0 +1,91 @@
+/**
+ * grant_table.c
+ * x86 specific part
+ *
+ * Granting foreign access to our memory reservation.
+ *
+ * Copyright (c) 2005-2006, Christopher Clark
+ * Copyright (c) 2004-2005, K A Fraser
+ * Copyright (c) 2008 Isaku Yamahata 
+ *VA Linux Systems Japan. Split out x86 specific part.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License version 2
+ * as published by the Free Software Foundation; or, when distributed
+ * separately from the Linux kernel or incorporated into other
+ * software packages, subject to the following license:
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this source file (the "Software"), to deal in the Software without
+ * restriction, including without limitation the rights to use, copy, modify,
+ * merge, publish, distribute, sublicense, and/or sell copies of the Software,
+ * and to permit persons to whom the Software is furnished to do so, subject to
+ * the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ */
+
+#include 
+#include 
+#include 
+
+#include 
+#include 
+#include 
+
+#include 
+
+static int map_pte_fn(pte_t *pte, struct page *pmd_page,
+ unsigned long addr, void *data)
+{
+   unsigned long **frames = (unsigned long **)data;
+
+   set_pte_at(_mm, addr, pte, mfn_pte((*frames)[0], PAGE_KERNEL));
+   (*frames)++;
+   return 0;
+}
+
+static int unmap_pte_fn(pte_t *pte, struct page *pmd_page,
+   unsigned long addr, void *data)
+{
+
+   set_pte_at(_mm, addr, pte, __pte(0));
+   return 0;
+}
+
+int arch_gnttab_map_shared(unsigned long *frames, unsigned long nr_gframes,
+  unsigned long max_nr_gframes,
+  struct grant_entry **__shared)
+{
+   int rc;
+   struct grant_entry *shared = *__shared;
+
+   if (shared == NULL) {
+   struct vm_struct *area =
+   xen_alloc_vm_area(PAGE_SIZE * max_nr_gframes);
+   BUG_ON(area == NULL);
+   shared = area->addr;
+   *__shared = shared;
+   }
+
+   rc = apply_to_page_range(_mm, (unsigned long)shared,
+PAGE_SIZE * nr_gframes,
+map_pte_fn, );
+   return rc;
+}
+
+void arch_gnttab_unmap_shared(struct grant_entry *shared,
+ unsigned long nr_gframes)
+{
+   apply_to_page_range(_mm, (unsigned long)shared,
+   PAGE_SIZE * nr_gframes, unmap_pte_fn, NULL);
+}
diff --git a/drivers/xen/grant-table.c b/drivers/xen/grant-table.c
index 9fcde20..22f5104 100644
--- a/drivers/xen/grant-table.c
+++ b/drivers/xen/grant-table.c
@@ -435,24 +435,6 @@ static inline unsigned int max_nr_grant_frames(void)
return xen_max;
 }
 
-static int map_pte_fn(pte_t *pte, struct page *pmd_page,
- unsigned long addr, void *data)
-{
-   unsigned long **frames = (unsigned long **)data;
-
-   set_pte_at(_mm, addr, pte, mfn_pte((*frames)[0], PAGE_KERNEL));
-   (*frames)++;
-   return 0;
-}
-
-static int unmap_pte_fn(pte_t *pte, struct page *pmd_page,
-   unsigned long addr, void *data)
-{
-
-   set_pte_at(_mm, addr, pte, __pte(0));

Re: [GIT PULL] XFS update for 2.6.25-rc3

2008-02-21 Thread Lachlan McIlroy

Jeff Garzik wrote:

Lachlan McIlroy wrote:

Remove empty file fs/xfs/Makefile-linux-2.6.


Already in the upstream kernel...

That's funny - I didn't see this change come through.  Oh well... thanks.





commit 1803f3389b7ac9ed33ea561b3b94e22e2864a95d
Author: Linus Torvalds <[EMAIL PROTECTED]>
Date:   Wed Feb 20 19:55:09 2008 -0800

Remove empty file remnants that were left in the tree by mistake
Noted by various people (Sam, Jeff, Roland..)
Commit 58b7983d15a422d9616bdc4e245d5c31dfaefbe2 intended to 
remove the
xfs "Makefile-linux-2.6" file, but it was mistakenly still left in 
the
tree as a empty file, and would cause git to correctly complain 
about a
tracked file being removed after a "make distclean" (which removes 
empty

files as garbage).
And the asm-x86/desc_64.h file was supposed to be removed by 
commit

c81c6ca45a69478c7877b729af1942d2b80ef582, but instead stayed around
containing just a single newline.
Get rid of them both properly.
Signed-off-by: Linus Torvalds <[EMAIL PROTECTED]>

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 07/11] xen: make include/xen/page.h portable moving those definitions under asm dir.

2008-02-21 Thread Isaku Yamahata
Those definitions in include/asm/xen/page.h are arch specific.
ia64/xen wants to define its own version. So move them to arch specific
directory and keep include/xen/page.h in order not to break compilation.

Signed-off-by: Isaku Yamahata <[EMAIL PROTECTED]>
---
 include/{ => asm-x86}/xen/page.h |0 
 include/xen/page.h   |  181 +-
 2 files changed, 1 insertions(+), 180 deletions(-)
 copy include/{ => asm-x86}/xen/page.h (100%)

diff --git a/include/xen/page.h b/include/asm-x86/xen/page.h
similarity index 100%
copy from include/xen/page.h
copy to include/asm-x86/xen/page.h
diff --git a/include/xen/page.h b/include/xen/page.h
index 031ef22..eaf85fa 100644
--- a/include/xen/page.h
+++ b/include/xen/page.h
@@ -1,180 +1 @@
-#ifndef __XEN_PAGE_H
-#define __XEN_PAGE_H
-
-#include 
-
-#include 
-#include 
-
-#include 
-
-#ifdef CONFIG_X86_PAE
-/* Xen machine address */
-typedef struct xmaddr {
-   unsigned long long maddr;
-} xmaddr_t;
-
-/* Xen pseudo-physical address */
-typedef struct xpaddr {
-   unsigned long long paddr;
-} xpaddr_t;
-#else
-/* Xen machine address */
-typedef struct xmaddr {
-   unsigned long maddr;
-} xmaddr_t;
-
-/* Xen pseudo-physical address */
-typedef struct xpaddr {
-   unsigned long paddr;
-} xpaddr_t;
-#endif
-
-#define XMADDR(x)  ((xmaddr_t) { .maddr = (x) })
-#define XPADDR(x)  ((xpaddr_t) { .paddr = (x) })
-
-/ MACHINE <-> PHYSICAL CONVERSION MACROS /
-#define INVALID_P2M_ENTRY  (~0UL)
-#define FOREIGN_FRAME_BIT  (1UL<<31)
-#define FOREIGN_FRAME(m)   ((m) | FOREIGN_FRAME_BIT)
-
-extern unsigned long *phys_to_machine_mapping;
-
-static inline unsigned long pfn_to_mfn(unsigned long pfn)
-{
-   if (xen_feature(XENFEAT_auto_translated_physmap))
-   return pfn;
-
-   return phys_to_machine_mapping[(unsigned int)(pfn)] &
-   ~FOREIGN_FRAME_BIT;
-}
-
-static inline int phys_to_machine_mapping_valid(unsigned long pfn)
-{
-   if (xen_feature(XENFEAT_auto_translated_physmap))
-   return 1;
-
-   return (phys_to_machine_mapping[pfn] != INVALID_P2M_ENTRY);
-}
-
-static inline unsigned long mfn_to_pfn(unsigned long mfn)
-{
-   unsigned long pfn;
-
-   if (xen_feature(XENFEAT_auto_translated_physmap))
-   return mfn;
-
-#if 0
-   if (unlikely((mfn >> machine_to_phys_order) != 0))
-   return max_mapnr;
-#endif
-
-   pfn = 0;
-   /*
-* The array access can fail (e.g., device space beyond end of RAM).
-* In such cases it doesn't matter what we return (we return garbage),
-* but we must handle the fault without crashing!
-*/
-   __get_user(pfn, _to_phys_mapping[mfn]);
-
-   return pfn;
-}
-
-static inline xmaddr_t phys_to_machine(xpaddr_t phys)
-{
-   unsigned offset = phys.paddr & ~PAGE_MASK;
-   return XMADDR(PFN_PHYS((u64)pfn_to_mfn(PFN_DOWN(phys.paddr))) | offset);
-}
-
-static inline xpaddr_t machine_to_phys(xmaddr_t machine)
-{
-   unsigned offset = machine.maddr & ~PAGE_MASK;
-   return XPADDR(PFN_PHYS((u64)mfn_to_pfn(PFN_DOWN(machine.maddr))) | 
offset);
-}
-
-/*
- * We detect special mappings in one of two ways:
- *  1. If the MFN is an I/O page then Xen will set the m2p entry
- * to be outside our maximum possible pseudophys range.
- *  2. If the MFN belongs to a different domain then we will certainly
- * not have MFN in our p2m table. Conversely, if the page is ours,
- * then we'll have p2m(m2p(MFN))==MFN.
- * If we detect a special mapping then it doesn't have a 'struct page'.
- * We force !pfn_valid() by returning an out-of-range pointer.
- *
- * NB. These checks require that, for any MFN that is not in our reservation,
- * there is no PFN such that p2m(PFN) == MFN. Otherwise we can get confused if
- * we are foreign-mapping the MFN, and the other domain as m2p(MFN) == PFN.
- * Yikes! Various places must poke in INVALID_P2M_ENTRY for safety.
- *
- * NB2. When deliberately mapping foreign pages into the p2m table, you *must*
- *  use FOREIGN_FRAME(). This will cause pte_pfn() to choke on it, as we
- *  require. In all the cases we care about, the FOREIGN_FRAME bit is
- *  masked (e.g., pfn_to_mfn()) so behaviour there is correct.
- */
-static inline unsigned long mfn_to_local_pfn(unsigned long mfn)
-{
-   extern unsigned long max_mapnr;
-   unsigned long pfn = mfn_to_pfn(mfn);
-   if ((pfn < max_mapnr)
-   && !xen_feature(XENFEAT_auto_translated_physmap)
-   && (phys_to_machine_mapping[pfn] != mfn))
-   return max_mapnr; /* force !pfn_valid() */
-   return pfn;
-}
-
-static inline void set_phys_to_machine(unsigned long pfn, unsigned long mfn)
-{
-   if (xen_feature(XENFEAT_auto_translated_physmap)) {
-   BUG_ON(pfn != mfn && mfn != INVALID_P2M_ENTRY);
-   return;
-   }
-   phys_to_machine_mapping[pfn] = mfn;
-}
-
-/* VIRT <-> MACHINE 

[PATCH 11/11] xen: import arch generic part of xencomm.

2008-02-21 Thread Isaku Yamahata
On xen/ia64 and xen/powerpc hypercall arguments are passed by pseudo
physical address (guest physical address) so that it's necessary to
convert from virtual address into pseudo physical address. The frame
work is called xencomm.
Import arch generic part of xencomm.

Signed-off-by: Isaku Yamahata <[EMAIL PROTECTED]>
---
 drivers/xen/Makefile|1 +
 drivers/xen/xencomm.c   |  232 +++
 include/xen/interface/xencomm.h |   41 +++
 include/xen/xencomm.h   |   77 +
 4 files changed, 351 insertions(+), 0 deletions(-)
 create mode 100644 drivers/xen/xencomm.c
 create mode 100644 include/xen/interface/xencomm.h
 create mode 100644 include/xen/xencomm.h

diff --git a/drivers/xen/Makefile b/drivers/xen/Makefile
index 823ce78..43f014c 100644
--- a/drivers/xen/Makefile
+++ b/drivers/xen/Makefile
@@ -1,2 +1,3 @@
 obj-y  += grant-table.o features.o events.o
 obj-y  += xenbus/
+obj-$(CONFIG_XEN_XENCOMM)  += xencomm.o
diff --git a/drivers/xen/xencomm.c b/drivers/xen/xencomm.c
new file mode 100644
index 000..797cb4e
--- /dev/null
+++ b/drivers/xen/xencomm.c
@@ -0,0 +1,232 @@
+/*
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307 USA
+ *
+ * Copyright (C) IBM Corp. 2006
+ *
+ * Authors: Hollis Blanchard <[EMAIL PROTECTED]>
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#ifdef __ia64__
+#include/* for is_kern_addr() */
+#endif
+
+#ifdef HAVE_XEN_PLATFORM_COMPAT_H
+#include 
+#endif
+
+static int xencomm_init(struct xencomm_desc *desc,
+   void *buffer, unsigned long bytes)
+{
+   unsigned long recorded = 0;
+   int i = 0;
+
+   while ((recorded < bytes) && (i < desc->nr_addrs)) {
+   unsigned long vaddr = (unsigned long)buffer + recorded;
+   unsigned long paddr;
+   int offset;
+   int chunksz;
+
+   offset = vaddr % PAGE_SIZE; /* handle partial pages */
+   chunksz = min(PAGE_SIZE - offset, bytes - recorded);
+
+   paddr = xencomm_vtop(vaddr);
+   if (paddr == ~0UL) {
+   printk(KERN_DEBUG "%s: couldn't translate vaddr %lx\n",
+  __func__, vaddr);
+   return -EINVAL;
+   }
+
+   desc->address[i++] = paddr;
+   recorded += chunksz;
+   }
+
+   if (recorded < bytes) {
+   printk(KERN_DEBUG
+  "%s: could only translate %ld of %ld bytes\n",
+  __func__, recorded, bytes);
+   return -ENOSPC;
+   }
+
+   /* mark remaining addresses invalid (just for safety) */
+   while (i < desc->nr_addrs)
+   desc->address[i++] = XENCOMM_INVALID;
+
+   desc->magic = XENCOMM_MAGIC;
+
+   return 0;
+}
+
+static struct xencomm_desc *xencomm_alloc(gfp_t gfp_mask,
+ void *buffer, unsigned long bytes)
+{
+   struct xencomm_desc *desc;
+   unsigned long buffer_ulong = (unsigned long)buffer;
+   unsigned long start = buffer_ulong & PAGE_MASK;
+   unsigned long end = (buffer_ulong + bytes) | ~PAGE_MASK;
+   unsigned long nr_addrs = (end - start + 1) >> PAGE_SHIFT;
+   unsigned long size = sizeof(*desc) +
+   sizeof(desc->address[0]) * nr_addrs;
+
+   /*
+* slab allocator returns at least sizeof(void*) aligned pointer.
+* When sizeof(*desc) > sizeof(void*), struct xencomm_desc might
+* cross page boundary.
+*/
+   if (sizeof(*desc) > sizeof(void *)) {
+   unsigned long order = get_order(size);
+   desc = (struct xencomm_desc *)__get_free_pages(gfp_mask,
+  order);
+   if (desc == NULL)
+   return NULL;
+
+   desc->nr_addrs =
+   ((PAGE_SIZE << order) - sizeof(struct xencomm_desc)) /
+   sizeof(*desc->address);
+   } else {
+   desc = kmalloc(size, gfp_mask);
+   if (desc == NULL)
+   return NULL;
+
+   desc->nr_addrs = nr_addrs;
+   }
+   return desc;
+}
+
+void xencomm_free(struct xencomm_handle *desc)
+{
+   

[PATCH 10/11] xen: import include/xen/interface/callback.h which ia64/xen needs.

2008-02-21 Thread Isaku Yamahata

Signed-off-by: Isaku Yamahata <[EMAIL PROTECTED]>
---
 include/xen/interface/callback.h |  119 ++
 1 files changed, 119 insertions(+), 0 deletions(-)
 create mode 100644 include/xen/interface/callback.h

diff --git a/include/xen/interface/callback.h b/include/xen/interface/callback.h
new file mode 100644
index 000..04c8b5d
--- /dev/null
+++ b/include/xen/interface/callback.h
@@ -0,0 +1,119 @@
+/**
+ * callback.h
+ *
+ * Register guest OS callbacks with Xen.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to
+ * deal in the Software without restriction, including without limitation the
+ * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
+ * sell copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ *
+ * Copyright (c) 2006, Ian Campbell
+ */
+
+#ifndef __XEN_PUBLIC_CALLBACK_H__
+#define __XEN_PUBLIC_CALLBACK_H__
+
+#include "xen.h"
+
+/*
+ * Prototype for this hypercall is:
+ *   long callback_op(int cmd, void *extra_args)
+ * @cmd== CALLBACKOP_??? (callback operation).
+ * @extra_args == Operation-specific extra arguments (NULL if none).
+ */
+
+/* ia64, x86: Callback for event delivery. */
+#define CALLBACKTYPE_event 0
+
+/* x86: Failsafe callback when guest state cannot be restored by Xen. */
+#define CALLBACKTYPE_failsafe  1
+
+/* x86/64 hypervisor: Syscall by 64-bit guest app ('64-on-64-on-64'). */
+#define CALLBACKTYPE_syscall   2
+
+/*
+ * x86/32 hypervisor: Only available on x86/32 when supervisor_mode_kernel
+ * feature is enabled. Do not use this callback type in new code.
+ */
+#define CALLBACKTYPE_sysenter_deprecated   3
+
+/* x86: Callback for NMI delivery. */
+#define CALLBACKTYPE_nmi   4
+
+/*
+ * x86: sysenter is only available as follows:
+ * - 32-bit hypervisor: with the supervisor_mode_kernel feature enabled
+ * - 64-bit hypervisor: 32-bit guest applications on Intel CPUs
+ *  ('32-on-32-on-64', '32-on-64-on-64')
+ *  [nb. also 64-bit guest applications on Intel CPUs
+ *   ('64-on-64-on-64'), but syscall is preferred]
+ */
+#define CALLBACKTYPE_sysenter  5
+
+/*
+ * x86/64 hypervisor: Syscall by 32-bit guest app on AMD CPUs
+ *('32-on-32-on-64', '32-on-64-on-64')
+ */
+#define CALLBACKTYPE_syscall32 7
+
+/*
+ * Disable event deliver during callback? This flag is ignored for event and
+ * NMI callbacks: event delivery is unconditionally disabled.
+ */
+#define _CALLBACKF_mask_events 0
+#define CALLBACKF_mask_events  (1U << _CALLBACKF_mask_events)
+
+/*
+ * Register a callback.
+ */
+#define CALLBACKOP_register0
+struct callback_register {
+uint16_t type;
+uint16_t flags;
+xen_callback_t address;
+};
+DEFINE_GUEST_HANDLE_STRUCT(callback_register);
+
+/*
+ * Unregister a callback.
+ *
+ * Not all callbacks can be unregistered. -EINVAL will be returned if
+ * you attempt to unregister such a callback.
+ */
+#define CALLBACKOP_unregister  1
+struct callback_unregister {
+uint16_t type;
+uint16_t _unused;
+};
+DEFINE_GUEST_HANDLE_STRUCT(callback_unregister);
+
+#if __XEN_INTERFACE_VERSION__ < 0x00030207
+#undef CALLBACKTYPE_sysenter
+#define CALLBACKTYPE_sysenter CALLBACKTYPE_sysenter_deprecated
+#endif
+
+#endif /* __XEN_PUBLIC_CALLBACK_H__ */
+
+/*
+ * Local variables:
+ * mode: C
+ * c-set-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
-- 
1.5.3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 08/11] xen: replace callers of alloc_vm_area()/free_vm_area() with xen_ prefixed one.

2008-02-21 Thread Isaku Yamahata
Don't use alloc_vm_area()/free_vm_area() directly, instead define
xen_alloc_vm_area()/xen_free_vm_area() and use them.

alloc_vm_area()/free_vm_area() are used to allocate/free area which
are for grant table mapping. Xen/x86 grant table is based on virtual
address so that alloc_vm_area()/free_vm_area() are suitable.
On the other hand Xen/ia64 (and Xen/powerpc) grant table is based on
pseudo physical address (guest physical address) so that allocation
should be done differently.
The original version of xenified Linux/IA64 have its own
allocate_vm_area()/free_vm_area() definitions which don't allocate vm area
contradictory to those names.
Now vanilla Linux already has its definitions so that it's impossible
to have IA64 definitions of allocate_vm_area()/free_vm_area().
Instead introduce xen_allocate_vm_area()/xen_free_vm_area() and use them.

Signed-off-by: Isaku Yamahata <[EMAIL PROTECTED]>
---
 drivers/xen/grant-table.c  |2 +-
 drivers/xen/xenbus/xenbus_client.c |6 +++---
 include/asm-x86/xen/hypervisor.h   |3 +++
 3 files changed, 7 insertions(+), 4 deletions(-)

diff --git a/drivers/xen/grant-table.c b/drivers/xen/grant-table.c
index 95016fd..9fcde20 100644
--- a/drivers/xen/grant-table.c
+++ b/drivers/xen/grant-table.c
@@ -478,7 +478,7 @@ static int gnttab_map(unsigned int start_idx, unsigned int 
end_idx)
 
if (shared == NULL) {
struct vm_struct *area;
-   area = alloc_vm_area(PAGE_SIZE * max_nr_grant_frames());
+   area = xen_alloc_vm_area(PAGE_SIZE * max_nr_grant_frames());
BUG_ON(area == NULL);
shared = area->addr;
}
diff --git a/drivers/xen/xenbus/xenbus_client.c 
b/drivers/xen/xenbus/xenbus_client.c
index 9fd2f70..0f86b0f 100644
--- a/drivers/xen/xenbus/xenbus_client.c
+++ b/drivers/xen/xenbus/xenbus_client.c
@@ -399,7 +399,7 @@ int xenbus_map_ring_valloc(struct xenbus_device *dev, int 
gnt_ref, void **vaddr)
 
*vaddr = NULL;
 
-   area = alloc_vm_area(PAGE_SIZE);
+   area = xen_alloc_vm_area(PAGE_SIZE);
if (!area)
return -ENOMEM;
 
@@ -409,7 +409,7 @@ int xenbus_map_ring_valloc(struct xenbus_device *dev, int 
gnt_ref, void **vaddr)
BUG();
 
if (op.status != GNTST_okay) {
-   free_vm_area(area);
+   xen_free_vm_area(area);
xenbus_dev_fatal(dev, op.status,
 "mapping in shared page %d from domain %d",
 gnt_ref, dev->otherend_id);
@@ -508,7 +508,7 @@ int xenbus_unmap_ring_vfree(struct xenbus_device *dev, void 
*vaddr)
BUG();
 
if (op.status == GNTST_okay)
-   free_vm_area(area);
+   xen_free_vm_area(area);
else
xenbus_dev_error(dev, op.status,
 "unmapping page at handle %d error %d",
diff --git a/include/asm-x86/xen/hypervisor.h b/include/asm-x86/xen/hypervisor.h
index 138ee8a..31836ad 100644
--- a/include/asm-x86/xen/hypervisor.h
+++ b/include/asm-x86/xen/hypervisor.h
@@ -57,6 +57,9 @@ extern struct shared_info *HYPERVISOR_shared_info;
 extern struct start_info *xen_start_info;
 #define is_initial_xendomain() (xen_start_info->flags & SIF_INITDOMAIN)
 
+#define xen_alloc_vm_area(size)alloc_vm_area(size)
+#define xen_free_vm_area(area) free_vm_area(area)
+
 /* arch/i386/mach-xen/evtchn.c */
 /* Force a proper event-channel callback from Xen. */
 extern void force_evtchn_callback(void);
-- 
1.5.3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 06/11] xen: move arch/x86/xen/events.c undedr drivers/xen and split out arch specific part.

2008-02-21 Thread Isaku Yamahata
ia64/xen also uses events.c. clean it up so that ia64/xen can use.
make ipi_to_irq globly visible. ia64/xen nees to reference it from other file.
introduce resend_irq_on_evtchn() which ia64 needs.
introduce xen_do_IRQ() to split out arch specific code.

Signed-off-by: Isaku Yamahata <[EMAIL PROTECTED]>
---
 arch/x86/xen/Makefile  |2 +-
 drivers/xen/Makefile   |2 +-
 {arch/x86 => drivers}/xen/events.c |   34 ++
 include/asm-x86/xen/hypervisor.h   |7 +++
 include/xen/events.h   |1 +
 5 files changed, 36 insertions(+), 10 deletions(-)
 rename {arch/x86 => drivers}/xen/events.c (95%)

diff --git a/arch/x86/xen/Makefile b/arch/x86/xen/Makefile
index c5e9aa4..95c5926 100644
--- a/arch/x86/xen/Makefile
+++ b/arch/x86/xen/Makefile
@@ -1,4 +1,4 @@
 obj-y  := enlighten.o setup.o multicalls.o mmu.o \
-   events.o time.o manage.o xen-asm.o
+   time.o manage.o xen-asm.o
 
 obj-$(CONFIG_SMP)  += smp.o
diff --git a/drivers/xen/Makefile b/drivers/xen/Makefile
index 609fdda..823ce78 100644
--- a/drivers/xen/Makefile
+++ b/drivers/xen/Makefile
@@ -1,2 +1,2 @@
-obj-y  += grant-table.o features.o
+obj-y  += grant-table.o features.o events.o
 obj-y  += xenbus/
diff --git a/arch/x86/xen/events.c b/drivers/xen/events.c
similarity index 95%
rename from arch/x86/xen/events.c
rename to drivers/xen/events.c
index dcf613e..7474739 100644
--- a/arch/x86/xen/events.c
+++ b/drivers/xen/events.c
@@ -37,7 +37,9 @@
 #include 
 #include 
 
-#include "xen-ops.h"
+#ifdef CONFIG_X86
+# include "../arch/x86/xen/xen-ops.h"
+#endif
 
 /*
  * This lock protects updates to the following mapping and reference-count
@@ -49,7 +51,7 @@ static DEFINE_SPINLOCK(irq_mapping_update_lock);
 static DEFINE_PER_CPU(int, virq_to_irq[NR_VIRQS]) = {[0 ... NR_VIRQS-1] = -1};
 
 /* IRQ <-> IPI mapping */
-static DEFINE_PER_CPU(int, ipi_to_irq[XEN_NR_IPIS]) = {[0 ... XEN_NR_IPIS-1] = 
-1};
+DEFINE_PER_CPU(int, ipi_to_irq[XEN_NR_IPIS]) = {[0 ... XEN_NR_IPIS-1] = -1};
 
 /* Packed IRQ information: binding type, sub-type index, and event channel. */
 struct packed_irq
@@ -455,7 +457,6 @@ void xen_send_IPI_one(unsigned int cpu, enum ipi_vector 
vector)
notify_remote_via_irq(irq);
 }
 
-
 /*
  * Search the CPUs pending events bitmasks.  For each one found, map
  * the event number to an irq, and feed it into do_IRQ() for
@@ -474,7 +475,10 @@ void xen_evtchn_do_upcall(struct pt_regs *regs)
 
vcpu_info->evtchn_upcall_pending = 0;
 
-   /* NB. No need for a barrier here -- XCHG is a barrier on x86. */
+#ifndef CONFIG_X86 /* No need for a barrier -- XCHG is a barrier on x86. */
+   /* Clear master flag /before/ clearing selector flag. */
+   rmb();
+#endif
pending_words = xchg(_info->evtchn_pending_sel, 0);
while (pending_words != 0) {
unsigned long pending_bits;
@@ -486,10 +490,8 @@ void xen_evtchn_do_upcall(struct pt_regs *regs)
int port = (word_idx * BITS_PER_LONG) + bit_idx;
int irq = evtchn_to_irq[port];
 
-   if (irq != -1) {
-   regs->orig_ax = ~irq;
-   do_IRQ(regs);
-   }
+   if (irq != -1)
+   xen_do_IRQ(irq, regs);
}
}
 
@@ -525,6 +527,22 @@ static void set_affinity_irq(unsigned irq, cpumask_t dest)
rebind_irq_to_cpu(irq, tcpu);
 }
 
+int resend_irq_on_evtchn(unsigned int irq)
+{
+   int masked, evtchn = evtchn_from_irq(irq);
+   struct shared_info *s = HYPERVISOR_shared_info;
+
+   if (!VALID_EVTCHN(evtchn))
+   return 1;
+
+   masked = sync_test_and_set_bit(evtchn, s->evtchn_mask);
+   sync_set_bit(evtchn, s->evtchn_pending);
+   if (!masked)
+   unmask_evtchn(evtchn);
+
+   return 1;
+}
+
 static void enable_dynirq(unsigned int irq)
 {
int evtchn = evtchn_from_irq(irq);
diff --git a/include/asm-x86/xen/hypervisor.h b/include/asm-x86/xen/hypervisor.h
index 8e15dd2..138ee8a 100644
--- a/include/asm-x86/xen/hypervisor.h
+++ b/include/asm-x86/xen/hypervisor.h
@@ -61,6 +61,13 @@ extern struct start_info *xen_start_info;
 /* Force a proper event-channel callback from Xen. */
 extern void force_evtchn_callback(void);
 
+/* macro to avoid header inclusion dependncy hell */
+#define xen_do_IRQ(irq, regs)  \
+   do {\
+   (regs)->orig_ax = ~(irq);   \
+   do_IRQ(regs);   \
+   } while (0)
+
 /* Turn jiffies into Xen system time. */
 u64 jiffies_to_st(unsigned long jiffies);
 
diff --git a/include/xen/events.h b/include/xen/events.h
index 2bde54d..574cfa4 100644
--- a/include/xen/events.h
+++ b/include/xen/events.h
@@ -37,6 +37,7 @@ int bind_ipi_to_irqhandler(enum ipi_vector ipi,
 void 

[PATCH 05/11] xen: move features.c from arch/x86/xen/features.c to drivers/xen.

2008-02-21 Thread Isaku Yamahata
ia64/xen also uses it too, so move it into common place.

Signed-off-by: Isaku Yamahata <[EMAIL PROTECTED]>
---
 arch/x86/xen/Makefile|2 +-
 drivers/xen/Makefile |2 +-
 {arch/x86 => drivers}/xen/features.c |0 
 3 files changed, 2 insertions(+), 2 deletions(-)
 rename {arch/x86 => drivers}/xen/features.c (100%)

diff --git a/arch/x86/xen/Makefile b/arch/x86/xen/Makefile
index 343df24..c5e9aa4 100644
--- a/arch/x86/xen/Makefile
+++ b/arch/x86/xen/Makefile
@@ -1,4 +1,4 @@
-obj-y  := enlighten.o setup.o features.o multicalls.o mmu.o \
+obj-y  := enlighten.o setup.o multicalls.o mmu.o \
events.o time.o manage.o xen-asm.o
 
 obj-$(CONFIG_SMP)  += smp.o
diff --git a/drivers/xen/Makefile b/drivers/xen/Makefile
index 56592f0..609fdda 100644
--- a/drivers/xen/Makefile
+++ b/drivers/xen/Makefile
@@ -1,2 +1,2 @@
-obj-y  += grant-table.o
+obj-y  += grant-table.o features.o
 obj-y  += xenbus/
diff --git a/arch/x86/xen/features.c b/drivers/xen/features.c
similarity index 100%
rename from arch/x86/xen/features.c
rename to drivers/xen/features.c
-- 
1.5.3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 03/11] xen: add missing definitions for xen grant table which ia64/xen needs.

2008-02-21 Thread Isaku Yamahata

Signed-off-by: Isaku Yamahata <[EMAIL PROTECTED]>
---
 drivers/xen/grant-table.c   |2 +-
 include/asm-x86/xen/interface.h |   24 
 include/xen/interface/grant_table.h |   11 ---
 3 files changed, 33 insertions(+), 4 deletions(-)

diff --git a/drivers/xen/grant-table.c b/drivers/xen/grant-table.c
index ea94dba..95016fd 100644
--- a/drivers/xen/grant-table.c
+++ b/drivers/xen/grant-table.c
@@ -466,7 +466,7 @@ static int gnttab_map(unsigned int start_idx, unsigned int 
end_idx)
 
setup.dom= DOMID_SELF;
setup.nr_frames  = nr_gframes;
-   setup.frame_list = frames;
+   set_xen_guest_handle(setup.frame_list, frames);
 
rc = HYPERVISOR_grant_table_op(GNTTABOP_setup_table, , 1);
if (rc == -ENOSYS) {
diff --git a/include/asm-x86/xen/interface.h b/include/asm-x86/xen/interface.h
index 165c396..49993dd 100644
--- a/include/asm-x86/xen/interface.h
+++ b/include/asm-x86/xen/interface.h
@@ -22,6 +22,30 @@
 #define DEFINE_GUEST_HANDLE(name) __DEFINE_GUEST_HANDLE(name, name)
 #define GUEST_HANDLE(name)__guest_handle_ ## name
 
+#ifdef __XEN__
+#if defined(__i386__)
+#define set_xen_guest_handle(hnd, val) \
+   do {\
+   if (sizeof(hnd) == 8)   \
+   *(uint64_t *)&(hnd) = 0;\
+   (hnd).p = val;  \
+   } while (0)
+#elif defined(__x86_64__)
+#define set_xen_guest_handle(hnd, val) do { (hnd).p = val; } while (0)
+#endif
+#else
+#if defined(__i386__)
+#define set_xen_guest_handle(hnd, val) \
+   do {\
+   if (sizeof(hnd) == 8)   \
+   *(uint64_t *)&(hnd) = 0;\
+   (hnd) = val;\
+   } while (0)
+#elif defined(__x86_64__)
+#define set_xen_guest_handle(hnd, val) do { (hnd) = val; } while (0)
+#endif
+#endif
+
 #ifndef __ASSEMBLY__
 /* Guest handles for primitive C types. */
 __DEFINE_GUEST_HANDLE(uchar, unsigned char);
diff --git a/include/xen/interface/grant_table.h 
b/include/xen/interface/grant_table.h
index 2190498..39da93c 100644
--- a/include/xen/interface/grant_table.h
+++ b/include/xen/interface/grant_table.h
@@ -185,6 +185,7 @@ struct gnttab_map_grant_ref {
 grant_handle_t handle;
 uint64_t dev_bus_addr;
 };
+DEFINE_GUEST_HANDLE_STRUCT(gnttab_map_grant_ref);
 
 /*
  * GNTTABOP_unmap_grant_ref: Destroy one or more grant-reference mappings
@@ -206,6 +207,7 @@ struct gnttab_unmap_grant_ref {
 /* OUT parameters. */
 int16_t  status;  /* GNTST_* */
 };
+DEFINE_GUEST_HANDLE_STRUCT(gnttab_unmap_grant_ref);
 
 /*
  * GNTTABOP_setup_table: Set up a grant table for  comprising at least
@@ -223,8 +225,9 @@ struct gnttab_setup_table {
 uint32_t nr_frames;
 /* OUT parameters. */
 int16_t  status;  /* GNTST_* */
-ulong *frame_list;
+GUEST_HANDLE(ulong) frame_list;
 };
+DEFINE_GUEST_HANDLE_STRUCT(gnttab_setup_table);
 
 /*
  * GNTTABOP_dump_table: Dump the contents of the grant table to the
@@ -237,6 +240,7 @@ struct gnttab_dump_table {
 /* OUT parameters. */
 int16_t status;   /* GNTST_* */
 };
+DEFINE_GUEST_HANDLE_STRUCT(gnttab_dump_table);
 
 /*
  * GNTTABOP_transfer_grant_ref: Transfer  to a foreign domain. The
@@ -255,7 +259,7 @@ struct gnttab_transfer {
 /* OUT parameters. */
 int16_t   status;
 };
-
+DEFINE_GUEST_HANDLE_STRUCT(gnttab_transfer);
 
 /*
  * GNTTABOP_copy: Hypervisor based copy
@@ -296,6 +300,7 @@ struct gnttab_copy {
/* OUT parameters. */
int16_t   status;
 };
+DEFINE_GUEST_HANDLE_STRUCT(gnttab_copy);
 
 /*
  * GNTTABOP_query_size: Query the current and maximum sizes of the shared
@@ -313,7 +318,7 @@ struct gnttab_query_size {
 uint32_t max_nr_frames;
 int16_t  status;  /* GNTST_* */
 };
-
+DEFINE_GUEST_HANDLE_STRUCT(gnttab_query_size);
 
 /*
  * Bitfield values for update_pin_status.flags.
-- 
1.5.3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 02/11] xen: add missing VIRQ_ARCH_[0-7] definitions which ia64/xen needs.

2008-02-21 Thread Isaku Yamahata

Signed-off-by: Isaku Yamahata <[EMAIL PROTECTED]>
---
 include/xen/interface/xen.h |   12 +++-
 1 files changed, 11 insertions(+), 1 deletions(-)

diff --git a/include/xen/interface/xen.h b/include/xen/interface/xen.h
index 87ad143..9b018da 100644
--- a/include/xen/interface/xen.h
+++ b/include/xen/interface/xen.h
@@ -78,8 +78,18 @@
 #define VIRQ_CONSOLE2  /* (DOM0) Bytes received on emergency console. */
 #define VIRQ_DOM_EXC3  /* (DOM0) Exceptional event for some domain.   */
 #define VIRQ_DEBUGGER   6  /* (DOM0) A domain has paused for debugging.   */
-#define NR_VIRQS8
 
+/* Architecture-specific VIRQ definitions. */
+#define VIRQ_ARCH_016
+#define VIRQ_ARCH_117
+#define VIRQ_ARCH_218
+#define VIRQ_ARCH_319
+#define VIRQ_ARCH_420
+#define VIRQ_ARCH_521
+#define VIRQ_ARCH_622
+#define VIRQ_ARCH_723
+
+#define NR_VIRQS   24
 /*
  * MMU-UPDATE REQUESTS
  *
-- 
1.5.3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 04/11] xen: add missing definitions in include/xen/interface/vcpu.h which ia64/xen needs

2008-02-21 Thread Isaku Yamahata

Signed-off-by: Isaku Yamahata <[EMAIL PROTECTED]>
---
 include/xen/interface/vcpu.h |5 +
 1 files changed, 5 insertions(+), 0 deletions(-)

diff --git a/include/xen/interface/vcpu.h b/include/xen/interface/vcpu.h
index b05d8a6..87e6f8a 100644
--- a/include/xen/interface/vcpu.h
+++ b/include/xen/interface/vcpu.h
@@ -85,6 +85,7 @@ struct vcpu_runstate_info {
 */
uint64_t time[4];
 };
+DEFINE_GUEST_HANDLE_STRUCT(vcpu_runstate_info);
 
 /* VCPU is currently running on a physical CPU. */
 #define RUNSTATE_running  0
@@ -119,6 +120,7 @@ struct vcpu_runstate_info {
 #define VCPUOP_register_runstate_memory_area 5
 struct vcpu_register_runstate_memory_area {
union {
+   GUEST_HANDLE(vcpu_runstate_info) h;
struct vcpu_runstate_info *v;
uint64_t p;
} addr;
@@ -134,6 +136,7 @@ struct vcpu_register_runstate_memory_area {
 struct vcpu_set_periodic_timer {
uint64_t period_ns;
 };
+DEFINE_GUEST_HANDLE_STRUCT(vcpu_set_periodic_timer);
 
 /*
  * Set or stop a VCPU's single-shot timer. Every VCPU has one single-shot
@@ -145,6 +148,7 @@ struct vcpu_set_singleshot_timer {
uint64_t timeout_abs_ns;
uint32_t flags;/* VCPU_SSHOTTMR_??? */
 };
+DEFINE_GUEST_HANDLE_STRUCT(vcpu_set_singleshot_timer);
 
 /* Flags to VCPUOP_set_singleshot_timer. */
  /* Require the timeout to be in the future (return -ETIME if it's passed). */
@@ -164,5 +168,6 @@ struct vcpu_register_vcpu_info {
 uint32_t offset; /* offset within page */
 uint32_t rsvd;   /* unused */
 };
+DEFINE_GUEST_HANDLE_STRUCT(vcpu_register_vcpu_info);
 
 #endif /* __XEN_PUBLIC_VCPU_H__ */
-- 
1.5.3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 01/11] xen: add missing __HYPERVISOR_arch_[0-7] definisions which ia64 needs.

2008-02-21 Thread Isaku Yamahata
Signed-off-by: Isaku Yamahata <[EMAIL PROTECTED]>
---
 include/xen/interface/xen.h |   10 ++
 1 files changed, 10 insertions(+), 0 deletions(-)

diff --git a/include/xen/interface/xen.h b/include/xen/interface/xen.h
index 518a5bf..87ad143 100644
--- a/include/xen/interface/xen.h
+++ b/include/xen/interface/xen.h
@@ -58,6 +58,16 @@
 #define __HYPERVISOR_physdev_op   33
 #define __HYPERVISOR_hvm_op   34
 
+/* Architecture-specific hypercall definitions. */
+#define __HYPERVISOR_arch_0   48
+#define __HYPERVISOR_arch_1   49
+#define __HYPERVISOR_arch_2   50
+#define __HYPERVISOR_arch_3   51
+#define __HYPERVISOR_arch_4   52
+#define __HYPERVISOR_arch_5   53
+#define __HYPERVISOR_arch_6   54
+#define __HYPERVISOR_arch_7   55
+
 /*
  * VIRTUAL INTERRUPTS
  *
-- 
1.5.3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 00/11] Xen arch portability patches

2008-02-21 Thread Isaku Yamahata
Hi. Recently the Xen-IA64 community started to make efforts to merge 
xen/ia64 Linux to upstream. The first step is to merge up domU portion.
This patchset is preliminary for xen/ia64 linux making the current
xen/x86 domU code more arch generic and adding missing definitions and
files.

Diffstat:
 arch/x86/xen/Makefile|4 +-
 arch/x86/xen/grant-table.c   |   91 +
 drivers/xen/Makefile |3 +-
 {arch/x86 => drivers}/xen/events.c   |   34 --
 {arch/x86 => drivers}/xen/features.c |0 
 drivers/xen/grant-table.c|   37 +-
 drivers/xen/xenbus/xenbus_client.c   |6 +-
 drivers/xen/xencomm.c|  232 ++
 include/asm-x86/xen/hypervisor.h |   10 ++
 include/asm-x86/xen/interface.h  |   24 
 include/{ => asm-x86}/xen/page.h |0 
 include/xen/events.h |1 +
 include/xen/grant_table.h|6 +
 include/xen/interface/callback.h |  119 +
 include/xen/interface/grant_table.h  |   11 ++-
 include/xen/interface/vcpu.h |5 +
 include/xen/interface/xen.h  |   22 +++-
 include/xen/interface/xencomm.h  |   41 ++
 include/xen/page.h   |  181 +--
 include/xen/xencomm.h|   77 +++
 20 files changed, 673 insertions(+), 231 deletions(-)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 09/37] Security: Allow kernel services to override LSM settings for task actions

2008-02-21 Thread Casey Schaufler

--- David Howells <[EMAIL PROTECTED]> wrote:

> Allow kernel services to override LSM settings appropriate to the actions
> performed by a task by duplicating a security record, modifying it and then
> using task_struct::act_as to point to it when performing operations on behalf
> of a task.
> 
> This is used, for example, by CacheFiles which has to transparently access
> the
> cache on behalf of a process that thinks it is doing, say, NFS accesses with
> a
> potentially inappropriate (with respect to accessing the cache) set of
> security data.
> 
> This patch provides two LSM hooks for modifying a task security record:
> 
>  (*) security_kernel_act_as() which allows modification of the security datum
>  with which a task acts on other objects (most notably files).
> 
>  (*) security_create_files_as() which allows modification of the security
>  datum that is used to initialise the security data on a file that a task
>  creates.
> 
> ...

> --- a/security/smack/smack_lsm.c
> +++ b/security/smack/smack_lsm.c
> @@ -976,6 +976,36 @@ static int smack_task_dup_security(struct task_security
> *sec)
>  }
>  
>  /**
> + * smack_task_kernel_act_as - Set the subjective context in a security
> record
> + * @p points to the task that nominated @secid.
> + * @sec points to the task security record to be modified.
> + * @secid specifies the security ID to be set
> + *
> + * Set the security data for a kernel service.
> + */
> +static int smack_task_kernel_act_as(struct task_struct *p,
> + struct task_security *sec, u32 secid)
> +{
> + return -ENOTSUPP;
> +}
> +
> +/**
> + * smack_task_create_files_as - Set the file creation label in a security
> record
> + * @p points to the task that nominated @inode.
> + * @sec points to the task security record to be modified.
> + * @inode points to the inode to use as a reference.
> + *
> + * Set the file creation context in a security record to the same as the
> + * objective context of the specified inode
> + */
> +static int smack_task_create_files_as(struct task_struct *p,
> +   struct task_security *sec,
> +   struct inode *inode)
> +{
> + return -ENOTSUPP;
> +}

Hum. ENOTSUPP is not not very satisfying, is it? I will have to
think on this a bit.

> +
> +/**
>   * smack_task_setpgid - Smack check on setting pgid
>   * @p: the task object
>   * @pgid: unused
> @@ -2444,6 +2474,8 @@ static struct security_operations smack_ops = {
>   .task_alloc_security =  smack_task_alloc_security,
>   .task_free_security =   smack_task_free_security,
>   .task_dup_security =smack_task_dup_security,
> + .task_kernel_act_as =   smack_task_kernel_act_as,
> + .task_create_files_as = smack_task_create_files_as,
>   .task_post_setuid = cap_task_post_setuid,
>   .task_setpgid = smack_task_setpgid,
>   .task_getpgid = smack_task_getpgid,

Except for the fact that the hooks don't do anything this
looks fine. I'm not sure that I would want these hooks to
do anything, it requires additional thought to determine if
there is a good behavior for them.

Thank you.


Casey Schaufler
[EMAIL PROTECTED]
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [GIT PULL?] Create and populate toplevel tests/ for kernel tests

2008-02-21 Thread Ananth N Mavinakayanahalli
On Fri, Feb 22, 2008 at 03:57:13PM +1100, Stephen Rothwell wrote:
> Hi Ananth,
> 
> On Fri, 22 Feb 2008 09:12:31 +0530 Ananth N Mavinakayanahalli <[EMAIL 
> PROTECTED]> wrote:
> >
> > The patchset in question is just a major code movement - basically to
> > move all in-kernel tests to live under a toplevel tests/ directory. As
> > such, all the stakeholders have acked the patchset, but it does look
> > like this is a big enough change to be deferred to the next merge
> > window.
> > 
> > Given that there is general agreement about the patchset, could you
> > please pull in the changes into the linux-next tree?
> > 
> > Sam has setup a git tree for this and you can pull from:
> > ssh://master.kernel.org/pub/scm/linux/kernel/git/sam/tests.git
> >  
> > Link to the thread: http://lkml.org/lkml/2008/2/11/97
> 
> I will include this in the next linux-tree.  It looks like it should not
> cause to many problems (it merges OK on top of the about to be announce
> next-20080222), but if I get a hard to resolve merge problem with it, I
> will drop it first, OK.
> 
> I have noted you as the contact.

Sure! Thanks Stephen.

Ananth
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [GIT PULL] XFS update for 2.6.25-rc3

2008-02-21 Thread Jeff Garzik

Lachlan McIlroy wrote:

Remove empty file fs/xfs/Makefile-linux-2.6.


Already in the upstream kernel...



commit 1803f3389b7ac9ed33ea561b3b94e22e2864a95d
Author: Linus Torvalds <[EMAIL PROTECTED]>
Date:   Wed Feb 20 19:55:09 2008 -0800

Remove empty file remnants that were left in the tree by mistake

Noted by various people (Sam, Jeff, Roland..)

Commit 58b7983d15a422d9616bdc4e245d5c31dfaefbe2 intended to remove the

xfs "Makefile-linux-2.6" file, but it was mistakenly still left in the
tree as a empty file, and would cause git to correctly complain about a
tracked file being removed after a "make distclean" (which removes empty
files as garbage).

And the asm-x86/desc_64.h file was supposed to be removed by commit

c81c6ca45a69478c7877b729af1942d2b80ef582, but instead stayed around
containing just a single newline.

Get rid of them both properly.

Signed-off-by: Linus Torvalds <[EMAIL PROTECTED]>

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [GIT PULL?] Create and populate toplevel tests/ for kernel tests

2008-02-21 Thread Stephen Rothwell
Hi Ananth,

On Fri, 22 Feb 2008 09:12:31 +0530 Ananth N Mavinakayanahalli <[EMAIL 
PROTECTED]> wrote:
>
> The patchset in question is just a major code movement - basically to
> move all in-kernel tests to live under a toplevel tests/ directory. As
> such, all the stakeholders have acked the patchset, but it does look
> like this is a big enough change to be deferred to the next merge
> window.
> 
> Given that there is general agreement about the patchset, could you
> please pull in the changes into the linux-next tree?
> 
> Sam has setup a git tree for this and you can pull from:
> ssh://master.kernel.org/pub/scm/linux/kernel/git/sam/tests.git
>  
> Link to the thread: http://lkml.org/lkml/2008/2/11/97

I will include this in the next linux-tree.  It looks like it should not
cause to many problems (it merges OK on top of the about to be announce
next-20080222), but if I get a hard to resolve merge problem with it, I
will drop it first, OK.

I have noted you as the contact.
-- 
Cheers,
Stephen Rothwell[EMAIL PROTECTED]


pgpob0Y0UsVFR.pgp
Description: PGP signature


Re: [PATCH 07/37] Security: De-embed task security record from task and use refcounting

2008-02-21 Thread Casey Schaufler

--- David Howells <[EMAIL PROTECTED]> wrote:

> Remove the temporarily embedded task security record from task_struct. 
> Instead
> it is made to dangle from the task_struct::sec and task_struct::act_as
> pointers
> with references counted for each.
> 
> ...
> 
> The LSM hooks for dealing with task security are modified to deal with the
> task
> security struct directly rather than going via the task_struct as appopriate.
> 
> ...

> diff --git a/security/smack/smack_lsm.c b/security/smack/smack_lsm.c
> index a49d94f..dbce607 100644
> --- a/security/smack/smack_lsm.c
> +++ b/security/smack/smack_lsm.c
> @@ -957,9 +957,22 @@ static int smack_task_alloc_security(struct task_struct
> *tsk)
>   * points to an immutable list. The blobs never go away.
>   * There is no leak here.
>   */
> -static void smack_task_free_security(struct task_struct *task)
> +static void smack_task_free_security(struct task_security *sec)
>  {
> - task->sec->security = NULL;
> + sec->security = NULL;
> +}
> +
> +/**
> + * task_dup_security - Duplicate task security
> + * @p points to the task_security struct that has been copied
> + *
> + * Duplicate the security structure currently attached to the p->security
> field
> + * and attach back to p->security (the pointer itself was copied, so there's
> + * nothing to be done here).
> + */
> +static int smack_task_dup_security(struct task_security *sec)
> +{
> + return 0;
>  }

Thank you for adding this hook. The comment is helpful.
  
>  /**
> @@ -2276,17 +2289,17 @@ static int smack_inet_conn_request(struct sock *sk,
> struct sk_buff *skb,
>  /**
>   * smack_key_alloc - Set the key security blob
>   * @key: object
> - * @tsk: the task associated with the key
> + * @context: the task security associated with the key
>   * @flags: unused
>   *
>   * No allocation required
>   *
>   * Returns 0
>   */
> -static int smack_key_alloc(struct key *key, struct task_struct *tsk,
> +static int smack_key_alloc(struct key *key, struct task_security *context,
>  unsigned long flags)
>  {
> - key->security = tsk->act_as->security;
> + key->security = context->security;
>   return 0;
>  }
>  
> @@ -2304,14 +2317,14 @@ static void smack_key_free(struct key *key)
>  /*
>   * smack_key_permission - Smack access on a key
>   * @key_ref: gets to the object
> - * @context: task involved
> + * @context: task security involved
>   * @perm: unused
>   *
>   * Return 0 if the task has read and write to the object,
>   * an error code otherwise
>   */
>  static int smack_key_permission(key_ref_t key_ref,
> - struct task_struct *context, key_perm_t perm)
> + struct task_security *context, key_perm_t perm)
>  {
>   struct key *keyp;
>  
> @@ -2327,10 +2340,10 @@ static int smack_key_permission(key_ref_t key_ref,
>   /*
>* This should not occur
>*/
> - if (context->act_as->security == NULL)
> + if (context->security == NULL)
>   return -EACCES;
>  
> - return smk_access(context->act_as->security, keyp->security,
> + return smk_access(context->security, keyp->security,
> MAY_READWRITE);
>  }
>  #endif /* CONFIG_KEYS */
> @@ -2430,6 +2443,7 @@ static struct security_operations smack_ops = {
>  
>   .task_alloc_security =  smack_task_alloc_security,
>   .task_free_security =   smack_task_free_security,
> + .task_dup_security =smack_task_dup_security,
>   .task_post_setuid = cap_task_post_setuid,
>   .task_setpgid = smack_task_setpgid,
>   .task_getpgid = smack_task_getpgid,

No objections from the Smack side. Thank you.


Casey Schaufler
[EMAIL PROTECTED]
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 06/37] Security: Separate task security context from task_struct

2008-02-21 Thread Casey Schaufler

--- David Howells <[EMAIL PROTECTED]> wrote:

> Separate the task security context from task_struct.  At this point, the
> security data is temporarily embedded in the task_struct with two pointers
> pointing to it.

> ...

> diff --git a/security/smack/smack_access.c b/security/smack/smack_access.c
> index f6b5f6e..722752f 100644
> --- a/security/smack/smack_access.c
> +++ b/security/smack/smack_access.c
> @@ -164,7 +164,7 @@ int smk_curacc(char *obj_label, u32 mode)
>  {
>   int rc;
>  
> - rc = smk_access(current->security, obj_label, mode);
> + rc = smk_access(current->act_as->security, obj_label, mode);
>   if (rc == 0)
>   return 0;
>  
> diff --git a/security/smack/smack_lsm.c b/security/smack/smack_lsm.c
> index 25cbfa3..a49d94f 100644
> --- a/security/smack/smack_lsm.c
> +++ b/security/smack/smack_lsm.c
> @@ -102,7 +102,8 @@ static int smack_ptrace(struct task_struct *ptp, struct
> task_struct *ctp)
>   if (rc != 0)
>   return rc;
>  
> - rc = smk_access(ptp->security, ctp->security, MAY_READWRITE);
> + rc = smk_access(ptp->act_as->security, ctp->sec->security,
> + MAY_READWRITE);
>   if (rc != 0 && __capable(ptp, CAP_MAC_OVERRIDE))
>   return 0;
>  
> @@ -120,7 +121,7 @@ static int smack_ptrace(struct task_struct *ptp, struct
> task_struct *ctp)
>  static int smack_syslog(int type)
>  {
>   int rc;
> - char *sp = current->security;
> + char *sp = current->act_as->security;
>  
>   rc = cap_syslog(type);
>   if (rc != 0)
> @@ -359,7 +360,7 @@ static int smack_sb_umount(struct vfsmount *mnt, int
> flags)
>   */
>  static int smack_inode_alloc_security(struct inode *inode)
>  {
> - inode->i_security = new_inode_smack(current->security);
> + inode->i_security = new_inode_smack(current->act_as->security);
>   if (inode->i_security == NULL)
>   return -ENOMEM;
>   return 0;
> @@ -777,7 +778,7 @@ static int smack_file_permission(struct file *file, int
> mask)
>   */
>  static int smack_file_alloc_security(struct file *file)
>  {
> - file->f_security = current->security;
> + file->f_security = current->act_as->security;
>   return 0;
>  }
>  
> @@ -875,7 +876,7 @@ static int smack_file_fcntl(struct file *file, unsigned
> int cmd,
>   */
>  static int smack_file_set_fowner(struct file *file)
>  {
> - file->f_security = current->security;
> + file->f_security = current->act_as->security;
>   return 0;
>  }
>  
> @@ -900,7 +901,7 @@ static int smack_file_send_sigiotask(struct task_struct
> *tsk,
>* struct fown_struct is never outside the context of a struct file
>*/
>   file = container_of(fown, struct file, f_owner);
> - rc = smk_access(file->f_security, tsk->security, MAY_WRITE);
> + rc = smk_access(file->f_security, tsk->sec->security, MAY_WRITE);
>   if (rc != 0 && __capable(tsk, CAP_MAC_OVERRIDE))
>   return 0;
>   return rc;
> @@ -943,7 +944,7 @@ static int smack_file_receive(struct file *file)
>   */
>  static int smack_task_alloc_security(struct task_struct *tsk)
>  {
> - tsk->security = current->security;
> + tsk->sec->security = current->act_as->security;
>  
>   return 0;
>  }
> @@ -958,7 +959,7 @@ static int smack_task_alloc_security(struct task_struct
> *tsk)
>   */
>  static void smack_task_free_security(struct task_struct *task)
>  {
> - task->security = NULL;
> + task->sec->security = NULL;
>  }
>  
>  /**
> @@ -970,7 +971,7 @@ static void smack_task_free_security(struct task_struct
> *task)
>   */
>  static int smack_task_setpgid(struct task_struct *p, pid_t pgid)
>  {
> - return smk_curacc(p->security, MAY_WRITE);
> + return smk_curacc(p->sec->security, MAY_WRITE);
>  }
>  
>  /**
> @@ -981,7 +982,7 @@ static int smack_task_setpgid(struct task_struct *p,
> pid_t pgid)
>   */
>  static int smack_task_getpgid(struct task_struct *p)
>  {
> - return smk_curacc(p->security, MAY_READ);
> + return smk_curacc(p->sec->security, MAY_READ);
>  }
>  
>  /**
> @@ -992,7 +993,7 @@ static int smack_task_getpgid(struct task_struct *p)
>   */
>  static int smack_task_getsid(struct task_struct *p)
>  {
> - return smk_curacc(p->security, MAY_READ);
> + return smk_curacc(p->sec->security, MAY_READ);
>  }
>  
>  /**
> @@ -1004,7 +1005,7 @@ static int smack_task_getsid(struct task_struct *p)
>   */
>  static void smack_task_getsecid(struct task_struct *p, u32 *secid)
>  {
> - *secid = smack_to_secid(p->security);
> + *secid = smack_to_secid(p->sec->security);
>  }
>  
>  /**
> @@ -1016,7 +1017,7 @@ static void smack_task_getsecid(struct task_struct *p,
> u32 *secid)
>   */
>  static int smack_task_setnice(struct task_struct *p, int nice)
>  {
> - return smk_curacc(p->security, MAY_WRITE);
> + return smk_curacc(p->sec->security, MAY_WRITE);
>  }
>  
>  /**
> @@ -1028,7 +1029,7 @@ static int smack_task_setnice(struct task_struct *p,
> int nice)
>   

Re: [PATCH] Document huge memory/cache overhead of memory controller in Kconfig

2008-02-21 Thread Balbir Singh
Andi Kleen wrote:
>> 1. We could create something similar to mem_map, we would need to handle 4
> 
> 4? At least x86 mainline only has two ways now. flatmem and vmemmap.
> 
>> different ways of creating mem_map.
> 
> Well it would be only a single way to create the "aux memory controller
> map" (or however it will be called). Basically just a call to single
> function from a few different places.
> 
>> 2. On x86 with 64 GB ram, 
> 
> First i386 with 64GB just doesn't work, at least not with default 3:1
> split. Just calculate it yourself how much of the lowmem area is left
> after the 64GB mem_map is allocated. Typical rule of thumb is that 16GB
> is the realistic limit for 32bit x86 kernels. Worrying about
> anything more does not make much sense.
> 

I understand what you say Andi, but nothing in the kernel stops us from
supporting 64GB. Should a framework like memory controller make an assumption
that not more than 16GB will be configured on an x86 box?

>> if we decided to use vmalloc space, we would need 64
>> MB of vmalloc'ed memory
> 
> Yes and if you increase mem_map you need exactly the same space
> in lowmem too. So increasing the vmalloc reservation for this is
> equivalent. Just make sure you use highmem backed vmalloc.
> 

I see two problems with using vmalloc. One, the reservation needs to be done
across architectures. Two, a big vmalloc chunk is not node aware, if all the
pages come from the same node, we have a penalty to pay in a NUMA system.

-- 
Warm Regards,
Balbir Singh
Linux Technology Center
IBM, ISTL
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [GIT PULL] XFS update for 2.6.25-rc3

2008-02-21 Thread Lachlan McIlroy

please ignore this request - sent the wrong file, see next pull request...

Lachlan McIlroy wrote:

Please pull from the for-linus branch:
git pull git://oss.sgi.com:8090/xfs/xfs-2.6.git for-linus

This will update the following files:

 fs/xfs/Kbuild |6 --
 fs/xfs/Makefile   |  118 -
 fs/xfs/Makefile-linux-2.6 |  117 
 3 files changed, 117 insertions(+), 124 deletions(-)
 delete mode 100644 fs/xfs/Kbuild

through these commits:

commit 269cdfaf769f5cd831284cc831790c7c5038040f
Author: Lachlan McIlroy <[EMAIL PROTECTED]>
Date:   Wed Nov 28 18:28:09 2007 +1100

[XFS] Added quota targets and removed dmapi directory

Fixes build failures introduced by bad merge to mainline.


commit 794f744b225aaf35742aac9e7b9dda96a9943413
Author: Eric Sandeen <[EMAIL PROTECTED]>
Date:   Tue Nov 27 16:59:56 2007 +1100

[XFS] Fix up xfs out-of-tree builds. (a.k.a. external modules)

Change -I include directives to find headers in the out-of-tree spot. This

allows a directory containing only xfs files to be built as:

SGI-PV: 971186

SGI-Modid: xfs-linux-melb:xfs-kern:29878a

Signed-off-by: Eric Sandeen <[EMAIL PROTECTED]>

Signed-off-by: Donald Douwsma <[EMAIL PROTECTED]>
Signed-off-by: Lachlan McIlroy <[EMAIL PROTECTED]>

commit 58b7983d15a422d9616bdc4e245d5c31dfaefbe2
Author: Andi Kleen <[EMAIL PROTECTED]>
Date:   Tue Nov 27 16:53:47 2007 +1100

[XFS] Remove Makefile wrappers in XFS

Makefile (and Kbuild) would include Makefile-linux-26 I doubt XFS will

really still compile on 2.4; so drop that. This moves Makefile-linux-26
into Makefile and drops Kbuild. Also having wrappers as both Kbuild and
Makefile seemed redundant anyways.

The patch is relatively large because it renames a file, but no functional

changes.

SGI-PV: 971050

SGI-Modid: xfs-linux-melb:xfs-kern:29781a

Signed-off-by: Andi Kleen <[EMAIL PROTECTED]>

Signed-off-by: Donald Douwsma <[EMAIL PROTECTED]>
Signed-off-by: Tim Shimmin <[EMAIL PROTECTED]>
Signed-off-by: Lachlan McIlroy <[EMAIL PROTECTED]>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


how to setup struct bio

2008-02-21 Thread zzCherring

Hi,
I am having problem reading first several blocks from disk(/dev/sda1, for
example). Not sure if it is because i am not doing correctly as to creating
struct bio, filling its fields and sending it down to block device, or it is
because there are certain rools which prevent the first couple of blocks of
disk from being read out, which sounds ridiculous.

Following is how i am doing it now,

Thanks.


// used in completion routine, trying to sync back to the calling //thread.
struct __io_complete_d
{
wait_queue_head_t whead;

unsigned int err;
unsigned int len;
unsigned int flags;
};

// completion routine
int io_complete_r( struct bio *b, unsigned int n, int err )
{
struct __io_complete_d *icd = 0;

if( b != 0 )
{
if( b->bi_private )
{
icd = ( struct __io_complete_d *)b->bi_private;
icd->err = err;
icd->len = n;
icd->flags = 1;
wake_up(>whead);
}

if( b->bi_destructor ) b->bi_destructor(b);
}

return err;
}

// alloc page, send bio down to block device
struct page * read_from_disk(
struct block_device *bdev,
void *d,
bio_end_io_t *io_complete,
sector_t sect_off,
unsigned int od
)
{
struct bio *b = 0;
struct page *pg = 0;

if( bdev == 0 ) return 0;

// having tried alloc_pages(as it is a macro), but failed somehow
pg = __alloc_pages(GFP_KERNEL, od, contig_page_data.node_zonelists +
(GFP_KERNEL & GFP_ZONEMASK));

if( pg )
{
b = bio_alloc(GFP_KERNEL, 1);
if( b == 0)
{
__free_pages( pg, od);
pg = 0;

return 0;
}

b->bi_sector = sect_off;
b->bi_next = 0;
b->bi_bdev = bdev;
b->bi_end_io = io_complete;

b->bi_private = d;

bio_add_page( b, pg, PAGE_SIZE << od, 0 );

submit_bio(0, b);

}

return pg;
}

// example using read_from_disk
void example(
struct block_device *bdev,
unsigned int nblocks
)
{
struct __io_complete_d *icd = 0;
struct page *pg = 0;

if( bdev == 0 )
{
return 0;
}

icd = kmalloc( GFP_KERNEL, sizeof(struct __io_complete_d));
if(icd == 0)
{
goto __end;
}

memset( icd, 0, sizeof(struct __io_complete_d));
icd->flags = 0;
init_waitqueue_head( >whead);

pg = read_from_disk( bdev, icd, io_complete_r, nblock * BLOCK_SIZE
/SECT_SIZE, 0 );
if( pg == 0 ) goto __end;

wait_event( icd->whead, icd->flags != 0 );

if(icd->err == 0)
{
// After successfully reading from disk

}
//.

_end:
if(icd)
{
kfree(icd);
}
if(pg)
{
__free_pages(pg, 0);
}

}
-- 
View this message in context: 
http://www.nabble.com/how-to-setup-struct-bio-tp15627654p15627654.html
Sent from the linux-kernel mailing list archive at Nabble.com.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[GIT PULL] XFS update for 2.6.25-rc3

2008-02-21 Thread Lachlan McIlroy
Please pull from the for-linus branch:
git pull git://oss.sgi.com:8090/xfs/xfs-2.6.git for-linus

This will update the following files:

 0 files changed, 0 insertions(+), 0 deletions(-)
 delete mode 100644 fs/xfs/Makefile-linux-2.6

through these commits:

commit 6e5e93424dc66542c548dfaa3bfebe30d46d50dd
Author: Lachlan McIlroy <[EMAIL PROTECTED]>
Date:   Fri Feb 22 15:36:19 2008 +1100

Remove empty file fs/xfs/Makefile-linux-2.6.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[GIT PULL] XFS update for 2.6.25-rc3

2008-02-21 Thread Lachlan McIlroy
Please pull from the for-linus branch:
git pull git://oss.sgi.com:8090/xfs/xfs-2.6.git for-linus

This will update the following files:

 fs/xfs/Kbuild |6 --
 fs/xfs/Makefile   |  118 -
 fs/xfs/Makefile-linux-2.6 |  117 
 3 files changed, 117 insertions(+), 124 deletions(-)
 delete mode 100644 fs/xfs/Kbuild

through these commits:

commit 269cdfaf769f5cd831284cc831790c7c5038040f
Author: Lachlan McIlroy <[EMAIL PROTECTED]>
Date:   Wed Nov 28 18:28:09 2007 +1100

[XFS] Added quota targets and removed dmapi directory

Fixes build failures introduced by bad merge to mainline.

commit 794f744b225aaf35742aac9e7b9dda96a9943413
Author: Eric Sandeen <[EMAIL PROTECTED]>
Date:   Tue Nov 27 16:59:56 2007 +1100

[XFS] Fix up xfs out-of-tree builds. (a.k.a. external modules)

Change -I include directives to find headers in the out-of-tree spot. This
allows a directory containing only xfs files to be built as:

SGI-PV: 971186
SGI-Modid: xfs-linux-melb:xfs-kern:29878a

Signed-off-by: Eric Sandeen <[EMAIL PROTECTED]>
Signed-off-by: Donald Douwsma <[EMAIL PROTECTED]>
Signed-off-by: Lachlan McIlroy <[EMAIL PROTECTED]>

commit 58b7983d15a422d9616bdc4e245d5c31dfaefbe2
Author: Andi Kleen <[EMAIL PROTECTED]>
Date:   Tue Nov 27 16:53:47 2007 +1100

[XFS] Remove Makefile wrappers in XFS

Makefile (and Kbuild) would include Makefile-linux-26 I doubt XFS will
really still compile on 2.4; so drop that. This moves Makefile-linux-26
into Makefile and drops Kbuild. Also having wrappers as both Kbuild and
Makefile seemed redundant anyways.

The patch is relatively large because it renames a file, but no functional
changes.

SGI-PV: 971050
SGI-Modid: xfs-linux-melb:xfs-kern:29781a

Signed-off-by: Andi Kleen <[EMAIL PROTECTED]>
Signed-off-by: Donald Douwsma <[EMAIL PROTECTED]>
Signed-off-by: Tim Shimmin <[EMAIL PROTECTED]>
Signed-off-by: Lachlan McIlroy <[EMAIL PROTECTED]>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [bug] uml doesn't boot under 2.6.25-rc1 host (was Re: 2.6.24-mm1 bugs)

2008-02-21 Thread Roland McGrath
Thanks for the pointers, guys.  It took a while for me to figure out what
got wrong to foul up UML, but the bug and fix are trivial (posting now).
Some of the testing I thought had got done clearly wasn't done, since
PTRACE_SETREGS was 100% busticated for 32-bit processes calling ptrace on
x86_64 kernels.


Thanks,
Roland
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] x86 ptrace: fix compat PTRACE_SETREGS

2008-02-21 Thread Roland McGrath

Simple typo fix for regression introduced by the user_regset changes.

Signed-off-by: Roland McGrath <[EMAIL PROTECTED]>
---
 arch/x86/kernel/ptrace.c |4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/ptrace.c b/arch/x86/kernel/ptrace.c
index 702c33e..d862e39 100644
--- a/arch/x86/kernel/ptrace.c
+++ b/arch/x86/kernel/ptrace.c
@@ -1160,7 +1160,7 @@ static int genregs32_set(struct task_struct *target,
if (kbuf) {
const compat_ulong_t *k = kbuf;
while (count > 0 && !ret) {
-   ret = putreg(target, pos, *k++);
+   ret = putreg32(target, pos, *k++);
count -= sizeof(*k);
pos += sizeof(*k);
}
@@ -1171,7 +1171,7 @@ static int genregs32_set(struct task_struct *target,
ret = __get_user(word, u++);
if (ret)
break;
-   ret = putreg(target, pos, word);
+   ret = putreg32(target, pos, word);
count -= sizeof(*u);
pos += sizeof(*u);
}
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/2] ResCounter: Use read_uint in memory controller

2008-02-21 Thread Balbir Singh
[EMAIL PROTECTED] wrote:
> Update the memory controller to use read_uint for its
> limit/usage/failcnt control files, calling the new
> res_counter_read_uint() function.
> 
> Signed-off-by: Paul Menage <[EMAIL PROTECTED]>
> 

Hi, Paul,

Looks good, except for the name uint(), can we make it u64(). Integers are 32
bit on both ILP32 and LP64, but we really read/write 64 bit values.

-- 
Warm Regards,
Balbir Singh
Linux Technology Center
IBM, ISTL
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] [PATCH] Fix b43 driver build for arm

2008-02-21 Thread Gordon Farquharson
Hi Sam

On Wed, Feb 20, 2008 at 12:37 PM, Sam Ravnborg <[EMAIL PROTECTED]> wrote:

>  Option 1) is the worst of the three as that can cost
>  of many hours bug-hunting.
>  Option 3) may seem optimal but I do not like to add more
>  complexity to this part of the build. And really I do not
>  know a reliable way to detech when we do cross builds anyway.
>
>  Leaving us with option 2) that is simple, strighforward and harmless.

Are you willing to sign off on and commit the patch?

Gordon

-- 
Gordon Farquharson
GnuPG Key ID: 32D6D676
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: modular intel-agp does not work on my box

2008-02-21 Thread Dave Jones
On Fri, Feb 22, 2008 at 04:27:33AM +0100, Gabriel C wrote:
 > Gabriel C wrote:
 > > Dave Airlie wrote:
 > >>> Hi,
 > >>>
 > >>> When building agp* modular ( CONFIG_AGP=y/m and CONFIG_AGP_INTEL=m ) 
 > >>> intel-agp does nothing on my box 
 > >>> ( Dell Precision WorkStation 530 MT ) chipset is not being detected.
 > >>>
 > >>> Building both Y fixes that and agpgart works and also detects my chipset.
 > >> Have you got EDAC modules built as well? they might be taking ownership 
 > >> when they shouldn't..
 > >>
 > > 
 > > Yes I have EDAC built modular. I will build latest git without EDAC and 
 > > agp modular
 > > and let you know if that fixes ( workarounds ;) ) the problem.
 > 
 > You are right without EDAC built , agp modular does work fine. I'm on 
 > 2.6.25-rc2-00477-g1a4c6be right now.
 > 
 > So it is an EDAC bug ?
 
No, it's a failing of the pci driver model. It currently doesn't
allow more than one driver to be bound to a single PCI device.
For multi-function devices like bridges, this means we see problems
like the one you mention.

Dave

-- 
http://www.codemonkey.org.uk
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 03/10] PCI: AMD SATA IDE mode quirk

2008-02-21 Thread Matthew Wilcox
On Thu, Feb 21, 2008 at 03:47:33PM -0800, Greg Kroah-Hartman wrote:
> +static void __devinit quirk_amd_ide_mode(struct pci_dev *pdev)
>  {
> - /* set sb600 sata to ahci mode */
> - if ((pdev->class >> 8) == PCI_CLASS_STORAGE_IDE) {
> - u8 tmp;
> + /* set sb600/sb700/sb800 sata to ahci mode */
> + u8 tmp;
>  
> + pci_read_config_byte(pdev, PCI_CLASS_DEVICE, );
> + if (tmp == 0x01) {
>   pci_read_config_byte(pdev, 0x40, );

This seems like a dis-improvement.  Why are we reading a config byte for
something we already have in the pci_dev?  Why are we now checking
against 0x01 instead of a symbolic constant?  Why are we no longer
checking that this is PCI_BASE_CLASS_STORAGE?

> -DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_ATI, PCI_DEVICE_ID_ATI_IXP600_SATA, 
> quirk_sb600_sata);
> -DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_ATI, PCI_DEVICE_ID_ATI_IXP700_SATA, 
> quirk_sb600_sata);
> +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_ATI, PCI_DEVICE_ID_ATI_IXP600_SATA, 
> quirk_amd_ide_mode);
> +DECLARE_PCI_FIXUP_RESUME(PCI_VENDOR_ID_ATI, PCI_DEVICE_ID_ATI_IXP600_SATA, 
> quirk_amd_ide_mode);
> +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_ATI, PCI_DEVICE_ID_ATI_IXP700_SATA, 
> quirk_amd_ide_mode);
> +DECLARE_PCI_FIXUP_RESUME(PCI_VENDOR_ID_ATI, PCI_DEVICE_ID_ATI_IXP700_SATA, 
> quirk_amd_ide_mode);

Nothing in the changelog entry suggests why we now need FIXUP_RESUME
entries when we didn't before.

-- 
Intel are signing my paycheques ... these opinions are still mine
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Keyboard interrupt - request_irq()

2008-02-21 Thread Robert Hancock

Pioz wrote:

Hi all,
  I have a problem.
I want handle the keyboard interrupt and for this purpose I have write
this module (I have kernel 2.6.23):


#include 
#include 
#include 

[...]

irqreturn_t
irq_myhandler (int irqn, void *dev)
{
printk (KERN_INFO "Key pressed...\n");
return IRQ_HANDLED;
}

int
init_module ()
{
int res;
printk (KERN_INFO "Hello World!\n");
free_irq (1, NULL);
res = request_irq (1, irq_myhandler, IRQF_SHARED, "bao", dev_id);
printk (KERN_INFO "res: %d\n", res);
return 0;
}

void
cleanup_module ()
{
free_irq (1, NULL);
printk (KERN_INFO "Goodbye World!\n");
}


The return value of request_irq() function is -EBUSY. Why? Is the
default handler? How can I do to change handler with my function?
Thanks...


Normally one doesn't register multiple interrupt handlers for the same 
device. For a PCI level-triggered interrupt one can do it (for the case 
where multiple devices share the IRQ), but the PC keyboard interrupt is 
edge-triggered and isn't sharable.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Configure MSI-X vectors to target different CPUs

2008-02-21 Thread Robert Hancock

[EMAIL PROTECTED] wrote:

Hi,

In MSI-HOWTO, it's said:

"Using MSI enables the device functions to support two or more vectors, which can be 
configured to target different CPUs to increase scalability."

So how can I set up MSI-X vectors to target different CPUs? I want to allocate 
the same number of MSI-X vectors as CPUs, and equally distribute them to every 
CPU.

Is it automatically done by Linux when I call pci_enable_msix()? If yes, how? 
If not, what should I do? My guess is to set the affinity of the interrupts 
manually. Am I right?

Please CC'ed me ([EMAIL PROTECTED]) answers/comments  in response to this posting. 


Thanks,
Ying


If the device actually supports multiple vectors (not all do), I think 
they should show up as separate interrupts in /proc/interrupts and you 
can either set the affinity manually, or maybe irqbalance is smart 
enough for this.


Careful, though, as in some cases this may reduce performance due to 
causing more cache line bouncing between CPUs.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/2] cgroup map files: Add cgroup map data type

2008-02-21 Thread YAMAMOTO Takashi
> The map type is printed in a similar format to /proc/meminfo or
> /proc//status, i.e. "$key: $value\n"

this description doesn't seem to match with the code.

YAMAMOTO Takashi

> +static int cgroup_map_add(struct cgroup_map_cb *cb, const char *key, u64 
> value)
> +{
> + struct seq_file *sf = cb->state;
> + return seq_printf(sf, "%s %llu\n", key, value);
> +}
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 06/11] xen: move arch/x86/xen/events.c undedr drivers/xen and split out arch specific part.

2008-02-21 Thread Isaku Yamahata
On Thu, Feb 21, 2008 at 11:38:18AM -0800, Jeremy Fitzhardinge wrote:
> [EMAIL PROTECTED] wrote:
> >diff --git a/arch/x86/xen/events.c b/drivers/xen/events.c
> >similarity index 95%
> >rename from arch/x86/xen/events.c
> >rename to drivers/xen/events.c
> >index dcf613e..7474739 100644
> >--- a/arch/x86/xen/events.c
> >+++ b/drivers/xen/events.c
> >@@ -37,7 +37,9 @@
> > #include 
> > #include 
> > 
> >-#include "xen-ops.h"
> >+#ifdef CONFIG_X86
> >+# include "../arch/x86/xen/xen-ops.h"
> >+#endif
> 
> Hm.  Perhaps it would be better to move whatever definition you need 
> into a header in a common place (or move xen-ops.h entirely).

Thank you for review. The updated version.
changes
- move the xen_vcpu declaration from arch/x86/xen/xen-ops.h to
  include/xen/xen-ops.h which is newly created.


xen: move arch/x86/xen/events.c undedr drivers/xen and split out arch specific 
part.

ia64/xen also uses events.c. clean it up so that ia64/xen can use.
make ipi_to_irq globly visible. ia64/xen nees to reference it from other file.
introduce resend_irq_on_evtchn() which ia64 needs.
introduce xen_do_IRQ() to split out arch specific code.

Signed-off-by: Isaku Yamahata <[EMAIL PROTECTED]>
---
 arch/x86/xen/Makefile  |2 +-
 arch/x86/xen/xen-ops.h |2 +-
 drivers/xen/Makefile   |2 +-
 {arch/x86 => drivers}/xen/events.c |   33 -
 include/asm-x86/xen/hypervisor.h   |7 +++
 include/xen/events.h   |1 +
 include/xen/xen-ops.h  |6 ++
 7 files changed, 41 insertions(+), 12 deletions(-)
 rename {arch/x86 => drivers}/xen/events.c (95%)
 create mode 100644 include/xen/xen-ops.h

diff --git a/arch/x86/xen/Makefile b/arch/x86/xen/Makefile
index c5e9aa4..95c5926 100644
--- a/arch/x86/xen/Makefile
+++ b/arch/x86/xen/Makefile
@@ -1,4 +1,4 @@
 obj-y  := enlighten.o setup.o multicalls.o mmu.o \
-   events.o time.o manage.o xen-asm.o
+   time.o manage.o xen-asm.o
 
 obj-$(CONFIG_SMP)  += smp.o
diff --git a/arch/x86/xen/xen-ops.h b/arch/x86/xen/xen-ops.h
index b02a909..caaabf3 100644
--- a/arch/x86/xen/xen-ops.h
+++ b/arch/x86/xen/xen-ops.h
@@ -2,6 +2,7 @@
 #define XEN_OPS_H
 
 #include 
+#include 
 
 /* These are code, but not functions.  Defined in entry.S */
 extern const char xen_hypervisor_callback[];
@@ -9,7 +10,6 @@ extern const char xen_failsafe_callback[];
 
 void xen_copy_trap_info(struct trap_info *traps);
 
-DECLARE_PER_CPU(struct vcpu_info *, xen_vcpu);
 DECLARE_PER_CPU(unsigned long, xen_cr3);
 DECLARE_PER_CPU(unsigned long, xen_current_cr3);
 
diff --git a/drivers/xen/Makefile b/drivers/xen/Makefile
index 609fdda..823ce78 100644
--- a/drivers/xen/Makefile
+++ b/drivers/xen/Makefile
@@ -1,2 +1,2 @@
-obj-y  += grant-table.o features.o
+obj-y  += grant-table.o features.o events.o
 obj-y  += xenbus/
diff --git a/arch/x86/xen/events.c b/drivers/xen/events.c
similarity index 95%
rename from arch/x86/xen/events.c
rename to drivers/xen/events.c
index dcf613e..dce2dfc 100644
--- a/arch/x86/xen/events.c
+++ b/drivers/xen/events.c
@@ -33,12 +33,11 @@
 #include 
 #include 
 
+#include 
 #include 
 #include 
 #include 
 
-#include "xen-ops.h"
-
 /*
  * This lock protects updates to the following mapping and reference-count
  * arrays. The lock does not need to be acquired to read the mapping tables.
@@ -49,7 +48,7 @@ static DEFINE_SPINLOCK(irq_mapping_update_lock);
 static DEFINE_PER_CPU(int, virq_to_irq[NR_VIRQS]) = {[0 ... NR_VIRQS-1] = -1};
 
 /* IRQ <-> IPI mapping */
-static DEFINE_PER_CPU(int, ipi_to_irq[XEN_NR_IPIS]) = {[0 ... XEN_NR_IPIS-1] = 
-1};
+DEFINE_PER_CPU(int, ipi_to_irq[XEN_NR_IPIS]) = {[0 ... XEN_NR_IPIS-1] = -1};
 
 /* Packed IRQ information: binding type, sub-type index, and event channel. */
 struct packed_irq
@@ -455,7 +454,6 @@ void xen_send_IPI_one(unsigned int cpu, enum ipi_vector 
vector)
notify_remote_via_irq(irq);
 }
 
-
 /*
  * Search the CPUs pending events bitmasks.  For each one found, map
  * the event number to an irq, and feed it into do_IRQ() for
@@ -474,7 +472,10 @@ void xen_evtchn_do_upcall(struct pt_regs *regs)
 
vcpu_info->evtchn_upcall_pending = 0;
 
-   /* NB. No need for a barrier here -- XCHG is a barrier on x86. */
+#ifndef CONFIG_X86 /* No need for a barrier -- XCHG is a barrier on x86. */
+   /* Clear master flag /before/ clearing selector flag. */
+   rmb();
+#endif
pending_words = xchg(_info->evtchn_pending_sel, 0);
while (pending_words != 0) {
unsigned long pending_bits;
@@ -486,10 +487,8 @@ void xen_evtchn_do_upcall(struct pt_regs *regs)
int port = (word_idx * BITS_PER_LONG) + bit_idx;
int irq = evtchn_to_irq[port];
 
-   if (irq != -1) {
-   regs->orig_ax = ~irq;
-   do_IRQ(regs);
-   }
+   if 

Re: [GIT PULL?] Create and populate toplevel tests/ for kernel tests

2008-02-21 Thread Ananth N Mavinakayanahalli
On Tue, Feb 19, 2008 at 08:21:53PM +0100, Sam Ravnborg wrote:
> Hi Anath.
> 
> Linus did not pull this in the -rc1 to -rc2 timeframe
> so please resubmit the patch serie one week into the
> next merge window (when most of the trees has hit linus' tree
> and Andrew has made his first merge).
> 
> IF you need an extra eye balling then you can submit
> a few weeks before the merge window opens.
> Thats typical after an -rc with only a few patches.

Stephen,

The patchset in question is just a major code movement - basically to
move all in-kernel tests to live under a toplevel tests/ directory. As
such, all the stakeholders have acked the patchset, but it does look
like this is a big enough change to be deferred to the next merge
window.

Given that there is general agreement about the patchset, could you
please pull in the changes into the linux-next tree?

Sam has setup a git tree for this and you can pull from:
ssh://master.kernel.org/pub/scm/linux/kernel/git/sam/tests.git
 
Link to the thread: http://lkml.org/lkml/2008/2/11/97

Thanks,
Ananth

> Thanks,
>   Sam
> 
> On Tue, Feb 12, 2008 at 11:39:18PM +0100, Sam Ravnborg wrote:
> > Hi Linus.
> > 
> > Will you consider such a primary code-movement for -rc1
> > or shall we wait until next merge window?
> > 
> > Had we hit -rc2 I would not have sent this pull req and
> > feel free to flame me anyway.
> > 
> > The rationale to get it merged is obviously to avoid
> > merge conflicts and the only reason I ask is that I consider
> > it a low risk patch.
> > 
> > I have not included 8/8 since it was questioned and it
> > will wait until next merge window. But the first 7 was
> > straightforward.
> > 
> > You can pull from:
> > ssh://master.kernel.org/pub/scm/linux/kernel/git/sam/tests.git
> > 
> > diffstat and shortlog below.
> > I also included mail last with a few of the merge related comments.
> > 
> > Sam
> > 
> >  Makefile|1 +
> >  drivers/misc/Makefile   |1 -
> >  kernel/Makefile |4 -
> >  lib/Kconfig.debug   |   71 
> > +
> >  lib/Makefile|1 -
> >  tests/Kconfig   |   79 
> > +++
> >  tests/Makefile  |   10 +++
> >  {kernel => tests}/backtracetest.c   |0 
> >  {drivers/misc => tests}/lkdtm.c |   12 ++--
> >  {lib => tests}/locking-selftest-hardirq.h   |0 
> >  {lib => tests}/locking-selftest-mutex.h |0 
> >  {lib => tests}/locking-selftest-rlock-hardirq.h |0 
> >  {lib => tests}/locking-selftest-rlock-softirq.h |0 
> >  {lib => tests}/locking-selftest-rlock.h |0 
> >  {lib => tests}/locking-selftest-rsem.h  |0 
> >  {lib => tests}/locking-selftest-softirq.h   |0 
> >  {lib => tests}/locking-selftest-spin-hardirq.h  |0 
> >  {lib => tests}/locking-selftest-spin-softirq.h  |0 
> >  {lib => tests}/locking-selftest-spin.h  |0 
> >  {lib => tests}/locking-selftest-wlock-hardirq.h |0 
> >  {lib => tests}/locking-selftest-wlock-softirq.h |0 
> >  {lib => tests}/locking-selftest-wlock.h |0 
> >  {lib => tests}/locking-selftest-wsem.h  |0 
> >  {lib => tests}/locking-selftest.c   |0 
> >  {kernel => tests}/rcutorture.c  |0 
> >  {kernel => tests}/rtmutex-tester.c  |2 +-
> >  {kernel => tests}/test_kprobes.c|0 
> >  27 files changed, 99 insertions(+), 82 deletions(-)
> > 
> > Ananth N Mavinakayanahalli (7):
> >   Create tests/ directory
> >   Move locking selftests to tests/
> >   Move rcutorture to tests/
> >   Move rtmutex-tests to tests/
> >   Move lkdtm to tests/
> >   Move kprobes smoke tests to tests/
> >   Move backtrace tests to tests/
> > 
> > 
> > 
> > On Tue, Feb 12, 2008 at 01:22:46PM -0800, Andrew Morton wrote:
> > > On Tue, 12 Feb 2008 11:44:52 -0500
> > > Christoph Hellwig <[EMAIL PROTECTED]> wrote:
> > > 
> > > > On Mon, Feb 11, 2008 at 04:14:52PM +0530, Ananth N Mavinakayanahalli 
> > > > wrote:
> > > > > The following series of patches create and populate the toplevel 
> > > > > tests/
> > > > > directory. This will henceforth be the place where all in-kernel tests
> > > > > live.
> > > > > 
> > > > > All patches against 2.6.25-rc1 and are just code movement without any
> > > > > change in functionality.
> > > > 
> > > > ACK to patches 1-7, and I agree with Ingo that the x86-specific test
> > > > should stay under arch/x86.
> > > 
> > > OK.  But now is basically the worst time for me (or anyone else) to merge
> > > large code-motion changes like this, because they need to be carried for
> > > two months or more.
> > > 
> > > And even though git can track renames, putting them into a git tree (say,
> > > git-kbuild) won't help, 

Re: Make yield_task_fair more efficient

2008-02-21 Thread Balbir Singh
Jens Axboe wrote:
> On Thu, Feb 21 2008, Jens Axboe wrote:
>> On Thu, Feb 21 2008, Peter Zijlstra wrote:
>>> On Thu, 2008-02-21 at 15:37 +0530, Balbir Singh wrote:
>>>
 You use the empty pointer (missing right child), so why do we need a list. 
 May
 be I am missing something.
>>> A fully threaded tree also has back-pointer to traverse backwards
>>> through the ordered elements.
>>>
>>> That said, overloading the right child pointer might not be the best
>>> thing for the linux kernel, as it will impact all the rb-tree lookups
>>> which are open-coded and often performance critical (this is the reason
>>> the colour isn't bit encoded in either of the child pointers either).
>>>
>>> But if you only want a uni directional thread, I guess we can stick it
>>> in the unsigned long we use for the node colour.
>>>
>>> Still, perhaps it's worth it to grow rb_node to 4 words and do the fully
>>> threaded thing as there are also a lot of rb_prev() users in the kernel.
>>> Who knows..
>>>
>>> Anyway, I agree that improving rb_next() is worth looking into for the
>>> scheduler.
>> For the IO scheduler as well, it's used quite extensively! So speeding
>> up rb_next() would definitely help, as it's typically invoked for every
>> bio queued (attempting to back merge with the next request). CFQ and AS
>> additionally does an rb_next() and rb_prev() when trying to decide which
>> request to do next.
> 
> One possible course of action to implement this without eating extra
> space in the rb_node would be:
> 
> - Add rb_right() and rb_set_right() (plus ditto _left variants) to
>   rbtree.h
> - Convert all in-kernel users to use these. Quite extensive, as the
>   rbtree code search/insert functions are coded in situ and not in
>   rbtree.[ch]
> - Now we can overload bit 0 of ->rb_right and ->rb_left to indicate
>   whether this is a node or thread pointer and modify rbtree.c to tag
>   and add the thread links when appropriate.
> 

Exactly along the lines I was thinking of.and discussing with David.

> So we can definitely do this in a compatible fashion. Given that I have
> a flight coming up in a few days time, I may give it a got if no one
> beats me to it :-)
> 

Feel free to do so, please do keep me on the cc. I am very interested in getting
rb threaded trees done, but my bandwidth is a little limited this month.


-- 
Warm Regards,
Balbir Singh
Linux Technology Center
IBM, ISTL
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: modular intel-agp does not work on my box

2008-02-21 Thread Gabriel C
Gabriel C wrote:
> Dave Airlie wrote:
>>> Hi,
>>>
>>> When building agp* modular ( CONFIG_AGP=y/m and CONFIG_AGP_INTEL=m ) 
>>> intel-agp does nothing on my box 
>>> ( Dell Precision WorkStation 530 MT ) chipset is not being detected.
>>>
>>> Building both Y fixes that and agpgart works and also detects my chipset.
>> Have you got EDAC modules built as well? they might be taking ownership 
>> when they shouldn't..
>>
> 
> Yes I have EDAC built modular. I will build latest git without EDAC and agp 
> modular
> and let you know if that fixes ( workarounds ;) ) the problem.

You are right without EDAC built , agp modular does work fine. I'm on 
2.6.25-rc2-00477-g1a4c6be right now.

So it is an EDAC bug ?

> 
>>> ..
>>>
>>> Linux agpgart interface v0.102
>>> agpgart: Detected an Intel i860 Chipset.
>>> agpgart: AGP aperture is 256M @ 0xe000
>>>
>>> ..
>>>
>>> Also I've tested kernel 2.6.24.2 and 2.6.25-rc2 both with same result.
>>>
>>> lspci -vvvxxx output :
>>>
>>> 00:00.0 Host bridge: Intel Corporation 82860 860 (Wombat) Chipset Host 
>>> Bridge (MCH) (rev 04)
>>> Subsystem: Dell Unknown device 00d8
>>> Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- 
>>> Stepping- SERR+ FastB2B- DisINTx-
>>> Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- >> SERR- >> Latency: 0
>>> Region 0: Memory at e000 (32-bit, prefetchable) [size=256M]
>>> Capabilities: [a0] AGP version 2.0
>>> Status: RQ=32 Iso- ArqSz=0 Cal=0 SBA+ ITACoh- GART64- HTrans- 
>>> 64bit- FW+ AGP3- Rate=x1,x2,x4
>>> Command: RQ=1 ArqSz=0 Cal=0 SBA+ AGP+ GART64- 64bit- FW+ Rate=x4
>>> Kernel driver in use: agpgart-intel
>>> Kernel modules: i82860_edac
>>> 00: 86 80 31 25 06 01 90 a0 04 00 00 06 00 00 00 00
>>> 10: 08 00 00 e0 00 00 00 00 00 00 00 00 00 00 00 00
>>> 20: 00 00 00 00 00 00 00 00 00 00 00 00 28 10 d8 00
>>> 30: 00 00 00 00 a0 00 00 00 00 00 00 00 00 00 00 00
>>> 40: d4 d4 92 80 80 80 80 80 80 80 80 80 80 80 80 80
>>> 50: 05 6b 02 00 00 00 00 00 00 10 11 11 01 00 11 11
>>> 60: 10 00 20 08 28 10 28 10 28 10 28 10 28 10 28 10
>>> 70: 28 10 28 10 28 10 28 10 28 10 28 10 28 10 28 10
>>> 80: 00 00 00 00 00 00 00 00 0f 00 00 00 00 00 00 00
>>> 90: 0b 00 0b 00 61 01 01 08 55 19 00 00 81 0a 38 00
>>> a0: 02 00 20 00 17 02 00 1f 14 03 00 00 00 00 00 00
>>> b0: 80 00 00 00 00 00 00 00 00 00 04 27 20 10 8b 00
>>> c0: 44 c0 50 11 00 28 00 00 00 00 00 00 03 00 00 00
>>> d0: 02 28 00 0e 03 00 00 33 af 09 31 b5 01 00 0b 00
>>> e0: 00 00 6a 00 00 00 00 01 2a 25 2b 33 07 00 00 00
>>> f0: 00 00 01 00 74 f8 30 80 38 0f 00 00 00 00 00 00
>>>
>>> 00:01.0 PCI bridge: Intel Corporation 82850 850 (Tehama) Chipset AGP Bridge 
>>> (rev 04) (prog-if 00 [Normal decode])
>>> Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- 
>>> Stepping- SERR+ FastB2B- DisINTx-
>>> Status: Cap- 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- >> SERR- >> Latency: 64
>>> Bus: primary=00, secondary=01, subordinate=01, sec-latency=64
>>> I/O behind bridge: e000-efff
>>> Memory behind bridge: ff80-ff9f
>>> Prefetchable memory behind bridge: d000-dfff
>>> Secondary status: 66MHz+ FastB2B+ ParErr- DEVSEL=medium >TAbort- 
>>> >> BridgeCtl: Parity- SERR+ NoISA+ VGA+ MAbort- >Reset- FastB2B-
>>> PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
>>> Kernel modules: shpchp
>>> 00: 86 80 32 25 07 01 a0 00 04 00 04 06 00 40 01 00
>>> 10: 00 00 00 00 00 00 00 00 00 01 01 40 e0 e0 a0 22
>>> 20: 80 ff 90 ff 00 d0 f0 df 00 00 00 00 00 00 00 00
>>> 30: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0e 00
>>> 40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>>> 50: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>>> 60: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>>> 70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>>> 80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>>> 90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>>> a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>>> b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>>> c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>>> d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>>> e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>>> f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>>>
>>> 00:02.0 PCI bridge: Intel Corporation 82860 860 (Wombat) Chipset AGP Bridge 
>>> (rev 04) (prog-if 00 [Normal decode])
>>> Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- 
>>> Stepping- SERR+ FastB2B- DisINTx-
>>> Status: Cap- 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- >> SERR- >> Latency: 64
>>> Bus: primary=00, secondary=02, subordinate=03, sec-latency=0
>>> I/O behind bridge: d000-dfff
>>> Memory behind bridge: ff50-ff7f
>>> Prefetchable memory behind bridge: fff0-000f
>>> Secondary status: 66MHz+ FastB2B+ ParErr- DEVSEL=medium >TAbort- 
>>> >> BridgeCtl: Parity- 

Re: [UPDATED v2][PATCH 0/6] regulator: voltage and current regulator framework

2008-02-21 Thread eric miao
On Fri, Feb 22, 2008 at 12:26 AM, Liam Girdwood
<[EMAIL PROTECTED]> wrote:
> On Thu, 2008-02-21 at 08:41 +, Russell King - ARM Linux wrote:
>  > On Wed, Feb 20, 2008 at 05:08:46PM +, Liam Girdwood wrote:
>  > > This patch series provides a generic framework to allow device drivers
>  > > to control voltage and current regulators on SoC based devices (e.g.
>  > > phones, gps, media players).
>  >
>  > Note that I'm explicitly avoiding commenting on this as far as PXA3xx
>  > devices go, until we're further down the road with PM support on that
>  > SoC.  It's not clear at present whether a generic PMIC framework will
>  > be suitable for this SoC since it's my understanding from Marvell that
>  > we need to talk to the PMIC from IRQs-off contexts.
>  >
>  > So don't take my silence as some sort of acceptance of this code; it
>  > isn't.
>
>  I wasn't ;)
>
>  It then might be worth adding this functionality at a later stage when
>  more can be said about PXA3xx PMIC support. We could always have a
>  version of the _set() functions that are designed to handle this case.
>
>  In the mean time this works well on 3 other SoC CPUs.
>
>  Liam
>

Liam,

I have a rough peek into the git tree on opensource.wolfsonmicro.com,
find another PMIC framework, and here instead is a regulator framework,
looks like a simplified or dedicated one. What is their relationship?

For those PMIC that covers additional features, like
  - usb vbus detection (or pull-up/pull-down)
  - audio codec
  - touch screen
  - battery monitor/ fuel gauge
  - battery charger
  - possible many others

How do you plan to handle them?

>
>
>  --
>  To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>  the body of a message to [EMAIL PROTECTED]
>  More majordomo info at  http://vger.kernel.org/majordomo-info.html
>  Please read the FAQ at  http://www.tux.org/lkml/
>



-- 
Cheers
- eric
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Document huge memory/cache overhead of memory controller in Kconfig

2008-02-21 Thread Nick Piggin
On Wednesday 20 February 2008 23:52, Balbir Singh wrote:
> Andi Kleen wrote:
> > Document huge memory/cache overhead of memory controller in Kconfig
> >
> > I was a little surprised that 2.6.25-rc* increased struct page for the
> > memory controller.  At least on many x86-64 machines it will not fit into
> > a single cache line now anymore and also costs considerable amounts of
> > RAM.
>
> The size of struct page earlier was 56 bytes on x86_64 and with 64 bytes it
> won't fit into the cacheline anymore? Please also look at
> http://lwn.net/Articles/234974/

BTW. We'll probably want to increase the width of some counters
in struct page at some point for 64-bit, so then it really will
go over with the memory controller!

Actually, an external data structure is a pretty good idea. We
could probably do it easily with a radix tree (pfn->memory
controller). And that might be a better option for distros.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Kernel oops with bluetooth usb dongle

2008-02-21 Thread Dave Young
On Fri, Feb 22, 2008 at 02:40:41AM +, Quel Qun wrote:
> 
>  -- Original message --
> From: Thomas Gleixner <[EMAIL PROTECTED]>
> > On Thu, 21 Feb 2008, Quel Qun wrote:
> > > > >  > Not that I'm aware off, but this might as well be some old use 
> > > > > after
> > > > >  > free bug which got exposed by some unrelated change. The good news 
> > > > > is
> > > > >  > that it is reproducible. I'll hack up some nasty debug patch which
> > > > >  > lets us - hopefully - decode where the timer was armed.
> > > > >
> > > > >  Quel, before I do that, is there any chance that you retest with the
> > > > >  latest mainline git version ?
> > > > >
> > > > >  
> > > > 
> > http://www.kernel.org/pub/linux/kernel/v2.6/snapshots/patch-2.6.25-rc2-git4.bz2
> > > > 
> > > > And please test with this patch as well:
> > > > 
> > > > http://lkml.org/lkml/2008/2/20/121
> > > > 
> > > Same kind of result unfortunately with this last patch on top of git4:
> > 
> > At least it is fully reproducible. Please apply the patch below to
> > your git4 tree and do not change your .config. The output should show,
> > which code armed the timer.
> > 
> Thomas,
> 
> Thanks for the patch, but that did not work, I never got the trace.
> 
> I switched to git5 and applied the patch.
> 
> First crash (= attached kernlog.9) showed some hald process, so I decided to 
> reduce the number of services and processes to a maximum. Attached are 
> process list before starting sdptool browse and crashing, list of modules and 
> list of services.
> 
> Second crash:
> 
> BUG: unable to handle kernel paging request at 6b6b6b6b
> IP: [] get_next_timer_interrupt+0x11f/0x234
> *pde = 
> Oops:  [#1] SMP
> Modules linked in: hidp rfcomm l2cap nfsd exportfs nfs lockd nfs_acl sunrpc 
> autofs4 af_packet binfmt_misc loop nls_iso8859_1 nls_cp437 vfat fat fuse 
> snd_pcm_oss snd_mixer_oss snd_intel8x0 snd_ac97_codec hci_usb ac97_bus 
> snd_pcm parport_pc snd_timer snd sr_mod i2c_i801 rtc_cmos iTCO_wdt i2c_core 
> parport soundcore iTCO_vendor_support pcspkr snd_page_alloc bluetooth button 
> thermal processor evdev dcdbas tg3 sg ide_disk piix ide_core ata_piix ahci 
> libata sd_mod scsi_mod ext3 jbd uhci_hcd ohci_hcd ehci_hcd usbcore [last 
> unloaded: scsi_wait_scan]
> 
> Pid: 0, comm: swapper Not tainted (2.6.25-rc2-git5kk1 #1)
> EIP: 0060:[] EFLAGS: 00010002 CPU: 0
> EIP is at get_next_timer_interrupt+0x11f/0x234
> EAX: 6b6b6b6b EBX: 3ffda6f6 ECX: c0432744 EDX: 6b6b6b6b
> ESI: 0027 EDI: c043260c EBP: c03b1ee8 ESP: c03b1eac
>  DS: 007b ES: 007b FS: 00d8 GS:  SS: 0068
> Process swapper (pid: 0, ti=c03b task=c03813a0 task.ti=c03b)
> Stack: fffda6f6 c0431e00  fffda700 0001 00f7 0027 00fffda7
>c043260c c043280c c0432a0c c0432c0c c18090c0 0643e180 fffda6f6 c03b1f2c
>c013fb78 0001 c03a3b08 0046 c03b1f20 0644b3c1 0022 0643e180
> Call Trace:
>  [] ? tick_nohz_stop_sched_tick+0x130/0x337
>  [] ? irq_exit+0x55/0x6e
>  [] ? smp_apic_timer_interrupt+0x59/0x92
>  [] ? apic_timer_interrupt+0x28/0x30
>  [] ? get_signal_to_deliver+0x2d8/0x332
>  [] ? native_safe_halt+0x5/0x7
>  [] ? default_idle+0x4d/0x7f
>  [] ? default_idle+0x0/0x7f
>  [] ? cpu_idle+0x6f/0x100
>  [] ? rest_init+0x49/0x50
>  ===
> Code: 85 e0 8b 4d e0 83 e1 3f 89 4d dc 89 ce 8b 04 f7 8b 10 0f 18 02 90 8d 0c 
> f7 39 c8 0f 84 8d 00 00 00 8b 40 08 39 d8 0f 48 d8 89 d0 <8b> 12 0f 18 02 90 
> 39 c8 75 ec c7 45 cc 01 00 00 00 8b 7d dc 85
> EIP: [] get_next_timer_interrupt+0x11f/0x234
> 
> $ addr2line -e vmlinux c012d51d
> /usr/src/linux-2.6.25-rc2-git5kk1/kernel/timer.c:770
> 
> Crap, that is on the next list_for_each_entry in timer.c :(
> 
> I tried to make a similar test loop as you did a few lines above:
> 
> @@ -718,6 +767,14 @@
>  
>   index = slot = timer_jiffies & TVN_MASK;
>   do {
> + struct list_head *tmp;
> +
> + __list_for_each(tmp, varp->vec + slot) {
> + nte = (struct timer_list *) tmp;
> + if (nte->entry.next == (void *)0x6b6b6b6b)
> + ttrace_find_timer(nte);
> + }
> +
>   list_for_each_entry(nte, varp->vec + slot, entry) {
>   found = 1;
>   if (time_before(nte->expires, expires))
> 
> I thought I got it on the next crash, but the system locked too fast, and the 
> only thing I saw was:
> 
> TTRACE timer f7b52858 fn f8e7c608 addr c012d776
> TTRACE fn l2cap_info_timeout
> TTRACE addr mod_timer
> BUG: unable to handle kernel paging request at 6b6b6b6b
> IP:
> 
> $ addr2line -e vmlinux.kk1 c012d776
> /usr/src/linux-2.6.25-rc2-git5kk1/kernel/timer.c:533
> 
> int mod_timer(struct timer_list *timer, unsigned long expires)
> {
> BUG_ON(!timer->function);
> 
> timer_stats_timer_set_start_info(timer);
> /*
>

Re: Merging of completely unreviewed drivers

2008-02-21 Thread Al Viro
On Fri, Feb 22, 2008 at 03:23:45AM +0100, Krzysztof Halasa wrote:
> Al Viro <[EMAIL PROTECTED]> writes:
> 
> > ... if your style is lousy.  I agree that situation with printks is
> > not normal in that respect and I certainly have no love for the
> > checkpatch nonsense, but pressure to keep the fucking nesting depth
> > low is a Good Thing(tm).
> 
> Indeed. Unfortunately it is orthogonal to the line length limit.

Not quite.  Add such things as choice of sane identifiers.  And sane use of
local variables, while we are at it - things like twenty lines of
foobar[(index + 1) % BLAH]->spork.vomit[12]->field_name = ;
with the only difference in the field_name, except for one line where
we have a typo and see 11 instead of intended 12, are responsible for quite
a few of such overruns.

IMO the line length overruns make good warnings.  Not as in "here's a cheap
way to get more changesets", but as in "that code might have other problems
nearby" kind of heuristics.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Merging of completely unreviewed drivers

2008-02-21 Thread Linus Torvalds


On Fri, 22 Feb 2008, Al Viro wrote:
>
> ... if your style is lousy.  I agree that situation with printks is
> not normal in that respect and I certainly have no love for the
> checkpatch nonsense, but pressure to keep the fucking nesting depth
> low is a Good Thing(tm).

I do agree, but that has little to do with line length *directly*.

IOW, I'd personally be happier with a checkpatch that calculated 
"complexity" and indentation over line length.

There is definitely a correlation there: there is no question that complex 
lines with deep indentation tend to be long. So yes, "long lines are 
correlated with bad code" is certainly true to some degree.

But sometimes lines are long just because it's a function call with 
multiple parameters, and it's just three levels indented, and it had a 
string there too. It may be long, but it's not complex, and keeping it on 
one line actually makes it much easier to visually parse (and grep for, 
for that matter).

So I'd be happier with warnings about deep indentation (but how do you 
count it? Will people then try to fake things out by using 4-space indents 
and then "deep" indentations will look like just a couple of tabs?) and 
against complex expressions (ie "if ((a = xyz()) == NULL) .." should just 
be split up into "a = xyz(); if (!a) ..", but there are sometimes reasons 
for those things too!

Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Document huge memory/cache overhead of memory controller in Kconfig

2008-02-21 Thread KOSAKI Motohiro
Hi

> > I think one reason of many people easy confusion is caused by bad menu
> > hierarchy.
> > I popose mem-cgroup move to child of cgroup and resource counter
> > (= obey denend on).
> 
> > +config CGROUP_MEM_CONT
> > +   bool "Memory controller for cgroups"
> 
> Memory _resource_ controller for cgroups?

Ahhh
my proposal only change menu hierarchy.
I don't know best name and i hope avoid rename discussion ;-)

Thanks.


- kosaki

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/2] x86 : relocate uninitialized variable in init DATA section into init BSS section

2008-02-21 Thread H. Peter Anvin

Huang, Ying wrote:


I think another method is to add a new attribute into GCC to prepend or
append something to section name instead of just to replace it, like the
example as follow:

#define __initdata  __attribute__((section_append(".init")))



Same difference, but less flexible.

-hpa
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Kernel oops with bluetooth usb dongle

2008-02-21 Thread Quel Qun

 -- Original message --
From: Thomas Gleixner <[EMAIL PROTECTED]>
> On Thu, 21 Feb 2008, Quel Qun wrote:
> > > >  > Not that I'm aware off, but this might as well be some old use after
> > > >  > free bug which got exposed by some unrelated change. The good news is
> > > >  > that it is reproducible. I'll hack up some nasty debug patch which
> > > >  > lets us - hopefully - decode where the timer was armed.
> > > >
> > > >  Quel, before I do that, is there any chance that you retest with the
> > > >  latest mainline git version ?
> > > >
> > > >  
> > > 
> http://www.kernel.org/pub/linux/kernel/v2.6/snapshots/patch-2.6.25-rc2-git4.bz2
> > > 
> > > And please test with this patch as well:
> > > 
> > > http://lkml.org/lkml/2008/2/20/121
> > > 
> > Same kind of result unfortunately with this last patch on top of git4:
> 
> At least it is fully reproducible. Please apply the patch below to
> your git4 tree and do not change your .config. The output should show,
> which code armed the timer.
> 
Thomas,

Thanks for the patch, but that did not work, I never got the trace.

I switched to git5 and applied the patch.

First crash (= attached kernlog.9) showed some hald process, so I decided to 
reduce the number of services and processes to a maximum. Attached are process 
list before starting sdptool browse and crashing, list of modules and list of 
services.

Second crash:

BUG: unable to handle kernel paging request at 6b6b6b6b
IP: [] get_next_timer_interrupt+0x11f/0x234
*pde = 
Oops:  [#1] SMP
Modules linked in: hidp rfcomm l2cap nfsd exportfs nfs lockd nfs_acl sunrpc 
autofs4 af_packet binfmt_misc loop nls_iso8859_1 nls_cp437 vfat fat fuse 
snd_pcm_oss snd_mixer_oss snd_intel8x0 snd_ac97_codec hci_usb ac97_bus snd_pcm 
parport_pc snd_timer snd sr_mod i2c_i801 rtc_cmos iTCO_wdt i2c_core parport 
soundcore iTCO_vendor_support pcspkr snd_page_alloc bluetooth button thermal 
processor evdev dcdbas tg3 sg ide_disk piix ide_core ata_piix ahci libata 
sd_mod scsi_mod ext3 jbd uhci_hcd ohci_hcd ehci_hcd usbcore [last unloaded: 
scsi_wait_scan]

Pid: 0, comm: swapper Not tainted (2.6.25-rc2-git5kk1 #1)
EIP: 0060:[] EFLAGS: 00010002 CPU: 0
EIP is at get_next_timer_interrupt+0x11f/0x234
EAX: 6b6b6b6b EBX: 3ffda6f6 ECX: c0432744 EDX: 6b6b6b6b
ESI: 0027 EDI: c043260c EBP: c03b1ee8 ESP: c03b1eac
 DS: 007b ES: 007b FS: 00d8 GS:  SS: 0068
Process swapper (pid: 0, ti=c03b task=c03813a0 task.ti=c03b)
Stack: fffda6f6 c0431e00  fffda700 0001 00f7 0027 00fffda7
   c043260c c043280c c0432a0c c0432c0c c18090c0 0643e180 fffda6f6 c03b1f2c
   c013fb78 0001 c03a3b08 0046 c03b1f20 0644b3c1 0022 0643e180
Call Trace:
 [] ? tick_nohz_stop_sched_tick+0x130/0x337
 [] ? irq_exit+0x55/0x6e
 [] ? smp_apic_timer_interrupt+0x59/0x92
 [] ? apic_timer_interrupt+0x28/0x30
 [] ? get_signal_to_deliver+0x2d8/0x332
 [] ? native_safe_halt+0x5/0x7
 [] ? default_idle+0x4d/0x7f
 [] ? default_idle+0x0/0x7f
 [] ? cpu_idle+0x6f/0x100
 [] ? rest_init+0x49/0x50
 ===
Code: 85 e0 8b 4d e0 83 e1 3f 89 4d dc 89 ce 8b 04 f7 8b 10 0f 18 02 90 8d 0c 
f7 39 c8 0f 84 8d 00 00 00 8b 40 08 39 d8 0f 48 d8 89 d0 <8b> 12 0f 18 02 90 39 
c8 75 ec c7 45 cc 01 00 00 00 8b 7d dc 85
EIP: [] get_next_timer_interrupt+0x11f/0x234

$ addr2line -e vmlinux c012d51d
/usr/src/linux-2.6.25-rc2-git5kk1/kernel/timer.c:770

Crap, that is on the next list_for_each_entry in timer.c :(

I tried to make a similar test loop as you did a few lines above:

@@ -718,6 +767,14 @@
 
index = slot = timer_jiffies & TVN_MASK;
do {
+   struct list_head *tmp;
+
+   __list_for_each(tmp, varp->vec + slot) {
+   nte = (struct timer_list *) tmp;
+   if (nte->entry.next == (void *)0x6b6b6b6b)
+   ttrace_find_timer(nte);
+   }
+
list_for_each_entry(nte, varp->vec + slot, entry) {
found = 1;
if (time_before(nte->expires, expires))

I thought I got it on the next crash, but the system locked too fast, and the 
only thing I saw was:

TTRACE timer f7b52858 fn f8e7c608 addr c012d776
TTRACE fn l2cap_info_timeout
TTRACE addr mod_timer
BUG: unable to handle kernel paging request at 6b6b6b6b
IP:

$ addr2line -e vmlinux.kk1 c012d776
/usr/src/linux-2.6.25-rc2-git5kk1/kernel/timer.c:533

int mod_timer(struct timer_list *timer, unsigned long expires)
{
BUG_ON(!timer->function);

timer_stats_timer_set_start_info(timer);
/*
 * This is a common optimization triggered by the
 * networking code - if the timer is re-modified
 * to be the same thing then just return:
 */
if (timer->expires == expires && timer_pending(timer))
return 1;

return __mod_timer(timer, 

Re: [PATCH] correct inconsistent ntp interval/tick_length usage

2008-02-21 Thread john stultz

On Wed, 2008-02-20 at 18:08 +0100, Roman Zippel wrote:
> > Well, it is a problem if its large. The 500ppm limit is supposed to be
> > for hardware frequency error correction, not hardware frequency +
> > software error correction. Now, if it were 1-10ppm, it wouldn't be that
> > big of an issue, but with the jiffies example above, 153ppm does cut
> > into the correctable space a good bit.
> 
> Again, what kind of crappy hardware do you expect? Aren't clocks supposed 
> to get better and not worse?

Well, while I've seen much worse, I consider crappy hardware to be 100
+ppm error. So if the hardware is perfect and the system results in
153ppm error, I'd consider that pretty crappy, especially if its not the
hardware's fault.

> Where do you get this idea that the 500ppm are exclusively for hardware 
> errors? If you have such bad hardware, there is another simple solution: 
> change HZ to 100 and the error is reduced to 15ppm.

True its not exclusively for hardware errors, and if we were talking
about only 15ppm I wouldn't really worry about it. But when we're saying
the system is adding 30% of the maximum error, that's just not good.

> I would see the point if this problem had actually any practically 
> relevance, but this error is not a problem for pretty much all existing 
> standard hardware. Why are you insisting on redesigning timekeeping for 
> broken hardware?

Remember my earlier data? Where I was talking about the acpi_pm being a
multiple of the PIT frequency? By removing CLOCK_TICK_ADJUST we got a
127ppm error when HZ=1000. NO_HZ drops that down to where we don't care,
but this _does_ effect current hardware, so I'd call it relevant.


> > > > My understanding of your approach (removing CLOCK_TICK_ADJUST),
> > > > addresses issues #1 and #3, but hurts issue #2.
> > > 
> > > What exactly is hurt?
> > 
> > By injecting 153ppm of error, the ability for NTP to correct hardware
> > error within 500ppm is hurt.
> 
> There's nothing 'injected', that resolution error is very real and the 
> 500ppm limit is more than enough to deal with this. _Nobody_ is hurt by 
> this.

Sure, 500ppm is enough for most people with good hardware. But remember
the alpha example you brought up earlier? The HZ=1200 case, with the
CLOCK_TICK_RATE=32768? If we don't take CLOCK_TICK_ADJUST into account,
we end up with a **11230ppm** error from the granularity issue. NTP just
won't work on those systems.

Now granted, the three types of alpha systems that actually use that HZ
value is probably as close to "nobody" as you're going to get, but I
don't think we can just throw the granularity issue aside.


> Revert bbe4d18ac2e058c56adb0cd71f49d9ed3216a405 and 
> e13a2e61dd5152f5499d2003470acf9c838eab84 and remove CLOCK_TICK_ADJUST 
> completely. Add a optional kernel parameter ntp_tick_adj instead to allow 
> adjusting of a large base drift and thus keeping ntpd happy.
> The CLOCK_TICK_ADJUST mechanism was introduced at a time PIT was the 
> primary clock, but we have a varity of clock sources now, so a global PIT 
> specific adjustment makes little sense anymore.
> 
> Signed-off-by: Roman Zippel <[EMAIL PROTECTED]>

So thanks so much for sending the patch. It makes clear your solution.

My initial comments: Its simple and that really does have its merits. It
resolves the inconsistent comparison issue, and does not have the
smallish scaling error we talked about as well. As I've said before, I
do like the idea, I'm just worried about the corner cases (mainly
jiffies based systems).

The granularity issue is still present. Depending on the HZ settings and
the clocksource hardware, systems may see large errors added on to the
actual hardware error, and its possible the kernel error may dominate
the actual hardware error.

The ntp_tick_adjust option does give a way out if you have, for example,
one of those alpha systems where it would be necessary, but I do wish
there was a better way then forcing users to calculate for themselves
what the granularity adjustment should be (esp given that it is more a
function of the kernel compile options, so different kernels would need
different values for the same system).

So then yes, your patch is simple and corrects the issue that started
the discussion. I think we're closing the gaps. :)

I still think my claims hold that your patch as it stands may worsen the
drift error depending on HZ settings, especially on jiffies based
systems (which means every non-GENERIC_TIME arch). However, if folks
don't really care, then that may be acceptable.

As promised, here is my own patch, which takes the scaling error you
pointed out into account, as well as resolving the granularity issue in
a way similar to your ntp_tick_adjust option does, only the kernel will
calculate such a granularity correction on a per-clocksource base, so
users don't have to do the math.

Sadly I've not had the chance to really test and debug this (there's a
lot of shifting logic, so I may have flubed something there), but I

[PATCH sched-devel 0/7] CPU isolation extensions

2008-02-21 Thread Max Krasnyanskiy

Ingo,

As you suggested I'm sending CPU isolation patches for review/inclusion into 
sched-devel tree. They are against 2.6.25-rc2.

You can also pull them from my GIT tree at
git://git.kernel.org/pub/scm/linux/kernel/git/maxk/cpuisol-2.6.git 
master

Diffstat:
b/Documentation/ABI/testing/sysfs-devices-system-cpu |   41 ++
b/Documentation/cpu-isolation.txt|  114 ++-
b/arch/x86/Kconfig   |1 
b/arch/x86/kernel/genapic_flat_64.c  |5 
b/drivers/base/cpu.c |   48 
b/include/linux/cpumask.h|3 
b/kernel/Kconfig.cpuisol |   15 ++
b/kernel/Makefile|4 
b/kernel/cpu.c   |   49 

b/kernel/sched.c |   37 --
b/kernel/stop_machine.c  |9 +
b/kernel/workqueue.c |   31 +++--
kernel/Kconfig.cpuisol   |   56 ++---
kernel/cpu.c |   16 +-
14 files changed, 356 insertions(+), 73 deletions(-)

List of commits
  cpuisol: Make cpu isolation configrable and export isolated map
  cpuisol: Do not route IRQs to the CPUs isolated at boot
  cpuisol: Do not schedule workqueues on the isolated CPUs
  cpuisol: Move on-stack array used for boot cmd parsing into __initdata
  cpuisol: Documentation updates
  cpuisol: Minor updates to the Kconfig options
  cpuisol: Do not halt isolated CPUs with Stop Machine

This patch series extends CPU isolation support.
The primary idea here is to be able to use some CPU cores as the dedicated 
engines for running
user-space code with minimal kernel overhead/intervention, think of it as an SPE in the 
Cell processor. I'd like to be able to run a CPU intensive (%100) RT task on one of the 
processors without adversely affecting or being affected by the other system activities. 
System activities here include _kernel_ activities as well. 

I'm personally using this for hard realtime purposes. With CPU isolation it's very easy to 
achieve single digit usec worst case and around 200 nsec average response times on off-the-shelf
multi- processor/core systems (vanilla kernel plus these patches) even under extreme system load. 
I'm working with legal folks on releasing hard RT user-space framework for that.
I believe with the current multi-core CPU trend we will see more and more applications that 
explore this capability: RT gaming engines, simulators, hard RT apps, etc.


Hence the proposal is to extend current CPU isolation feature.
The new definition of the CPU isolation would be:
---
1. Isolated CPU(s) must not be subject to scheduler load balancing
 Users must explicitly bind threads in order to run on those CPU(s).

2. By default interrupts must not be routed to the isolated CPU(s)
 User must route interrupts (if any) to those CPUs explicitly.

3. In general kernel subsystems must avoid activity on the isolated CPU(s) as 
much as possible
 Includes workqueues, per CPU threads, etc.
 This feature is configurable and is disabled by default.  
---


I've been maintaining this stuff since around 2.6.18 and it's been running in 
production
environment for a couple of years now. It's been tested on all kinds of 
machines, from NUMA
boxes like HP xw9300/9400 to tiny uTCA boards like Mercury AXA110.
The messiest part used to be SLAB garbage collector changes. With the new SLUB all that mess 
goes away (ie no changes necessary). Also CFS seems to handle CPU hotplug much better than O(1) 
did (ie domains are recomputed dynamically) so that isolation can be done at any time (via sysfs). 
So this seems like a good time to merge. 


We've had scheduler support for CPU isolation ever since O(1) scheduler went 
it. In other words
#1 is already supported. These patches do not change/affect that functionality in any way. 
#2 is trivial one liner change to the IRQ init code. 
#3 is addressed by a couple of separate patches. The main problem here is that RT thread can prevent

kernel threads from running and machine gets stuck because other CPUs are 
waiting for those threads
to run and report back.

Folks involved in the scheduler/cpuset development provided a lot of feedback 
on the first series
of patches. I believe I managed to explain and clarify every aspect. 
Paul Jackson initially suggested to implement #2 and #3 using cpusets subsystem. Paul and I looked 
at it more closely and determined that exporting cpu_isolated_map instead is a better option.

Details here
http://marc.info/?l=linux-kernel=120180692331461=2

Last patch to the stop machine is potentially unsafe and is marked as experimental. Unfortunately 
it's currently the only option that allows dynamic module insertion/removal for above scenarios. 

From the previous discussions it's the only 

Re: [PATCH 2/2] x86 : relocate uninitialized variable in init DATA section into init BSS section

2008-02-21 Thread Huang, Ying
On Thu, 2008-02-21 at 10:53 +0100, Ingo Molnar wrote:
> * Huang, Ying <[EMAIL PROTECTED]> wrote:
> 
> > > > -int __initdata early_ioremap_debug;
> > > > +int __initbss early_ioremap_debug;
> > > 
> > > will we get some sort of build error if we accidentally do:
> > > 
> > >int __initbss early_ioremap_debug = 1;
> > > 
> > > ?
> > 
> > I tested it just now, and there is no build error.
> 
> well, that's bad. We'd silently ignore the " = 1" and boot up with that 
> value at 0, right? At minimum we need some really prominent build-time 
> _errors_ (i.e. aborted builds) if this ever happens. But ideally, 
> shouldnt this whole thing be done at link time? Couldnt the linker sort 
> the variables that are zero initialized into the right section, and move 
> this constant maintenance pressure off the programmer's shoulder?

I think another method is to add a new attribute into GCC to prepend or
append something to section name instead of just to replace it, like the
example as follow:

#define __initdata  __attribute__((section_append(".init")))

int __initdata early_ioremap_debug_data = 1;
int __initdata early_ioremap_debug_bss;

The GCC can deduce the section (.data or .bss) of global variable based
on whether it is initialized. That is, without attribute,
early_ioremap_debug_data will be in ".data", early_ioremap_debug_bss
will be in ".bss". And with section_append attribute,
early_ioremap_debug_data will be in ".data.init" and
early_ioremap_debug_bss will be in ".bss.init".

Best Regards,
Huang Ying

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Configure MSI-X vectors to target different CPUs

2008-02-21 Thread caiying
Hi,

In MSI-HOWTO, it's said:

"Using MSI enables the device functions to support two or more vectors, which 
can be configured to target different CPUs to increase scalability."

So how can I set up MSI-X vectors to target different CPUs? I want to allocate 
the same number of MSI-X vectors as CPUs, and equally distribute them to every 
CPU.

Is it automatically done by Linux when I call pci_enable_msix()? If yes, how? 
If not, what should I do? My guess is to set the affinity of the interrupts 
manually. Am I right?

Please CC'ed me ([EMAIL PROTECTED]) answers/comments  in response to this 
posting. 

Thanks,
Ying


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: modular intel-agp does not work on my box

2008-02-21 Thread Gabriel C
Dave Airlie wrote:
>> Hi,
>>
>> When building agp* modular ( CONFIG_AGP=y/m and CONFIG_AGP_INTEL=m ) 
>> intel-agp does nothing on my box 
>> ( Dell Precision WorkStation 530 MT ) chipset is not being detected.
>>
>> Building both Y fixes that and agpgart works and also detects my chipset.
> 
> Have you got EDAC modules built as well? they might be taking ownership 
> when they shouldn't..
> 

Yes I have EDAC built modular. I will build latest git without EDAC and agp 
modular
and let you know if that fixes ( workarounds ;) ) the problem.

> 
>> ..
>>
>> Linux agpgart interface v0.102
>> agpgart: Detected an Intel i860 Chipset.
>> agpgart: AGP aperture is 256M @ 0xe000
>>
>> ..
>>
>> Also I've tested kernel 2.6.24.2 and 2.6.25-rc2 both with same result.
>>
>> lspci -vvvxxx output :
>>
>> 00:00.0 Host bridge: Intel Corporation 82860 860 (Wombat) Chipset Host 
>> Bridge (MCH) (rev 04)
>>  Subsystem: Dell Unknown device 00d8
>>  Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- 
>> Stepping- SERR+ FastB2B- DisINTx-
>>  Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- > SERR- >  Latency: 0
>>  Region 0: Memory at e000 (32-bit, prefetchable) [size=256M]
>>  Capabilities: [a0] AGP version 2.0
>>  Status: RQ=32 Iso- ArqSz=0 Cal=0 SBA+ ITACoh- GART64- HTrans- 
>> 64bit- FW+ AGP3- Rate=x1,x2,x4
>>  Command: RQ=1 ArqSz=0 Cal=0 SBA+ AGP+ GART64- 64bit- FW+ Rate=x4
>>  Kernel driver in use: agpgart-intel
>>  Kernel modules: i82860_edac
>> 00: 86 80 31 25 06 01 90 a0 04 00 00 06 00 00 00 00
>> 10: 08 00 00 e0 00 00 00 00 00 00 00 00 00 00 00 00
>> 20: 00 00 00 00 00 00 00 00 00 00 00 00 28 10 d8 00
>> 30: 00 00 00 00 a0 00 00 00 00 00 00 00 00 00 00 00
>> 40: d4 d4 92 80 80 80 80 80 80 80 80 80 80 80 80 80
>> 50: 05 6b 02 00 00 00 00 00 00 10 11 11 01 00 11 11
>> 60: 10 00 20 08 28 10 28 10 28 10 28 10 28 10 28 10
>> 70: 28 10 28 10 28 10 28 10 28 10 28 10 28 10 28 10
>> 80: 00 00 00 00 00 00 00 00 0f 00 00 00 00 00 00 00
>> 90: 0b 00 0b 00 61 01 01 08 55 19 00 00 81 0a 38 00
>> a0: 02 00 20 00 17 02 00 1f 14 03 00 00 00 00 00 00
>> b0: 80 00 00 00 00 00 00 00 00 00 04 27 20 10 8b 00
>> c0: 44 c0 50 11 00 28 00 00 00 00 00 00 03 00 00 00
>> d0: 02 28 00 0e 03 00 00 33 af 09 31 b5 01 00 0b 00
>> e0: 00 00 6a 00 00 00 00 01 2a 25 2b 33 07 00 00 00
>> f0: 00 00 01 00 74 f8 30 80 38 0f 00 00 00 00 00 00
>>
>> 00:01.0 PCI bridge: Intel Corporation 82850 850 (Tehama) Chipset AGP Bridge 
>> (rev 04) (prog-if 00 [Normal decode])
>>  Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- 
>> Stepping- SERR+ FastB2B- DisINTx-
>>  Status: Cap- 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- > SERR- >  Latency: 64
>>  Bus: primary=00, secondary=01, subordinate=01, sec-latency=64
>>  I/O behind bridge: e000-efff
>>  Memory behind bridge: ff80-ff9f
>>  Prefetchable memory behind bridge: d000-dfff
>>  Secondary status: 66MHz+ FastB2B+ ParErr- DEVSEL=medium >TAbort- 
>> >  BridgeCtl: Parity- SERR+ NoISA+ VGA+ MAbort- >Reset- FastB2B-
>>  PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
>>  Kernel modules: shpchp
>> 00: 86 80 32 25 07 01 a0 00 04 00 04 06 00 40 01 00
>> 10: 00 00 00 00 00 00 00 00 00 01 01 40 e0 e0 a0 22
>> 20: 80 ff 90 ff 00 d0 f0 df 00 00 00 00 00 00 00 00
>> 30: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0e 00
>> 40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>> 50: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>> 60: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>> 70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>> 80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>> 90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>> a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>> b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>> c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>> d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>> e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>> f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>>
>> 00:02.0 PCI bridge: Intel Corporation 82860 860 (Wombat) Chipset AGP Bridge 
>> (rev 04) (prog-if 00 [Normal decode])
>>  Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- 
>> Stepping- SERR+ FastB2B- DisINTx-
>>  Status: Cap- 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- > SERR- >  Latency: 64
>>  Bus: primary=00, secondary=02, subordinate=03, sec-latency=0
>>  I/O behind bridge: d000-dfff
>>  Memory behind bridge: ff50-ff7f
>>  Prefetchable memory behind bridge: fff0-000f
>>  Secondary status: 66MHz+ FastB2B+ ParErr- DEVSEL=medium >TAbort- 
>> >  BridgeCtl: Parity- SERR+ NoISA+ VGA- MAbort- >Reset- FastB2B-
>>  PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
>>  Kernel modules: shpchp
>> 00: 86 80 33 25 07 01 a0 00 04 00 04 06 00 40 01 00
>> 10: 00 00 00 00 00 00 00 00 

Re: Merging of completely unreviewed drivers

2008-02-21 Thread Krzysztof Halasa
Al Viro <[EMAIL PROTECTED]> writes:

> ... if your style is lousy.  I agree that situation with printks is
> not normal in that respect and I certainly have no love for the
> checkpatch nonsense, but pressure to keep the fucking nesting depth
> low is a Good Thing(tm).

Indeed. Unfortunately it is orthogonal to the line length limit.

We should limit the nesting level, though I think there is no
universally good value. What is good for one case (a function with a
short multi-level if/for/etc) is bad for another (a long switch()
where any added complexity makes it unparseable).

So I think it just have to meet the author's and reviewers' taste. We
already depend on this.
-- 
Krzysztof Halasa
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Module loading/unloading and "The Stop Machine"

2008-02-21 Thread Max Krasnyanskiy

Tejun Heo wrote:

Max Krasnyanskiy wrote:

Tejun Heo wrote:

Max Krasnyanskiy wrote:

Thanks for the info. I guess I missed that from the code. In any case
that seems like a pretty heavy refcounting mechanism. In a sense that
every time something is loaded or unloaded entire machine freezes,
potentially for several milliseconds. Normally it's not a big deal. But
once you get more and more CPUs and/or start using realtime apps this
becomes a big deal.

Module loading doesn't involve stop_machine last time I checked.  It's a
big deal when unloading a module but it's actually a very good trade off
because it makes much hotter path (module_get/put) much cheaper.  If
your application can't stand stop_machine, simply don't unload a module.

static struct module *load_module(void __user *umod,
 unsigned long len,
 const char __user *uargs)
{
 ...

 /* Now sew it into the lists so we can get lockdep and oops
* info during argument parsing.  Noone should access us, since
* strong_try_module_get() will fail. */
   stop_machine_run(__link_module, mod, NR_CPUS);
 ...
}


Ah... right.  That part doesn't have anything to do with module
reference counting as the comment suggests and can probably be removed
by updating how kallsyms synchronize against module load/unload.


That list (updated by __link_module) is accessed in couple of other places. ie 
outside symbol
lookup stuff used for kallsyms.

Max
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: modular intel-agp does not work on my box

2008-02-21 Thread Dave Airlie

> Hi,
> 
> When building agp* modular ( CONFIG_AGP=y/m and CONFIG_AGP_INTEL=m ) 
> intel-agp does nothing on my box 
> ( Dell Precision WorkStation 530 MT ) chipset is not being detected.
> 
> Building both Y fixes that and agpgart works and also detects my chipset.

Have you got EDAC modules built as well? they might be taking ownership 
when they shouldn't..

Dave.

> 
> ..
> 
> Linux agpgart interface v0.102
> agpgart: Detected an Intel i860 Chipset.
> agpgart: AGP aperture is 256M @ 0xe000
> 
> ..
> 
> Also I've tested kernel 2.6.24.2 and 2.6.25-rc2 both with same result.
> 
> lspci -vvvxxx output :
> 
> 00:00.0 Host bridge: Intel Corporation 82860 860 (Wombat) Chipset Host Bridge 
> (MCH) (rev 04)
>   Subsystem: Dell Unknown device 00d8
>   Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- 
> Stepping- SERR+ FastB2B- DisINTx-
>   Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort-  SERR-Latency: 0
>   Region 0: Memory at e000 (32-bit, prefetchable) [size=256M]
>   Capabilities: [a0] AGP version 2.0
>   Status: RQ=32 Iso- ArqSz=0 Cal=0 SBA+ ITACoh- GART64- HTrans- 
> 64bit- FW+ AGP3- Rate=x1,x2,x4
>   Command: RQ=1 ArqSz=0 Cal=0 SBA+ AGP+ GART64- 64bit- FW+ Rate=x4
>   Kernel driver in use: agpgart-intel
>   Kernel modules: i82860_edac
> 00: 86 80 31 25 06 01 90 a0 04 00 00 06 00 00 00 00
> 10: 08 00 00 e0 00 00 00 00 00 00 00 00 00 00 00 00
> 20: 00 00 00 00 00 00 00 00 00 00 00 00 28 10 d8 00
> 30: 00 00 00 00 a0 00 00 00 00 00 00 00 00 00 00 00
> 40: d4 d4 92 80 80 80 80 80 80 80 80 80 80 80 80 80
> 50: 05 6b 02 00 00 00 00 00 00 10 11 11 01 00 11 11
> 60: 10 00 20 08 28 10 28 10 28 10 28 10 28 10 28 10
> 70: 28 10 28 10 28 10 28 10 28 10 28 10 28 10 28 10
> 80: 00 00 00 00 00 00 00 00 0f 00 00 00 00 00 00 00
> 90: 0b 00 0b 00 61 01 01 08 55 19 00 00 81 0a 38 00
> a0: 02 00 20 00 17 02 00 1f 14 03 00 00 00 00 00 00
> b0: 80 00 00 00 00 00 00 00 00 00 04 27 20 10 8b 00
> c0: 44 c0 50 11 00 28 00 00 00 00 00 00 03 00 00 00
> d0: 02 28 00 0e 03 00 00 33 af 09 31 b5 01 00 0b 00
> e0: 00 00 6a 00 00 00 00 01 2a 25 2b 33 07 00 00 00
> f0: 00 00 01 00 74 f8 30 80 38 0f 00 00 00 00 00 00
> 
> 00:01.0 PCI bridge: Intel Corporation 82850 850 (Tehama) Chipset AGP Bridge 
> (rev 04) (prog-if 00 [Normal decode])
>   Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- 
> Stepping- SERR+ FastB2B- DisINTx-
>   Status: Cap- 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort-  SERR-Latency: 64
>   Bus: primary=00, secondary=01, subordinate=01, sec-latency=64
>   I/O behind bridge: e000-efff
>   Memory behind bridge: ff80-ff9f
>   Prefetchable memory behind bridge: d000-dfff
>   Secondary status: 66MHz+ FastB2B+ ParErr- DEVSEL=medium >TAbort- 
>BridgeCtl: Parity- SERR+ NoISA+ VGA+ MAbort- >Reset- FastB2B-
>   PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
>   Kernel modules: shpchp
> 00: 86 80 32 25 07 01 a0 00 04 00 04 06 00 40 01 00
> 10: 00 00 00 00 00 00 00 00 00 01 01 40 e0 e0 a0 22
> 20: 80 ff 90 ff 00 d0 f0 df 00 00 00 00 00 00 00 00
> 30: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0e 00
> 40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> 50: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> 60: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> 70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> 80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> 90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> 
> 00:02.0 PCI bridge: Intel Corporation 82860 860 (Wombat) Chipset AGP Bridge 
> (rev 04) (prog-if 00 [Normal decode])
>   Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- 
> Stepping- SERR+ FastB2B- DisINTx-
>   Status: Cap- 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort-  SERR-Latency: 64
>   Bus: primary=00, secondary=02, subordinate=03, sec-latency=0
>   I/O behind bridge: d000-dfff
>   Memory behind bridge: ff50-ff7f
>   Prefetchable memory behind bridge: fff0-000f
>   Secondary status: 66MHz+ FastB2B+ ParErr- DEVSEL=medium >TAbort- 
>BridgeCtl: Parity- SERR+ NoISA+ VGA- MAbort- >Reset- FastB2B-
>   PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
>   Kernel modules: shpchp
> 00: 86 80 33 25 07 01 a0 00 04 00 04 06 00 40 01 00
> 10: 00 00 00 00 00 00 00 00 00 02 03 00 d0 d0 a0 02
> 20: 50 ff 70 ff f0 ff 00 00 00 00 00 00 00 00 00 00
> 30: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 06 00
> 40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> 50: 02 28 00 0e 23 14 00 00 00 00 00 00 

Re: Merging of completely unreviewed drivers

2008-02-21 Thread Al Viro
On Fri, Feb 22, 2008 at 12:16:45PM +1030, David Newall wrote:
> Krzysztof Halasa wrote:
> > Linus Torvalds <[EMAIL PROTECTED]> writes:
> >> I'm personally of the opinion that a lot of checkpatch "fixes" are 
> >> anything but. That mainly concerns fixing overlong lines
> >> 
> >
> > Perhaps we should increase line length limit, 132 should be fine.
> > Especially useful with long printk() lines and long arithmetic
> > expressions.
> >   
> 
> 
> Yes; or even longer.  80 characters might have made sense on a screen
> when the alternative was 80 characters on a punched card, but on a
> modern computer it's very restrictive.  That's especially true with the
> deep indents that you quickly get in C

... if your style is lousy.  I agree that situation with printks is
not normal in that respect and I certainly have no love for the
checkpatch nonsense, but pressure to keep the fucking nesting depth
low is a Good Thing(tm).
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Module loading/unloading and "The Stop Machine"

2008-02-21 Thread Tejun Heo
Max Krasnyanskiy wrote:
> Tejun Heo wrote:
>> Max Krasnyanskiy wrote:
>>> Thanks for the info. I guess I missed that from the code. In any case
>>> that seems like a pretty heavy refcounting mechanism. In a sense that
>>> every time something is loaded or unloaded entire machine freezes,
>>> potentially for several milliseconds. Normally it's not a big deal. But
>>> once you get more and more CPUs and/or start using realtime apps this
>>> becomes a big deal.
>>
>> Module loading doesn't involve stop_machine last time I checked.  It's a
>> big deal when unloading a module but it's actually a very good trade off
>> because it makes much hotter path (module_get/put) much cheaper.  If
>> your application can't stand stop_machine, simply don't unload a module.
> 
> static struct module *load_module(void __user *umod,
>  unsigned long len,
>  const char __user *uargs)
> {
>  ...
> 
>  /* Now sew it into the lists so we can get lockdep and oops
> * info during argument parsing.  Noone should access us, since
> * strong_try_module_get() will fail. */
>stop_machine_run(__link_module, mod, NR_CPUS);
>  ...
> }

Ah... right.  That part doesn't have anything to do with module
reference counting as the comment suggests and can probably be removed
by updating how kallsyms synchronize against module load/unload.

> I actually rarely unload modules. The way I notice the problem in first
> place is when things started hanging when tun driver was autoloaded or
> when fs automounts triggered some auto loading.
> These days it's kind hard to have a semi-general purpose machine without
> module loading :).

Yeap, agreed.

-- 
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Merging of completely unreviewed drivers

2008-02-21 Thread Krzysztof Halasa
Jeff Garzik <[EMAIL PROTECTED]> writes:

> Every time this discussion comes up, people point out that it remains
> highly common to open multiple 80-column terminal windows, making the
> 80-column limit still highly relevant in modern times.

I guess only because of the limit :-)
Raise the limit, terminal windows will follow.
I'm using 80-column windows, too.
-- 
Krzysztof Halasa
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] capabilities: implement per-process securebits

2008-02-21 Thread Andrew G. Morgan

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Serge E. Hallyn wrote:
|> It all looks good to me.
|
|> Since we've confirmed that wireshark uses capabilities it must be using
|> prctl(PR_SET_KEEPCAPS), so running it might be a good way to verify that
|> your changes to that codepath (with CONFIG_SECURITY_FILE_CAPABILITIES=n)
|> are 100% correct, and might set minds at ease.  Is that something you're
|> set up to be able to do?

I guess I need someone to offer an existence proof that this particular
wireshark code ever worked? (ldd dumpcap|grep libcap). For reference,
I'm looking at:

wireshark-0.99.7/dumpcap.c:302

void
relinquish_privs_except_capture(void)
{
~/* CAP_NET_ADMIN: Promiscuous mode and a truckload of other
~ *stuff we don't need (and shouldn't have).
~ * CAP_NET_RAW:   Packet capture (raw sockets).
~ */
~cap_value_t cap_list[2] = { CAP_NET_ADMIN, CAP_NET_RAW };
~cap_t caps = cap_init();
~int cl_len = sizeof(cap_list) / sizeof(cap_value_t);

~if (started_with_special_privs()) {
~print_caps("Pre drop, pre set");
~if (prctl(PR_SET_KEEPCAPS, 1, 0, 0, 0) == -1) {
~perror("prctl()");
~}

~cap_set_flag(caps, CAP_PERMITTED,   cl_len, cap_list, CAP_SET);
~cap_set_flag(caps, CAP_INHERITABLE, cl_len, cap_list, CAP_SET);

[ XXX:AGM since (caps.pE > caps.pP) this next line should fail ]
~if (cap_set_proc(caps)) {
~perror("capset()");
~}
~print_caps("Pre drop, post set");
~}

~relinquish_special_privs_perm();

~print_caps("Post drop, pre set");
~cap_set_flag(caps, CAP_EFFECTIVE,   cl_len, cap_list, CAP_SET);
~if (cap_set_proc(caps)) {
~perror("capset()");
~}
~print_caps("Post drop, post set");
~cap_free(caps);
}
#endif /* HAVE_LIBCAP */

My reading of the above code suggests that the application believes that
it can raise/retain effective capabilities that are not in its permitted
set.

Browsing back in my git tree all the way back to 'v2.6.12-rc2', the
following code (cap_capset_check) correctly requires:

~   96  /* verify the _new_Effective_ is a subset of the _new_Permitted_ */
~   97  if (!cap_issubset (*effective, *permitted)) {
~   98   return -EPERM;
~   99  }

so my question is, why should one expect this wireshark code to work? It
looks wrong to me.

Thanks

Andrew

|
|> -serge
|
| Thanks
|
| Andrew

~From 006ddf6903983dd596e360ab1ab8e537b29fab46 Mon Sep 17 00:00:00 2001
From: Andrew G. Morgan <[EMAIL PROTECTED]>
Date: Mon, 18 Feb 2008 15:23:28 -0800
Subject: [PATCH] Implement per-process securebits
|>
[This patch represents a no-op unless CONFIG_SECURITY_FILE_CAPABILITIES
~ is enabled at configure time.]
|>
Filesystem capability support makes it possible to do away with
(set)uid-0 based privilege and use capabilities instead. That is, with
filesystem support for capabilities but without this present patch,
it is (conceptually) possible to manage a system with capabilities
alone and never need to obtain privilege via (set)uid-0.
|>
Of course, conceptually isn't quite the same as currently possible
since few user applications, certainly not enough to run a viable
system, are currently prepared to leverage capabilities to exercise
privilege. Further, many applications exist that may never get
upgraded in this way, and the kernel will continue to want to support
their setuid-0 base privilege needs.
|>
Where pure-capability applications evolve and replace setuid-0
binaries, it is desirable that there be a mechanisms by which they
can contain their privilege. In addition to leveraging the per-process
bounding and inheritable sets, this should include suppressing the
privilege of the uid-0 superuser from the process' tree of children.
|>
The feature added by this patch can be leveraged to suppress the
privilege associated with (set)uid-0. This suppression requires
CAP_SETPCAP to initiate, and only immediately affects the 'current'
process (it is inherited through fork()/exec()). This
reimplementation differs significantly from the historical support for
securebits which was system-wide, unwieldy and which has ultimately
withered to a dead relic in the source of the modern kernel.
|>
With this patch applied a process, that is capable(CAP_SETPCAP), can
now drop all legacy privilege (through uid=0) for itself and all
subsequently fork()'d/exec()'d children with:
|>
~  prctl(PR_SET_SECUREBITS, 0x2f);
|>
[2008/02/18: This version includes an int -> long argument fix from Serge.]
|>
Signed-off-by: Andrew G. Morgan <[EMAIL PROTECTED]>
Acked-by: Serge Hallyn <[EMAIL PROTECTED]>
Reviewed-by: James Morris <[EMAIL PROTECTED]>
- ---
~ include/linux/capability.h |3 +-
~ include/linux/init_task.h  |3 +-
~ include/linux/prctl.h  |9 +++-
~ include/linux/sched.h  |3 +-
~ include/linux/securebits.h |   25 ---
~ include/linux/security.h   |   14 +++---
~ kernel/sys.c   |   25 +--
~ 

[PATCH 1/3] exporting capability name/code pairs (final)

2008-02-21 Thread Kohei KaiGai
[1/3] Add a private data field within kobj_attribute structure.

This patch add a private data field, declared as void *, within kobj_attribute
structure. The _show() and _store() method in the sysfs attribute entries can
refer this information to identify what entry is accessed.
It makes easier to share a single method implementation with several similar
entries, like ones to export the list of capabilities the running kernel
supported.

Signed-off-by: KaiGai Kohei <[EMAIL PROTECTED]>
--
 include/linux/kobject.h |1 +
 include/linux/sysfs.h   |7 +++
 2 files changed, 8 insertions(+), 0 deletions(-)

diff --git a/include/linux/kobject.h b/include/linux/kobject.h
index caa3f41..57d5bf1 100644
--- a/include/linux/kobject.h
+++ b/include/linux/kobject.h
@@ -130,6 +130,7 @@ struct kobj_attribute {
char *buf);
ssize_t (*store)(struct kobject *kobj, struct kobj_attribute *attr,
 const char *buf, size_t count);
+   void *data; /* a private field */
 };

 extern struct sysfs_ops kobj_sysfs_ops;
diff --git a/include/linux/sysfs.h b/include/linux/sysfs.h
index 8027104..6f40ff9 100644
--- a/include/linux/sysfs.h
+++ b/include/linux/sysfs.h
@@ -50,6 +50,13 @@ struct attribute_group {
.store  = _store,   \
 }

+#define __ATTR_DATA(_name,_mode,_show,_store,_data) {  \
+   .attr = {.name = __stringify(_name), .mode = _mode },   \
+   .show   = _show,\
+   .store  = _store,   \
+   .data   = (void *)(_data),  \
+}
+   
 #define __ATTR_RO(_name) { \
.attr   = { .name = __stringify(_name), .mode = 0444 }, \
.show   = _name##_show, \

-- 
OSS Platform Development Division, NEC
KaiGai Kohei <[EMAIL PROTECTED]>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2/3] exporting capability name/code pairs (final)

2008-02-21 Thread Kohei KaiGai
[2/3] Exporting capability code/name pairs

This patch enables to export code/name pairs of capabilities the running
kernel supported.

A newer kernel sometimes adds new capabilities, like CAP_MAC_ADMIN
at 2.6.25. However, we have no interface to disclose what capabilities
are supported on the running kernel. Thus, we have to maintain libcap
version in appropriate one synchronously.

This patch enables libcap to collect the list of capabilities at run time,
and provide them for users. It helps to improve portability of library.

It exports these information as regular files under /sys/kernel/capability.
The numeric node exports its name, the symbolic node exports its code.

Please consider to put this patch on the queue of 2.6.25.

Thanks,

 BEGIN EXAMPLE 
[EMAIL PROTECTED] ~]$ ls -R /sys/kernel/capability/
/sys/kernel/capability/:
codes  names  version

/sys/kernel/capability/codes:
0  10  12  14  16  18  2   21  23  25  27  29  30  32  4  6  8
1  11  13  15  17  19  20  22  24  26  28  3   31  33  5  7  9

/sys/kernel/capability/names:
cap_audit_controlcap_kill  cap_net_raw cap_sys_nice
cap_audit_write  cap_lease cap_setfcap cap_sys_pacct
cap_chowncap_linux_immutable   cap_setgid  cap_sys_ptrace
cap_dac_override cap_mac_admin cap_setpcap cap_sys_rawio
cap_dac_read_search  cap_mac_override  cap_setuid  cap_sys_resource
cap_fowner   cap_mknod cap_sys_admin   cap_sys_time
cap_fsetid   cap_net_admin cap_sys_bootcap_sys_tty_config
cap_ipc_lock cap_net_bind_service  cap_sys_chroot
cap_ipc_ownercap_net_broadcast cap_sys_module
[EMAIL PROTECTED] ~]$ cat /sys/kernel/capability/version
0x20071026
[EMAIL PROTECTED] ~]$ cat /sys/kernel/capability/codes/30
cap_audit_control
[EMAIL PROTECTED] ~]$ cat /sys/kernel/capability/names/cap_sys_pacct
20
[EMAIL PROTECTED] ~]$
 END EXAMPLE --

Signed-off-by: KaiGai Kohei <[EMAIL PROTECTED]>
--
 Documentation/ABI/testing/sysfs-kernel-capability |   23 +
 scripts/mkcapnames.sh |   44 +
 security/Makefile |9 ++
 security/commoncap.c  |   99 +
 4 files changed, 175 insertions(+), 0 deletions(-)

diff --git a/Documentation/ABI/testing/sysfs-kernel-capability 
b/Documentation/ABI/testing/sysfs-kernel-capability
index e69de29..402ef06 100644
--- a/Documentation/ABI/testing/sysfs-kernel-capability
+++ b/Documentation/ABI/testing/sysfs-kernel-capability
@@ -0,0 +1,23 @@
+What:  /sys/kernel/capability
+Date:  Feb 2008
+Contact:   KaiGai Kohei <[EMAIL PROTECTED]>
+Description:
+   The entries under /sys/kernel/capability are used to export
+   the list of capabilities the running kernel supported.
+
+   - /sys/kernel/capability/version
+ returns the most preferable version number for the
+ running kernel.
+ e.g) $ cat /sys/kernel/capability/version
+  0x20071026
+
+   - /sys/kernel/capability/code/
+ returns its symbolic representation, on reading.
+ e.g) $ cat /sys/kernel/capability/codes/30
+  cap_audit_control
+
+   - /sys/kernel/capability/name/
+ returns its numerical representation, on reading.
+ e.g) $ cat /sys/kernel/capability/names/cap_sys_pacct
+  20
+
diff --git a/scripts/mkcapnames.sh b/scripts/mkcapnames.sh
index e69de29..5d36d52 100644
--- a/scripts/mkcapnames.sh
+++ b/scripts/mkcapnames.sh
@@ -0,0 +1,44 @@
+#!/bin/sh
+
+#
+# generate a cap_names.h file from include/linux/capability.h
+#
+
+CAPHEAD="`dirname $0`/../include/linux/capability.h"
+REGEXP='^#define CAP_[A-Z_]+[  ]+[0-9]+$'
+NUMCAP=`cat "$CAPHEAD" | egrep -c "$REGEXP"`
+
+echo '#ifndef CAP_NAMES_H'
+echo '#define CAP_NAMES_H'
+echo
+echo '/*'
+echo ' * Do NOT edit this file directly.'
+echo ' * This file is generated from include/linux/capability.h automatically'
+echo ' */'
+echo
+echo '#if !defined(SYSFS_CAP_NAME_ENTRY) || !defined(SYSFS_CAP_CODE_ENTRY)'
+echo '#error cap_names.h should be included from security/capability.c'
+echo '#else'
+echo "#if $NUMCAP != CAP_LAST_CAP + 1"
+echo '#error mkcapnames.sh cannot collect capabilities correctly'
+echo '#else'
+cat "$CAPHEAD" | egrep "$REGEXP" \
+| awk '{ printf("SYSFS_CAP_NAME_ENTRY(%s,%s);\n", tolower($2), $2); }'
+echo
+echo 'static struct attribute *capability_name_attrs[] = {'
+cat "$CAPHEAD" | egrep "$REGEXP" \
+| awk '{ printf("\t&%s_name_attr.attr,\n", tolower($2)); } END { print 
"\tNULL," }'
+echo '};'
+
+echo
+cat "$CAPHEAD" | egrep "$REGEXP" \
+| awk '{ printf("SYSFS_CAP_CODE_ENTRY(%s,%s);\n", tolower($2), $2); }'
+echo
+echo 'static struct attribute *capability_code_attrs[] = {'
+cat "$CAPHEAD" | 

[PATCH 3/3] exporting capability name/code pairs (final)

2008-02-21 Thread Kohei KaiGai
[3/3] A new example to use kobject/kobj_attribute

The attached patch can provide a new exmple to use kobject and attribute.
The _show() and _store() method can refer/store the private data field of
kobj_attribute structure to know what entries are accessed by users.
It will make easier to share a single _show()/_store() method with several
entries.

KaiGai Kohei <[EMAIL PROTECTED]>
--
 samples/kobject/kobject-example.c |   32 
 1 files changed, 32 insertions(+), 0 deletions(-)

diff --git a/samples/kobject/kobject-example.c 
b/samples/kobject/kobject-example.c
index 08d0d3f..f99d734 100644
--- a/samples/kobject/kobject-example.c
+++ b/samples/kobject/kobject-example.c
@@ -77,6 +77,35 @@ static struct kobj_attribute baz_attribute =
 static struct kobj_attribute bar_attribute =
__ATTR(bar, 0666, b_show, b_store);

+/*
+ * You can store a private data within 'data' field of kobj_attribute.
+ * It enables to share a single _show() or _store() method with several
+ * entries.
+ */
+static ssize_t integer_show(struct kobject *kobj,
+   struct kobj_attribute *attr,
+   char *buf)
+{
+   return scnprintf(buf, PAGE_SIZE, "%d\n", (int) attr->data);
+}
+
+static ssize_t integer_store(struct kobject *kobj,
+struct kobj_attribute *attr,
+const char *buf, size_t count)
+{
+   int code;
+
+   sscanf(buf, "%du", );
+   attr->data = (void *) code;
+   return count;
+}
+
+static struct kobj_attribute hoge_attribute =
+   __ATTR_DATA(hoge, 0666, integer_show, integer_store, 123);
+static struct kobj_attribute piyo_attribute =
+   __ATTR_DATA(piyo, 0666, integer_show, integer_store, 456);
+static struct kobj_attribute fuga_attribute =
+   __ATTR_DATA(fuga, 0444, integer_show, NULL, 789);

 /*
  * Create a group of attributes so that we can create and destory them all
@@ -86,6 +115,9 @@ static struct attribute *attrs[] = {
_attribute.attr,
_attribute.attr,
_attribute.attr,
+   _attribute.attr,
+   _attribute.attr,
+   _attribute.attr,
NULL,   /* need to NULL terminate the list of attributes */
 };


-- 
OSS Platform Development Division, NEC
KaiGai Kohei <[EMAIL PROTECTED]>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Suspend-devel] 2.6.25-rc2 System no longer powers off aftersuspend-to-disk. Screen becomes green.

2008-02-21 Thread Jesse Barnes
On Thursday, February 21, 2008 5:28 pm Linus Torvalds wrote:
> On Thu, 21 Feb 2008, Jesse Barnes wrote:
> > So the advantage of the kernel suspend/resume hooks for the DRM layer is
> > that the kernel video drivers can do full state save/restore (which X
> > usually doesn't do, and isn't really designed to do), so that if your
> > platform *doesn't* do it all, you'll still end up with a usable machine
> > in the end.
>
> Well, I'm also hoping that eventually we could even just not do the VT
> switch at all, and the kernel can treat X as "just another user process"
> that it freezes.

Hell yes.

> At least from a mode setting standpoint.
>
> We'd still want to make sure that X repaints the screen if the contents
> were lost, of course. And this is going to depend very intimately on the
> type of graphics card and whether the video RAM is saved by STR or not -
> for the Intel integrated graphics kind of situation, the video RAM will be
> refreshed along with all the other memory, but for other cards we may end
> up having to do the VT switch not so much for modesetting reasons as just
> a way to get X to save and restore all the *other* state.

Drivers supporting kernel modesetting will have to stuff their VRAM somewhere, 
yeah.  Hopefully X won't have much to do with it though...

> How close is the i915 driver from not having to even signal X? Or is that
> just a pipedream of mine?

It's there in the modesetting tree (though the requisite changes to avoid VT 
notification aren't done, it should all work fine).

Jesse
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Module loading/unloading and "The Stop Machine"

2008-02-21 Thread Max Krasnyanskiy

Tejun Heo wrote:

Max Krasnyanskiy wrote:

Thanks for the info. I guess I missed that from the code. In any case
that seems like a pretty heavy refcounting mechanism. In a sense that
every time something is loaded or unloaded entire machine freezes,
potentially for several milliseconds. Normally it's not a big deal. But
once you get more and more CPUs and/or start using realtime apps this
becomes a big deal.


Module loading doesn't involve stop_machine last time I checked.  It's a
big deal when unloading a module but it's actually a very good trade off
because it makes much hotter path (module_get/put) much cheaper.  If
your application can't stand stop_machine, simply don't unload a module.


static struct module *load_module(void __user *umod,
 unsigned long len,
 const char __user *uargs)
{
 ...

 /* Now sew it into the lists so we can get lockdep and oops
* info during argument parsing.  Noone should access us, since
* strong_try_module_get() will fail. */
   stop_machine_run(__link_module, mod, NR_CPUS);
 ...
}

I actually rarely unload modules. The way I notice the problem in first place is when 
things started hanging when tun driver was autoloaded or when fs automounts triggered 
some auto loading.

These days it's kind hard to have a semi-general purpose machine without module 
loading :).

Max
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Merging of completely unreviewed drivers

2008-02-21 Thread David Newall
Krzysztof Halasa wrote:
> Linus Torvalds <[EMAIL PROTECTED]> writes:
>> I'm personally of the opinion that a lot of checkpatch "fixes" are 
>> anything but. That mainly concerns fixing overlong lines
>> 
>
> Perhaps we should increase line length limit, 132 should be fine.
> Especially useful with long printk() lines and long arithmetic
> expressions.
>   


Yes; or even longer.  80 characters might have made sense on a screen
when the alternative was 80 characters on a punched card, but on a
modern computer it's very restrictive.  That's especially true with the
deep indents that you quickly get in C.  Even short lines often need to
be split when you put a few tabs in front of them, and that makes
comprehension that bit harder, not to mention looks ugly.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Suspend-devel] 2.6.25-rc2 System no longer powers off aftersuspend-to-disk. Screen becomes green.

2008-02-21 Thread Jesse Barnes
On Thursday, February 21, 2008 5:13 pm Jesse Barnes wrote:
> On Thursday, February 21, 2008 4:54 pm Rafael J. Wysocki wrote:
> > On Friday, 22 of February 2008, Linus Torvalds wrote:
> > > On Fri, 22 Feb 2008, Rafael J. Wysocki wrote:
> > > > -   if (state.event == PM_EVENT_SUSPEND) {
> > > > +   if (state.event == PM_EVENT_SUSPEND && 
> > > > !in_hibernation_power_off())
> > > > {
> > >
> > > I don't understand why hibernation just doesn't use a
> > > PM_EVENT_HIBERNATE, and be done with it?
> > >
> > > Why should it be called PM_EVENT_SUSPEND when it isn't?
> > >
> > > Adding some external global variables is absolutely the wrong way to
> > > fix this.
> > >
> > > It's not even like there are very many drivers who actually care about
> > > "state.event" anyway: a 'git grep' returns just 35 users in the whole
> > > tree, so if this was done this ugly way just to avoid double-chcking
> > > the other cases that compare against PM_EVENT_SUSPEND, then it really
> > > wasn't worth it.
> >
> > Please relax, we're debugging the thing right now and the patch doesn't
> > even seem to help on the other affected box.
>
> Actually, looks like I forgot to reboot between tests (just rmmod'd &
> modprobed i915), your patch actually does work.
>
> However, making new PM event messages might be a good thing anyway,
> assuming Linus takes it for 2.6.25, since it should make the migration to
> ->hibernate callbacks easier.

Rafael, I'd actually prefer these changes to the i915 driver.  One is to avoid 
the "green screen" problem and the other is to actually save state at 
hibernate time in case we don't do a POST coming out of S4 (probably not 
common but hey).

Jesse

Make sure hibernation works by not shutting down the video device during 
hibernation power off.  This is important because later stages of the 
hibernation cycle end up touching the video device, which may cause a hang if 
it was disabled early on.  Also make sure the restoration correctly restores 
the AR registers by flipping the ARX register into index mode before doing 
anything.

Depends on Rafael's patch which exports hibernation state to drivers.

Signed-off-by:  Jesse Barnes <[EMAIL PROTECTED]>

diff --git a/drivers/char/drm/i915_drv.c b/drivers/char/drm/i915_drv.c
index 35758a6..5e73869 100644
--- a/drivers/char/drm/i915_drv.c
+++ b/drivers/char/drm/i915_drv.c
@@ -27,6 +27,7 @@
  *
  */
 
+#include 
 #include "drmP.h"
 #include "drm.h"
 #include "i915_drm.h"
@@ -222,6 +223,7 @@ static void i915_restore_vga(struct drm_device *dev)
   dev_priv->saveGR[0x18]);
 
/* Attribute controller registers */
+   inb(st01);
for (i = 0; i < 20; i++)
i915_write_ar(st01, i, dev_priv->saveAR[i], 0);
inb(st01); /* switch back to index mode */
@@ -364,8 +366,8 @@ static int i915_suspend(struct drm_device *dev)
i915_save_vga(dev);
 
/* Shut down the device */
-   pci_disable_device(dev->pdev);
-   pci_set_power_state(dev->pdev, PCI_D3hot);
+   if (!in_hibernation_power_off())
+   pci_set_power_state(dev->pdev, PCI_D3hot);
 
return 0;
 }

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: tty && pid problems

2008-02-21 Thread Eric W. Biederman
Alan Cox <[EMAIL PROTECTED]> writes:

>> > *ping* - Any further activity on this one?  I got bit by it as well on
>> > the very first attempted boot of 25-rc2-mm1, the instant it tried to leave
>> > single-user and go multi-user.
>> 
>> Valdis, any chance you can try the
>>  "[PATCH] (for -mm only) put_pid: make sure we don't free the live pid"
>> I sent? just to make sure we don't have other problems here.
>
> There is some other iffy locking of the pid objects ever since they were
> changed from pid_t to ref counted structs. Whoever did that didn't add
> any locking for it, and the old code knew it was "safe" not to.
>
> I've added locks in my test tree and now I've finally got -mm to build
> will do some testing then push more stuff upstream

Thanks.  At the tty layer that was probably me.
Most of the instances already appear to be nested in some other kind of
locking, but that doesn't make no additional locking correct or ensure
that it will give a uniform result.

Eric

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [BUILD_FAILURE] 2.6.25-rc2-mm1 - Build Failure at acpi_os

2008-02-21 Thread Nish Aravamudan
On 2/21/08, Sam Ravnborg <[EMAIL PROTECTED]> wrote:
> On Thu, Feb 21, 2008 at 10:54:40AM -0800, Nish Aravamudan wrote:
>  > On 2/20/08, Len Brown <[EMAIL PROTECTED]> wrote:
>  > > On Saturday 16 February 2008 14:47, Kamalesh Babulal wrote:
>  > >  > Hi Andrew,
>  > >  >
>  > >  > The 2.6.25-rc2-mm1 kernel with randconfig build option, fails
>  > >  > to build on x86_64 machine
>  > >  >
>  > >  >   CC  drivers/acpi/osl.o
>  > >  > drivers/acpi/osl.c:60:38: error: empty filename in #include
>  > >  > drivers/acpi/osl.c: In function 'acpi_os_table_override':
>  > >  > drivers/acpi/osl.c:399: error: 'AmlCode' undeclared (first use in 
> this function)
>  > >  > drivers/acpi/osl.c:399: error: (Each undeclared identifier is 
> reported only once
>  > >  > drivers/acpi/osl.c:399: error: for each function it appears in.)
>  > >  > make[2]: *** [drivers/acpi/osl.o] Error 1
>  > >  > make[1]: *** [drivers/acpi] Error 2
>  > >  > make: *** [drivers] Error 2
>  > >  >
>  > >  > #
>  > >  > # Automatically generated make config: don't edit
>  > >  > # Linux kernel version: 2.6.25-rc2-mm1
>  > >  > # Sun Feb 17 08:07:17 2008
>  > >  > #
>  > >
>  > >
>  > > > CONFIG_ACPI_CUSTOM_DSDT=y
>  > >  > CONFIG_ACPI_CUSTOM_DSDT_FILE=""
>  > >
>  > >
>  > > garbage in, garbage out.
>  >
>  > garbage explicitly *allowed* by Kconfig in this case, though.
>  >
>  > >  If you don't give this build option a file name where AmlCode lives,
>  > >  then the build will be unable to find AmlCode[].
>  > >
>  > >  http://www.lesswatts.org/projects/acpi/overridingDSDT.php
>  >
>  > So we have a .config option whose sole purpose is to use another
>  > .config option? That seems ... less than ideal. Is there not some
>  > Kconfig voodoo we can do to only require the one option? Maybe
>  > something like how CONFIG_INITRAMFS_SOURCE is done? Adding Sam to the
>  > Cc, in case he has any ideas.
>
>
> Make sure STANDALONE is y for your randconfig builds.
>  See README for examples.

Hrm, if this is needed for randconfig to work, perhaps randconfig
itself should somehow be specifying it?

>  STANALONE is there exactly to prevent the above but we cannot
>  control randconfig.

While setting STANDALONE does fix the above, it doesn't answer the
more basic question I had -- do we really need both .config options in
this case? If it's simply a case of "That's how it is, won't be fixed,
there are higher priorities", that's good enough by me. Just seems a
shame that we have an option to enable another option, which is
required for the first option to be sensible -- seems like we should
only need the second option...

Thanks,
Nish
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Module loading/unloading and "The Stop Machine"

2008-02-21 Thread Tejun Heo
Max Krasnyanskiy wrote:
> Thanks for the info. I guess I missed that from the code. In any case
> that seems like a pretty heavy refcounting mechanism. In a sense that
> every time something is loaded or unloaded entire machine freezes,
> potentially for several milliseconds. Normally it's not a big deal. But
> once you get more and more CPUs and/or start using realtime apps this
> becomes a big deal.

Module loading doesn't involve stop_machine last time I checked.  It's a
big deal when unloading a module but it's actually a very good trade off
because it makes much hotter path (module_get/put) much cheaper.  If
your application can't stand stop_machine, simply don't unload a module.

> And it's plain broken for the use case that I mentioned
> during CPU isolation discussions. ie When user-space thread(s) prevent
> stopmachine kthread from running, in which
> case machine simply hangs until those user-space threads exit.

This I don't know nothing about. :-)

-- 
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] libata: Register for dock events when the drive is inside a dock station

2008-02-21 Thread Tejun Heo
> If a device/bay is inside a docking station, we need to register for dock
> events additionally to bay events. If a dock event occurs, the dock driver
> will call the appropriate handler (ata_acpi_ap_notify() or
> ata_acpi_dev_notify()) for us.
> 
> Signed-off-by: Holger Macht <[EMAIL PROTECTED]>
> ---
> 
> diff --git a/drivers/ata/libata-acpi.c b/drivers/ata/libata-acpi.c
> index 9e8ec19..563ad72 100644
> --- a/drivers/ata/libata-acpi.c
> +++ b/drivers/ata/libata-acpi.c
> @@ -191,20 +191,33 @@ void ata_acpi_associate(struct ata_host *host)
>   else
>   ata_acpi_associate_ide_port(ap);
>  
> - if (ap->acpi_handle)
> + if (ap->acpi_handle) {
>   acpi_install_notify_handler (ap->acpi_handle,
>ACPI_SYSTEM_NOTIFY,
>ata_acpi_ap_notify,
>ap);
> +#ifdef CONFIG_ACPI_DOCK_MODULE

Heh, you need

  #if defined(CONFIG_ACPI_DOCK) || defined(CONFIG_ACPI_DOCK_MODULE)

Also, another question.  Is there a way to tell whether the device or
port is connected behind a dock or not?  Just notifying hotplug signal
is fine for hotplugging but to make hot unplug safe for PATA, libata
should be able to tell whether the device is actually gonna go away and
kill it explicitly.

Thanks.

-- 
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


  1   2   3   4   5   6   7   8   9   10   >