Re: [PATCH 20/20] x86_64: Move CPU verification code to common file

2006-11-17 Thread H. Peter Anvin

Oleg Verych wrote:

On Fri, Nov 17, 2006 at 10:59:32PM -0800, H. Peter Anvin wrote:

Oleg Verych wrote:

It will burn CPU, until power cycle will be done (my AMD64 laptop and
Intel's amd64 destop PC require that). In case of reboot timeout (or
just reboot with jump to BIOS), i will just choose another image to boot
or will press F8 to have another boot device.

That's a fairly stupid argument, since it assumes operator intervention, 
at which point you have access to the machine anyway.


I would never call *power cycle* stupid, just because from physics
point of veiw.

Example. I have my flower.upol.cz many kilometers far away from me.
I used to boot it from that flash (new hardware, sata problems, etc).

When something goes wrong with rc kernel or power source, bum.
And i had to move my ass there, just to press reset. Because.


Yes, and you would have to do that to press F8 too.


While i have "power on, on AC failures" in BIOS, *sometimes* flash
will not boot (i don't know why, maybe it's GRUB+flash-read,
or BIOS usb hdd implementation specific).


I was making the point that there is unattended recovery possible.  That 
makes it a significant argument.  That a user on a laptop has to wait 
four seconds pushing the power button is not.


-hpa
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [rfc patch] Re: sched: incorrect argument used in task_hot()

2006-11-17 Thread Ingo Molnar

* Chen, Kenneth W <[EMAIL PROTECTED]> wrote:

> - if (sd->nr_balance_failed > sd->cache_nice_tries)
> + if (sd->nr_balance_failed > sd->cache_nice_tries) {
> + #ifdef CONFIG_SCHEDSTATS
> + if (task_hot(p, rq->most_recent_timestamp, sd))
> + schedstat_inc(sd, lb_hot_gained[idle]);
> + #endif
>   return 1;
> + }

minor nit: preprocessor directives should be aligned to the first 
column.

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 20/20] x86_64: Move CPU verification code to common file

2006-11-17 Thread Oleg Verych
On Fri, Nov 17, 2006 at 10:59:32PM -0800, H. Peter Anvin wrote:
> Oleg Verych wrote:
> >
> >It will burn CPU, until power cycle will be done (my AMD64 laptop and
> >Intel's amd64 destop PC require that). In case of reboot timeout (or
> >just reboot with jump to BIOS), i will just choose another image to boot
> >or will press F8 to have another boot device.
> >
> 
> That's a fairly stupid argument, since it assumes operator intervention, 
> at which point you have access to the machine anyway.

I would never call *power cycle* stupid, just because from physics
point of veiw.

Example. I have my flower.upol.cz many kilometers far away from me.
I used to boot it from that flash (new hardware, sata problems, etc).

When something goes wrong with rc kernel or power source, bum.
And i had to move my ass there, just to press reset. Because.

While i have "power on, on AC failures" in BIOS, *sometimes* flash
will not boot (i don't know why, maybe it's GRUB+flash-read,
or BIOS usb hdd implementation specific).

DTR laptop ~33% doesn't boot that flash. And laptop has no reset button.
Operator is present, so your consern is right here.

> A stronger argument is, again, that some bootloaders can do unattended 
> fallback.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Regard MSRs in lapic_suspend()/lapic_resume()

2006-11-17 Thread Ingo Molnar

* Karsten Wiese <[EMAIL PROTECTED]> wrote:

> Read/Write APIC_LVTPC and APIC_LVTTHMR only, if get_maxlvt() returns 
> certain values. This is done like everywhere else in 
> i386/kernel/apic.c, so I guess its correct. Suspends/Resumes to disk 
> fine and eleminates an smp_error_interrupt() here on a K8.
> 
> Signed-off-by: Karsten Wiese <[EMAIL PROTECTED]>

nice one! I'm tempted to suggest this for 2.6.19 merge because it causes 
the kernel to do less (so it has little risk of breaking something that 
is working) ... who knows what happens on certain (older?) APICs when we 
try to write back those bogus values.

  Acked-by: Ingo Molnar <[EMAIL PROTECTED]>

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC: -mm patch] remove kernel/timer.c:wall_jiffies

2006-11-17 Thread Ingo Molnar

* Adrian Bunk <[EMAIL PROTECTED]> wrote:

> "wall_jiffies" was added, but it's completely unused...
> 
> Signed-off-by: Adrian Bunk <[EMAIL PROTECTED]>

yeah, that's a merge leftover in:

  gtod-persistent-clock-support-core.patch

Acked-by: Ingo Molnar <[EMAIL PROTECTED]>

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC: -mm patch] make kernel/timer.c:__next_timer_interrupt() static

2006-11-17 Thread Ingo Molnar

* Adrian Bunk <[EMAIL PROTECTED]> wrote:

> This patch makes the needlessly global __next_timer_interrupt() static.
> 
> Signed-off-by: Adrian Bunk <[EMAIL PROTECTED]>

ok.

Acked-by: Ingo Molnar <[EMAIL PROTECTED]>

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 20/20] x86_64: Move CPU verification code to common file

2006-11-17 Thread H. Peter Anvin

Oleg Verych wrote:


It will burn CPU, until power cycle will be done (my AMD64 laptop and
Intel's amd64 destop PC require that). In case of reboot timeout (or
just reboot with jump to BIOS), i will just choose another image to boot
or will press F8 to have another boot device.



That's a fairly stupid argument, since it assumes operator intervention, 
at which point you have access to the machine anyway.


A stronger argument is, again, that some bootloaders can do unattended 
fallback.


However, this test should probably be pushed earlier, into setup.S, 
where executing a BIOS-clean reboot is much easier.


-hpa
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC: 2.6 patch] remove kernel/lockdep.c:lockdep_internal

2006-11-17 Thread Ingo Molnar

* Adrian Bunk <[EMAIL PROTECTED]> wrote:

> This patch removes the no longer used lockdep_internal().
> 
> Signed-off-by: Adrian Bunk <[EMAIL PROTECTED]>

agreed.

Acked-by: Ingo Molnar <[EMAIL PROTECTED]>

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/2] input: make serio_register_driver() return error code

2006-11-17 Thread Akinobu Mita
On Fri, Nov 17, 2006 at 01:37:34AM -0500, Dmitry Torokhov wrote:
> I think I found a way to handle all errors when registering serio driver.
> What do you think about the patch below?
> 

Looks good to me.
I also tested this patch with my patch 2/4 (which actually checking
the return code of serio_register_driver() for each input driver).

Acked-by: Akinobu Mita <[EMAIL PROTECTED]>
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC 6/7] Use an external declaration in exit.c for fs_cachep

2006-11-17 Thread Stephen Rothwell
On Sat, 18 Nov 2006 06:44:33 + Oleg Verych <[EMAIL PROTECTED]> wrote:
>
>
> On 2006-11-18, Stephen Rothwell wrote:
> []
> >> --- linux-2.6.19-rc5-mm2.orig/kernel/exit.c2006-11-15 
> >> 16:48:11.485511089 -0600
> >> +++ linux-2.6.19-rc5-mm2/kernel/exit.c 2006-11-17 23:04:09.764530373 
> >> -0600
> >> @@ -48,6 +48,8 @@
> >>  #include 
> >>  #include 
> >>
> >> +extern kmem_cache_t *fs_cachep;
> >
> > You know what I am going to say, right? :-)
>
> I know, externs must be in headers. Please, explain why.

So that there is only one declaration.  That way if it is changed,
everywhere that uses it will notice.  Also, the same header must be
included by the file that defines the variable or function so any
discrepancy between declaration and definition will be obvious.

i.e. we are protecting ourselves against change and making maintainance
easier.

In this particular case, the type is probably never going to change, but
consistency is good.

--
Cheers,
Stephen Rothwell[EMAIL PROTECTED]
http://www.canb.auug.org.au/~sfr/


pgplvgnpKfyzJ.pgp
Description: PGP signature


SCSI init discussion/SAN problem

2006-11-17 Thread Evan Rempel


I have a problem with the order that the SCSI subsystem attaches disk 
devices that shows up in a multipath environment.


My understanding is that during the finishing phase of the SCSI 
subsystem the partition table is read from the drive and the bare drive 
and each partition are registered with the kernel. Please correct me if 
I am wrong becuase I am not a kernel developer even at the tinkering level.


The problem shows up in a multipath environment where the same physical 
device has it's partition table read and then registered with the kernel 
*for each path on which it available*. I understand the requirement for 
the second (possibly more) devices registered with the kernel, and I 
want this behaviour to continue (how else would multipath work?).
The problem is that reading the partition table on each of the paths 
causes I/O to be generated to the physical disk on each of the paths.
For some disk controllers (any with active/passive controllers) this 
will initial a failover event from the active to the passive controller. 
This failover can take a few seconds, but multipathing may result in 
100's of such paths and failover events which make the boot time very 
long. I have a machine that takes close to 1hr to boot due to this behavior.


What I would like to have considered is the ability to get the serial 
number/WWName of the device prior to reading the partition table. If the 
serial number/WWName has already been registered under a different SCSI 
ID, then just use the partition table that was used to load the first 
instance. This will result in I/O only on the first path to each disk.


Another thing that might make things even better is to do something like 
the mp_prio utils of multipathing do and determine which paths is an 
active path, and only read the partition table from the active paths.
This may require a 2 pass device registration mechanism becuase it may 
be possible that none of the paths are active paths, meaning that the 
device did not get registered by the end of the device list. We would 
have to go back to the beginning of the list and for any device that was 
not yet registered with the kernel, read the serial number/WWName and 
partition table, register with the kernel and then determine if any of 
the other paths are for the same device to load them into the kernel.


I hope this is clear enough to start a dialog on how to change the scsi 
initialization faster for large systems on multipath hardware.


Evan Rempel
Senior Programmer Analyst
University of Victoria
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 18/20] x86_64: Relocatable kernel support

2006-11-17 Thread Andi Kleen
On Sat, Nov 18, 2006 at 05:56:47AM +, Oleg Verych wrote:
> 
> On 2006-11-17, Vivek Goyal wrote:
> []
> >  static void error(char *x)
> > @@ -281,57 +335,8 @@ static void error(char *x)
> > while(1);   /* Halt */
> >  }
> 
> Is it possible to make this optional (using "panic" reboot timeout)?

There is no command line parsing at this point. I guess it would
be possible to implement, but some work. Do you want to submit a patch ?

-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 20/20] x86_64: Move CPU verification code to common file

2006-11-17 Thread H. Peter Anvin

Andi Kleen wrote:

May hang be done optional? There was a discussion about applying
"panic" reboot timeout here. Is it possible to implement somehow?


It would be tricky, but might be possible.  But that would be a completely
new feature -- the kernel has always hung in this case. If you think you need 
it submit a (followup) patch. But I don't think it's fair to ask Vivek to do it.


Besides i don't think it would be any useful. panic reboot only
makes sense if you can recover after reboot. But if your CPU somehow
suddenly loses its ability to run 64bit code, no reboot of the world will 
recover.




Not true.  Some bootloaders support a fallback kernel.  This case is 
particular important if one accidentally installs the wrong kernel for 
the machine.


-hpa
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC 6/7] Use an external declaration in exit.c for fs_cachep

2006-11-17 Thread Oleg Verych
On 2006-11-18, Stephen Rothwell wrote:
[]
>> --- linux-2.6.19-rc5-mm2.orig/kernel/exit.c  2006-11-15 16:48:11.485511089 
>> -0600
>> +++ linux-2.6.19-rc5-mm2/kernel/exit.c   2006-11-17 23:04:09.764530373 
>> -0600
>> @@ -48,6 +48,8 @@
>>  #include 
>>  #include 
>>
>> +extern kmem_cache_t *fs_cachep;
>
> You know what I am going to say, right? :-)

I know, externs must be in headers. Please, explain why.

TIA.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 20/20] x86_64: Move CPU verification code to common file

2006-11-17 Thread Andi Kleen
> May hang be done optional? There was a discussion about applying
> "panic" reboot timeout here. Is it possible to implement somehow?

It would be tricky, but might be possible.  But that would be a completely
new feature -- the kernel has always hung in this case. If you think you need 
it submit a (followup) patch. But I don't think it's fair to ask Vivek to do it.

Besides i don't think it would be any useful. panic reboot only
makes sense if you can recover after reboot. But if your CPU somehow
suddenly loses its ability to run 64bit code, no reboot of the world will 
recover.

-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC 6/7] Use an external declaration in exit.c for fs_cachep

2006-11-17 Thread Stephen Rothwell
On Fri, 17 Nov 2006 21:44:13 -0800 (PST) Christoph Lameter <[EMAIL PROTECTED]> 
wrote:
>
> Use an external declaration in exit.c for fs_cachep.
>
> fs_cachep is only used in kernel/exit.c and in kernel/fork.c.
> It is defined in kernel/fork.c so we need to add an external
> declaration to kernel/exit.c to be able to avoid the
> declaration.
>
> Signed-off-by: Christoph Lameter <[EMAIL PROTECTED]>
>
> --- linux-2.6.19-rc5-mm2.orig/kernel/exit.c   2006-11-15 16:48:11.485511089 
> -0600
> +++ linux-2.6.19-rc5-mm2/kernel/exit.c2006-11-17 23:04:09.764530373 
> -0600
> @@ -48,6 +48,8 @@
>  #include 
>  #include 
>
> +extern kmem_cache_t *fs_cachep;

You know what I am going to say, right? :-)

--
Cheers,
Stephen Rothwell[EMAIL PROTECTED]
http://www.canb.auug.org.au/~sfr/


pgpbmae4Q5fsW.pgp
Description: PGP signature


Re: [RFC 5/7] Use external declaration for filep_cachep

2006-11-17 Thread Stephen Rothwell
On Fri, 17 Nov 2006 21:44:08 -0800 (PST) Christoph Lameter <[EMAIL PROTECTED]> 
wrote:
>
> Use external declaration for filep_cachep.
>
> filp_cachep is used in fs/file_table.c. Its defined in fs/dcache.c.
> The easiest solution here is to add an external declaration to
> fs/file_table.c.
>
> Signed-off-by: Christoph Lameter <[EMAIL PROTECTED]>
>
> --- linux-2.6.19-rc5-mm2.orig/fs/file_table.c 2006-11-15 16:47:59.622264626 
> -0600
> +++ linux-2.6.19-rc5-mm2/fs/file_table.c  2006-11-17 23:04:05.885291107 
> -0600
> @@ -35,6 +35,8 @@ __cacheline_aligned_in_smp DEFINE_SPINLO
>
>  static struct percpu_counter nr_files __cacheline_aligned_in_smp;
>
> +extern kmem_cache_t *filp_cachep;

Is there no suitable header file to put this in?

--
Cheers,
Stephen Rothwell[EMAIL PROTECTED]
http://www.canb.auug.org.au/~sfr/


pgpp0XDX7e1dS.pgp
Description: PGP signature


Re: [RFC 1/7] Remove declaration of sighand_cachep from slab.h

2006-11-17 Thread Stephen Rothwell
On Fri, 17 Nov 2006 21:43:47 -0800 (PST) Christoph Lameter <[EMAIL PROTECTED]> 
wrote:
>
> Remove declaration of sighand_cachep from slab.h
>
> The sighand cache is only used in fs/exec.c and kernel/fork.c. It is defined
> in kernel/fork.c but also used in fs/exec.c. So add an extern declaration to
> fs/exec.c and remove the definition from slab.h.
>
> Signed-off-by: Christoph Lameter <[EMAIL PROTECTED]>
>
> Index: linux-2.6.19-rc5-mm2/fs/exec.c
> ===
> --- linux-2.6.19-rc5-mm2.orig/fs/exec.c   2006-11-15 16:47:59.065579813 
> -0600
> +++ linux-2.6.19-rc5-mm2/fs/exec.c2006-11-17 23:03:46.049603927 -0600
> @@ -62,6 +62,8 @@ int core_uses_pid;
>  char core_pattern[128] = "core";
>  int suid_dumpable = 0;
>
> +extern kmem_cache_t  *sighand_cachep;

Is there no suitable header file to put this in?

--
Cheers,
Stephen Rothwell[EMAIL PROTECTED]
http://www.canb.auug.org.au/~sfr/


pgpI7q5obMZHq.pgp
Description: PGP signature


Re: [patch] cpufreq: mark cpufreq_tsc() as core_initcall_sync

2006-11-17 Thread Paul E. McKenney
On Fri, Nov 17, 2006 at 08:51:03PM -0800, Andrew Morton wrote:
> On Fri, 17 Nov 2006 23:33:45 -0500 (EST)
> Alan Stern <[EMAIL PROTECTED]> wrote:
> 
> > On Fri, 17 Nov 2006, Paul E. McKenney wrote:
> >
> > > > Perhaps a better approach to the initialization problem would be to 
> > > > assume
> > > > that either:
> > > >
> > > > 1.  The srcu_struct will be initialized before it is used, or
> > > >
> > > > 2.  When it is used before initialization, the system is running
> > > > only one thread.
> > >
> > > Are these assumptions valid?  If so, they would indeed simplify things
> > > a bit.
> >
> > I don't know.  Maybe Andrew can tell us -- is it true that the kernel runs
> > only one thread up through the time the core_initcalls are finished?
> 
> I don't see why - a core_initcall could go off and do the
> multithreaded-pci-probing thing, or it could call kernel_thread() or
> anything.  I doubt if any core_initcall functions _do_ do that, but there
> are a lot of them.
> 
> > If not, can we create another initcall level that is guaranteed to run
> > before any threads are spawned?
> 
> It's a simple and cheap matter to create a precore_initcall() - one would
> need to document it carefully to be able to preserve whatever guarantees it
> needs.
> 
> However by the time the initcalls get run, various thing are already
> happening: SMP is up, the keventd threads are running, the CPU scheduler
> migration threads are running, ksoftirqd, softlockup-detector, etc.
> keventd is the problematic one.
> 
> So I guess you'd need a new linker section and a call from
> do_pre_smp_initcalls() or thereabouts.

Hmmm...  OK then, for the moment, I will stick with the current checks
in the primitives.  Not that I particularly like the "bulking up" of
srcu_read_lock() and srcu_read_unlock() -- but if the super-fast
version is needed, it can easily be provided either within the
confines of the subsystem that needs it, or as yet another set of
RCU-like primitives.  Hopefully this latter option can be avoided!

BTW, the reason for the hardluckref is that I don't want to inflict a
failure return from srcu_read_lock() on you guys.  The non-blocking
synchronization community has repeatedly made that sort of mistake,
and I have no intention of letting it propagate any further.  ;-)

Thanx, Paul
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH -mm] handle BUG=n

2006-11-17 Thread Randy Dunlap
From: Randy Dunlap <[EMAIL PROTECTED]>

Handle BUG=n, GENERIC_BUG=n to prevent build errors:

arch/x86_64/kernel/built-in.o: In function `die':
(.text+0x3b3c): undefined reference to `report_bug'
arch/x86_64/kernel/built-in.o: In function `module_arch_cleanup':
(.text+0x10b60): undefined reference to `module_bug_cleanup'
arch/x86_64/kernel/built-in.o: In function `module_finalize':
(.text+0x10c98): undefined reference to `module_bug_finalize'

Signed-off-by: Randy Dunlap <[EMAIL PROTECTED]>
---
 include/linux/bug.h |   26 --
 1 file changed, 20 insertions(+), 6 deletions(-)

--- linux-2619-rc5mm2.orig/include/linux/bug.h
+++ linux-2619-rc5mm2/include/linux/bug.h
@@ -3,6 +3,12 @@
 
 #include 
 
+enum bug_trap_type {
+   BUG_TRAP_TYPE_NONE = 0,
+   BUG_TRAP_TYPE_WARN = 1,
+   BUG_TRAP_TYPE_BUG = 2,
+};
+
 #ifdef CONFIG_GENERIC_BUG
 #include 
 #include 
@@ -12,12 +18,6 @@ static inline int is_warning_bug(const s
return bug->flags & BUGFLAG_WARNING;
 }
 
-enum bug_trap_type {
-   BUG_TRAP_TYPE_NONE = 0,
-   BUG_TRAP_TYPE_WARN = 1,
-   BUG_TRAP_TYPE_BUG = 2,
-};
-
 const struct bug_entry *find_bug(unsigned long bugaddr);
 
 enum bug_trap_type report_bug(unsigned long bug_addr);
@@ -29,5 +29,19 @@ void module_bug_cleanup(struct module *)
 /* These are defined by the architecture */
 int is_valid_bugaddr(unsigned long addr);
 
+#else  /* !CONFIG_GENERIC_BUG */
+
+static inline enum bug_trap_type report_bug(unsigned long bug_addr)
+{
+   return BUG_TRAP_TYPE_BUG;
+}
+static inline int  module_bug_finalize(const Elf_Ehdr *hdr,
+   const Elf_Shdr *sechdrs,
+   struct module *mod)
+{
+   return 0;
+}
+static inline void module_bug_cleanup(struct module *mod) {}
+
 #endif /* CONFIG_GENERIC_BUG */
 #endif /* _LINUX_BUG_H */


---
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH -mm] profile_likely: export do_check_likely

2006-11-17 Thread Randy Dunlap
From: Randy Dunlap <[EMAIL PROTECTED]>

I see MODPOST warnings for all modules in some (random) configs; e.g.:
(This is a short list; I see >100 of these.)

WARNING: "do_check_likely" [net/sched/cls_basic.ko] undefined!
WARNING: "do_check_likely" [net/netfilter/x_tables.ko] undefined!
WARNING: "do_check_likely" [net/key/af_key.ko] undefined!
WARNING: "do_check_likely" [kernel/rcutorture.ko] undefined!
WARNING: "do_check_likely" [fs/xfs/xfs.ko] undefined!
WARNING: "do_check_likely" [fs/sysv/sysv.ko] undefined!
WARNING: "do_check_likely" [fs/reiserfs/reiserfs.ko] undefined!
WARNING: "do_check_likely" [fs/ntfs/ntfs.ko] undefined!
WARNING: "do_check_likely" [fs/minix/minix.ko] undefined!

Signed-off-by: Randy Dunlap <[EMAIL PROTECTED]>
---
 lib/likely_prof.c |2 ++
 1 file changed, 2 insertions(+)

--- linux-2619-rc5mm2.orig/lib/likely_prof.c
+++ linux-2619-rc5mm2/lib/likely_prof.c
@@ -10,6 +10,7 @@
  */
 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -50,6 +51,7 @@ int do_check_likely(struct likeliness *l
 
return ret;
 }
+EXPORT_SYMBOL(do_check_likely);
 
 static void * lp_seq_start(struct seq_file *out, loff_t *pos)
 {


---
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] I2O: handle __copy_from_user

2006-11-17 Thread Randy Dunlap
From: Randy Dunlap <[EMAIL PROTECTED]>

Handle __copy_from_user() return value.

Noticed by inspection, not from build warning.

Signed-off-by: Randy Dunlap <[EMAIL PROTECTED]>
---
 drivers/message/i2o/i2o_config.c |6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

--- linux-2619-rc5mm2.orig/drivers/message/i2o/i2o_config.c
+++ linux-2619-rc5mm2/drivers/message/i2o/i2o_config.c
@@ -265,7 +265,11 @@ static int i2o_cfg_swdl(unsigned long ar
return -ENOMEM;
}
 
-   __copy_from_user(buffer.virt, kxfer.buf, fragsize);
+   if (__copy_from_user(buffer.virt, kxfer.buf, fragsize)) {
+   i2o_msg_nop(c, msg);
+   i2o_dma_free(>pdev->dev, );
+   return -EFAULT;
+   }
 
msg->u.head[0] = cpu_to_le32(NINE_WORD_MSG_SIZE | SGL_OFFSET_7);
msg->u.head[1] =


---
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH -mm] ipath needs HT_IRQ

2006-11-17 Thread Randy Dunlap
From: Randy Dunlap <[EMAIL PROTECTED]>

IPATH needs HT_IRQ to prevent these build errors:
drivers/built-in.o: In function `ipath_ht_free_irq':
ipath_iba6110.c:(.text+0x15c76b): undefined reference to `ht_destroy_irq'
drivers/built-in.o: In function `ipath_setup_ht_config':
ipath_iba6110.c:(.text+0x15cbb1): undefined reference to `__ht_create_irq'

Signed-off-by: Randy Dunlap <[EMAIL PROTECTED]>
---
 drivers/infiniband/hw/ipath/Kconfig |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- linux-2619-rc5mm2.orig/drivers/infiniband/hw/ipath/Kconfig
+++ linux-2619-rc5mm2/drivers/infiniband/hw/ipath/Kconfig
@@ -1,6 +1,6 @@
 config INFINIBAND_IPATH
tristate "QLogic InfiniPath Driver"
-   depends on PCI_MSI && 64BIT && INFINIBAND
+   depends on PCI_MSI && 64BIT && INFINIBAND && HT_IRQ
---help---
This is a driver for QLogic InfiniPath host channel adapters,
including InfiniBand verbs support.  This driver allows these


---
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 18/20] x86_64: Relocatable kernel support

2006-11-17 Thread Oleg Verych
On 2006-11-17, Vivek Goyal wrote:
[]
>  static void error(char *x)
> @@ -281,57 +335,8 @@ static void error(char *x)
>   while(1);   /* Halt */
>  }

Is it possible to make this optional (using "panic" reboot timeout)?


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC 4/7] Move files_cachep to file.h

2006-11-17 Thread Christoph Lameter
Move files_cachep to file.h

The proper place is in file.h since its related to file I/O.

Signed-off-by: Christoph Lameter <[EMAIL PROTECTED]>

Index: linux-2.6.19-rc5-mm2/include/linux/file.h
===
--- linux-2.6.19-rc5-mm2.orig/include/linux/file.h  2006-11-15 
16:48:08.583913536 -0600
+++ linux-2.6.19-rc5-mm2/include/linux/file.h   2006-11-17 23:03:59.254839099 
-0600
@@ -101,4 +101,6 @@ struct files_struct *get_files_struct(st
 void FASTCALL(put_files_struct(struct files_struct *fs));
 void reset_files_struct(struct task_struct *, struct files_struct *);
 
+extern kmem_cache_t*files_cachep;
+
 #endif /* __LINUX_FILE_H */
Index: linux-2.6.19-rc5-mm2/include/linux/slab.h
===
--- linux-2.6.19-rc5-mm2.orig/include/linux/slab.h  2006-11-17 
23:03:55.587532089 -0600
+++ linux-2.6.19-rc5-mm2/include/linux/slab.h   2006-11-17 23:03:59.268512148 
-0600
@@ -298,7 +298,6 @@ static inline void kmem_set_shrinker(kme
 
 /* System wide caches */
 extern kmem_cache_t*names_cachep;
-extern kmem_cache_t*files_cachep;
 extern kmem_cache_t*filp_cachep;
 extern kmem_cache_t*fs_cachep;
 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC 6/7] Use an external declaration in exit.c for fs_cachep

2006-11-17 Thread Christoph Lameter
Use an external declaration in exit.c for fs_cachep.

fs_cachep is only used in kernel/exit.c and in kernel/fork.c.
It is defined in kernel/fork.c so we need to add an external
declaration to kernel/exit.c to be able to avoid the
declaration.

Signed-off-by: Christoph Lameter <[EMAIL PROTECTED]>

Index: linux-2.6.19-rc5-mm2/include/linux/slab.h
===
--- linux-2.6.19-rc5-mm2.orig/include/linux/slab.h  2006-11-17 
23:04:05.859898302 -0600
+++ linux-2.6.19-rc5-mm2/include/linux/slab.h   2006-11-17 23:04:09.679562142 
-0600
@@ -298,7 +298,6 @@ static inline void kmem_set_shrinker(kme
 
 /* System wide caches */
 extern kmem_cache_t*names_cachep;
-extern kmem_cache_t*fs_cachep;
 
 #endif /* __KERNEL__ */
 
Index: linux-2.6.19-rc5-mm2/kernel/exit.c
===
--- linux-2.6.19-rc5-mm2.orig/kernel/exit.c 2006-11-15 16:48:11.485511089 
-0600
+++ linux-2.6.19-rc5-mm2/kernel/exit.c  2006-11-17 23:04:09.764530373 -0600
@@ -48,6 +48,8 @@
 #include 
 #include 
 
+extern kmem_cache_t *fs_cachep;
+
 extern void sem_exit (void);
 
 static void exit_mm(struct task_struct * tsk);
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC 7/7] Move names_cachep to fs.h

2006-11-17 Thread Christoph Lameter
Move names_cachep to fs.h

The names_cachep is used for getname() and putname(). So lets
put it into fs.h.

Signed-off-by: Christoph Lameter <[EMAIL PROTECTED]>

Index: linux-2.6.19-rc5-mm2/include/linux/slab.h
===
--- linux-2.6.19-rc5-mm2.orig/include/linux/slab.h  2006-11-17 
23:04:09.679562142 -0600
+++ linux-2.6.19-rc5-mm2/include/linux/slab.h   2006-11-17 23:04:13.548058299 
-0600
@@ -296,9 +296,6 @@ static inline void kmem_set_shrinker(kme
 
 #endif /* CONFIG_SLOB */
 
-/* System wide caches */
-extern kmem_cache_t*names_cachep;
-
 #endif /* __KERNEL__ */
 
 #endif /* _LINUX_SLAB_H */
Index: linux-2.6.19-rc5-mm2/include/linux/fs.h
===
--- linux-2.6.19-rc5-mm2.orig/include/linux/fs.h2006-11-15 
16:48:08.629815618 -0600
+++ linux-2.6.19-rc5-mm2/include/linux/fs.h 2006-11-17 23:04:13.586147506 
-0600
@@ -1558,6 +1558,8 @@ extern char * getname(const char __user 
 extern void __init vfs_caches_init_early(void);
 extern void __init vfs_caches_init(unsigned long);
 
+extern kmem_cache_t*names_cachep;
+
 #define __getname()kmem_cache_alloc(names_cachep, SLAB_KERNEL)
 #define __putname(name) kmem_cache_free(names_cachep, (void *)(name))
 #ifndef CONFIG_AUDITSYSCALL
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC 0/7] Remove slab cache declarations in slab.h

2006-11-17 Thread Christoph Lameter
One of the strange issues in include/linux/slab.h that it contains
a list of global slab caches. The following patches remove all the global
definitions from slab.h and find other ways of defining these caches.

6 of the 7 defined caches are rarely used. One is never used.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC 1/7] Remove declaration of sighand_cachep from slab.h

2006-11-17 Thread Christoph Lameter
Remove declaration of sighand_cachep from slab.h

The sighand cache is only used in fs/exec.c and kernel/fork.c. It is defined
in kernel/fork.c but also used in fs/exec.c. So add an extern declaration to
fs/exec.c and remove the definition from slab.h.

Signed-off-by: Christoph Lameter <[EMAIL PROTECTED]>

Index: linux-2.6.19-rc5-mm2/fs/exec.c
===
--- linux-2.6.19-rc5-mm2.orig/fs/exec.c 2006-11-15 16:47:59.065579813 -0600
+++ linux-2.6.19-rc5-mm2/fs/exec.c  2006-11-17 23:03:46.049603927 -0600
@@ -62,6 +62,8 @@ int core_uses_pid;
 char core_pattern[128] = "core";
 int suid_dumpable = 0;
 
+extern kmem_cache_t*sighand_cachep;
+
 EXPORT_SYMBOL(suid_dumpable);
 /* The maximal length of core_pattern is also specified in sysctl.c */
 
Index: linux-2.6.19-rc5-mm2/include/linux/slab.h
===
--- linux-2.6.19-rc5-mm2.orig/include/linux/slab.h  2006-11-17 
23:02:21.371436329 -0600
+++ linux-2.6.19-rc5-mm2/include/linux/slab.h   2006-11-17 23:03:46.114062585 
-0600
@@ -302,7 +302,6 @@ extern kmem_cache_t *names_cachep;
 extern kmem_cache_t*files_cachep;
 extern kmem_cache_t*filp_cachep;
 extern kmem_cache_t*fs_cachep;
-extern kmem_cache_t*sighand_cachep;
 extern kmem_cache_t*bio_cachep;
 
 #endif /* __KERNEL__ */
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC 2/7] Remove bio_cachep from slab.h

2006-11-17 Thread Christoph Lameter
Remove bio_cachep from slab.h

bio_cachep is no longer used it seems.

Signed-off-by: Christoph Lameter <[EMAIL PROTECTED]>

Index: linux-2.6.19-rc5-mm2/include/linux/slab.h
===
--- linux-2.6.19-rc5-mm2.orig/include/linux/slab.h  2006-11-17 
23:03:46.114062585 -0600
+++ linux-2.6.19-rc5-mm2/include/linux/slab.h   2006-11-17 23:03:51.817677214 
-0600
@@ -302,7 +302,6 @@ extern kmem_cache_t *names_cachep;
 extern kmem_cache_t*files_cachep;
 extern kmem_cache_t*filp_cachep;
 extern kmem_cache_t*fs_cachep;
-extern kmem_cache_t*bio_cachep;
 
 #endif /* __KERNEL__ */
 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC 3/7] Move vm_area_cachep to mm.h

2006-11-17 Thread Christoph Lameter
Move vm_area_cachep to mm.h

vm_area_cachep is used to store vm_area_structs. So move to mm.h.

Signed-off-by: Christoph Lameter <[EMAIL PROTECTED]>

Index: linux-2.6.19-rc5-mm2/include/linux/mm.h
===
--- linux-2.6.19-rc5-mm2.orig/include/linux/mm.h2006-11-15 
16:48:09.197243479 -0600
+++ linux-2.6.19-rc5-mm2/include/linux/mm.h 2006-11-17 23:03:55.571905748 
-0600
@@ -114,6 +114,8 @@ struct vm_area_struct {
 #endif
 };
 
+extern kmem_cache_t*vm_area_cachep;
+
 /*
  * This struct defines the per-mm list of VMAs for uClinux. If CONFIG_MMU is
  * disabled, then there's a single shared list of VMAs maintained by the
Index: linux-2.6.19-rc5-mm2/include/linux/slab.h
===
--- linux-2.6.19-rc5-mm2.orig/include/linux/slab.h  2006-11-17 
23:03:51.817677214 -0600
+++ linux-2.6.19-rc5-mm2/include/linux/slab.h   2006-11-17 23:03:55.587532089 
-0600
@@ -297,7 +297,6 @@ static inline void kmem_set_shrinker(kme
 #endif /* CONFIG_SLOB */
 
 /* System wide caches */
-extern kmem_cache_t*vm_area_cachep;
 extern kmem_cache_t*names_cachep;
 extern kmem_cache_t*files_cachep;
 extern kmem_cache_t*filp_cachep;
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC 5/7] Use external declaration for filep_cachep

2006-11-17 Thread Christoph Lameter
Use external declaration for filep_cachep.

filp_cachep is used in fs/file_table.c. Its defined in fs/dcache.c.
The easiest solution here is to add an external declaration to
fs/file_table.c.

Signed-off-by: Christoph Lameter <[EMAIL PROTECTED]>

Index: linux-2.6.19-rc5-mm2/include/linux/slab.h
===
--- linux-2.6.19-rc5-mm2.orig/include/linux/slab.h  2006-11-17 
23:03:59.268512148 -0600
+++ linux-2.6.19-rc5-mm2/include/linux/slab.h   2006-11-17 23:04:05.859898302 
-0600
@@ -298,7 +298,6 @@ static inline void kmem_set_shrinker(kme
 
 /* System wide caches */
 extern kmem_cache_t*names_cachep;
-extern kmem_cache_t*filp_cachep;
 extern kmem_cache_t*fs_cachep;
 
 #endif /* __KERNEL__ */
Index: linux-2.6.19-rc5-mm2/fs/file_table.c
===
--- linux-2.6.19-rc5-mm2.orig/fs/file_table.c   2006-11-15 16:47:59.622264626 
-0600
+++ linux-2.6.19-rc5-mm2/fs/file_table.c2006-11-17 23:04:05.885291107 
-0600
@@ -35,6 +35,8 @@ __cacheline_aligned_in_smp DEFINE_SPINLO
 
 static struct percpu_counter nr_files __cacheline_aligned_in_smp;
 
+extern kmem_cache_t *filp_cachep;
+
 static inline void file_free_rcu(struct rcu_head *head)
 {
struct file *f =  container_of(head, struct file, f_u.fu_rcuhead);
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 20/20] x86_64: Move CPU verification code to common file

2006-11-17 Thread Oleg Verych
Hallo.

On 2006-11-17, Vivek Goyal wrote:
[]
> +no_longmode:
> + /* This isn't an x86-64 CPU so hang */
> +1:
> + hlt
> + jmp 1b
> +
> +#include "../../kernel/verify_cpu.S"
> +

May hang be done optional? There was a discussion about applying
"panic" reboot timeout here. Is it possible to implement somehow?

[]
> diff -puN /dev/null arch/x86_64/kernel/verify_cpu.S
> --- /dev/null 2006-11-17 00:03:10.168280803 -0500
> +++ linux-2.6.19-rc6-reloc-root/arch/x86_64/kernel/verify_cpu.S   
> 2006-11-17 00:14:07.0 -0500
> @@ -0,0 +1,106 @@
> +/*
> + *
> + *   verify_cpu.S - Code for cpu long mode and SSE verification
> + *
> + *   Copyright (c) 2006-2007  Vivek Goyal ([EMAIL PROTECTED])
   
Warning: File verify_cpu.S has modification time in the future...
(preliminary shoot (in the head ;))

[]
> +verify_cpu:
> +
> + pushfl  # Save caller passed flags
> + pushl   $0  # Kill any dangerous flags
> + popfl
> +
> + /* minimum CPUID flags for x86-64 */
> + /* see http://www.x86-64.org/lists/discuss/msg02971.html */

Maybe there's a place for this in Documentation/ ?

> +#define SSE_MASK ((1<<25)|(1<<26))
> +#define REQUIRED_MASK1 ((1<<0)|(1<<3)|(1<<4)|(1<<5)|(1<<6)|(1<<8)|\
> +(1<<13)|(1<<15)|(1<<24))

Maybe there is a more readable way to setup this mask?


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] cpufreq: mark cpufreq_tsc() as core_initcall_sync

2006-11-17 Thread Andrew Morton
On Fri, 17 Nov 2006 23:33:45 -0500 (EST)
Alan Stern <[EMAIL PROTECTED]> wrote:

> On Fri, 17 Nov 2006, Paul E. McKenney wrote:
> 
> > > Perhaps a better approach to the initialization problem would be to 
> > > assume 
> > > that either:
> > > 
> > > 1.  The srcu_struct will be initialized before it is used, or
> > > 
> > > 2.  When it is used before initialization, the system is running
> > >   only one thread.
> > 
> > Are these assumptions valid?  If so, they would indeed simplify things
> > a bit.
> 
> I don't know.  Maybe Andrew can tell us -- is it true that the kernel runs 
> only one thread up through the time the core_initcalls are finished?

I don't see why - a core_initcall could go off and do the
multithreaded-pci-probing thing, or it could call kernel_thread() or
anything.  I doubt if any core_initcall functions _do_ do that, but there
are a lot of them.

> If not, can we create another initcall level that is guaranteed to run 
> before any threads are spawned?

It's a simple and cheap matter to create a precore_initcall() - one would
need to document it carefully to be able to preserve whatever guarantees it
needs.

However by the time the initcalls get run, various thing are already
happening: SMP is up, the keventd threads are running, the CPU scheduler
migration threads are running, ksoftirqd, softlockup-detector, etc. 
keventd is the problematic one.

So I guess you'd need a new linker section and a call from
do_pre_smp_initcalls() or thereabouts.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: We're still coping with GCC < 3.0

2006-11-17 Thread Oleg Verych
Hallo.

On 2006-11-17, someone with nick Blaisorblade wrote:
> In arch/i386/kernel/irq.c (current git head) I found this comment:
>
> /*
>  * These should really be __section__(".bss.page_aligned") as well, but
>  * gcc's 3.0 and earlier don't handle that correctly.
>  */
> static char softirq_stack[NR_CPUS * THREAD_SIZE]
> __attribute__((__aligned__(THREAD_SIZE)));
>
> static char hardirq_stack[NR_CPUS * THREAD_SIZE]
> __attribute__((__aligned__(THREAD_SIZE)));
>
> That should be fixed now that we require GCC 3.0, not?
>
> Btw, there are other such comments, like in include/asm-i386/semaphore.h: 
> sema_init (for GCC 2.7!). That one might not be the case to fix because of 
> the 
> increased stack usage
>
> I've seen other similar tests around, so I thought that it'd be useful to 
> centralize all tests for GCC versions to headers like include/compiler.h so 
> they're promptly removed when deprecating old compilers.
>
> What about this?

Tested patches to Andrew Morton and code maintainers.
See also:


-- e-mail --
> (CC me on replies as I'm not subscribed)
Cc yourself, isn't this smart idea?
Also, please, honor "Mail-Followup-To" headers and the like.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] cpufreq: mark cpufreq_tsc() as core_initcall_sync

2006-11-17 Thread Alan Stern
On Fri, 17 Nov 2006, Paul E. McKenney wrote:

> > Perhaps a better approach to the initialization problem would be to assume 
> > that either:
> > 
> > 1.  The srcu_struct will be initialized before it is used, or
> > 
> > 2.  When it is used before initialization, the system is running
> > only one thread.
> 
> Are these assumptions valid?  If so, they would indeed simplify things
> a bit.

I don't know.  Maybe Andrew can tell us -- is it true that the kernel runs 
only one thread up through the time the core_initcalls are finished?

If not, can we create another initcall level that is guaranteed to run 
before any threads are spawned?

> For the moment, I cheaped out and used a mutex_trylock.  If this can block,
> I will need to add a separate spinlock to guard per_cpu_ref allocation.

I haven't looked at your revised patch yet...  But it's important to keep 
things as simple as possible.

> Hmmm...  How to test this?  Time for the wrapper around alloc_percpu()
> that randomly fails, I guess.  ;-)

Do you really want things to continue in a highly degraded mode when 
percpu allocation fails?  Maybe it would be better just to pass the 
failure back to the caller.

Alan Stern

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: touch_cache() only touches two thirds

2006-11-17 Thread dean gaudet
On Fri, 17 Nov 2006, dean gaudet wrote:

> another pointer chase arranged to fill the L1 (or L2) using many many 
> pages.  i.e. suppose i wanted to traverse 32KiB L1 with 64B cache lines 
> then i'd allocate 512 pages and put one line on each page (pages ordered 
> randomly), but colour them so they fill the L1.  this conveniently happens 
> to fit in a 2MiB huge page on x86, so you could even ameliorate the TLB 
> pressure from the microbenchmark.

btw, for L2-sized measurements you don't need to be so clever... you can 
get away with a random traversal of the L2 on 128B boundaries.  (need to 
avoid the "next-line prefetch" issues on p-m/core/core2, p4 model 3 and 
later.)  there's just so many more pages required to map the L2 than any 
reasonable prefetcher is going to have any time soon.

-dean


> the benchmark i was considering would be like so:
> 
>   switch to cpu m
>   scrub the cache
>   switch to cpu n
>   scrub the cache
>   traverse the coloured list and modify each cache line as we go
>   switch to cpu m
>   start timing
>   traverse the coloured list without modification
>   stop timing
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: "Unable to handle kernel NULL pointer dereference" in 2.6.18.2 (2.6.18-1.2239.fc5)

2006-11-17 Thread Oleg Verych
On 2006-11-17, somebody from ckeith.clara.net wrote:
[]
> dmesg is attached below, if anything else is needed please let me know.
> And if any one knows of any older kernels that are more stable for this
> hardware configuration (2x dual core opteron 2220SE's with aacraid) please
> let me know.

Noone, but you. But i would ask you to check *newer* kernel.
Please check last available 2.6.19-rc6 from  kernel.org one.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Read/Write multiple network FDs in a single syscall context switch?

2006-11-17 Thread Stephen Hemminger
On Fri, 17 Nov 2006 22:53:26 -0500
"Marc Snider" <[EMAIL PROTECTED]> wrote:

> Sorry, I must have given the wrong impression with respect to the data.   It 
> is not all the same.   Each ingress socket is associated with an individual 
> egress socket and the packet data being received and transmitted is unique 
> across ingress/egress socket pairs...
>  
> Guess I don't see the difficulties you alluded to below, Stephen.  The 
> userspace app would only ask to receive on sockets where there was already 
> known data available as per Epoll reporting.   I also think it a reasonable 
> constraint to require in this multiple FD operation case that all sockets are 
> mandated as nonblocking, thus a zero or some other unique return value could 
> be provided for each socket that would have blocked in lieu of EWOULDBLOCK.
>  
> Write sockets would only be written to when data was available, so there 
> would be no ambiguity on write operations.   Again, if the request could not 
> be satisfied due to socket buffer overflow or other aberration a nonblocking 
> return code would ensue.
>  
> If all socket FDs referenced were required to be nonblocking then I'm having 
> difficulty understanding how circumstances would differ for a vectorized 
> recvMultiple() or sendMultiple() operation when contrasted with doing 
> multiple individual recv() and/or send() calls on N nonblocking sockets...
>  
> Forgive me if I'm missing something.   It seems to me that the bang for the 
> buck in exponentially reducing the number of context switches required for a 
> userspace application to service many network FDs makes a great deal of sense 
> here 
>  
> Regards,
> Marc Snider
> [EMAIL PROTECTED]
> 

You forget on Linux system calls are cheap, unlike other OS's. A poll/select 
followed by a receive
is going to be as fast as any recv_any() type interface. Unless you can reduce 
the number
of copies from kernel to user (or vice versa) there is no point to inventing 
yet another
network interface.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: FW: RTC , ds1307 I2C driver and NTP does not work.

2006-11-17 Thread Oleg Verych
> -Original Message-
> From: 
> Sent: 17 November 2006 17:57
> To: Joakim Tjernlund
> Cc: [EMAIL PROTECTED]
> Subject: Re: RTC , ds1307 I2C driver and NTP does not work.
>
>
> On Nov 17, 2006, at 10:38 AM, Joakim Tjernlund wrote:
>
>> I get this when I activathte NTP and ntp "sync" the time the I2C HW  
>> clock.
>
> You may be better off posting this to lkml and copy the i2c list (and  
> rtc if one exists).  Since its more a driver issue than anything  
> really ppc specific.  Clearly we are doing schedules() in mpc_xfer()  
> and maybe we shouldn't be.
>
> - kumar

Please try this patch:

Message-ID: <[EMAIL PROTECTED]>


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Read/Write multiple network FDs in a single syscall context switch?

2006-11-17 Thread Willy Tarreau
On Fri, Nov 17, 2006 at 04:40:30PM -0500, Marc Snider wrote:
> I've searched long and hard prior to posting here, but have been unable to 
> locate a kernel mechanism providing the ability to read or write multiple FDs 
> in a single userspace to kernel context switch.
> 
> We've got a userspace network application that uses epoll to wait for packet 
> arrival and then reads a single frame off of dozens of separate FDs 
> (sockets), operates on the payload and then forwards along by writing to 
> dozens of other separate FDs (sockets).   At high loads we invariably have 
> many dozens of socket FDs to read and write.
> 
> If 50 separate frames are received on 50 separate sockets then we are at 
> present doing 50 separate reads and then 50 separate writes, thus resulting 
> in over a hundred distinct (and seemingly unnecessary) user to kernel space 
> and kernel to user space context switches.   Is there a mechanism I've missed 
> which allows many network FDs to be read or written in a single syscall?   
> For example, something analogous to the recv() and send() calls but instead 
> providing a vector for the parameters and return value?
> 
> I picture something like:
> 
>      ssize_t *recvMultiple(int *s, void **buf, size_t *len, int *flags)       
>   and
>      ssize_t *sendMultiple(int *s, void **buf, size_t *len, int *flags)
> 
> 
> The user would have to be careful about not using blocking sockets with these 
> types of multiple FD operations, but it seems to me that such a kernel 
> mechanism would allow a user space process to eliminate dozens or even 
> hundreds of unnecessary context switches when servicing multiple network 
> FDs...    The cycle savings for an application like ours would be huge.   I 
> am confused about why I've been unable to locate such a mechanism considering 
> the perceived performance advantages and ubiquitous nature of user 
> applications that service many network FDs...

You should take a look at the "Kernel Mode Linux" patch. While it doesn't
provide the feature you want, it addresses this specific context switch
problem by making your app run in kernel space, thus considerably reducing
the cost of the system calls.

Regards,
Willy

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: touch_cache() only touches two thirds

2006-11-17 Thread dean gaudet
On Fri, 10 Nov 2006, Bela Lubkin wrote:

> The corrected code in 
> covers the full cache range.  Granted that modern CPUs may be able to track
> multiple simultaneous cache access streams: how many such streams are they
> likely to be able to follow at once?  It seems like going from 1 to 2 would
> be a big win, 2 to 3 a small win, beyond that it wouldn't likely make much
> incremental difference.  So what do the actual implementations in the field
> support?

p-m family, core, core2 track one stream on each of 12 to 16 pages.  in 
the earlier ones they split the trackers into some forward-only and some 
backward-only, but on core2 i think they're all bidirectional.  if i had 
to guess they round-robin the trackers, so once you hit 17 pages with 
streams they're defeated.

a p4 (0f0403, probably "prescott") i have here is tracking 16 -- seems to 
use LRU or pLRU but i haven't tested really, you need to get out past 32 
streams before it really starts falling off... and even then the next-line 
prefetch in the L2 helps too much (64-byte lines, but 128-byte tags and a 
pair of dirty/state bits -- it prefetches the other half of a pair 
automatically).  oh it can track forward or backward, and is happy with 
strides up to 128.

k8 rev F tracks one stream on each of 20 pages (forward or backward).  it 
also seems to use round-robin, and is defeated as soon as you have 21 
streams.

i swear there was an x86 which did 28 streams, but it was a few years ago 
that i last looked really closely at the prefetchers and i don't have 
access to the data at the moment.

i suggest that streams are the wrong approach.  i was actually considering 
this same problem this week, happy to see your thread.

the approach i was considering was to set up two pointer chases:

one pointer chase covering enough cache lines (and in a prefetchable 
ordering) for "scrubbing" the cache(s).

another pointer chase arranged to fill the L1 (or L2) using many many 
pages.  i.e. suppose i wanted to traverse 32KiB L1 with 64B cache lines 
then i'd allocate 512 pages and put one line on each page (pages ordered 
randomly), but colour them so they fill the L1.  this conveniently happens 
to fit in a 2MiB huge page on x86, so you could even ameliorate the TLB 
pressure from the microbenchmark.

you can actually get away with a pointer every 256 bytes today -- none of 
the prefetchers on today's x86 cores consider a 256 byte stride to be 
prefetchable.  for safety you might want to use 512 byte alignment... this 
lets you get away with fewer pages for colouring larger caches.

the benchmark i was considering would be like so:

switch to cpu m
scrub the cache
switch to cpu n
scrub the cache
traverse the coloured list and modify each cache line as we go
switch to cpu m
start timing
traverse the coloured list without modification
stop timing

-dean
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Linux 2.6.19-rc6 - NFSD working again

2006-11-17 Thread Christian Kujau

Hi,

I just wanted to report a 'it works again' for rc6: after encountering 
the very same problems with -rc3 Jeff Garzik described in [0], I 
upgraded to -rc5 and applied the proposed[1] patch[2].
Now, the knfsd behaved a bit better (nfs-mounted /home, X11 
applications created thousands of empty 'configuration'-files),

however 'mkdir' and 'touch' still failed too often:

 $ mkdir /mnt/nfs/compile-farm/foo
 mkdir: /mnt/nfs/compile-farm/foo: Operation not permitted
 $ mkdir /mnt/nfs/compile-farm/foo
 mkdir: /mnt/nfs/compile-farm/foo: File exists

...and things like that.

With -rc6 this seems to be gone. However, I noticed this message in the 
server's (192.168.10.10) syslog:


nfs4_cb: server 127.0.1.1/192.168.10.10 AUTH_UNIX 0 not responding, timed out
nfs4_cb: server 127.0.1.1/192.168.10.10 AUTH_UNIX 0 not responding, timed out

The NFS server is running on 0.0.0.0:2049, what does this mean?
The message occurs once in a while, not sure what triggers it, found 
not much in the archives...


Thanks,
Christian.

[0] http://uwsg.iu.edu/hypermail/linux/kernel/0611.0/1418.html
[1] http://uwsg.iu.edu/hypermail/linux/kernel/0611.0/1491.html
[2] 
http://www.citi.umich.edu/projects/nfsv4/linux/kernel-patches/2.6.19-rc3-2/linux-2.6.19-rc3-CITI_NFS4_ALL-2.diff
--
BOFH excuse #106:

The electrician didn't know what the yellow cable was so he yanked the ethernet 
out.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


tests of kernel modules

2006-11-17 Thread Pavol Gono

Hi

After resolving http://bugzilla.kernel.org/show_bug.cgi?id=7481
I was thinking about possibilities how to prevent such bugs with
testing. Usually just few insmods and rmmods show, whether the
initialization and cleanup code of module is ok or not.

I created a script which do the automatic job of finding all modules
and inserting/removing them (see attachment). On my Lifebook E8110,
kernel 2.6.18.2, the following modules were problematic:
arptable_filter pktgen rfcomm rpcsec_gss_krb5 sdhci xt_NFQUEUE
Kernel logs usually say "BUG: unable to handle kernel paging request
at virtual address ..." or "BUG: unable to handle kernel NULL pointer
dereference at virtual address ".

When trying knoppix 5.0.1, my script causes total freeze quickly.
Is it worth to report each buggy module to bugzilla? Reproducing is
quite simple, effects common. People just usually don't use many
insmod-s & rmmod-s in normal life...

Palo


test_modules
Description: application/shellscript


[patch] smbfs: is obsolete, please use CIFS

2006-11-17 Thread Oleg Verych

Signed-off-by: Oleg Verych <[EMAIL PROTECTED]>
--

Note, some white spaces were killed.

--- linux-2.6-mm/fs/Kconfig~smbfs-is-obsolete+emacs-visiting2006-11-15 
08:58:53.097867250 +
+++ linux-2.6-mm/fs/Kconfig 2006-11-18 03:22:24.055118500 +
@@ -1200,5 +1200,5 @@
help
  If you say Y here, you can use the 'debug' mount option to enable
- debugging output from the driver. 
+ debugging output from the driver.
 
 config BFS_FS
@@ -1326,5 +1326,5 @@
  the kernel or by users (see the attr(5) manual page, or visit
   for details).
- 
+
  If unsure, say N.
 
@@ -1337,8 +1337,8 @@
  Posix Access Control Lists (ACLs) support permissions for users and
  groups beyond the owner/group/world scheme.
- 
+
  To learn more about Access Control Lists, visit the Posix ACLs for
  Linux website .
- 
+
  If you don't know what Access Control Lists are, say N
 
@@ -1352,5 +1352,5 @@
  enables an extended attribute handler for file security
  labels in the jffs2 filesystem.
- 
+
  If you are not using a security module that requires using
  extended attributes for file security labels, say N.
@@ -1852,5 +1852,5 @@
 
 config SMB_FS
-   tristate "SMB file system support (to mount Windows shares etc.)"
+   tristate "SMB file system support (OBSOLETE, please use CIFS)"
depends on INET
select NLS
@@ -1875,6 +1875,6 @@
  Macs is on the WWW at .
 
- To compile the SMB support as a module, choose M here: the module will
- be called smbfs.  Most people say N, however.
+ To compile the SMB support as a module, choose M here:
+ the module will be called smbfs.  Most people say N, however.
 
 config SMB_NLS_DEFAULT
@@ -1908,28 +1908,28 @@
 
 config CIFS
-   tristate "CIFS support (advanced network filesystem for Samba, Window 
and other CIFS compliant servers)"
+   tristate "CIFS support (advanced network filesystem, SMBFS successor)"
depends on INET
select NLS
help
  This is the client VFS module for the Common Internet File System
- (CIFS) protocol which is the successor to the Server Message Block 
+ (CIFS) protocol which is the successor to the Server Message Block
  (SMB) protocol, the native file sharing mechanism for most early
- PC operating systems.  The CIFS protocol is fully supported by 
- file servers such as Windows 2000 (including Windows 2003, NT 4  
+ PC operating systems.  The CIFS protocol is fully supported by
+ file servers such as Windows 2000 (including Windows 2003, NT 4
  and Windows XP) as well by Samba (which provides excellent CIFS
  server support for Linux and many other operating systems). Limited
- support for Windows ME and similar servers is provided as well. 
+ support for Windows ME and similar servers is provided as well.
  You must use the smbfs client filesystem to access older SMB servers
  such as OS/2 and DOS.
 
  The intent of the cifs module is to provide an advanced
- network file system client for mounting to CIFS compliant servers, 
+ network file system client for mounting to CIFS compliant servers,
  including support for dfs (hierarchical name space), secure per-user
  session establishment, safe distributed caching (oplock), optional
- packet signing, Unicode and other internationalization improvements, 
+ packet signing, Unicode and other internationalization improvements,
  and optional Winbind (nsswitch) integration. You do not need to enable
  cifs if running only a (Samba) server. It is possible to enable both
  smbfs and cifs (e.g. if you are using CIFS for accessing Windows 2003
- and Samba 3 servers, and smbfs for accessing old servers). If you 
need 
+ and Samba 3 servers, and smbfs for accessing old servers). If you need
  to mount to Samba or Windows from this machine, say Y.
 
@@ -1969,5 +1969,5 @@
  mounts may be less secure than mounts using NTLM or more recent
  security mechanisms if you are on a public network.  Unless you
- have a need to access old SMB servers (and are on a private 
+ have a need to access old SMB servers (and are on a private
  network) you probably want to say N.  Even if this support
  is enabled in the kernel build, they will not be used
@@ -1975,8 +1975,8 @@
  can be set to required (or optional) either in
  /proc/fs/cifs (see fs/cifs/README for more detail) or via an
- option on the mount command. This support is disabled by 
+ option on the mount command. This support is disabled by
  

Re: [PATCH 17/20] x86_64: Remove CONFIG_PHYSICAL_START

2006-11-17 Thread Vivek Goyal
On Sat, Nov 18, 2006 at 10:14:31AM +0900, Magnus Damm wrote:
> Hi Vivek,
> 
> Sorry for not commenting on an earlier version.
> 
> On 11/18/06, Vivek Goyal <[EMAIL PROTECTED]> wrote:
> >I am about to add relocatable kernel support which has essentially
> >no cost so there is no point in retaining CONFIG_PHYSICAL_START
> >and retaining CONFIG_PHYSICAL_START makes implementation of and
> >testing of a relocatable kernel more difficult.
> >
> >Signed-off-by: Eric W. Biederman <[EMAIL PROTECTED]>
> >Signed-off-by: Vivek Goyal <[EMAIL PROTECTED]>
> >---
> >
> > arch/x86_64/Kconfig|   19 ---
> > arch/x86_64/boot/compressed/head.S |6 +++---
> > arch/x86_64/boot/compressed/misc.c |6 +++---
> > arch/x86_64/defconfig  |1 -
> > arch/x86_64/kernel/vmlinux.lds.S   |2 +-
> > arch/x86_64/mm/fault.c |4 ++--
> > include/asm-x86_64/page.h  |2 --
> > 7 files changed, 9 insertions(+), 31 deletions(-)
> 
> [snip]
> 
> >diff -puN arch/x86_64/mm/fault.c~x86_64-Remove-CONFIG_PHYSICAL_START 
> >arch/x86_64/mm/fault.c
> >--- 
> >linux-2.6.19-rc6-reloc/arch/x86_64/mm/fault.c~x86_64-Remove-CONFIG_PHYSICAL_START
> >   2006-11-17 00:12:50.0 -0500
> >+++ linux-2.6.19-rc6-reloc-root/arch/x86_64/mm/fault.c  2006-11-17 
> >00:12:50.0 -0500
> >@@ -644,9 +644,9 @@ void vmalloc_sync_all(void)
> >start = address + PGDIR_SIZE;
> >}
> >/* Check that there is no need to do the same for the modules 
> >area. */
> >-   BUILD_BUG_ON(!(MODULES_VADDR > __START_KERNEL));
> >+   BUILD_BUG_ON(!(MODULES_VADDR > __START_KERNEL_map));
> >BUILD_BUG_ON(!(((MODULES_END - 1) & PGDIR_MASK) ==
> >-   (__START_KERNEL & PGDIR_MASK)));
> >+   (__START_KERNEL_map & PGDIR_MASK)));
> > }
> 
> This code looks either like a bugfix or a bug. If it's a fix then
> maybe it should be broken out and submitted separately for the
> rc-kernels?
> 

Magnus, Eric got rid of __START_KERNEL because he was compiling kernel
for physical addr zero which made __START_KERNEL and __START_KERNEL_map
same, hence he got rid of __START_KERNEL. That's why above change.

But compiling for physical address zero has got drawback that one can
not directly load a vmlinux as it shall have to be loaded at physical
addr zero. Hence I changed the behavior back to compile the kernel for
physical addr 2MB. So now __START_KERNEL = __START_KERNEL_map + 2MB.

Now it makes sense to retain __START_KERNEL. I have done the changes.


> >diff -puN include/asm-x86_64/page.h~x86_64-Remove-CONFIG_PHYSICAL_START 
> >include/asm-x86_64/page.h
> >--- 
> >linux-2.6.19-rc6-reloc/include/asm-x86_64/page.h~x86_64-Remove-CONFIG_PHYSICAL_START
> >2006-11-17 00:12:50.0 -0500
> >+++ linux-2.6.19-rc6-reloc-root/include/asm-x86_64/page.h   2006-11-17 
> >00:12:50.0 -0500
> >@@ -75,8 +75,6 @@ typedef struct { unsigned long pgprot; }
> >
> > #endif /* !__ASSEMBLY__ */
> >
> >-#define __PHYSICAL_START   _AC(CONFIG_PHYSICAL_START,UL)
> >-#define __START_KERNEL (__START_KERNEL_map + __PHYSICAL_START)
> > #define __START_KERNEL_map _AC(0x8000,UL)
> > #define __PAGE_OFFSET   _AC(0x8100,UL)
> 
> I understand that you want to remove the Kconfig option
> CONFIG_PHYSICAL_START and that is fine with me. I don't however like
> the idea of replacing __PHYSICAL_START and __START_KERNEL with
> hardcoded values. Is there any special reason behind this?
> 

All the hardcodings for 2MB have disappeared in final version. See next
patch in the series which actually implements relocatable kernel. Actually
the whole logic itself has changed hence we did not require these
hardcodings. This patch retains these hardcodings so that even if somebody
removes the top patch, kernel can be compiled and booted.

So bottom line, all the hardcodings are not present once all the patches
have been applied.

> The code in page.h already has constants for __START_KERNEL_map and
> __PAGE_OFFSET (thank god) and none of them are adjustable via Kconfig.
> Why not change as little as possible and keep __PHYSICAL_START and
> __START_KERNEL in page.h and the places that use them but remove
> references to CONFIG_PHYSICAL_START in Kconfig, defconfig, and page.h?

Good suggestion. Now I have retained __START_KERNEL. But did not feel 
the need to retain __PHYSICAL_START. It will be used only at one place
in page.h

Please find attached the regenerated patch.

Thanks
Vivek



I am about to add relocatable kernel support which has essentially
no cost so there is no point in retaining CONFIG_PHYSICAL_START
and retaining CONFIG_PHYSICAL_START makes implementation of and
testing of a relocatable kernel more difficult.

Signed-off-by: Eric W. Biederman <[EMAIL PROTECTED]>
Signed-off-by: Vivek Goyal <[EMAIL PROTECTED]>
---

 arch/x86_64/Kconfig|   19 ---
 

Re: [PATCH] emit logging when a process receives a fatal signal

2006-11-17 Thread Oleg Verych
On Sat, Nov 18, 2006 at 03:04:13AM +0100, Folkert van Heusden wrote:
> > > > I found that sometimes processes disappear on some heavily used system
> > > > of mine without any logging. So I've written a patch against 2.6.18.2
> > > > which emits logging when a process emits a fatal signal.
> > > Why not to patch default signal handlers in glibc, to have not only
> > > stderr, but syslog, or /dev/kmsg copy of fatal messages?
> > Afaik when a proces gets shot because of a segfault, also the libraries
> > it used are shot so to say. iirc some of the more fatal signals are
> > handled directly by the kernel.

Kernel sends signals, no doubt.

Then, who you think prints that "Killed" or "Segmentation fault"
messages in *stderr*?
[Hint: libc's default signal handler (man 2 signal).]

> Also: what about statically build programs?

"-lc" embeds libc in static binary, no?

IMHO it's not a lkml issue. Here are many who would say you, that userspace
preblems are userspace problems.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/7] paravirtualization: header and stubs for paravirtualizing critical operations

2006-11-17 Thread john stultz
On Wed, 2006-11-01 at 21:27 +1100, Rusty Russell wrote:
> Create a paravirt.h header for all the critical operations which need
> to be replaced with hypervisor calls, and include that instead of
> defining native operations, when CONFIG_PARAVIRT.
> 
> This patch does the dumbest possible replacement of paravirtualized
> instructions: calls through a "paravirt_ops" structure.  Currently
> these are function implementations of native hardware: hypervisors
> will override the ops structure with their own variants.
> 
[snip]

> +struct paravirt_ops paravirt_ops = {
> + .name = "bare hardware",
[snip]
> + .get_wallclock = native_get_wallclock,
> + .set_wallclock = native_set_wallclock,

[snip]

> --- /dev/null
> +++ b/include/asm-i386/time.h
> @@ -0,0 +1,41 @@
> +#ifndef _ASMi386_TIME_H
> +#define _ASMi386_TIME_H
> +
> +#include 
> +#include "mach_time.h"
> +
> +static inline unsigned long native_get_wallclock(void)
> +{
> + unsigned long retval;
> +
> + if (efi_enabled)
> + retval = efi_get_time();
> + else
> + retval = mach_get_cmos_time();
> +
> + return retval;
> +}
> +
> +static inline int native_set_wallclock(unsigned long nowtime)
> +{
> + int retval;
> +
> + if (efi_enabled)
> + retval = efi_set_rtc_mmss(nowtime);
> + else
> + retval = mach_set_rtc_mmss(nowtime);
> +
> + return retval;
> +}
> +
> +#ifdef CONFIG_PARAVIRT
> +#include 
> +#else /* !CONFIG_PARAVIRT */
> +
> +#define get_wallclock() native_get_wallclock()
> +#define set_wallclock(x) native_set_wallclock(x)


Could a better name then "get/set_wallclock" be used here? Its too vague
and would be easily confused with do_set/gettimeofday() functions.

My suggestion would be to use "persistent_clock" to describe the
battery-backed CMOS/hardware clock. (I assume that is what you intend
this paravirt_op to be, rather then get the high-resolution system
timeofday)

thanks
-john


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] emit logging when a process receives a fatal signal

2006-11-17 Thread Folkert van Heusden
> > > I found that sometimes processes disappear on some heavily used system
> > > of mine without any logging. So I've written a patch against 2.6.18.2
> > > which emits logging when a process emits a fatal signal.
> > Why not to patch default signal handlers in glibc, to have not only
> > stderr, but syslog, or /dev/kmsg copy of fatal messages?
> Afaik when a proces gets shot because of a segfault, also the libraries
> it used are shot so to say. iirc some of the more fatal signals are
> handled directly by the kernel.

Also: what about statically build programs?


Folkert van Heusden

-- 
Feeling generous? -> http://www.vanheusden.com/wishlist.php
--
Phone: +31-6-41278122, PGP-key: 1F28D8AE, www.vanheusden.com
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] emit logging when a process receives a fatal signal

2006-11-17 Thread Folkert van Heusden
Hi,

> > I found that sometimes processes disappear on some heavily used system
> > of mine without any logging. So I've written a patch against 2.6.18.2
> > which emits logging when a process emits a fatal signal.
> Why not to patch default signal handlers in glibc, to have not only
> stderr, but syslog, or /dev/kmsg copy of fatal messages?

Afaik when a proces gets shot because of a segfault, also the libraries
it used are shot so to say. iirc some of the more fatal signals are
handled directly by the kernel.


Folkert van Heusden

-- 
www.biglumber.com <- site where one can exchange PGP key signatures 
--
Phone: +31-6-41278122, PGP-key: 1F28D8AE, www.vanheusden.com
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] Regard MSRs in lapic_suspend()/lapic_resume()

2006-11-17 Thread Karsten Wiese

Read/Write APIC_LVTPC and APIC_LVTTHMR only,
if get_maxlvt() returns certain values.
This is done like everywhere else in i386/kernel/apic.c,
so I guess its correct.
Suspends/Resumes to disk fine and eleminates an smp_error_interrupt()
here on a K8.

Signed-off-by: Karsten Wiese <[EMAIL PROTECTED]>
---
 arch/i386/kernel/apic.c |   22 ++
 1 files changed, 18 insertions(+), 4 deletions(-)

diff --git a/arch/i386/kernel/apic.c b/arch/i386/kernel/apic.c
index 2fd4b7d..776d9be 100644
--- a/arch/i386/kernel/apic.c
+++ b/arch/i386/kernel/apic.c
@@ -647,23 +647,30 @@ static struct {
 static int lapic_suspend(struct sys_device *dev, pm_message_t state)
 {
unsigned long flags;
+   int maxlvt;
 
if (!apic_pm_state.active)
return 0;
 
+   maxlvt = get_maxlvt();
+
apic_pm_state.apic_id = apic_read(APIC_ID);
apic_pm_state.apic_taskpri = apic_read(APIC_TASKPRI);
apic_pm_state.apic_ldr = apic_read(APIC_LDR);
apic_pm_state.apic_dfr = apic_read(APIC_DFR);
apic_pm_state.apic_spiv = apic_read(APIC_SPIV);
apic_pm_state.apic_lvtt = apic_read(APIC_LVTT);
-   apic_pm_state.apic_lvtpc = apic_read(APIC_LVTPC);
+   if (maxlvt >= 4)
+   apic_pm_state.apic_lvtpc = apic_read(APIC_LVTPC);
apic_pm_state.apic_lvt0 = apic_read(APIC_LVT0);
apic_pm_state.apic_lvt1 = apic_read(APIC_LVT1);
apic_pm_state.apic_lvterr = apic_read(APIC_LVTERR);
apic_pm_state.apic_tmict = apic_read(APIC_TMICT);
apic_pm_state.apic_tdcr = apic_read(APIC_TDCR);
-   apic_pm_state.apic_thmr = apic_read(APIC_LVTTHMR);
+#ifdef CONFIG_X86_MCE_P4THERMAL
+   if (maxlvt >= 5)
+   apic_pm_state.apic_thmr = apic_read(APIC_LVTTHMR);
+#endif

local_irq_save(flags);
disable_local_APIC();
@@ -675,10 +682,13 @@ static int lapic_resume(struct sys_devic
 {
unsigned int l, h;
unsigned long flags;
+   int maxlvt;
 
if (!apic_pm_state.active)
return 0;
 
+   maxlvt = get_maxlvt();
+
local_irq_save(flags);
 
/*
@@ -700,8 +710,12 @@ static int lapic_resume(struct sys_devic
apic_write(APIC_SPIV, apic_pm_state.apic_spiv);
apic_write(APIC_LVT0, apic_pm_state.apic_lvt0);
apic_write(APIC_LVT1, apic_pm_state.apic_lvt1);
-   apic_write(APIC_LVTTHMR, apic_pm_state.apic_thmr);
-   apic_write(APIC_LVTPC, apic_pm_state.apic_lvtpc);
+#ifdef CONFIG_X86_MCE_P4THERMAL
+   if (maxlvt >= 5)
+   apic_write(APIC_LVTTHMR, apic_pm_state.apic_thmr);
+#endif
+   if (maxlvt >= 4)
+   apic_write(APIC_LVTPC, apic_pm_state.apic_lvtpc);
apic_write(APIC_LVTT, apic_pm_state.apic_lvtt);
apic_write(APIC_TDCR, apic_pm_state.apic_tdcr);
apic_write(APIC_TMICT, apic_pm_state.apic_tmict);
-- 
1.4.2.4
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 19/20] x86_64: Extend bzImage protocol for relocatable kernel

2006-11-17 Thread Vivek Goyal
On Fri, Nov 17, 2006 at 04:45:46PM -0800, H. Peter Anvin wrote:
> Vivek Goyal wrote:
> >On Fri, Nov 17, 2006 at 04:30:04PM -0800, H. Peter Anvin wrote:
> >>Vivek Goyal wrote:
> >>>o Extend the bzImage protocol (same as i386) to allow bzImage loaders to
> >>> load the protected mode kernel at non-1MB address. Now protected mode
> >>> component is relocatable and can be loaded at non-1MB addresses.
> >>>
> >>>o As of today kdump uses it to run a second kernel from a reserved memory
> >>> area.
> >>>
> >>>Signed-off-by: Vivek Goyal <[EMAIL PROTECTED]>
> >>Do you have a patch for Documentation/i386/boot.txt as well?
> >>
> >
> >Yes. As documentation is shared between i386 and x86_64, It is already 
> >there
> >in Andi's tree and in -mm. I had pushed that with i386 relocatable bzImage
> >changes.
> >
> >http://kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.19-rc5/2.6.19-rc5-mm2/broken-out/x86_64-mm-extend-bzimage-protocol-for-relocatable-protected-mode-kernel.patch
> >
> 
> Your documentation change is buggy.
> 
> The fields at 0230/4 and 0234/1 are 2.05+ not 2.04+
> 
> Please fix, also please update the last revision date.

Thanks for noticing this. Just now sent a patch in separate thread to fix
this.

Thanks
Vivek
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [2.6 patch] mark pci_find_device() as __deprecated

2006-11-17 Thread Alan
On Sat, 18 Nov 2006 01:06:29 +0100
> Oh, and if anything starts complaining "But this adds some warnings to 
> my kernel build!", he should either first fix the 200 kB (sic) of 
> warnings I'm getting in 2.6.19-rc5-mm2 starting at MODPOST or go to hell.
> 
> Signed-off-by: Adrian Bunk <[EMAIL PROTECTED]>

The only significant user remaining is the old ISDN code so it doesn't
create too many of them. 

Acked-by: Alan Cox <[EMAIL PROTECTED]>
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] emit logging when a process receives a fatal signal

2006-11-17 Thread Oleg Verych
On 2006-11-18, Folkert van Heusden wrote:
> Hi,
>
> I found that sometimes processes disappear on some heavily used system
> of mine without any logging. So I've written a patch against 2.6.18.2
> which emits logging when a process emits a fatal signal.

Why not to patch default signal handlers in glibc, to have not only
stderr, but syslog, or /dev/kmsg copy of fatal messages?

> Signed-off-by: Folkert van Heusden <[EMAIL PROTECTED]>
>
> --- linux-2.6.18.2/kernel/signal.c2006-11-04 02:33:58.0 +0100
> +++ linux-2.6.18.2.new/kernel/signal.c2006-11-17 15:59:13.0 
> +0100
> @@ -706,6 +706,15 @@
>   struct sigqueue * q = NULL;
>   int ret = 0;
>  
> + if (sig == SIGQUIT || sig == SIGILL  || sig == SIGTRAP ||
> + sig == SIGABRT || sig == SIGBUS  || sig == SIGFPE  ||
> + sig == SIGSEGV || sig == SIGXCPU || sig == SIGXFSZ ||
> + sig == SIGSYS  || sig == SIGSTKFLT)
> + {
> + printk(KERN_WARNING "Sig %d send to %d owned by %d.%d (%s)\n",
> + sig, t -> pid, t -> uid, t -> gid, t -> comm);
> + }
> +
>   /*
>* fast-pathed signals for kernel-internal things like SIGSTOP
>* or SIGKILL.
>
>
> Folkert van Heusden
>
> www.vanheusden.com/multitail - multitail is tail on steroids. multiple
>windows, filtering, coloring, anything you can think of
> --
> Phone: +31-6-41278122, PGP-key: 1F28D8AE, www.vanheusden.com

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 12/20] x86_64: wakeup.S Misc cleanup

2006-11-17 Thread Vivek Goyal
On Sat, Nov 18, 2006 at 01:19:07AM +0100, Pavel Machek wrote:
> Hi!
> 
> > o Various cleanups. One of the main purpose of cleanups is that make
> >   wakeup.S as close as possible to trampoline.S.
> >
> > o Following are the changes
> > - Indentations for comments.
> > - Changed the gdt table to compact form and to resemble the
> >   one in trampoline.S
> > - Take the jump to 32bit from real mode using ljmpl. Makes code
> >   more readable.
> > - After enabling long mode, directly take a long jump for 64bit
> >   mode. No need to take an extra jump to "reach_comaptibility_mode"
> > - Stack is not used after real mode. So don't load stack in
> >   32 bit mode.
> > - No need to enable PGE here.
> > - No need to do extra EFER read, anyway we trash the read contents.
> > - No need to enable system call (EFER_SCE). Anyway it will be
> >   enabled when original EFER is restored.
> > - No need to set MP, ET, NE, WP, AM bits in cr0. Very soon we will
> >   reload the original cr0 while restroing the processor state.
> >
> > Signed-off-by: Eric W. Biederman <[EMAIL PROTECTED]>
> > Signed-off-by: Vivek Goyal <[EMAIL PROTECTED]>
> 
> ACK, minor nitpicks:
> 
> > +   /* ??? Why I need the accessed bit set in order for this to work? */
> 
> Yes, I'd like to know :-).
> 

I don't know. :-( May be it is present because in original code also
it is present. I just changed it from 9b00 to 9a00 for __KERNEL32_CS
and __KERNEL_CS to mark the entry unaccessed and it works fine for me.

Eric, any thoughts on this. ? 

> > +   .quad   0x00cf9b00  # __KERNEL32_CS
> > +   .quad   0x00af9b00  # __KERNEL_CS
> > +   .quad   0x00cf9300  # __KERNEL_DS
> 
> Can we get a comment telling us what to keep it in sync with?
> 

Ok. Just added a line mentioning that keep it in sync with trampoline.S

Please find attached the revised patch.

Thanks
Vivek




o Various cleanups. One of the main purpose of cleanups is that make
  wakeup.S as close as possible to trampoline.S.

o Following are the changes
- Indentations for comments.
- Changed the gdt table to compact form and to resemble the
  one in trampoline.S
- Take the jump to 32bit from real mode using ljmpl. Makes code
  more readable.
- After enabling long mode, directly take a long jump for 64bit
  mode. No need to take an extra jump to "reach_comaptibility_mode"
- Stack is not used after real mode. So don't load stack in
  32 bit mode.
- No need to enable PGE here.
- No need to do extra EFER read, anyway we trash the read contents.
- No need to enable system call (EFER_SCE). Anyway it will be 
  enabled when original EFER is restored.
- No need to set MP, ET, NE, WP, AM bits in cr0. Very soon we will
  reload the original cr0 while restroing the processor state.

Signed-off-by: Vivek Goyal <[EMAIL PROTECTED]>
---

 arch/x86_64/kernel/acpi/wakeup.S |  112 +--
 1 file changed, 40 insertions(+), 72 deletions(-)

diff -puN arch/x86_64/kernel/acpi/wakeup.S~x86_64-wakeup.S-misc-cleanups 
arch/x86_64/kernel/acpi/wakeup.S
--- 
linux-2.6.19-rc5-git2-reloc/arch/x86_64/kernel/acpi/wakeup.S~x86_64-wakeup.S-misc-cleanups
  2006-11-17 00:29:30.0 -0500
+++ linux-2.6.19-rc5-git2-reloc-root/arch/x86_64/kernel/acpi/wakeup.S   
2006-11-17 10:01:10.0 -0500
@@ -30,11 +30,12 @@ wakeup_code:
cld
# setup data segment
movw%cs, %ax
-   movw%ax, %ds# Make ds:0 
point to wakeup_start
+   movw%ax, %ds# Make ds:0 point to wakeup_start
movw%ax, %ss
-   mov $(wakeup_stack - wakeup_code), %sp  # Private stack 
is needed for ASUS board
+   # Private stack is needed for ASUS board
+   mov $(wakeup_stack - wakeup_code), %sp
 
-   pushl   $0  # Kill any 
dangerous flags
+   pushl   $0  # Kill any dangerous flags
popfl
 
movlreal_magic - wakeup_code, %eax
@@ -45,7 +46,7 @@ wakeup_code:
jz  1f
lcall   $0xc000,$3
movw%cs, %ax
-   movw%ax, %ds# Bios might 
have played with that
+   movw%ax, %ds# Bios might have played with that
movw%ax, %ss
 1:
 
@@ -75,9 +76,12 @@ wakeup_code:
jmp 1f
 1:
 
-   .byte 0x66, 0xea# prefix + jmpi-opcode
-   .long   wakeup_32 - __START_KERNEL_map
-   .word   __KERNEL_CS
+   ljmpl   *(wakeup_32_vector - wakeup_code)
+
+   .balign 4
+wakeup_32_vector:
+   .long   wakeup_32 - __START_KERNEL_map
+   .word   __KERNEL32_CS, 0
 
.code32
 wakeup_32:
@@ -96,65 +100,50 @@ 

Re: [PATCH 17/20] x86_64: Remove CONFIG_PHYSICAL_START

2006-11-17 Thread Magnus Damm

Hi Vivek,

Sorry for not commenting on an earlier version.

On 11/18/06, Vivek Goyal <[EMAIL PROTECTED]> wrote:

I am about to add relocatable kernel support which has essentially
no cost so there is no point in retaining CONFIG_PHYSICAL_START
and retaining CONFIG_PHYSICAL_START makes implementation of and
testing of a relocatable kernel more difficult.

Signed-off-by: Eric W. Biederman <[EMAIL PROTECTED]>
Signed-off-by: Vivek Goyal <[EMAIL PROTECTED]>
---

 arch/x86_64/Kconfig|   19 ---
 arch/x86_64/boot/compressed/head.S |6 +++---
 arch/x86_64/boot/compressed/misc.c |6 +++---
 arch/x86_64/defconfig  |1 -
 arch/x86_64/kernel/vmlinux.lds.S   |2 +-
 arch/x86_64/mm/fault.c |4 ++--
 include/asm-x86_64/page.h  |2 --
 7 files changed, 9 insertions(+), 31 deletions(-)


[snip]


diff -puN arch/x86_64/mm/fault.c~x86_64-Remove-CONFIG_PHYSICAL_START 
arch/x86_64/mm/fault.c
--- 
linux-2.6.19-rc6-reloc/arch/x86_64/mm/fault.c~x86_64-Remove-CONFIG_PHYSICAL_START
   2006-11-17 00:12:50.0 -0500
+++ linux-2.6.19-rc6-reloc-root/arch/x86_64/mm/fault.c  2006-11-17 
00:12:50.0 -0500
@@ -644,9 +644,9 @@ void vmalloc_sync_all(void)
start = address + PGDIR_SIZE;
}
/* Check that there is no need to do the same for the modules area. */
-   BUILD_BUG_ON(!(MODULES_VADDR > __START_KERNEL));
+   BUILD_BUG_ON(!(MODULES_VADDR > __START_KERNEL_map));
BUILD_BUG_ON(!(((MODULES_END - 1) & PGDIR_MASK) ==
-   (__START_KERNEL & PGDIR_MASK)));
+   (__START_KERNEL_map & PGDIR_MASK)));
 }


This code looks either like a bugfix or a bug. If it's a fix then
maybe it should be broken out and submitted separately for the
rc-kernels?


diff -puN include/asm-x86_64/page.h~x86_64-Remove-CONFIG_PHYSICAL_START 
include/asm-x86_64/page.h
--- 
linux-2.6.19-rc6-reloc/include/asm-x86_64/page.h~x86_64-Remove-CONFIG_PHYSICAL_START
2006-11-17 00:12:50.0 -0500
+++ linux-2.6.19-rc6-reloc-root/include/asm-x86_64/page.h   2006-11-17 
00:12:50.0 -0500
@@ -75,8 +75,6 @@ typedef struct { unsigned long pgprot; }

 #endif /* !__ASSEMBLY__ */

-#define __PHYSICAL_START   _AC(CONFIG_PHYSICAL_START,UL)
-#define __START_KERNEL (__START_KERNEL_map + __PHYSICAL_START)
 #define __START_KERNEL_map _AC(0x8000,UL)
 #define __PAGE_OFFSET   _AC(0x8100,UL)


I understand that you want to remove the Kconfig option
CONFIG_PHYSICAL_START and that is fine with me. I don't however like
the idea of replacing __PHYSICAL_START and __START_KERNEL with
hardcoded values. Is there any special reason behind this?

The code in page.h already has constants for __START_KERNEL_map and
__PAGE_OFFSET (thank god) and none of them are adjustable via Kconfig.
Why not change as little as possible and keep __PHYSICAL_START and
__START_KERNEL in page.h and the places that use them but remove
references to CONFIG_PHYSICAL_START in Kconfig, defconfig, and page.h?

/ magnus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: How to go about debuging a system lockup?

2006-11-17 Thread Krzysztof Halasa
"Jesper Juhl" <[EMAIL PROTECTED]> writes:

> Or just try a few random older 2.6 kernels like 2.6.14, 2.6.9,
> 2.6.whatever (of course it needs to be a version that git knows
> about).

One can also do "bisect" manually, works with all kernels.
-- 
Krzysztof Halasa
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] emit logging when a process receives a fatal signal

2006-11-17 Thread Folkert van Heusden
Hi,

I found that sometimes processes disappear on some heavily used system
of mine without any logging. So I've written a patch against 2.6.18.2
which emits logging when a process emits a fatal signal.

Signed-off-by: Folkert van Heusden <[EMAIL PROTECTED]>

--- linux-2.6.18.2/kernel/signal.c  2006-11-04 02:33:58.0 +0100
+++ linux-2.6.18.2.new/kernel/signal.c  2006-11-17 15:59:13.0 +0100
@@ -706,6 +706,15 @@
struct sigqueue * q = NULL;
int ret = 0;
 
+   if (sig == SIGQUIT || sig == SIGILL  || sig == SIGTRAP ||
+   sig == SIGABRT || sig == SIGBUS  || sig == SIGFPE  ||
+   sig == SIGSEGV || sig == SIGXCPU || sig == SIGXFSZ ||
+   sig == SIGSYS  || sig == SIGSTKFLT)
+   {
+   printk(KERN_WARNING "Sig %d send to %d owned by %d.%d (%s)\n",
+   sig, t -> pid, t -> uid, t -> gid, t -> comm);
+   }
+
/*
 * fast-pathed signals for kernel-internal things like SIGSTOP
 * or SIGKILL.


Folkert van Heusden

www.vanheusden.com/multitail - multitail is tail on steroids. multiple
   windows, filtering, coloring, anything you can think of
--
Phone: +31-6-41278122, PGP-key: 1F28D8AE, www.vanheusden.com
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Basic support for siemens sx1

2006-11-17 Thread Tony Lindgren
* Pavel Machek <[EMAIL PROTECTED]> [061118 02:38]:
> Hi!
> 
> > * Pavel Machek <[EMAIL PROTECTED]> [061116 19:04]:
> > > From: Vladimir Ananiev <[EMAIL PROTECTED]>
> > > 
> > > This adds basic support for Siemens SX1. More patches are available,
> > > with video driver, mixer, and serial ports working. That is enough to
> > > do gsm calls with right userland.
> > 
> > Cool.
> 
> :-)
> 
> > > It would be nice to get basic patches merged to the -omap tree... do
> > > they look ok?
> > 
> > Yeah, looks good, except for the i2c part. Is Sofia really a TI PCF8574
> > i2c chip? In that case it could use the gpioexpander code.  
> > 
> > Anyways, let's plan on pushing this to linux-omap tree, then do the
> > changes for gpioexpander, and send that upstream too.
> 
> Works for me. I'll check with google to find out what sofia really is.

OK, I've pushed to linux-omap after separating USB and MMC code into
separate patches. Also did a bit more tabifying on the code.

BTW, PCF8574 .pdf is available on TI's site.

Regards,

Tony
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 19/20] x86_64: Extend bzImage protocol for relocatable kernel

2006-11-17 Thread H. Peter Anvin

Vivek Goyal wrote:

On Fri, Nov 17, 2006 at 04:30:04PM -0800, H. Peter Anvin wrote:

Vivek Goyal wrote:

o Extend the bzImage protocol (same as i386) to allow bzImage loaders to
 load the protected mode kernel at non-1MB address. Now protected mode
 component is relocatable and can be loaded at non-1MB addresses.

o As of today kdump uses it to run a second kernel from a reserved memory
 area.

Signed-off-by: Vivek Goyal <[EMAIL PROTECTED]>

Do you have a patch for Documentation/i386/boot.txt as well?



Yes. As documentation is shared between i386 and x86_64, It is already there
in Andi's tree and in -mm. I had pushed that with i386 relocatable bzImage
changes.

http://kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.19-rc5/2.6.19-rc5-mm2/broken-out/x86_64-mm-extend-bzimage-protocol-for-relocatable-protected-mode-kernel.patch



Your documentation change is buggy.

The fields at 0230/4 and 0234/1 are 2.05+ not 2.04+

Please fix, also please update the last revision date.

-hpa
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 9/20] x86_64: 64bit PIC SMP trampoline

2006-11-17 Thread Pavel Machek
On Fri 2006-11-17 19:33:52, Vivek Goyal wrote:
> On Sat, Nov 18, 2006 at 01:27:10AM +0100, Pavel Machek wrote:
> > Hi!
> > 
> > > that long mode is supported.  Asking if long mode is implemented is
> > > down right silly but we have traditionally had some of these checks,
> > > and they can't hurt anything.  So when the totally ludicrous happens
> > > we just might handle it correctly.
> > 
> > Well, it is silly, and it is 50 lines of dense assembly. can we get
> > rid of it or get it shared with bootup version?
> > 
> 
> Hi Pavel,
> 
> Last patch in the series (patch 20)  already does that. That patch just
> puts all the assembly at one place which everybody shares. 
> 
> I know it is bad to introduce and delete your own code, but I kept that
> patch as last patch as all the other patches have got fair bit of testing
> in RHEL kernels and I wanted to make sure that if last patch breaks something
> problem can be isolated relatively easily.

Ahha, okay. ACK, then.
Pavel

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Basic support for siemens sx1

2006-11-17 Thread Pavel Machek
Hi!

> * Pavel Machek <[EMAIL PROTECTED]> [061116 19:04]:
> > From: Vladimir Ananiev <[EMAIL PROTECTED]>
> > 
> > This adds basic support for Siemens SX1. More patches are available,
> > with video driver, mixer, and serial ports working. That is enough to
> > do gsm calls with right userland.
> 
> Cool.

:-)

> > It would be nice to get basic patches merged to the -omap tree... do
> > they look ok?
> 
> Yeah, looks good, except for the i2c part. Is Sofia really a TI PCF8574
> i2c chip? In that case it could use the gpioexpander code.  
> 
> Anyways, let's plan on pushing this to linux-omap tree, then do the
> changes for gpioexpander, and send that upstream too.

Works for me. I'll check with google to find out what sofia really is.

Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 19/20] x86_64: Extend bzImage protocol for relocatable kernel

2006-11-17 Thread Vivek Goyal
On Fri, Nov 17, 2006 at 04:30:04PM -0800, H. Peter Anvin wrote:
> Vivek Goyal wrote:
> >
> >o Extend the bzImage protocol (same as i386) to allow bzImage loaders to
> >  load the protected mode kernel at non-1MB address. Now protected mode
> >  component is relocatable and can be loaded at non-1MB addresses.
> >
> >o As of today kdump uses it to run a second kernel from a reserved memory
> >  area.
> >
> >Signed-off-by: Vivek Goyal <[EMAIL PROTECTED]>
> 
> Do you have a patch for Documentation/i386/boot.txt as well?
> 

Yes. As documentation is shared between i386 and x86_64, It is already there
in Andi's tree and in -mm. I had pushed that with i386 relocatable bzImage
changes.

http://kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.19-rc5/2.6.19-rc5-mm2/broken-out/x86_64-mm-extend-bzimage-protocol-for-relocatable-protected-mode-kernel.patch

Thanks
Vivek
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 19/20] x86_64: Extend bzImage protocol for relocatable kernel

2006-11-17 Thread H. Peter Anvin

Vivek Goyal wrote:


o Extend the bzImage protocol (same as i386) to allow bzImage loaders to
  load the protected mode kernel at non-1MB address. Now protected mode
  component is relocatable and can be loaded at non-1MB addresses.

o As of today kdump uses it to run a second kernel from a reserved memory
  area.

Signed-off-by: Vivek Goyal <[EMAIL PROTECTED]>


Do you have a patch for Documentation/i386/boot.txt as well?

-hpa
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] cpufreq: mark cpufreq_tsc() as core_initcall_sync

2006-11-17 Thread Paul E. McKenney
On Fri, Nov 17, 2006 at 02:27:15PM -0500, Alan Stern wrote:
> On Fri, 17 Nov 2006, Paul E. McKenney wrote:
> 
> > > It works for me, but the overhead is still large. Before it would take
> > > 8-12 jiffies for a synchronize_srcu() to complete without there actually
> > > being any reader locks active, now it takes 2-3 jiffies. So it's
> > > definitely faster, and as suspected the loss of two of three
> > > synchronize_sched() cut down the overhead to a third.
> > 
> > Good to hear, thank you for trying it out!
> > 
> > > It's still too heavy for me, by far the most calls I do to
> > > synchronize_srcu() doesn't have any reader locks pending. I'm still a
> > > big advocate of the fastpath srcu_readers_active() check. I can
> > > understand the reluctance to make it the default, but for my case it's
> > > "safe enough", so if we could either export srcu_readers_active() or
> > > export a synchronize_srcu_fast() (or something like that), then SRCU
> > > would be a good fit for barrier vs plug rework.
> > 
> > OK, will export the interface.  Do your queues have associated locking?
> > 
> > > > Attached is a patch that compiles, but probably goes down in flames
> > > > otherwise.
> > > 
> > > Works here :-)
> > 
> > I have at least a couple bugs that would show up under low-memory
> > situations, will fix and post an update.
> 
> Perhaps a better approach to the initialization problem would be to assume 
> that either:
> 
> 1.  The srcu_struct will be initialized before it is used, or
> 
> 2.  When it is used before initialization, the system is running
>   only one thread.

Are these assumptions valid?  If so, they would indeed simplify things
a bit.

> In other words, statically allocated SRCU strucures that get used during
> system startup must be initialized before the system starts multitasking.  
> That seems like a reasonable requirement.
> 
> This eliminates worries about readers holding mutexes.  It doesn't 
> solve the issues surrounding your hardluckref, but maybe it makes them 
> easier to think about.

For the moment, I cheaped out and used a mutex_trylock.  If this can block,
I will need to add a separate spinlock to guard per_cpu_ref allocation.

Hmmm...  How to test this?  Time for the wrapper around alloc_percpu()
that randomly fails, I guess.  ;-)

Thanx, Paul
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 9/20] x86_64: 64bit PIC SMP trampoline

2006-11-17 Thread Vivek Goyal
On Sat, Nov 18, 2006 at 01:27:10AM +0100, Pavel Machek wrote:
> Hi!
> 
> > that long mode is supported.  Asking if long mode is implemented is
> > down right silly but we have traditionally had some of these checks,
> > and they can't hurt anything.  So when the totally ludicrous happens
> > we just might handle it correctly.
> 
> Well, it is silly, and it is 50 lines of dense assembly. can we get
> rid of it or get it shared with bootup version?
> 

Hi Pavel,

Last patch in the series (patch 20)  already does that. That patch just
puts all the assembly at one place which everybody shares. 

I know it is bad to introduce and delete your own code, but I kept that
patch as last patch as all the other patches have got fair bit of testing
in RHEL kernels and I wanted to make sure that if last patch breaks something
problem can be isolated relatively easily.

Thanks
Vivek
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 11/20] x86_64: wakeup.S Rename labels to reflect right register names

2006-11-17 Thread Vivek Goyal


o Use appropriate names for 64bit regsiters.

Signed-off-by: Eric W. Biederman <[EMAIL PROTECTED]>
Signed-off-by: Vivek Goyal <[EMAIL PROTECTED]>
---

 arch/x86_64/kernel/acpi/wakeup.S |   36 ++--
 include/asm-x86_64/suspend.h |   12 ++--
 2 files changed, 24 insertions(+), 24 deletions(-)

diff -puN 
arch/x86_64/kernel/acpi/wakeup.S~x86_64-wakeup.S-rename-registers-to-reflect-right-names
 arch/x86_64/kernel/acpi/wakeup.S
--- 
linux-2.6.19-rc6-reloc/arch/x86_64/kernel/acpi/wakeup.S~x86_64-wakeup.S-rename-registers-to-reflect-right-names
 2006-11-17 00:09:29.0 -0500
+++ linux-2.6.19-rc6-reloc-root/arch/x86_64/kernel/acpi/wakeup.S
2006-11-17 00:09:29.0 -0500
@@ -211,16 +211,16 @@ wakeup_long64:
movw%ax, %es
movw%ax, %fs
movw%ax, %gs
-   movqsaved_esp, %rsp
+   movqsaved_rsp, %rsp
 
movw$0x0e00 + 'x', %ds:(0xb8018)
-   movqsaved_ebx, %rbx
-   movqsaved_edi, %rdi
-   movqsaved_esi, %rsi
-   movqsaved_ebp, %rbp
+   movqsaved_rbx, %rbx
+   movqsaved_rdi, %rdi
+   movqsaved_rsi, %rsi
+   movqsaved_rbp, %rbp
 
movw$0x0e00 + '!', %ds:(0xb801a)
-   movqsaved_eip, %rax
+   movqsaved_rip, %rax
jmp *%rax
 
 .code32
@@ -408,13 +408,13 @@ do_suspend_lowlevel:
movq %r15, saved_context_r15(%rip)
pushfq ; popq saved_context_eflags(%rip)
 
-   movq$.L97, saved_eip(%rip)
+   movq$.L97, saved_rip(%rip)
 
-   movq %rsp,saved_esp
-   movq %rbp,saved_ebp
-   movq %rbx,saved_ebx
-   movq %rdi,saved_edi
-   movq %rsi,saved_esi
+   movq %rsp,saved_rsp
+   movq %rbp,saved_rbp
+   movq %rbx,saved_rbx
+   movq %rdi,saved_rdi
+   movq %rsi,saved_rsi
 
addq$8, %rsp
movl$3, %edi
@@ -461,12 +461,12 @@ do_suspend_lowlevel:

 .data
 ALIGN
-ENTRY(saved_ebp)   .quad   0
-ENTRY(saved_esi)   .quad   0
-ENTRY(saved_edi)   .quad   0
-ENTRY(saved_ebx)   .quad   0
+ENTRY(saved_rbp)   .quad   0
+ENTRY(saved_rsi)   .quad   0
+ENTRY(saved_rdi)   .quad   0
+ENTRY(saved_rbx)   .quad   0
 
-ENTRY(saved_eip)   .quad   0
-ENTRY(saved_esp)   .quad   0
+ENTRY(saved_rip)   .quad   0
+ENTRY(saved_rsp)   .quad   0
 
 ENTRY(saved_magic) .quad   0
diff -puN 
include/asm-x86_64/suspend.h~x86_64-wakeup.S-rename-registers-to-reflect-right-names
 include/asm-x86_64/suspend.h
--- 
linux-2.6.19-rc6-reloc/include/asm-x86_64/suspend.h~x86_64-wakeup.S-rename-registers-to-reflect-right-names
 2006-11-17 00:09:29.0 -0500
+++ linux-2.6.19-rc6-reloc-root/include/asm-x86_64/suspend.h2006-11-17 
00:09:29.0 -0500
@@ -45,12 +45,12 @@ extern unsigned long saved_context_eflag
 extern void fix_processor_context(void);
 
 #ifdef CONFIG_ACPI_SLEEP
-extern unsigned long saved_eip;
-extern unsigned long saved_esp;
-extern unsigned long saved_ebp;
-extern unsigned long saved_ebx;
-extern unsigned long saved_esi;
-extern unsigned long saved_edi;
+extern unsigned long saved_rip;
+extern unsigned long saved_rsp;
+extern unsigned long saved_rbp;
+extern unsigned long saved_rbx;
+extern unsigned long saved_rsi;
+extern unsigned long saved_rdi;
 
 /* routines for saving/restoring kernel state */
 extern int acpi_save_state_mem(void);
_
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 9/20] x86_64: 64bit PIC SMP trampoline

2006-11-17 Thread Pavel Machek
Hi!

> that long mode is supported.  Asking if long mode is implemented is
> down right silly but we have traditionally had some of these checks,
> and they can't hurt anything.  So when the totally ludicrous happens
> we just might handle it correctly.

Well, it is silly, and it is 50 lines of dense assembly. can we get
rid of it or get it shared with bootup version?

The REQUIRED_MASK1/2 look like something that could get out of sync,
for example.

The rest of patch looks okay.

(The traditional checks were unneeded, so it is okay to drop them...)

Pavel

> + .code16
> +verify_cpu:
> + pushl   $0  # Kill any dangerous flags
> + popfl
> +
> + /* minimum CPUID flags for x86-64 */
> + /* see http://www.x86-64.org/lists/discuss/msg02971.html */
> +#define REQUIRED_MASK1 ((1<<0)|(1<<3)|(1<<4)|(1<<5)|(1<<6)|(1<<8)|\
> +(1<<13)|(1<<15)|(1<<24)|(1<<25)|(1<<26))
> +#define REQUIRED_MASK2 (1<<29)
> +
> + pushfl  # check for cpuid
> + popl%eax
> + movl%eax, %ebx
> + xorl$0x20,%eax
> + pushl   %eax
> + popfl
> + pushfl
> + popl%eax
> + pushl   %ebx
> + popfl
> + cmpl%eax, %ebx
> + jz  no_longmode
> +
> + xorl%eax, %eax  # See if cpuid 1 is implemented
> + cpuid
> + cmpl$0x1, %eax
> + jb  no_longmode
> +
> + movl$0x01, %eax # Does the cpu have what it takes?
> + cpuid
> + andl$REQUIRED_MASK1, %edx
> + xorl$REQUIRED_MASK1, %edx
> + jnz no_longmode
> +
> + movl$0x8000, %eax   # See if extended cpuid is implemented
> + cpuid
> + cmpl$0x8001, %eax
> + jb  no_longmode
> +
> + movl$0x8001, %eax   # Does the cpu have what it takes?
> + cpuid
> + andl$REQUIRED_MASK2, %edx
> + xorl$REQUIRED_MASK2, %edx
> + jnz no_longmode
> +
> + ret # The cpu supports long mode
> +
> +no_longmode:
> + hlt
> + jmp no_longmode
> +

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 5/20] x86_64: Fix early printk to use standard ISA mapping

2006-11-17 Thread Vivek Goyal


Signed-off-by: Eric W. Biederman <[EMAIL PROTECTED]>
Signed-off-by: Vivek Goyal <[EMAIL PROTECTED]>
---

 arch/x86_64/kernel/early_printk.c |3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff -puN 
arch/x86_64/kernel/early_printk.c~x86_64-fix-early_printk-to-use-the-standard-ISA-mapping
 arch/x86_64/kernel/early_printk.c
--- 
linux-2.6.19-rc6-reloc/arch/x86_64/kernel/early_printk.c~x86_64-fix-early_printk-to-use-the-standard-ISA-mapping
2006-11-17 00:06:43.0 -0500
+++ linux-2.6.19-rc6-reloc-root/arch/x86_64/kernel/early_printk.c   
2006-11-17 00:06:43.0 -0500
@@ -11,11 +11,10 @@
 
 #ifdef __i386__
 #include 
-#define VGABASE(__ISA_IO_base + 0xb8000)
 #else
 #include 
-#define VGABASE((void __iomem *)0x800b8000UL)
 #endif
+#define VGABASE(__ISA_IO_base + 0xb8000)
 
 static int max_ypos = 25, max_xpos = 80;
 static int current_ypos = 25, current_xpos = 0;
_
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 13/20] x86_64: 64bit PIC ACPI wakeup trampoline

2006-11-17 Thread Pavel Machek
On Fri 2006-11-17 17:51:03, Vivek Goyal wrote:
> 
> 
> o Moved wakeup_level4_pgt into the wakeup routine so we can
>   run the kernel above 4G.
> 
> o Now we first go to 64bit mode and continue to run from trampoline and
>   then then start accessing kernel symbols and restore processor context.
>   This enables us to resume even in relocatable kernel context when 
>   kernel might not be loaded at physical addr it has been compiled for.
> 
> o Removed the need for modifying any existing kernel page table.
> 
> o Increased the size of the wakeup routine to 8K. This is required as
>   wake page tables are on trampoline itself and they got to be at 4K
>   boundary, hence one page is not sufficient.
> 
> Signed-off-by: Eric W. Biederman <[EMAIL PROTECTED]>
> Signed-off-by: Vivek Goyal <[EMAIL PROTECTED]>

Looks okay to me, ACK.
Pavel

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 8/20] x86_64: Add EFER to the set registers saved by save_processor_state

2006-11-17 Thread Pavel Machek
Hi!

> EFER varies like %cr4 depending on the cpu capabilities, and which cpu
> capabilities we want to make use of.  So save/restore it make certain
> we have the same EFER value when we are done.

I still think that comment is right: EFER is function(cpu
capabilities, kernel version, kernel cmdline); and that _should_ be
constant accross suspend.

Anyway saving it does not hurt and code is probably easier to
understand.

ACK.
Pavel


>   /* XMM0..XMM15 should be handled by kernel_fpu_begin(). */
> - /* EFER should be constant for kernel version, no need to handle it. */
>   /*
>* segment registers
>*/
> @@ -50,6 +49,7 @@ void __save_processor_state(struct saved
>   /*
>* control registers 
>*/
> + rdmsrl(MSR_EFER, ctxt->efer);
>   asm volatile ("movq %%cr0, %0" : "=r" (ctxt->cr0));
>   asm volatile ("movq %%cr2, %0" : "=r" (ctxt->cr2));
>   asm volatile ("movq %%cr3, %0" : "=r" (ctxt->cr3));
> @@ -75,6 +75,7 @@ void __restore_processor_state(struct sa
>   /*
>* control registers
>*/
> + wrmsrl(MSR_EFER, ctxt->efer);
>   asm volatile ("movq %0, %%cr8" :: "r" (ctxt->cr8));
>   asm volatile ("movq %0, %%cr4" :: "r" (ctxt->cr4));
>   asm volatile ("movq %0, %%cr3" :: "r" (ctxt->cr3));
> --- 
> linux-2.6.19-rc6-reloc/include/asm-x86_64/suspend.h~x86_64-Add-EFER-to-the-set-registers-saved-by-save_processor_state
> 2006-11-17 00:08:16.0 -0500
> @@ -17,6 +17,7 @@ struct saved_context {
>   u16 ds, es, fs, gs, ss;
>   unsigned long gs_base, gs_kernel_base, fs_base;
>   unsigned long cr0, cr2, cr3, cr4, cr8;
> + unsigned long efer;
>   u16 gdt_pad;
>   u16 gdt_limit;
>   unsigned long gdt_base;
> _

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 9/20] x86_64: 64bit PIC SMP trampoline

2006-11-17 Thread Vivek Goyal


This modifies the SMP trampoline and all of the associated code so
it can jump to a 64bit kernel loaded at an arbitrary address.

The dependencies on having an idenetity mapped page in the kernel
page tables for SMP bootup have all been removed.

In addition the trampoline has been modified to verify
that long mode is supported.  Asking if long mode is implemented is
down right silly but we have traditionally had some of these checks,
and they can't hurt anything.  So when the totally ludicrous happens
we just might handle it correctly.

Signed-off-by: Eric W. Biederman <[EMAIL PROTECTED]>
Signed-off-by: Vivek Goyal <[EMAIL PROTECTED]>
---

 arch/x86_64/kernel/head.S   |1 
 arch/x86_64/kernel/setup.c  |9 --
 arch/x86_64/kernel/trampoline.S |  168 
 3 files changed, 156 insertions(+), 22 deletions(-)

diff -puN arch/x86_64/kernel/head.S~x86_64-64bit-PIC-SMP-trampoline 
arch/x86_64/kernel/head.S
--- 
linux-2.6.19-rc6-reloc/arch/x86_64/kernel/head.S~x86_64-64bit-PIC-SMP-trampoline
2006-11-17 00:08:38.0 -0500
+++ linux-2.6.19-rc6-reloc-root/arch/x86_64/kernel/head.S   2006-11-17 
00:08:38.0 -0500
@@ -101,6 +101,7 @@ startup_32:
.org 0x100  
.globl startup_64
 startup_64:
+ENTRY(secondary_startup_64)
/* We come here either from startup_32
 * or directly from a 64bit bootloader.
 * Since we may have come directly from a bootloader we
diff -puN arch/x86_64/kernel/setup.c~x86_64-64bit-PIC-SMP-trampoline 
arch/x86_64/kernel/setup.c
--- 
linux-2.6.19-rc6-reloc/arch/x86_64/kernel/setup.c~x86_64-64bit-PIC-SMP-trampoline
   2006-11-17 00:08:38.0 -0500
+++ linux-2.6.19-rc6-reloc-root/arch/x86_64/kernel/setup.c  2006-11-17 
00:08:38.0 -0500
@@ -446,15 +446,8 @@ void __init setup_arch(char **cmdline_p)
reserve_bootmem_generic(ebda_addr, ebda_size);
 
 #ifdef CONFIG_SMP
-   /*
-* But first pinch a few for the stack/trampoline stuff
-* FIXME: Don't need the extra page at 4K, but need to fix
-* trampoline before removing it. (see the GDT stuff)
-*/
-   reserve_bootmem_generic(PAGE_SIZE, PAGE_SIZE);
-
/* Reserve SMP trampoline */
-   reserve_bootmem_generic(SMP_TRAMPOLINE_BASE, PAGE_SIZE);
+   reserve_bootmem_generic(SMP_TRAMPOLINE_BASE, 2*PAGE_SIZE);
 #endif
 
 #ifdef CONFIG_ACPI_SLEEP
diff -puN arch/x86_64/kernel/trampoline.S~x86_64-64bit-PIC-SMP-trampoline 
arch/x86_64/kernel/trampoline.S
--- 
linux-2.6.19-rc6-reloc/arch/x86_64/kernel/trampoline.S~x86_64-64bit-PIC-SMP-trampoline
  2006-11-17 00:08:38.0 -0500
+++ linux-2.6.19-rc6-reloc-root/arch/x86_64/kernel/trampoline.S 2006-11-17 
00:08:38.0 -0500
@@ -3,6 +3,7 @@
  * Trampoline.SDerived from Setup.S by Linus Torvalds
  *
  * 4 Jan 1997 Michael Chastain: changed to gnu as.
+ * 15 Sept 2005 Eric Biederman: 64bit PIC support
  *
  * Entry: CS:IP point to the start of our code, we are 
  * in real mode with no stack, but the rest of the 
@@ -17,15 +18,20 @@
  * and IP is zero.  Thus, data addresses need to be absolute
  * (no relocation) and are taken with regard to r_base.
  *
+ * With the addition of trampoline_level4_pgt this code can
+ * now enter a 64bit kernel that lives at arbitrary 64bit
+ * physical addresses.
+ *
  * If you work on this file, check the object module with objdump
  * --full-contents --reloc to make sure there are no relocation
- * entries. For the GDT entry we do hand relocation in smpboot.c
- * because of 64bit linker limitations.
+ * entries.
  */
 
 #include 
-#include 
+#include 
 #include 
+#include 
+#include 
 
 .data
 
@@ -33,15 +39,31 @@
 
 ENTRY(trampoline_data)
 r_base = .
+   cli # We should be safe anyway
wbinvd  
mov %cs, %ax# Code and data in the same place
mov %ax, %ds
+   mov %ax, %es
+   mov %ax, %ss
 
-   cli # We should be safe anyway
 
movl$0xA5A5A5A5, trampoline_data - r_base
# write marker for master knows we're running
 
+   # Setup stack
+   movw$(trampoline_stack_end - r_base), %sp
+
+   callverify_cpu  # Verify the cpu supports long mode
+
+   mov %cs, %ax
+   movzx   %ax, %esi   # Find the 32bit trampoline location
+   shll$4, %esi
+
+   # Fixup the vectors
+   addl%esi, startup_32_vector - r_base
+   addl%esi, startup_64_vector - r_base
+   addl%esi, tgdt + 2 - r_base # Fixup the gdt pointer
+
/*
 * GDT tables in non default location kernel can be beyond 16MB and
 * lgdt will not be able to load the address as in real mode default
@@ -49,23 +71,141 @@ r_base = .
 * to 32 bit.
 */
 
-

[RFC][PATCH 0/20] x86_64: Relocatable bzImage (V3)

2006-11-17 Thread Vivek Goyal
Hi All,

Here is the third attempt on implementing relocatable bzImage for x86_64.

Following are the changes since V2.

- Broke suspend/resume code changes into smaller patches. Pavel, I hope
  now it is easier to review.

- Moved cpu long mode and SSE verfication code into a single common 
  file (arch/x86_64/kernel/verify_cpu.S). This file is not shared at all
  the entry places.

- Fixed a bug during resume operation on machines which support NX bit.

Your comments/suggestions are welcome.

Thanks
Vivek
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2/20] x86_64: Assembly safe page.h and pgtable.h

2006-11-17 Thread Vivek Goyal


This patch makes pgtable.h and page.h safe to include
in assembly files like head.S.  Allowing us to use
symbolic constants instead of hard coded numbers when
refering to the page tables.

This patch copies asm-sparc64/const.h to asm-x86_64 to
get a definition of _AC() a very convinient macro that
allows us to force the type when we are compiling the
code in C and to drop all of the type information when
we are using the constant in assembly.  Previously this
was done with multiple definition of the same constant.
const.h was modified slightly so that it works when given
CONFIG options as arguments.

This patch adds #ifndef __ASSEMBLY__ ... #endif
and _AC(1,UL) where appropriate so the assembler won't
choke on the header files.  Otherwise nothing
should have changed.

Signed-off-by: Eric W. Biederman <[EMAIL PROTECTED]>
Signed-off-by: Vivek Goyal <[EMAIL PROTECTED]>
---

 include/asm-x86_64/const.h   |   20 
 include/asm-x86_64/page.h|   34 +-
 include/asm-x86_64/pgtable.h |   33 +
 3 files changed, 54 insertions(+), 33 deletions(-)

diff -puN /dev/null include/asm-x86_64/const.h
--- /dev/null   2006-11-17 00:03:10.168280803 -0500
+++ linux-2.6.19-rc6-reloc-root/include/asm-x86_64/const.h  2006-11-17 
00:05:30.0 -0500
@@ -0,0 +1,20 @@
+/* const.h: Macros for dealing with constants.  */
+
+#ifndef _X86_64_CONST_H
+#define _X86_64_CONST_H
+
+/* Some constant macros are used in both assembler and
+ * C code.  Therefore we cannot annotate them always with
+ * 'UL' and other type specificers unilaterally.  We
+ * use the following macros to deal with this.
+ */
+
+#ifdef __ASSEMBLY__
+#define _AC(X,Y)   X
+#else
+#define __AC(X,Y)  (X##Y)
+#define _AC(X,Y)   __AC(X,Y)
+#endif
+
+
+#endif /* !(_X86_64_CONST_H) */
diff -puN include/asm-x86_64/page.h~x86_64-Assembly-safe-page.h-and-pgtable.h 
include/asm-x86_64/page.h
--- 
linux-2.6.19-rc6-reloc/include/asm-x86_64/page.h~x86_64-Assembly-safe-page.h-and-pgtable.h
  2006-11-17 00:05:30.0 -0500
+++ linux-2.6.19-rc6-reloc-root/include/asm-x86_64/page.h   2006-11-17 
00:05:30.0 -0500
@@ -1,14 +1,11 @@
 #ifndef _X86_64_PAGE_H
 #define _X86_64_PAGE_H
 
+#include 
 
 /* PAGE_SHIFT determines the page size */
 #define PAGE_SHIFT 12
-#ifdef __ASSEMBLY__
-#define PAGE_SIZE  (0x1 << PAGE_SHIFT)
-#else
-#define PAGE_SIZE  (1UL << PAGE_SHIFT)
-#endif
+#define PAGE_SIZE  (_AC(1,UL) << PAGE_SHIFT)
 #define PAGE_MASK  (~(PAGE_SIZE-1))
 #define PHYSICAL_PAGE_MASK (~(PAGE_SIZE-1) & __PHYSICAL_MASK)
 
@@ -33,10 +30,10 @@
 #define N_EXCEPTION_STACKS 5  /* hw limit: 7 */
 
 #define LARGE_PAGE_MASK (~(LARGE_PAGE_SIZE-1))
-#define LARGE_PAGE_SIZE (1UL << PMD_SHIFT)
+#define LARGE_PAGE_SIZE (_AC(1,UL) << PMD_SHIFT)
 
 #define HPAGE_SHIFT PMD_SHIFT
-#define HPAGE_SIZE ((1UL) << HPAGE_SHIFT)
+#define HPAGE_SIZE (_AC(1,UL) << HPAGE_SHIFT)
 #define HPAGE_MASK (~(HPAGE_SIZE - 1))
 #define HUGETLB_PAGE_ORDER (HPAGE_SHIFT - PAGE_SHIFT)
 
@@ -76,29 +73,24 @@ typedef struct { unsigned long pgprot; }
 #define __pgd(x) ((pgd_t) { (x) } )
 #define __pgprot(x)((pgprot_t) { (x) } )
 
-#define __PHYSICAL_START   ((unsigned long)CONFIG_PHYSICAL_START)
-#define __START_KERNEL (__START_KERNEL_map + __PHYSICAL_START)
-#define __START_KERNEL_map 0x8000UL
-#define __PAGE_OFFSET   0x8100UL
+#endif /* !__ASSEMBLY__ */
 
-#else
-#define __PHYSICAL_START   CONFIG_PHYSICAL_START
+#define __PHYSICAL_START   _AC(CONFIG_PHYSICAL_START,UL)
 #define __START_KERNEL (__START_KERNEL_map + __PHYSICAL_START)
-#define __START_KERNEL_map 0x8000
-#define __PAGE_OFFSET   0x8100
-#endif /* !__ASSEMBLY__ */
+#define __START_KERNEL_map _AC(0x8000,UL)
+#define __PAGE_OFFSET   _AC(0x8100,UL)
 
 /* to align the pointer to the (next) page boundary */
 #define PAGE_ALIGN(addr)   (((addr)+PAGE_SIZE-1)_MASK)
 
 /* See Documentation/x86_64/mm.txt for a description of the memory map. */
 #define __PHYSICAL_MASK_SHIFT  46
-#define __PHYSICAL_MASK((1UL << __PHYSICAL_MASK_SHIFT) - 1)
+#define __PHYSICAL_MASK((_AC(1,UL) << __PHYSICAL_MASK_SHIFT) - 
1)
 #define __VIRTUAL_MASK_SHIFT   48
-#define __VIRTUAL_MASK ((1UL << __VIRTUAL_MASK_SHIFT) - 1)
+#define __VIRTUAL_MASK ((_AC(1,UL) << __VIRTUAL_MASK_SHIFT) - 1)
 
-#define KERNEL_TEXT_SIZE  (40UL*1024*1024)
-#define KERNEL_TEXT_START 0x8000UL 
+#define KERNEL_TEXT_SIZE  (_AC(40,UL)*1024*1024)
+#define KERNEL_TEXT_START _AC(0x8000,UL)
 
 #ifndef __ASSEMBLY__
 
@@ -106,7 +98,7 @@ typedef struct { unsigned long pgprot; }
 
 #endif /* __ASSEMBLY__ */
 
-#define PAGE_OFFSET((unsigned long)__PAGE_OFFSET)
+#define PAGE_OFFSET__PAGE_OFFSET
 
 /* Note: __pa(_visible_to_c) should be always replaced 

[PATCH 13/20] x86_64: 64bit PIC ACPI wakeup trampoline

2006-11-17 Thread Vivek Goyal


o Moved wakeup_level4_pgt into the wakeup routine so we can
  run the kernel above 4G.

o Now we first go to 64bit mode and continue to run from trampoline and
  then then start accessing kernel symbols and restore processor context.
  This enables us to resume even in relocatable kernel context when 
  kernel might not be loaded at physical addr it has been compiled for.

o Removed the need for modifying any existing kernel page table.

o Increased the size of the wakeup routine to 8K. This is required as
  wake page tables are on trampoline itself and they got to be at 4K
  boundary, hence one page is not sufficient.

Signed-off-by: Eric W. Biederman <[EMAIL PROTECTED]>
Signed-off-by: Vivek Goyal <[EMAIL PROTECTED]>
---

 arch/x86_64/kernel/acpi/sleep.c  |   22 ++
 arch/x86_64/kernel/acpi/wakeup.S |   59 ---
 arch/x86_64/kernel/head.S|9 -
 3 files changed, 41 insertions(+), 49 deletions(-)

diff -puN arch/x86_64/kernel/acpi/sleep.c~x86_64-64bit-ACPI-wakeup-trampoline 
arch/x86_64/kernel/acpi/sleep.c
--- 
linux-2.6.19-rc6-reloc/arch/x86_64/kernel/acpi/sleep.c~x86_64-64bit-ACPI-wakeup-trampoline
  2006-11-17 00:10:48.0 -0500
+++ linux-2.6.19-rc6-reloc-root/arch/x86_64/kernel/acpi/sleep.c 2006-11-17 
00:10:48.0 -0500
@@ -60,17 +60,6 @@ extern char wakeup_start, wakeup_end;
 
 extern unsigned long FASTCALL(acpi_copy_wakeup_routine(unsigned long));
 
-static pgd_t low_ptr;
-
-static void init_low_mapping(void)
-{
-   pgd_t *slot0 = pgd_offset(current->mm, 0UL);
-   low_ptr = *slot0;
-   set_pgd(slot0, *pgd_offset(current->mm, PAGE_OFFSET));
-   WARN_ON(num_online_cpus() != 1);
-   local_flush_tlb();
-}
-
 /**
  * acpi_save_state_mem - save kernel state
  *
@@ -79,8 +68,6 @@ static void init_low_mapping(void)
  */
 int acpi_save_state_mem(void)
 {
-   init_low_mapping();
-
memcpy((void *)acpi_wakeup_address, _start,
   _end - _start);
acpi_copy_wakeup_routine(acpi_wakeup_address);
@@ -93,8 +80,6 @@ int acpi_save_state_mem(void)
  */
 void acpi_restore_state_mem(void)
 {
-   set_pgd(pgd_offset(current->mm, 0UL), low_ptr);
-   local_flush_tlb();
 }
 
 /**
@@ -107,10 +92,11 @@ void acpi_restore_state_mem(void)
  */
 void __init acpi_reserve_bootmem(void)
 {
-   acpi_wakeup_address = (unsigned long)alloc_bootmem_low(PAGE_SIZE);
-   if ((_end - _start) > PAGE_SIZE)
+   acpi_wakeup_address = (unsigned long)alloc_bootmem_low(PAGE_SIZE*2);
+   if ((_end - _start) > (PAGE_SIZE*2))
printk(KERN_CRIT
-  "ACPI: Wakeup code way too big, will crash on attempt to 
suspend\n");
+  "ACPI: Wakeup code way too big, will crash on attempt"
+  " to suspend\n");
 }
 
 static int __init acpi_sleep_setup(char *str)
diff -puN arch/x86_64/kernel/acpi/wakeup.S~x86_64-64bit-ACPI-wakeup-trampoline 
arch/x86_64/kernel/acpi/wakeup.S
--- 
linux-2.6.19-rc6-reloc/arch/x86_64/kernel/acpi/wakeup.S~x86_64-64bit-ACPI-wakeup-trampoline
 2006-11-17 00:10:48.0 -0500
+++ linux-2.6.19-rc6-reloc-root/arch/x86_64/kernel/acpi/wakeup.S
2006-11-17 00:10:48.0 -0500
@@ -1,6 +1,7 @@
 .text
 #include 
 #include 
+#include 
 #include 
 #include 
 
@@ -62,12 +63,15 @@ wakeup_code:
 
movb$0xa2, %al  ;  outb %al, $0x80

-   lidt%ds:idt_48a - wakeup_code
-   xorl%eax, %eax
-   movw%ds, %ax# (Convert %ds:gdt to a linear 
ptr)
-   shll$4, %eax
-   addl$(gdta - wakeup_code), %eax
-   movl%eax, gdt_48a +2 - wakeup_code
+   mov %ds, %ax# Find 32bit wakeup_code addr
+   movzx   %ax, %esi   # (Convert %ds:gdt to a liner 
ptr)
+   shll$4, %esi
+   # Fix up the vectors
+   addl%esi, wakeup_32_vector - wakeup_code
+   addl%esi, wakeup_long64_vector - wakeup_code
+   addl%esi, gdt_48a + 2 - wakeup_code # Fixup the gdt pointer
+
+   lidtl   %ds:idt_48a - wakeup_code
lgdtl   %ds:gdt_48a - wakeup_code   # load gdt with whatever is
# appropriate
 
@@ -80,7 +84,7 @@ wakeup_code:
 
.balign 4
 wakeup_32_vector:
-   .long   wakeup_32 - __START_KERNEL_map
+   .long   wakeup_32 - wakeup_code
.word   __KERNEL32_CS, 0
 
.code32
@@ -103,10 +107,6 @@ wakeup_32:
movl$__KERNEL_DS, %eax
movl%eax, %ds
 
-   movlsaved_magic - __START_KERNEL_map, %eax
-   cmpl$0x9abcdef0, %eax
-   jne bogus_32_magic
-
movw$0x0e00 + 'i', %ds:(0xb8012)
movb$0xa8, %al  ;  outb %al, $0x80;
 
@@ -120,7 +120,7 @@ wakeup_32:
movl%eax, %cr4
 
/* Setup early boot stage 4 level pagetables */
-   movl$(wakeup_level4_pgt - __START_KERNEL_map), %eax
+

[PATCH 4/20] x86_64: Cleanup the early boot page table

2006-11-17 Thread Vivek Goyal


- Merge physmem_pgt and ident_pgt, removing physmem_pgt.  The merge
  is broken as soon as mm/init.c:init_memory_mapping is run.
- As physmem_pgt is gone don't export it in pgtable.h.
- Use defines from pgtable.h for page permissions.
- Fix the physical memory identity mapping so it is at the correct
  address.
- Remove the physical memory mapping from wakeup_level4_pgt it
  is at the wrong address so we can't possibly be usinging it.
- Simply NEXT_PAGE the work to calculate the phys_ alias
  of the labels was very cool.  Unfortuantely it was a brittle
  special purpose hack that makes maitenance more difficult.
  Instead just use label - __START_KERNEL_map like we do
  everywhere else in assembly.

Signed-off-by: Eric W. Biederman <[EMAIL PROTECTED]>
Signed-off-by: Vivek Goyal <[EMAIL PROTECTED]>
---

 arch/x86_64/kernel/head.S|   61 +++
 include/asm-x86_64/pgtable.h |1 
 2 files changed, 28 insertions(+), 34 deletions(-)

diff -puN arch/x86_64/kernel/head.S~x86_64-Cleanup-the-early-boot-page-table 
arch/x86_64/kernel/head.S
--- 
linux-2.6.19-rc6-reloc/arch/x86_64/kernel/head.S~x86_64-Cleanup-the-early-boot-page-table
   2006-11-17 00:06:20.0 -0500
+++ linux-2.6.19-rc6-reloc-root/arch/x86_64/kernel/head.S   2006-11-17 
00:06:20.0 -0500
@@ -13,6 +13,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -252,52 +253,48 @@ ljumpvector:
 ENTRY(stext)
 ENTRY(_stext)
 
-   $page = 0
 #define NEXT_PAGE(name) \
-   $page = $page + 1; \
-   .org $page * 0x1000; \
-   phys_/**/name = $page * 0x1000 + __PHYSICAL_START; \
+   .balign PAGE_SIZE; \
 ENTRY(name)
 
+/* Automate the creation of 1 to 1 mapping pmd entries */
+#define PMDS(START, PERM, COUNT)   \
+   i = 0 ; \
+   .rept (COUNT) ; \
+   .quad   (START) + (i << 21) + (PERM) ;  \
+   i = i + 1 ; \
+   .endr
+
 NEXT_PAGE(init_level4_pgt)
/* This gets initialized in x86_64_start_kernel */
.fill   512,8,0
 
 NEXT_PAGE(level3_ident_pgt)
-   .quad   phys_level2_ident_pgt | 0x007
+   .quad   level2_ident_pgt - __START_KERNEL_map + _KERNPG_TABLE
.fill   511,8,0
 
 NEXT_PAGE(level3_kernel_pgt)
.fill   510,8,0
/* (2^48-(2*1024*1024*1024)-((2^39)*511))/(2^30) = 510 */
-   .quad   phys_level2_kernel_pgt | 0x007
+   .quad   level2_kernel_pgt - __START_KERNEL_map + _KERNPG_TABLE
.fill   1,8,0
 
 NEXT_PAGE(level2_ident_pgt)
-   /* 40MB for bootup. */
-   i = 0
-   .rept 20
-   .quad   i << 21 | 0x083
-   i = i + 1
-   .endr
-   .fill   492,8,0
+   /* Since I easily can, map the first 1G.
+* Don't set NX because code runs from these pages.
+*/
+   PMDS(0x, __PAGE_KERNEL_LARGE_EXEC, PTRS_PER_PMD)

 NEXT_PAGE(level2_kernel_pgt)
/* 40MB kernel mapping. The kernel code cannot be bigger than that.
   When you change this change KERNEL_TEXT_SIZE in page.h too. */
/* (2^48-(2*1024*1024*1024)-((2^39)*511)-((2^30)*510)) = 0 */
-   i = 0
-   .rept 20
-   .quad   i << 21 | 0x183
-   i = i + 1
-   .endr
+   PMDS(0x, __PAGE_KERNEL_LARGE_EXEC|_PAGE_GLOBAL,
+   KERNEL_TEXT_SIZE/PMD_SIZE)
/* Module mapping starts here */
-   .fill   492,8,0
-
-NEXT_PAGE(level3_physmem_pgt)
-   .quad   phys_level2_kernel_pgt | 0x007  /* so that __va works even 
before pagetable_init */
-   .fill   511,8,0
+   .fill   (PTRS_PER_PMD - (KERNEL_TEXT_SIZE/PMD_SIZE)),8,0
 
+#undef PMDS
 #undef NEXT_PAGE
 
.data
@@ -305,12 +302,10 @@ NEXT_PAGE(level3_physmem_pgt)
 #ifdef CONFIG_ACPI_SLEEP
.align PAGE_SIZE
 ENTRY(wakeup_level4_pgt)
-   .quad   phys_level3_ident_pgt | 0x007
-   .fill   255,8,0
-   .quad   phys_level3_physmem_pgt | 0x007
-   .fill   254,8,0
+   .quad   level3_ident_pgt - __START_KERNEL_map + _KERNPG_TABLE
+   .fill   510,8,0
/* (2^48-(2*1024*1024*1024))/(2^39) = 511 */
-   .quad   phys_level3_kernel_pgt | 0x007
+   .quad   level3_kernel_pgt - __START_KERNEL_map + _KERNPG_TABLE
 #endif
 
 #ifndef CONFIG_HOTPLUG_CPU
@@ -324,12 +319,12 @@ ENTRY(wakeup_level4_pgt)
 */
.align PAGE_SIZE
 ENTRY(boot_level4_pgt)
-   .quad   phys_level3_ident_pgt | 0x007
-   .fill   255,8,0
-   .quad   phys_level3_physmem_pgt | 0x007
-   .fill   254,8,0
+   .quad   level3_ident_pgt - __START_KERNEL_map + _KERNPG_TABLE
+   .fill   257,8,0
+   .quad   level3_ident_pgt - __START_KERNEL_map + _KERNPG_TABLE
+   .fill   252,8,0
/* (2^48-(2*1024*1024*1024))/(2^39) = 511 */
-   .quad   phys_level3_kernel_pgt | 0x007
+   .quad   level3_kernel_pgt - __START_KERNEL_map + _PAGE_TABLE
 
.data
 
diff -puN 

Re: [PATCH 11/20] x86_64: wakeup.S Rename labels to reflect right register names

2006-11-17 Thread Pavel Machek
On Fri 2006-11-17 17:48:22, Vivek Goyal wrote:
> 
> 
> o Use appropriate names for 64bit regsiters.
> 
> Signed-off-by: Eric W. Biederman <[EMAIL PROTECTED]>
> Signed-off-by: Vivek Goyal <[EMAIL PROTECTED]>

ACK.

> --- 
> linux-2.6.19-rc6-reloc/arch/x86_64/kernel/acpi/wakeup.S~x86_64-wakeup.S-rename-registers-to-reflect-right-names
>2006-11-17 00:09:29.0 -0500
> +++ linux-2.6.19-rc6-reloc-root/arch/x86_64/kernel/acpi/wakeup.S  
> 2006-11-17 00:09:29.0 -0500
> @@ -211,16 +211,16 @@ wakeup_long64:
>   movw%ax, %es
>   movw%ax, %fs
>   movw%ax, %gs
> - movqsaved_esp, %rsp
> + movqsaved_rsp, %rsp
>  
>   movw$0x0e00 + 'x', %ds:(0xb8018)
> - movqsaved_ebx, %rbx
> - movqsaved_edi, %rdi
> - movqsaved_esi, %rsi
> - movqsaved_ebp, %rbp
> + movqsaved_rbx, %rbx
> + movqsaved_rdi, %rdi
> + movqsaved_rsi, %rsi
> + movqsaved_rbp, %rbp
>  
>   movw$0x0e00 + '!', %ds:(0xb801a)
> - movqsaved_eip, %rax
> + movqsaved_rip, %rax
>   jmp *%rax
>  
>  .code32
> @@ -408,13 +408,13 @@ do_suspend_lowlevel:
>   movq %r15, saved_context_r15(%rip)
>   pushfq ; popq saved_context_eflags(%rip)
>  
> - movq$.L97, saved_eip(%rip)
> + movq$.L97, saved_rip(%rip)
>  
> - movq %rsp,saved_esp
> - movq %rbp,saved_ebp
> - movq %rbx,saved_ebx
> - movq %rdi,saved_edi
> - movq %rsi,saved_esi
> + movq %rsp,saved_rsp
> + movq %rbp,saved_rbp
> + movq %rbx,saved_rbx
> + movq %rdi,saved_rdi
> + movq %rsi,saved_rsi
>  
>   addq$8, %rsp
>   movl$3, %edi
> @@ -461,12 +461,12 @@ do_suspend_lowlevel:
>   
>  .data
>  ALIGN
> -ENTRY(saved_ebp) .quad   0
> -ENTRY(saved_esi) .quad   0
> -ENTRY(saved_edi) .quad   0
> -ENTRY(saved_ebx) .quad   0
> +ENTRY(saved_rbp) .quad   0
> +ENTRY(saved_rsi) .quad   0
> +ENTRY(saved_rdi) .quad   0
> +ENTRY(saved_rbx) .quad   0
>  
> -ENTRY(saved_eip) .quad   0
> -ENTRY(saved_esp) .quad   0
> +ENTRY(saved_rip) .quad   0
> +ENTRY(saved_rsp) .quad   0
>  
>  ENTRY(saved_magic)   .quad   0
> diff -puN 
> include/asm-x86_64/suspend.h~x86_64-wakeup.S-rename-registers-to-reflect-right-names
>  include/asm-x86_64/suspend.h
> --- 
> linux-2.6.19-rc6-reloc/include/asm-x86_64/suspend.h~x86_64-wakeup.S-rename-registers-to-reflect-right-names
>2006-11-17 00:09:29.0 -0500
> +++ linux-2.6.19-rc6-reloc-root/include/asm-x86_64/suspend.h  2006-11-17 
> 00:09:29.0 -0500
> @@ -45,12 +45,12 @@ extern unsigned long saved_context_eflag
>  extern void fix_processor_context(void);
>  
>  #ifdef CONFIG_ACPI_SLEEP
> -extern unsigned long saved_eip;
> -extern unsigned long saved_esp;
> -extern unsigned long saved_ebp;
> -extern unsigned long saved_ebx;
> -extern unsigned long saved_esi;
> -extern unsigned long saved_edi;
> +extern unsigned long saved_rip;
> +extern unsigned long saved_rsp;
> +extern unsigned long saved_rbp;
> +extern unsigned long saved_rbx;
> +extern unsigned long saved_rsi;
> +extern unsigned long saved_rdi;
>  
>  /* routines for saving/restoring kernel state */
>  extern int acpi_save_state_mem(void);
> _

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 10/20] x86_64: wakeup.S Remove dead code

2006-11-17 Thread Pavel Machek
On Fri 2006-11-17 17:47:02, Vivek Goyal wrote:
> 
> 
> o Get rid of dead code in wakeup.S
> 
> o We never restore from saved_gdt, saved_idt, saved_ltd, saved_tss, saved_cr3,
>   saved_cr4, saved_cr0, real_save_gdt, saved_efer, saved_efer2. Get rid
>   of of associated code.
> 
> o Get rid of bogus_magic, bogus_31_magic and bogus_magic2. No longer being
>   used.
> 
> Signed-off-by: Eric W. Biederman <[EMAIL PROTECTED]>
> Signed-off-by: Vivek Goyal <[EMAIL PROTECTED]>

ACK and thanks.

> diff -puN 
> arch/x86_64/kernel/acpi/wakeup.S~x86_64-get-rid-of-dead-code-in-suspend-resume
>  arch/x86_64/kernel/acpi/wakeup.S
> --- 
> linux-2.6.19-rc6-reloc/arch/x86_64/kernel/acpi/wakeup.S~x86_64-get-rid-of-dead-code-in-suspend-resume
>  2006-11-17 00:09:05.0 -0500
> +++ linux-2.6.19-rc6-reloc-root/arch/x86_64/kernel/acpi/wakeup.S  
> 2006-11-17 00:09:05.0 -0500
> @@ -258,8 +258,6 @@ gdt_48a:
>   .word   0, 0# gdt base (filled in later)
>   
>   
> -real_save_gdt:   .word 0
> - .quad 0
>  real_magic:  .quad 0
>  video_mode:  .quad 0
>  video_flags: .quad 0
> @@ -272,10 +270,6 @@ bogus_32_magic:
>   movb$0xb3,%al   ;  outb %al,$0x80
>   jmp bogus_32_magic
>  
> -bogus_31_magic:
> - movb$0xb1,%al   ;  outb %al,$0x80
> - jmp bogus_31_magic
> -
>  bogus_cpu:
>   movb$0xbc,%al   ;  outb %al,$0x80
>   jmp bogus_cpu
> @@ -346,16 +340,6 @@ check_vesaa:
>  
>  _setbada: jmp setbada
>  
> - .code64
> -bogus_magic:
> - movw$0x0e00 + 'B', %ds:(0xb8018)
> - jmp bogus_magic
> -
> -bogus_magic2:
> - movw$0x0e00 + '2', %ds:(0xb8018)
> - jmp bogus_magic2
> - 
> -
>  wakeup_stack_begin:  # Stack grows down
>  
>  .org 0xff0
> @@ -373,28 +357,11 @@ ENTRY(wakeup_end)
>  #
>  # Returned address is location of code in low memory (past data and stack)
>  #
> + .code64
>  ENTRY(acpi_copy_wakeup_routine)
>   pushq   %rax
> - pushq   %rcx
>   pushq   %rdx
>  
> - sgdtsaved_gdt
> - sidtsaved_idt
> - sldtsaved_ldt
> - str saved_tss
> -
> - movq%cr3, %rdx
> - movq%rdx, saved_cr3
> - movq%cr4, %rdx
> - movq%rdx, saved_cr4
> - movq%cr0, %rdx
> - movq%rdx, saved_cr0
> - sgdtreal_save_gdt - wakeup_start (,%rdi)
> - movl$MSR_EFER, %ecx
> - rdmsr
> - movl%eax, saved_efer
> - movl%edx, saved_efer2
> -
>   movlsaved_video_mode, %edx
>   movl%edx, video_mode - wakeup_start (,%rdi)
>   movlacpi_video_flags, %edx
> @@ -407,17 +374,8 @@ ENTRY(acpi_copy_wakeup_routine)
>   cmpl$0x9abcdef0, %eax
>   jne bogus_32_magic
>  
> - # make sure %cr4 is set correctly (features, etc)
> - movlsaved_cr4 - __START_KERNEL_map, %eax
> - movq%rax, %cr4
> -
> - movlsaved_cr0 - __START_KERNEL_map, %eax
> - movq%rax, %cr0
> - jmp 1f  # Flush pipelines
> -1:
>   # restore the regs we used
>   popq%rdx
> - popq%rcx
>   popq%rax
>  ENTRY(do_suspend_lowlevel_s4bios)
>   ret
> @@ -512,16 +470,3 @@ ENTRY(saved_eip) .quad   0
>  ENTRY(saved_esp) .quad   0
>  
>  ENTRY(saved_magic)   .quad   0
> -
> -ALIGN
> -# saved registers
> -saved_gdt:   .quad   0,0
> -saved_idt:   .quad   0,0
> -saved_ldt:   .quad   0
> -saved_tss:   .quad   0
> -
> -saved_cr0:   .quad 0
> -saved_cr3:   .quad 0
> -saved_cr4:   .quad 0
> -saved_efer:  .quad 0
> -saved_efer2: .quad 0
> _

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 15/20] x86_64: Remove the identity mapping as early as possible

2006-11-17 Thread Vivek Goyal


With the rewrite of the SMP trampoline and the early page
allocator there is nothing that needs identity mapped pages,
once we start executing C code.

So add zap_identity_mappings into head64.c and remove
zap_low_mappings() from much later in the code.  The functions
 are subtly different thus the name change.

This also kills boot_level4_pgt which was from an earlier
attempt to move the identity mappings as early as possible,
and is now no longer needed.  Essentially I have replaced
boot_level4_pgt with trampoline_level4_pgt in trampoline.S

Signed-off-by: Eric W. Biederman <[EMAIL PROTECTED]>
Signed-off-by: Vivek Goyal <[EMAIL PROTECTED]>
---

 arch/x86_64/kernel/head.S|   39 ++-
 arch/x86_64/kernel/head64.c  |   16 ++--
 arch/x86_64/kernel/setup.c   |2 --
 arch/x86_64/kernel/setup64.c |1 -
 arch/x86_64/mm/init.c|   24 
 include/asm-x86_64/pgtable.h |1 -
 include/asm-x86_64/proto.h   |2 --
 7 files changed, 24 insertions(+), 61 deletions(-)

diff -puN 
arch/x86_64/kernel/head64.c~x86_64-Remove-the-identity-mapping-as-early-as-possible
 arch/x86_64/kernel/head64.c
--- 
linux-2.6.19-rc6-reloc/arch/x86_64/kernel/head64.c~x86_64-Remove-the-identity-mapping-as-early-as-possible
  2006-11-17 00:11:42.0 -0500
+++ linux-2.6.19-rc6-reloc-root/arch/x86_64/kernel/head64.c 2006-11-17 
00:11:42.0 -0500
@@ -18,8 +18,16 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
+static void __init zap_identity_mappings(void)
+{
+   pgd_t *pgd = pgd_offset_k(0UL);
+   pgd_clear(pgd);
+   __flush_tlb();
+}
+
 /* Don't add a printk in there. printk relies on the PDA which is not 
initialized 
yet. */
 static void __init clear_bss(void)
@@ -56,6 +64,8 @@ void __init x86_64_start_kernel(char * r
 {
int i;
 
+   /* Make NULL pointers segfault */
+   zap_identity_mappings();
for (i = 0; i < 256; i++)
set_intr_gate(i, early_idt_handler);
asm volatile("lidt %0" :: "m" (idt_descr));
@@ -63,12 +73,6 @@ void __init x86_64_start_kernel(char * r
 
early_printk("Kernel alive\n");
 
-   /*
-* switch to init_level4_pgt from boot_level4_pgt
-*/
-   memcpy(init_level4_pgt, boot_level4_pgt, PTRS_PER_PGD*sizeof(pgd_t));
-   asm volatile("movq %0,%%cr3" :: "r" (__pa_symbol(_level4_pgt)));
-
for (i = 0; i < NR_CPUS; i++)
cpu_pda(i) = _cpu_pda[i];
 
diff -puN 
arch/x86_64/kernel/head.S~x86_64-Remove-the-identity-mapping-as-early-as-possible
 arch/x86_64/kernel/head.S
--- 
linux-2.6.19-rc6-reloc/arch/x86_64/kernel/head.S~x86_64-Remove-the-identity-mapping-as-early-as-possible
2006-11-17 00:11:42.0 -0500
+++ linux-2.6.19-rc6-reloc-root/arch/x86_64/kernel/head.S   2006-11-17 
00:11:42.0 -0500
@@ -71,7 +71,7 @@ startup_32:
movl%eax, %cr4
 
/* Setup early boot stage 4 level pagetables */
-   movl$(boot_level4_pgt - __START_KERNEL_map), %eax
+   movl$(init_level4_pgt - __START_KERNEL_map), %eax
movl%eax, %cr3
 
/* Setup EFER (Extended Feature Enable Register) */
@@ -115,7 +115,7 @@ ENTRY(secondary_startup_64)
movq%rax, %cr4
 
/* Setup early boot stage 4 level pagetables. */
-   movq$(boot_level4_pgt - __START_KERNEL_map), %rax
+   movq$(init_level4_pgt - __START_KERNEL_map), %rax
movq%rax, %cr3
 
/* Check if nx is implemented */
@@ -266,9 +266,19 @@ ENTRY(name)
i = i + 1 ; \
.endr
 
+   /*
+* This default setting generates an ident mapping at address 0x10
+* and a mapping for the kernel that precisely maps virtual address
+* 0x8000 to physical address 0x00. (always using
+* 2Mbyte large pages provided by PAE mode)
+*/
 NEXT_PAGE(init_level4_pgt)
-   /* This gets initialized in x86_64_start_kernel */
-   .fill   512,8,0
+   .quad   level3_ident_pgt - __START_KERNEL_map + _KERNPG_TABLE
+   .fill   257,8,0
+   .quad   level3_ident_pgt - __START_KERNEL_map + _KERNPG_TABLE
+   .fill   252,8,0
+   /* (2^48-(2*1024*1024*1024))/(2^39) = 511 */
+   .quad   level3_kernel_pgt - __START_KERNEL_map + _PAGE_TABLE
 
 NEXT_PAGE(level3_ident_pgt)
.quad   level2_ident_pgt - __START_KERNEL_map + _KERNPG_TABLE
@@ -299,27 +309,6 @@ NEXT_PAGE(level2_kernel_pgt)
 #undef NEXT_PAGE
 
.data
-
-#ifndef CONFIG_HOTPLUG_CPU
-   __INITDATA
-#endif
-   /*
-* This default setting generates an ident mapping at address 0x10
-* and a mapping for the kernel that precisely maps virtual address
-* 0x8000 to physical address 0x00. (always using
-* 2Mbyte large pages provided by PAE mode)
-*/
-   .align PAGE_SIZE
-ENTRY(boot_level4_pgt)
-   .quad   level3_ident_pgt - 

[PATCH 19/20] x86_64: Extend bzImage protocol for relocatable kernel

2006-11-17 Thread Vivek Goyal


o Extend the bzImage protocol (same as i386) to allow bzImage loaders to
  load the protected mode kernel at non-1MB address. Now protected mode
  component is relocatable and can be loaded at non-1MB addresses.

o As of today kdump uses it to run a second kernel from a reserved memory
  area.

Signed-off-by: Vivek Goyal <[EMAIL PROTECTED]>
---

 arch/x86_64/boot/setup.S |9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff -puN 
arch/x86_64/boot/setup.S~x86_64-extend-bzImage-protocol-for-relocatable-bzImage 
arch/x86_64/boot/setup.S
--- 
linux-2.6.19-rc6-reloc/arch/x86_64/boot/setup.S~x86_64-extend-bzImage-protocol-for-relocatable-bzImage
  2006-11-17 00:13:38.0 -0500
+++ linux-2.6.19-rc6-reloc-root/arch/x86_64/boot/setup.S2006-11-17 
00:13:38.0 -0500
@@ -80,7 +80,7 @@ start:
 # This is the setup header, and it must start at %cs:2 (old 0x9020:2)
 
.ascii  "HdrS"  # header signature
-   .word   0x0204  # header version number (>= 0x0105)
+   .word   0x0205  # header version number (>= 0x0105)
# or else old loadlin-1.5 will fail)
 realmode_swtch:.word   0, 0# default_switch, SETUPSEG
 start_sys_seg: .word   SYSSEG
@@ -155,7 +155,12 @@ cmd_line_ptr:  .long 0 # (Header versio
# low memory 0x1 or higher.
 
 ramdisk_max:   .long 0x
-   
+kernel_alignment:  .long 0x20   # physical addr alignment required for
+   # protected mode relocatable kernel
+relocatable_kernel:.byte 1
+pad2:  .byte 0
+pad3:  .word 0
+
 trampoline:callstart_of_setup
.align 16
# The offset at this point is 0x240
_
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 10/20] x86_64: wakeup.S Remove dead code

2006-11-17 Thread Vivek Goyal


o Get rid of dead code in wakeup.S

o We never restore from saved_gdt, saved_idt, saved_ltd, saved_tss, saved_cr3,
  saved_cr4, saved_cr0, real_save_gdt, saved_efer, saved_efer2. Get rid
  of of associated code.

o Get rid of bogus_magic, bogus_31_magic and bogus_magic2. No longer being
  used.

Signed-off-by: Eric W. Biederman <[EMAIL PROTECTED]>
Signed-off-by: Vivek Goyal <[EMAIL PROTECTED]>
---

 arch/x86_64/kernel/acpi/wakeup.S |   57 ---
 1 file changed, 1 insertion(+), 56 deletions(-)

diff -puN 
arch/x86_64/kernel/acpi/wakeup.S~x86_64-get-rid-of-dead-code-in-suspend-resume 
arch/x86_64/kernel/acpi/wakeup.S
--- 
linux-2.6.19-rc6-reloc/arch/x86_64/kernel/acpi/wakeup.S~x86_64-get-rid-of-dead-code-in-suspend-resume
   2006-11-17 00:09:05.0 -0500
+++ linux-2.6.19-rc6-reloc-root/arch/x86_64/kernel/acpi/wakeup.S
2006-11-17 00:09:05.0 -0500
@@ -258,8 +258,6 @@ gdt_48a:
.word   0, 0# gdt base (filled in later)


-real_save_gdt: .word 0
-   .quad 0
 real_magic:.quad 0
 video_mode:.quad 0
 video_flags:   .quad 0
@@ -272,10 +270,6 @@ bogus_32_magic:
movb$0xb3,%al   ;  outb %al,$0x80
jmp bogus_32_magic
 
-bogus_31_magic:
-   movb$0xb1,%al   ;  outb %al,$0x80
-   jmp bogus_31_magic
-
 bogus_cpu:
movb$0xbc,%al   ;  outb %al,$0x80
jmp bogus_cpu
@@ -346,16 +340,6 @@ check_vesaa:
 
 _setbada: jmp setbada
 
-   .code64
-bogus_magic:
-   movw$0x0e00 + 'B', %ds:(0xb8018)
-   jmp bogus_magic
-
-bogus_magic2:
-   movw$0x0e00 + '2', %ds:(0xb8018)
-   jmp bogus_magic2
-   
-
 wakeup_stack_begin:# Stack grows down
 
 .org   0xff0
@@ -373,28 +357,11 @@ ENTRY(wakeup_end)
 #
 # Returned address is location of code in low memory (past data and stack)
 #
+   .code64
 ENTRY(acpi_copy_wakeup_routine)
pushq   %rax
-   pushq   %rcx
pushq   %rdx
 
-   sgdtsaved_gdt
-   sidtsaved_idt
-   sldtsaved_ldt
-   str saved_tss
-
-   movq%cr3, %rdx
-   movq%rdx, saved_cr3
-   movq%cr4, %rdx
-   movq%rdx, saved_cr4
-   movq%cr0, %rdx
-   movq%rdx, saved_cr0
-   sgdtreal_save_gdt - wakeup_start (,%rdi)
-   movl$MSR_EFER, %ecx
-   rdmsr
-   movl%eax, saved_efer
-   movl%edx, saved_efer2
-
movlsaved_video_mode, %edx
movl%edx, video_mode - wakeup_start (,%rdi)
movlacpi_video_flags, %edx
@@ -407,17 +374,8 @@ ENTRY(acpi_copy_wakeup_routine)
cmpl$0x9abcdef0, %eax
jne bogus_32_magic
 
-   # make sure %cr4 is set correctly (features, etc)
-   movlsaved_cr4 - __START_KERNEL_map, %eax
-   movq%rax, %cr4
-
-   movlsaved_cr0 - __START_KERNEL_map, %eax
-   movq%rax, %cr0
-   jmp 1f  # Flush pipelines
-1:
# restore the regs we used
popq%rdx
-   popq%rcx
popq%rax
 ENTRY(do_suspend_lowlevel_s4bios)
ret
@@ -512,16 +470,3 @@ ENTRY(saved_eip)   .quad   0
 ENTRY(saved_esp)   .quad   0
 
 ENTRY(saved_magic) .quad   0
-
-ALIGN
-# saved registers
-saved_gdt: .quad   0,0
-saved_idt: .quad   0,0
-saved_ldt: .quad   0
-saved_tss: .quad   0
-
-saved_cr0: .quad 0
-saved_cr3: .quad 0
-saved_cr4: .quad 0
-saved_efer:.quad 0
-saved_efer2:   .quad 0
_
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [rfc patch] Re: sched: incorrect argument used in task_hot()

2006-11-17 Thread Chen, Kenneth W
Mike Galbraith wrote on Friday, November 17, 2006 2:19 PM
> > And a changelog, then we're all set!
> > 
> > Oh.  And a patch, too.
> 
> Co-opt rq->timestamp_last_tick to maintain a cache_hot_time evaluation
> reference timestamp at both tick and sched times to prevent said
> reference, formerly rq->timestamp_last_tick, from being behind
> task->last_ran at evaluation time, and to move said reference closer to
> current time on the remote processor, intent being to improve cache hot
> evaluation and timestamp adjustment accuracy for task migration.
> 
> Fix minor sched_time double accounting error which occurs when a task
> passing through schedule() does not schedule off, and takes the next
> timer tick.
> 
> []
> 
> @@ -2206,7 +2207,7 @@ skip_queue:
>   }
>  
>  #ifdef CONFIG_SCHEDSTATS
> - if (task_hot(tmp, busiest->timestamp_last_tick, sd))
> + if (task_hot(tmp, busiest->most_recent_timestamp, sd))
>   schedstat_inc(sd, lb_hot_gained[idle]);
>  #endif


While we at it, let's clean up this hunk.  task_hot is evaluated twice in
the more common case of nr_balance_failed <= cache_nice_tries. We should
only test/increment relevant stats for forced migration.
Patch on top of yours.


Signed-off-by: Ken Chen <[EMAIL PROTECTED]>


--- ./kernel/sched.c.orig   2006-11-17 13:37:46.0 -0800
+++ ./kernel/sched.c2006-11-17 14:20:34.0 -0800
@@ -2088,8 +2088,13 @@ int can_migrate_task(struct task_struct 
 * 2) too many balance attempts have failed.
 */
 
-   if (sd->nr_balance_failed > sd->cache_nice_tries)
+   if (sd->nr_balance_failed > sd->cache_nice_tries) {
+   #ifdef CONFIG_SCHEDSTATS
+   if (task_hot(p, rq->most_recent_timestamp, sd))
+   schedstat_inc(sd, lb_hot_gained[idle]);
+   #endif
return 1;
+   }
 
if (task_hot(p, rq->most_recent_timestamp, sd))
return 0;
@@ -2189,11 +2194,6 @@ skip_queue:
goto skip_bitmap;
}
 
-#ifdef CONFIG_SCHEDSTATS
-   if (task_hot(tmp, busiest->most_recent_timestamp, sd))
-   schedstat_inc(sd, lb_hot_gained[idle]);
-#endif
-
pull_task(busiest, array, tmp, this_rq, dst_array, this_cpu);
pulled++;
rem_load_move -= tmp->load_weight;
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] cpufreq: mark cpufreq_tsc() as core_initcall_sync

2006-11-17 Thread Paul E. McKenney
On Fri, Nov 17, 2006 at 09:39:45PM +0300, Oleg Nesterov wrote:
> Paul E. McKenney wrote:
> >
> >  int srcu_read_lock(struct srcu_struct *sp)
> >  {
> > @@ -112,11 +126,24 @@ int srcu_read_lock(struct srcu_struct *s
> >  
> > preempt_disable();
> > idx = sp->completed & 0x1;
> > -   barrier();  /* ensure compiler looks -once- at sp->completed. */
> > -   per_cpu_ptr(sp->per_cpu_ref, smp_processor_id())->c[idx]++;
> > -   srcu_barrier();  /* ensure compiler won't misorder critical section. */
> > +   if (likely(sp->per_cpu_ref != NULL)) {
> > +   barrier();  /* ensure compiler looks -once- at sp->completed. */
> > +   per_cpu_ptr(rcu_dereference(sp->per_cpu_ref),
> > +   smp_processor_id())->c[idx]++;
> > +   smp_mb();
> > +   preempt_enable();
> > +   return idx;
> > +   }
> > preempt_enable();
> > -   return idx;
> > +   mutex_lock(>mutex);
> > +   sp->per_cpu_ref = alloc_srcu_struct_percpu();
> 
> We should re-check sp->per_cpu_ref != NULL after taking sp->mutex,
> it was probably allocated by another thread.

Good catch!!!

> >  void srcu_read_unlock(struct srcu_struct *sp, int idx)
> >  {
> > -   preempt_disable();
> > -   srcu_barrier();  /* ensure compiler won't misorder critical section. */
> > -   per_cpu_ptr(sp->per_cpu_ref, smp_processor_id())->c[idx]--;
> > -   preempt_enable();
> > +   if (likely(idx != -1)) {
> > +   preempt_disable();
> > +   smp_mb();
> > +   per_cpu_ptr(sp->per_cpu_ref, smp_processor_id())->c[idx]--;
> > +   preempt_enable();
> > +   return;
> > +   }
> > +   mutex_lock(>mutex);
> > +   sp->hardluckref--;
> > +   mutex_unlock(>mutex);
> >  }
> 
> I think this is deadlockable, synchronize_srcu() does
> 
>   while (srcu_readers_active_idx(sp, idx))
>   schedule_timeout_interruptible(1);
> 
> under sp->mutex, so the loop above may spin forever while the reader
> waits for sp->mutex in srcu_read_unlock(sp, -1).

Indeed it is!  This requires a nested reader, so that the outer reader
blocks synchronize_srcu() and synchronize_srcu() blocks the inner
reader -- but that is legal.

So I made hardluckref be an atomic_t, and changed the mutex_lock()
in srcu_read_lock() be a mutex_trylock() -- which cannot block, right?

I also added the srcu_readers_active() declaration to srcu.h for Jens.
Oleg, any thoughts about Jens's optimization?  He would code something
like:

if (srcu_readers_active(_srcu))
synchronize_srcu();
else
smp_mb();

However, he is doing ordered I/O requests rather than protecting data
structures.

Changes:

o   Make hardluckref be an atomic_t.

o   Put the now-needed rcu_dereference()s for per_cpu_ref
(used to be constant...).

o   Moved to mutex_trylock() in srcu_read_lock() to avoid Oleg's
deadlock scenario.

o   Added per_cpu_ref NULL rechecks to avoid the Oleg's memory
leak (and worse).

o   Added srcu_readers_active() to srcu.h.

Still untested (aside from Jens's runs).

Signed-off-by: [EMAIL PROTECTED] (AKA [EMAIL PROTECTED])

---


 include/linux/srcu.h |8 ---
 kernel/srcu.c|  130 +++
 2 files changed, 73 insertions(+), 65 deletions(-)

diff -urpNa -X dontdiff linux-2.6.19-rc5/include/linux/srcu.h 
linux-2.6.19-rc5-dsrcu/include/linux/srcu.h
--- linux-2.6.19-rc5/include/linux/srcu.h   2006-11-17 13:54:15.0 
-0800
+++ linux-2.6.19-rc5-dsrcu/include/linux/srcu.h 2006-11-17 15:14:07.0 
-0800
@@ -35,19 +35,15 @@ struct srcu_struct {
int completed;
struct srcu_struct_array *per_cpu_ref;
struct mutex mutex;
+   atomic_t hardluckref;
 };
 
-#ifndef CONFIG_PREEMPT
-#define srcu_barrier() barrier()
-#else /* #ifndef CONFIG_PREEMPT */
-#define srcu_barrier()
-#endif /* #else #ifndef CONFIG_PREEMPT */
-
 int init_srcu_struct(struct srcu_struct *sp);
 void cleanup_srcu_struct(struct srcu_struct *sp);
 int srcu_read_lock(struct srcu_struct *sp) __acquires(sp);
 void srcu_read_unlock(struct srcu_struct *sp, int idx) __releases(sp);
 void synchronize_srcu(struct srcu_struct *sp);
 long srcu_batches_completed(struct srcu_struct *sp);
+int srcu_readers_active(struct srcu_struct *sp);
 
 #endif
diff -urpNa -X dontdiff linux-2.6.19-rc5/kernel/srcu.c 
linux-2.6.19-rc5-dsrcu/kernel/srcu.c
--- linux-2.6.19-rc5/kernel/srcu.c  2006-11-17 13:54:17.0 -0800
+++ linux-2.6.19-rc5-dsrcu/kernel/srcu.c2006-11-17 14:15:06.0 
-0800
@@ -34,6 +34,18 @@
 #include 
 #include 
 
+/*
+ * Initialize the per-CPU array, returning the pointer.
+ */
+static inline struct srcu_struct_array *alloc_srcu_struct_percpu(void)
+{
+   struct srcu_struct_array *sap;
+
+   sap = alloc_percpu(struct srcu_struct_array);
+   smp_wmb();
+   return (sap);
+}
+
 /**
  * init_srcu_struct - initialize a sleep-RCU structure
  * @sp: structure to initialize.
@@ 

Re: [PATCH 12/20] x86_64: wakeup.S Misc cleanup

2006-11-17 Thread Pavel Machek
Hi!

> o Various cleanups. One of the main purpose of cleanups is that make
>   wakeup.S as close as possible to trampoline.S.
> 
> o Following are the changes
>   - Indentations for comments.
>   - Changed the gdt table to compact form and to resemble the
> one in trampoline.S
>   - Take the jump to 32bit from real mode using ljmpl. Makes code
> more readable.
>   - After enabling long mode, directly take a long jump for 64bit
> mode. No need to take an extra jump to "reach_comaptibility_mode"
>   - Stack is not used after real mode. So don't load stack in
> 32 bit mode.
>   - No need to enable PGE here.
>   - No need to do extra EFER read, anyway we trash the read contents.
>   - No need to enable system call (EFER_SCE). Anyway it will be 
> enabled when original EFER is restored.
>   - No need to set MP, ET, NE, WP, AM bits in cr0. Very soon we will
> reload the original cr0 while restroing the processor state.
> 
> Signed-off-by: Eric W. Biederman <[EMAIL PROTECTED]>
> Signed-off-by: Vivek Goyal <[EMAIL PROTECTED]>

ACK, minor nitpicks:

> + /* ??? Why I need the accessed bit set in order for this to work? */

Yes, I'd like to know :-).

> + .quad   0x00cf9b00  # __KERNEL32_CS
> + .quad   0x00af9b00  # __KERNEL_CS
> + .quad   0x00cf9300  # __KERNEL_DS

Can we get a comment telling us what to keep it in sync with?

Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[2.6 patch] mark pci_find_device() as __deprecated

2006-11-17 Thread Adrian Bunk
On Fri, Nov 17, 2006 at 09:32:36AM -0500, Alan Cox wrote:
> On Fri, Nov 17, 2006 at 03:21:45PM +0100, Adrian Bunk wrote:
> > This patch removes the no longer used pci_find_device_reverse().
> > 
> > Signed-off-by: Adrian Bunk <[EMAIL PROTECTED]>
> 
> Acked-by: Alan Cox <[EMAIL PROTECTED]>
> 
> Soon we should deprecate pci_find_device as well

So let's mark it as __deprecated now, which also has the side effect 
that noone can later whine that removing it might break some shiny 
external modules.

Oh, and if anything starts complaining "But this adds some warnings to 
my kernel build!", he should either first fix the 200 kB (sic) of 
warnings I'm getting in 2.6.19-rc5-mm2 starting at MODPOST or go to hell.

Signed-off-by: Adrian Bunk <[EMAIL PROTECTED]>

--- linux-2.6.19-rc5-mm2/include/linux/pci.h.old2006-11-18 
01:03:27.0 +0100
+++ linux-2.6.19-rc5-mm2/include/linux/pci.h2006-11-18 01:05:46.0 
+0100
@@ -441,7 +441,7 @@
 
 /* Generic PCI functions exported to card drivers */
 
-struct pci_dev *pci_find_device (unsigned int vendor, unsigned int device, 
const struct pci_dev *from);
+struct pci_dev __deprecated *pci_find_device (unsigned int vendor, unsigned 
int device, const struct pci_dev *from);
 struct pci_dev *pci_find_slot (unsigned int bus, unsigned int devfn);
 int pci_find_capability (struct pci_dev *dev, int cap);
 int pci_find_next_capability (struct pci_dev *dev, u8 pos, int cap);
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 12/20] x86_64: wakeup.S Misc cleanup

2006-11-17 Thread Vivek Goyal


o Various cleanups. One of the main purpose of cleanups is that make
  wakeup.S as close as possible to trampoline.S.

o Following are the changes
- Indentations for comments.
- Changed the gdt table to compact form and to resemble the
  one in trampoline.S
- Take the jump to 32bit from real mode using ljmpl. Makes code
  more readable.
- After enabling long mode, directly take a long jump for 64bit
  mode. No need to take an extra jump to "reach_comaptibility_mode"
- Stack is not used after real mode. So don't load stack in
  32 bit mode.
- No need to enable PGE here.
- No need to do extra EFER read, anyway we trash the read contents.
- No need to enable system call (EFER_SCE). Anyway it will be 
  enabled when original EFER is restored.
- No need to set MP, ET, NE, WP, AM bits in cr0. Very soon we will
  reload the original cr0 while restroing the processor state.

Signed-off-by: Eric W. Biederman <[EMAIL PROTECTED]>
Signed-off-by: Vivek Goyal <[EMAIL PROTECTED]>
---

 arch/x86_64/kernel/acpi/wakeup.S |  111 +--
 1 file changed, 39 insertions(+), 72 deletions(-)

diff -puN arch/x86_64/kernel/acpi/wakeup.S~x86_64-wakeup.S-misc-cleanups 
arch/x86_64/kernel/acpi/wakeup.S
--- 
linux-2.6.19-rc6-reloc/arch/x86_64/kernel/acpi/wakeup.S~x86_64-wakeup.S-misc-cleanups
   2006-11-17 00:09:56.0 -0500
+++ linux-2.6.19-rc6-reloc-root/arch/x86_64/kernel/acpi/wakeup.S
2006-11-17 00:09:56.0 -0500
@@ -30,11 +30,12 @@ wakeup_code:
cld
# setup data segment
movw%cs, %ax
-   movw%ax, %ds# Make ds:0 
point to wakeup_start
+   movw%ax, %ds# Make ds:0 point to wakeup_start
movw%ax, %ss
-   mov $(wakeup_stack - wakeup_code), %sp  # Private stack 
is needed for ASUS board
+   # Private stack is needed for ASUS board
+   mov $(wakeup_stack - wakeup_code), %sp
 
-   pushl   $0  # Kill any 
dangerous flags
+   pushl   $0  # Kill any dangerous flags
popfl
 
movlreal_magic - wakeup_code, %eax
@@ -45,7 +46,7 @@ wakeup_code:
jz  1f
lcall   $0xc000,$3
movw%cs, %ax
-   movw%ax, %ds# Bios might 
have played with that
+   movw%ax, %ds# Bios might have played with that
movw%ax, %ss
 1:
 
@@ -75,9 +76,12 @@ wakeup_code:
jmp 1f
 1:
 
-   .byte 0x66, 0xea# prefix + jmpi-opcode
-   .long   wakeup_32 - __START_KERNEL_map
-   .word   __KERNEL_CS
+   ljmpl   *(wakeup_32_vector - wakeup_code)
+
+   .balign 4
+wakeup_32_vector:
+   .long   wakeup_32 - __START_KERNEL_map
+   .word   __KERNEL32_CS, 0
 
.code32
 wakeup_32:
@@ -96,65 +100,50 @@ wakeup_32:
jnc bogus_cpu
movl%edx,%edi

-   movw$__KERNEL_DS, %ax
-   movw%ax, %ds
-   movw%ax, %es
-   movw%ax, %fs
-   movw%ax, %gs
+   movl$__KERNEL_DS, %eax
+   movl%eax, %ds
 
-   movw$__KERNEL_DS, %ax   
-   movw%ax, %ss
-
-   mov $(wakeup_stack - __START_KERNEL_map), %esp
movlsaved_magic - __START_KERNEL_map, %eax
cmpl$0x9abcdef0, %eax
jne bogus_32_magic
 
+   movw$0x0e00 + 'i', %ds:(0xb8012)
+   movb$0xa8, %al  ;  outb %al, $0x80;
+
/*
 * Prepare for entering 64bits mode
 */
 
-   /* Enable PAE mode and PGE */
+   /* Enable PAE */
xorl%eax, %eax
btsl$5, %eax
-   btsl$7, %eax
movl%eax, %cr4
 
/* Setup early boot stage 4 level pagetables */
movl$(wakeup_level4_pgt - __START_KERNEL_map), %eax
movl%eax, %cr3
 
-   /* Setup EFER (Extended Feature Enable Register) */
-   movl$MSR_EFER, %ecx
-   rdmsr
-   /* Fool rdmsr and reset %eax to avoid dependences */
-   xorl%eax, %eax
/* Enable Long Mode */
+   xorl%eax, %eax
btsl$_EFER_LME, %eax
-   /* Enable System Call */
-   btsl$_EFER_SCE, %eax
 
-   /* No Execute supported? */ 
+   /* No Execute supported? */
btl $20,%edi
jnc 1f
btsl$_EFER_NX, %eax
-1: 

/* Make changes effective */
+1: movl$MSR_EFER, %ecx
+   xorl%edx, %edx
wrmsr
-   wbinvd
 
xorl%eax, %eax
btsl$31, %eax   /* Enable paging and in turn 
activate Long Mode */
btsl$0, %eax/* Enable protected mode */
-   btsl$1, %eax 

[PATCH 17/20] x86_64: Remove CONFIG_PHYSICAL_START

2006-11-17 Thread Vivek Goyal


I am about to add relocatable kernel support which has essentially
no cost so there is no point in retaining CONFIG_PHYSICAL_START
and retaining CONFIG_PHYSICAL_START makes implementation of and
testing of a relocatable kernel more difficult.

Signed-off-by: Eric W. Biederman <[EMAIL PROTECTED]>
Signed-off-by: Vivek Goyal <[EMAIL PROTECTED]>
---

 arch/x86_64/Kconfig|   19 ---
 arch/x86_64/boot/compressed/head.S |6 +++---
 arch/x86_64/boot/compressed/misc.c |6 +++---
 arch/x86_64/defconfig  |1 -
 arch/x86_64/kernel/vmlinux.lds.S   |2 +-
 arch/x86_64/mm/fault.c |4 ++--
 include/asm-x86_64/page.h  |2 --
 7 files changed, 9 insertions(+), 31 deletions(-)

diff -puN 
arch/x86_64/boot/compressed/head.S~x86_64-Remove-CONFIG_PHYSICAL_START 
arch/x86_64/boot/compressed/head.S
--- 
linux-2.6.19-rc6-reloc/arch/x86_64/boot/compressed/head.S~x86_64-Remove-CONFIG_PHYSICAL_START
   2006-11-17 00:12:50.0 -0500
+++ linux-2.6.19-rc6-reloc-root/arch/x86_64/boot/compressed/head.S  
2006-11-17 00:12:50.0 -0500
@@ -76,7 +76,7 @@ startup_32:
jnz  3f
addl $8,%esp
xorl %ebx,%ebx
-   ljmp $(__KERNEL_CS), $__PHYSICAL_START
+   ljmp $(__KERNEL_CS), $0x20
 
 /*
  * We come here, if we were loaded high.
@@ -102,7 +102,7 @@ startup_32:
popl %ecx   # lcount
popl %edx   # high_buffer_start
popl %eax   # hcount
-   movl $__PHYSICAL_START,%edi
+   movl $0x20,%edi
cli # make sure we don't get interrupted
ljmp $(__KERNEL_CS), $0x1000 # and jump to the move routine
 
@@ -127,7 +127,7 @@ move_routine_start:
movsl
movl %ebx,%esi  # Restore setup pointer
xorl %ebx,%ebx
-   ljmp $(__KERNEL_CS), $__PHYSICAL_START
+   ljmp $(__KERNEL_CS), $0x20
 move_routine_end:
 
 
diff -puN 
arch/x86_64/boot/compressed/misc.c~x86_64-Remove-CONFIG_PHYSICAL_START 
arch/x86_64/boot/compressed/misc.c
--- 
linux-2.6.19-rc6-reloc/arch/x86_64/boot/compressed/misc.c~x86_64-Remove-CONFIG_PHYSICAL_START
   2006-11-17 00:12:50.0 -0500
+++ linux-2.6.19-rc6-reloc-root/arch/x86_64/boot/compressed/misc.c  
2006-11-17 00:12:50.0 -0500
@@ -288,7 +288,7 @@ static void setup_normal_output_buffer(v
 #else
if ((RM_ALT_MEM_K > RM_EXT_MEM_K ? RM_ALT_MEM_K : RM_EXT_MEM_K) < 1024) 
error("Less than 2MB of memory");
 #endif
-   output_data = (unsigned char *)__PHYSICAL_START; /* Normally Points to 
1M */
+   output_data = (unsigned char *)0x20;
free_mem_end_ptr = (long)real_mode;
 }
 
@@ -311,8 +311,8 @@ static void setup_output_buffer_if_we_ru
low_buffer_size = low_buffer_end - LOW_BUFFER_START;
high_loaded = 1;
free_mem_end_ptr = (long)high_buffer_start;
-   if ( (__PHYSICAL_START + low_buffer_size) > ((ulg)high_buffer_start)) {
-   high_buffer_start = (uch *)(__PHYSICAL_START + low_buffer_size);
+   if ( (0x20 + low_buffer_size) > ((ulg)high_buffer_start)) {
+   high_buffer_start = (uch *)(0x20 + low_buffer_size);
mv->hcount = 0; /* say: we need not to move high_buffer */
}
else mv->hcount = -1;
diff -puN arch/x86_64/defconfig~x86_64-Remove-CONFIG_PHYSICAL_START 
arch/x86_64/defconfig
--- 
linux-2.6.19-rc6-reloc/arch/x86_64/defconfig~x86_64-Remove-CONFIG_PHYSICAL_START
2006-11-17 00:12:50.0 -0500
+++ linux-2.6.19-rc6-reloc-root/arch/x86_64/defconfig   2006-11-17 
00:12:50.0 -0500
@@ -165,7 +165,6 @@ CONFIG_X86_MCE_INTEL=y
 CONFIG_X86_MCE_AMD=y
 # CONFIG_KEXEC is not set
 # CONFIG_CRASH_DUMP is not set
-CONFIG_PHYSICAL_START=0x20
 CONFIG_SECCOMP=y
 # CONFIG_CC_STACKPROTECTOR is not set
 # CONFIG_HZ_100 is not set
diff -puN arch/x86_64/Kconfig~x86_64-Remove-CONFIG_PHYSICAL_START 
arch/x86_64/Kconfig
--- 
linux-2.6.19-rc6-reloc/arch/x86_64/Kconfig~x86_64-Remove-CONFIG_PHYSICAL_START  
2006-11-17 00:12:50.0 -0500
+++ linux-2.6.19-rc6-reloc-root/arch/x86_64/Kconfig 2006-11-17 
00:12:50.0 -0500
@@ -513,25 +513,6 @@ config CRASH_DUMP
  PHYSICAL_START.
   For more details see Documentation/kdump/kdump.txt
 
-config PHYSICAL_START
-   hex "Physical address where the kernel is loaded" if (EMBEDDED || 
CRASH_DUMP)
-   default "0x100" if CRASH_DUMP
-   default "0x20"
-   help
- This gives the physical address where the kernel is loaded. Normally
- for regular kernels this value is 0x20 (2MB). But in the case
- of kexec on panic the fail safe kernel needs to run at a different
- address than the panic-ed kernel. This option is used to set the load
- address for kernels used to capture crash dump on being kexec'ed
- after panic. The default value for crash dump kernels is
- 0x100 (16MB). This can also be set based on the "X" value as
- 

[PATCH 18/20] x86_64: Relocatable kernel support

2006-11-17 Thread Vivek Goyal


This patch modifies the x86_64 kernel so that it can be loaded and run
at any 2M aligned address, below 512G.  The technique used is to
compile the decompressor with -fPIC and modify it so the decompressor
is fully relocatable.  For the main kernel the page tables are
modified so the kernel remains at the same virtual address.  In
addition a variable phys_base is kept that holds the physical address
the kernel is loaded at.  __pa_symbol is modified to add that when
we take the address of a kernel symbol.

When loaded with a normal bootloader the decompressor will decompress
the kernel to 2M and it will run there.  This both ensures the
relocation code is always working, and makes it easier to use 2M
pages for the kernel and the cpu.

Signed-off-by: Eric W. Biederman <[EMAIL PROTECTED]>
Signed-off-by: Vivek Goyal <[EMAIL PROTECTED]>
---

 arch/x86_64/boot/compressed/Makefile|   12 -
 arch/x86_64/boot/compressed/head.S  |  311 ++--
 arch/x86_64/boot/compressed/misc.c  |  251 +
 arch/x86_64/boot/compressed/vmlinux.lds |   44 
 arch/x86_64/boot/compressed/vmlinux.scr |5 
 arch/x86_64/kernel/head.S   |  221 --
 include/asm-x86_64/page.h   |6 
 7 files changed, 532 insertions(+), 318 deletions(-)

diff -puN arch/x86_64/boot/compressed/head.S~x86_64-Relocatable-kernel-support 
arch/x86_64/boot/compressed/head.S
--- 
linux-2.6.19-rc6-reloc/arch/x86_64/boot/compressed/head.S~x86_64-Relocatable-kernel-support
 2006-11-17 00:13:18.0 -0500
+++ linux-2.6.19-rc6-reloc-root/arch/x86_64/boot/compressed/head.S  
2006-11-17 00:13:18.0 -0500
@@ -26,116 +26,245 @@
 
 #include 
 #include 
+#include 
 #include 
+#include 
 
+.section ".text.head"
.code32
.globl startup_32

 startup_32:
cld
cli
-   movl $(__KERNEL_DS),%eax
-   movl %eax,%ds
-   movl %eax,%es
-   movl %eax,%fs
-   movl %eax,%gs
-
-   lss stack_start,%esp
-   xorl %eax,%eax
-1: incl %eax   # check that A20 really IS enabled
-   movl %eax,0x00  # loop forever if it isn't
-   cmpl %eax,0x10
-   je 1b
-
-/*
- * Initialize eflags.  Some BIOS's leave bits like NT set.  This would
- * confuse the debugger if this code is traced.
- * XXX - best to initialize before switching to protected mode.
+   movl$(__KERNEL_DS), %eax
+   movl%eax, %ds
+   movl%eax, %es
+   movl%eax, %ss
+
+/* Calculate the delta between where we were compiled to run
+ * at and where we were actually loaded at.  This can only be done
+ * with a short local call on x86.  Nothing  else will tell us what
+ * address we are running at.  The reserved chunk of the real-mode
+ * data at 0x34-0x3f are used as the stack for this calculation.
+ * Only 4 bytes are needed.
  */
-   pushl $0
-   popfl
+   leal0x40(%esi), %esp
+   call1f
+1: popl%ebp
+   subl$1b, %ebp
+
+/* Compute the delta between where we were compiled to run at
+ * and where the code will actually run at.
+ */
+   movl%ebp, %ebx
+   addl$(LARGE_PAGE_SIZE -1), %ebx
+   andl$LARGE_PAGE_MASK, %ebx
+
+   /* Replace the compressed data size with the uncompressed size */
+   sublinput_len(%ebp), %ebx
+   movloutput_len(%ebp), %eax
+   addl%eax, %ebx
+   /* Add 8 bytes for every 32K input block */
+   shrl$12, %eax
+   addl%eax, %ebx
+   /* Add 32K + 18 bytes of extra slack and align on a 4K boundary */
+   addl$(32768 + 18 + 4095), %ebx
+   andl$~4095, %ebx
+
+/*
+ * Prepare for entering 64 bit mode
+ */
+
+   /* Load new GDT with the 64bit segments using 32bit descriptor */
+   lealgdt(%ebp), %eax
+   movl%eax, gdt+2(%ebp)
+   lgdtgdt(%ebp)
+
+   /* Enable PAE mode */
+   xorl%eax, %eax
+   orl $(1 << 5), %eax
+   movl%eax, %cr4
+
+/*
+ * Build early 4G boot pagetable
+ */
+   /* Initialize Page tables to 0*/
+   lealpgtable(%ebx), %edi
+   xorl%eax, %eax
+   movl$((4096*6)/4), %ecx
+   rep stosl
+
+   /* Build Level 4 */
+   lealpgtable + 0(%ebx), %edi
+   leal0x1007 (%edi), %eax
+   movl%eax, 0(%edi)
+
+   /* Build Level 3 */
+   lealpgtable + 0x1000(%ebx), %edi
+   leal0x1007(%edi), %eax
+   movl$4, %ecx
+1: movl%eax, 0x00(%edi)
+   addl$0x1000, %eax
+   addl$8, %edi
+   decl%ecx
+   jnz 1b
+
+   /* Build Level 2 */
+   lealpgtable + 0x2000(%ebx), %edi
+   movl$0x0183, %eax
+   movl$2048, %ecx
+1: movl%eax, 0(%edi)
+   addl$0x0020, %eax
+   addl$8, %edi
+   decl%ecx
+   jnz 1b
+
+   /* Enable the boot page tables */
+   lealpgtable(%ebx), %eax
+   movl

[PATCH 8/20] x86_64: Add EFER to the set registers saved by save_processor_state

2006-11-17 Thread Vivek Goyal


EFER varies like %cr4 depending on the cpu capabilities, and which cpu
capabilities we want to make use of.  So save/restore it make certain
we have the same EFER value when we are done.

Signed-off-by: Eric W. Biederman <[EMAIL PROTECTED]>
Signed-off-by: Vivek Goyal <[EMAIL PROTECTED]>
---

 arch/x86_64/kernel/suspend.c |3 ++-
 include/asm-x86_64/suspend.h |1 +
 2 files changed, 3 insertions(+), 1 deletion(-)

diff -puN 
arch/x86_64/kernel/suspend.c~x86_64-Add-EFER-to-the-set-registers-saved-by-save_processor_state
 arch/x86_64/kernel/suspend.c
--- 
linux-2.6.19-rc6-reloc/arch/x86_64/kernel/suspend.c~x86_64-Add-EFER-to-the-set-registers-saved-by-save_processor_state
  2006-11-17 00:08:16.0 -0500
+++ linux-2.6.19-rc6-reloc-root/arch/x86_64/kernel/suspend.c2006-11-17 
00:08:16.0 -0500
@@ -33,7 +33,6 @@ void __save_processor_state(struct saved
asm volatile ("str %0"  : "=m" (ctxt->tr));
 
/* XMM0..XMM15 should be handled by kernel_fpu_begin(). */
-   /* EFER should be constant for kernel version, no need to handle it. */
/*
 * segment registers
 */
@@ -50,6 +49,7 @@ void __save_processor_state(struct saved
/*
 * control registers 
 */
+   rdmsrl(MSR_EFER, ctxt->efer);
asm volatile ("movq %%cr0, %0" : "=r" (ctxt->cr0));
asm volatile ("movq %%cr2, %0" : "=r" (ctxt->cr2));
asm volatile ("movq %%cr3, %0" : "=r" (ctxt->cr3));
@@ -75,6 +75,7 @@ void __restore_processor_state(struct sa
/*
 * control registers
 */
+   wrmsrl(MSR_EFER, ctxt->efer);
asm volatile ("movq %0, %%cr8" :: "r" (ctxt->cr8));
asm volatile ("movq %0, %%cr4" :: "r" (ctxt->cr4));
asm volatile ("movq %0, %%cr3" :: "r" (ctxt->cr3));
diff -puN 
include/asm-x86_64/suspend.h~x86_64-Add-EFER-to-the-set-registers-saved-by-save_processor_state
 include/asm-x86_64/suspend.h
--- 
linux-2.6.19-rc6-reloc/include/asm-x86_64/suspend.h~x86_64-Add-EFER-to-the-set-registers-saved-by-save_processor_state
  2006-11-17 00:08:16.0 -0500
+++ linux-2.6.19-rc6-reloc-root/include/asm-x86_64/suspend.h2006-11-17 
00:08:16.0 -0500
@@ -17,6 +17,7 @@ struct saved_context {
u16 ds, es, fs, gs, ss;
unsigned long gs_base, gs_kernel_base, fs_base;
unsigned long cr0, cr2, cr3, cr4, cr8;
+   unsigned long efer;
u16 gdt_pad;
u16 gdt_limit;
unsigned long gdt_base;
_
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC: -mm patch] make kernel/timer.c:__next_timer_interrupt() static

2006-11-17 Thread Adrian Bunk
This patch makes the needlessly global __next_timer_interrupt() static.

Signed-off-by: Adrian Bunk <[EMAIL PROTECTED]>

--- linux-2.6.19-rc5-mm2/kernel/timer.c.old 2006-11-17 19:11:49.0 
+0100
+++ linux-2.6.19-rc5-mm2/kernel/timer.c 2006-11-17 19:12:02.0 +0100
@@ -621,7 +621,8 @@
  * is used on S/390 to stop all activity when a cpus is idle.
  * This functions needs to be called disabled.
  */
-unsigned long __next_timer_interrupt(tvec_base_t *base, unsigned long now)
+static unsigned long __next_timer_interrupt(tvec_base_t *base,
+   unsigned long now)
 {
struct list_head *list;
struct timer_list *nte, *found = NULL;

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC: -mm patch] remove kernel/timer.c:wall_jiffies

2006-11-17 Thread Adrian Bunk
"wall_jiffies" was added, but it's completely unused...

Signed-off-by: Adrian Bunk <[EMAIL PROTECTED]>

--- linux-2.6.19-rc5-mm2/kernel/timer.c.old 2006-11-17 19:09:54.0 
+0100
+++ linux-2.6.19-rc5-mm2/kernel/timer.c 2006-11-17 19:10:01.0 +0100
@@ -42,9 +42,6 @@
 #include 
 #include 
 
-/* jiffies at the most recent update of wall time */
-unsigned long wall_jiffies = INITIAL_JIFFIES;
-
 u64 jiffies_64 __cacheline_aligned_in_smp = INITIAL_JIFFIES;
 
 EXPORT_SYMBOL(jiffies_64);

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC: 2.6 patch] remove the broken HISAX_AMD7930 option

2006-11-17 Thread Adrian Bunk
HISAX_AMD7930 was never anywhere near to being working, and this doesn't 
seem to change in the forseeable future.

Signed-off-by: Adrian Bunk <[EMAIL PROTECTED]>

---

 drivers/isdn/hisax/Kconfig  |7 ---
 drivers/isdn/hisax/config.c |   18 --
 drivers/isdn/hisax/hisax.h  |6 --
 3 files changed, 31 deletions(-)

--- linux-2.6.19-rc5-mm2/drivers/isdn/hisax/Kconfig.old 2006-11-17 
19:41:07.0 +0100
+++ linux-2.6.19-rc5-mm2/drivers/isdn/hisax/Kconfig 2006-11-17 
19:41:15.0 +0100
@@ -349,13 +349,6 @@ config HISAX_ENTERNOW_PCI
  This enables HiSax support for the Formula-n enter:now PCI
  ISDN card.
 
-config HISAX_AMD7930
-   bool "Am7930 (EXPERIMENTAL)"
-   depends on EXPERIMENTAL && SPARC && BROKEN
-   help
- This enables HiSax support for the AMD7930 chips on some SPARCs.
- This code is not finished yet.
-
 endif
 
 if ISDN_DRV_HISAX
--- linux-2.6.19-rc5-mm2/drivers/isdn/hisax/hisax.h.old 2006-11-17 
19:41:33.0 +0100
+++ linux-2.6.19-rc5-mm2/drivers/isdn/hisax/hisax.h 2006-11-17 
19:41:44.0 +0100
@@ -1139,12 +1139,6 @@ struct IsdnCardState {
 #define  CARD_HFC_SX 0
 #endif
 
-#ifdef  CONFIG_HISAX_AMD7930
-#define CARD_AMD7930 1
-#else
-#define CARD_AMD7930 0
-#endif
-
 #ifdef CONFIG_HISAX_NICCY
 #defineCARD_NICCY 1
 #ifndef ISDN_CHIP_ISAC
--- linux-2.6.19-rc5-mm2/drivers/isdn/hisax/config.c.old2006-11-17 
19:41:57.0 +0100
+++ linux-2.6.19-rc5-mm2/drivers/isdn/hisax/config.c2006-11-17 
19:43:03.0 +0100
@@ -227,14 +227,6 @@ const char *CardType[] = {
 #define DEFAULT_CFG {5,0x2E0,0,0}
 #endif
 
-
-#ifdef CONFIG_HISAX_AMD7930
-#undef DEFAULT_CARD
-#undef DEFAULT_CFG
-#define DEFAULT_CARD ISDN_CTYPE_AMD7930
-#define DEFAULT_CFG {12,0x3e0,0,0}
-#endif
-
 #ifdef CONFIG_HISAX_NICCY
 #undef DEFAULT_CARD
 #undef DEFAULT_CFG
@@ -545,10 +537,6 @@ extern int setup_hfcpci(struct IsdnCard 
 extern int setup_hfcsx(struct IsdnCard *card);
 #endif
 
-#if CARD_AMD7930
-extern int setup_amd7930(struct IsdnCard *card);
-#endif
-
 #if CARD_NICCY
 extern int setup_niccy(struct IsdnCard *card);
 #endif
@@ -1064,11 +1052,6 @@ static int checkcard(int cardnr, char *i
ret = setup_niccy(card);
break;
 #endif
-#if CARD_AMD7930
-   case ISDN_CTYPE_AMD7930:
-   ret = setup_amd7930(card);
-   break;
-#endif
 #if CARD_ISURF
case ISDN_CTYPE_ISURF:
ret = setup_isurf(card);
@@ -1438,7 +1421,6 @@ static int __init HiSax_init(void)
break;
case ISDN_CTYPE_ELSA_PCI:
case ISDN_CTYPE_NETJET_S:
-   case ISDN_CTYPE_AMD7930:
case ISDN_CTYPE_TELESPCI:
case ISDN_CTYPE_W6692:
case ISDN_CTYPE_NETJET_U:

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC: 2.6 patch] make kernel/signal.c:kill_proc_info()

2006-11-17 Thread Adrian Bunk
This patch makes the needlessly global kill_proc_info() static.

Signed-off-by: Adrian Bunk <[EMAIL PROTECTED]>

---

 include/linux/sched.h |1 -
 kernel/signal.c   |3 +--
 2 files changed, 1 insertion(+), 3 deletions(-)

--- linux-2.6.19-rc5-mm2/include/linux/sched.h.old  2006-11-17 
19:05:35.0 +0100
+++ linux-2.6.19-rc5-mm2/include/linux/sched.h  2006-11-17 19:05:43.0 
+0100
@@ -1343,7 +1343,6 @@
 extern int kill_pid(struct pid *pid, int sig, int priv);
 extern int __kill_pg_info(int sig, struct siginfo *info, pid_t pgrp);
 extern int kill_pg_info(int, struct siginfo *, pid_t);
-extern int kill_proc_info(int, struct siginfo *, pid_t);
 extern void do_notify_parent(struct task_struct *, int);
 extern void force_sig(int, struct task_struct *);
 extern void force_sig_specific(int, struct task_struct *);
--- linux-2.6.19-rc5-mm2/kernel/signal.c.old2006-11-17 19:05:51.0 
+0100
+++ linux-2.6.19-rc5-mm2/kernel/signal.c2006-11-17 19:06:03.0 
+0100
@@ -1261,8 +1261,7 @@
return error;
 }
 
-int
-kill_proc_info(int sig, struct siginfo *info, pid_t pid)
+static int kill_proc_info(int sig, struct siginfo *info, pid_t pid)
 {
int error;
rcu_read_lock();

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 16/20] x86_64: __pa and __pa_symbol address space separation

2006-11-17 Thread Vivek Goyal


Currently __pa_symbol is for use with symbols in the kernel address
map and __pa is for use with pointers into the physical memory map.
But the code is implemented so you can usually interchange the two.

__pa which is much more common can be implemented much more cheaply
if it is it doesn't have to worry about any other kernel address
spaces.  This is especially true with a relocatable kernel as
__pa_symbol needs to peform an extra variable read to resolve
the address.

There is a third macro that is added for the vsyscall data
__pa_vsymbol for finding the physical addesses of vsyscall pages.

Most of this patch is simply sorting through the references to
__pa or __pa_symbol and using the proper one.  A little of
it is continuing to use a physical address when we have it
instead of recalculating it several times.

swapper_pgd is now NULL.  leave_mm now uses init_mm.pgd
and init_mm.pgd is initialized at boot (instead of compile time)
to the physmem virtual mapping of init_level4_pgd.  The
physical address changed.

Except for the for EMPTY_ZERO page all of the remaining references
to __pa_symbol appear to be during kernel initialization.  So this
should reduce the cost of __pa in the common case, even on a relocated
kernel.

As this is technically a semantic change we need to be on the lookout
for anything I missed.  But it works for me (tm).

Signed-off-by: Eric W. Biederman <[EMAIL PROTECTED]>
Signed-off-by: Vivek Goyal <[EMAIL PROTECTED]>
---

 arch/i386/kernel/alternative.c |8 
 arch/i386/mm/init.c|   15 ---
 arch/x86_64/kernel/machine_kexec.c |   14 +++---
 arch/x86_64/kernel/setup.c |9 +
 arch/x86_64/kernel/smp.c   |2 +-
 arch/x86_64/kernel/vsyscall.c  |   10 --
 arch/x86_64/mm/init.c  |   21 +++--
 arch/x86_64/mm/pageattr.c  |   17 ++---
 include/asm-x86_64/page.h  |6 ++
 include/asm-x86_64/pgtable.h   |4 ++--
 10 files changed, 58 insertions(+), 48 deletions(-)

diff -puN 
arch/i386/kernel/alternative.c~x86_64-__pa-and-__pa_symbol-address-space-separation
 arch/i386/kernel/alternative.c
--- 
linux-2.6.19-rc6-reloc/arch/i386/kernel/alternative.c~x86_64-__pa-and-__pa_symbol-address-space-separation
  2006-11-17 00:12:15.0 -0500
+++ linux-2.6.19-rc6-reloc-root/arch/i386/kernel/alternative.c  2006-11-17 
00:12:15.0 -0500
@@ -348,8 +348,8 @@ void __init alternative_instructions(voi
if (no_replacement) {
printk(KERN_INFO "(SMP-)alternatives turned off\n");
free_init_pages("SMP alternatives",
-   (unsigned long)__smp_alt_begin,
-   (unsigned long)__smp_alt_end);
+   __pa_symbol(&__smp_alt_begin),
+   __pa_symbol(&__smp_alt_end));
return;
}
 
@@ -378,8 +378,8 @@ void __init alternative_instructions(voi
_text, _etext);
}
free_init_pages("SMP alternatives",
-   (unsigned long)__smp_alt_begin,
-   (unsigned long)__smp_alt_end);
+   __pa_symbol(&__smp_alt_begin),
+   __pa_symbol(&__smp_alt_end));
} else {
alternatives_smp_save(__smp_alt_instructions,
  __smp_alt_instructions_end);
diff -puN 
arch/i386/mm/init.c~x86_64-__pa-and-__pa_symbol-address-space-separation 
arch/i386/mm/init.c
--- 
linux-2.6.19-rc6-reloc/arch/i386/mm/init.c~x86_64-__pa-and-__pa_symbol-address-space-separation
 2006-11-17 00:12:15.0 -0500
+++ linux-2.6.19-rc6-reloc-root/arch/i386/mm/init.c 2006-11-17 
00:12:15.0 -0500
@@ -778,10 +778,11 @@ void free_init_pages(char *what, unsigne
unsigned long addr;
 
for (addr = begin; addr < end; addr += PAGE_SIZE) {
-   ClearPageReserved(virt_to_page(addr));
-   init_page_count(virt_to_page(addr));
-   memset((void *)addr, POISON_FREE_INITMEM, PAGE_SIZE);
-   free_page(addr);
+   struct page *page = pfn_to_page(addr >> PAGE_SHIFT);
+   ClearPageReserved(page);
+   init_page_count(page);
+   memset(page_address(page), POISON_FREE_INITMEM, PAGE_SIZE);
+   __free_page(page);
totalram_pages++;
}
printk(KERN_INFO "Freeing %s: %ldk freed\n", what, (end - begin) >> 10);
@@ -790,14 +791,14 @@ void free_init_pages(char *what, unsigne
 void free_initmem(void)
 {
free_init_pages("unused kernel memory",
-   (unsigned long)(&__init_begin),
-   (unsigned long)(&__init_end));
+   __pa_symbol(&__init_begin),
+   __pa_symbol(&__init_end));
 }
 
 #ifdef 

[PATCH 3/20] x86_64: Kill temp_boot_pmds

2006-11-17 Thread Vivek Goyal


Early in the boot process we need the ability to set
up temporary mappings, before our normal mechanisms are
initialized.  Currently this is used to map pages that
are part of the page tables we are building and pages
during the dmi scan.

The core problem is that we are using the user portion of
the page tables to implement this.  Which means that while
this mechanism is active we cannot catch NULL pointer dereferences
and we deviate from the normal ways of handling things.

In this patch I modify early_ioremap to map pages into
the kernel portion of address space, roughly where
we will later put modules, and I make the discovery of
which addresses we can use dynamic which removes all
kinds of static limits and remove the dependencies
on implementation details between different parts of the code.

Now alloc_low_page() and unmap_low_page() use 
early_iomap() and early_iounmap() to allocate/map and 
unmap a page.

Signed-off-by: Eric W. Biederman <[EMAIL PROTECTED]>
Signed-off-by: Vivek Goyal <[EMAIL PROTECTED]>
---

 arch/x86_64/kernel/head.S |3 -
 arch/x86_64/mm/init.c |  100 --
 2 files changed, 45 insertions(+), 58 deletions(-)

diff -puN arch/x86_64/kernel/head.S~x86_64-Kill-temp_boot_pmds 
arch/x86_64/kernel/head.S
--- linux-2.6.19-rc6-reloc/arch/x86_64/kernel/head.S~x86_64-Kill-temp_boot_pmds 
2006-11-17 00:05:55.0 -0500
+++ linux-2.6.19-rc6-reloc-root/arch/x86_64/kernel/head.S   2006-11-17 
00:05:55.0 -0500
@@ -280,9 +280,6 @@ NEXT_PAGE(level2_ident_pgt)
.quad   i << 21 | 0x083
i = i + 1
.endr
-   /* Temporary mappings for the super early allocator in 
arch/x86_64/mm/init.c */
-   .globl temp_boot_pmds
-temp_boot_pmds:
.fill   492,8,0

 NEXT_PAGE(level2_kernel_pgt)
diff -puN arch/x86_64/mm/init.c~x86_64-Kill-temp_boot_pmds arch/x86_64/mm/init.c
--- linux-2.6.19-rc6-reloc/arch/x86_64/mm/init.c~x86_64-Kill-temp_boot_pmds 
2006-11-17 00:05:55.0 -0500
+++ linux-2.6.19-rc6-reloc-root/arch/x86_64/mm/init.c   2006-11-17 
00:05:55.0 -0500
@@ -167,23 +167,9 @@ __set_fixmap (enum fixed_addresses idx, 
 
 unsigned long __initdata table_start, table_end; 
 
-extern pmd_t temp_boot_pmds[]; 
-
-static  struct temp_map { 
-   pmd_t *pmd;
-   void  *address; 
-   intallocated; 
-} temp_mappings[] __initdata = { 
-   { _boot_pmds[0], (void *)(40UL * 1024 * 1024) },
-   { _boot_pmds[1], (void *)(42UL * 1024 * 1024) }, 
-   {}
-}; 
-
-static __meminit void *alloc_low_page(int *index, unsigned long *phys)
+static __meminit void *alloc_low_page(unsigned long *phys)
 { 
-   struct temp_map *ti;
-   int i; 
-   unsigned long pfn = table_end++, paddr; 
+   unsigned long pfn = table_end++;
void *adr;
 
if (after_bootmem) {
@@ -194,57 +180,63 @@ static __meminit void *alloc_low_page(in
 
if (pfn >= end_pfn) 
panic("alloc_low_page: ran out of memory"); 
-   for (i = 0; temp_mappings[i].allocated; i++) {
-   if (!temp_mappings[i].pmd) 
-   panic("alloc_low_page: ran out of temp mappings"); 
-   } 
-   ti = _mappings[i];
-   paddr = (pfn << PAGE_SHIFT) & PMD_MASK; 
-   set_pmd(ti->pmd, __pmd(paddr | _KERNPG_TABLE | _PAGE_PSE)); 
-   ti->allocated = 1; 
-   __flush_tlb(); 
-   adr = ti->address + ((pfn << PAGE_SHIFT) & ~PMD_MASK); 
+
+   adr = early_ioremap(pfn * PAGE_SIZE, PAGE_SIZE);
memset(adr, 0, PAGE_SIZE);
-   *index = i; 
-   *phys  = pfn * PAGE_SIZE;  
-   return adr; 
-} 
+   *phys  = pfn * PAGE_SIZE;
+   return adr;
+}
 
-static __meminit void unmap_low_page(int i)
+static __meminit void unmap_low_page(void *adr)
 { 
-   struct temp_map *ti;
 
if (after_bootmem)
return;
 
-   ti = _mappings[i];
-   set_pmd(ti->pmd, __pmd(0));
-   ti->allocated = 0; 
+   early_iounmap(adr, PAGE_SIZE);
 } 
 
 /* Must run before zap_low_mappings */
 __init void *early_ioremap(unsigned long addr, unsigned long size)
 {
-   unsigned long map = round_down(addr, LARGE_PAGE_SIZE); 
-
-   /* actually usually some more */
-   if (size >= LARGE_PAGE_SIZE) { 
-   return NULL;
+   unsigned long vaddr;
+   pmd_t *pmd, *last_pmd;
+   int i, pmds;
+
+   pmds = ((addr & ~PMD_MASK) + size + ~PMD_MASK) / PMD_SIZE;
+   vaddr = __START_KERNEL_map;
+   pmd = level2_kernel_pgt;
+   last_pmd = level2_kernel_pgt + PTRS_PER_PMD - 1;
+   for (; pmd <= last_pmd; pmd++, vaddr += PMD_SIZE) {
+   for (i = 0; i < pmds; i++) {
+   if (pmd_present(pmd[i]))
+   goto next;
+   }
+   vaddr += addr & ~PMD_MASK;
+   addr &= PMD_MASK;
+   for (i = 0; i < pmds; i++, addr += PMD_SIZE)
+   set_pmd(pmd + i,__pmd(addr | 

[PATCH 14/20] x86_64: Modify discover_ebda to use virtual address

2006-11-17 Thread Vivek Goyal


Signed-off-by: Eric W. Biederman <[EMAIL PROTECTED]>
Signed-off-by: Vivek Goyal <[EMAIL PROTECTED]>
---

 arch/x86_64/kernel/setup.c |4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff -puN 
arch/x86_64/kernel/setup.c~x86_64-Modify-discover_ebda-to-use-virtual-addresses 
arch/x86_64/kernel/setup.c
--- 
linux-2.6.19-rc6-reloc/arch/x86_64/kernel/setup.c~x86_64-Modify-discover_ebda-to-use-virtual-addresses
  2006-11-17 00:11:14.0 -0500
+++ linux-2.6.19-rc6-reloc-root/arch/x86_64/kernel/setup.c  2006-11-17 
00:11:14.0 -0500
@@ -327,10 +327,10 @@ static void discover_ebda(void)
 * there is a real-mode segmented pointer pointing to the 
 * 4K EBDA area at 0x40E
 */
-   ebda_addr = *(unsigned short *)EBDA_ADDR_POINTER;
+   ebda_addr = *(unsigned short *)__va(EBDA_ADDR_POINTER);
ebda_addr <<= 4;
 
-   ebda_size = *(unsigned short *)(unsigned long)ebda_addr;
+   ebda_size = *(unsigned short *)__va(ebda_addr);
 
/* Round EBDA up to pages */
if (ebda_size == 0)
_
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC: 2.6 patch] remove broken video drivers

2006-11-17 Thread Adrian Bunk
This patch removes video drivers that:
- had already been marked as BROKEN in 2.6.0 three years ago and
- are still marked as BROKEN.

These are the following drivers:
- FB_CYBER
- FB_VIRGE
- FB_RETINAZ3
- FB_ATARI
- FB_SUN3
- FB_PM3

Drivers that had been marked as BROKEN for such a long time seem to be 
unlikely to be revived in the forseeable future.

But if anyone wants to ever revive any of these drivers, the code is 
still present in the older kernel releases.

Signed-off-by: Adrian Bunk <[EMAIL PROTECTED]>

---

Due to it's size, the patch is available at
  
http://ftp.kernel.org/pub/linux/kernel/people/bunk/misc/patch-remove-broken-video.gz

 drivers/video/Kconfig   |   62 
 drivers/video/Makefile  |6 
 drivers/video/atafb.c   | 3107 --
 drivers/video/cyberfb.c | 2297 -
 drivers/video/cyberfb.h |  415 
 drivers/video/pm3fb.c   | 3641 
 drivers/video/retz3fb.c | 1588 -
 drivers/video/retz3fb.h |  286 ---
 drivers/video/sun3fb.c  |  702 ---
 drivers/video/virgefb.c | 2526 ---
 drivers/video/virgefb.h |  288 ---
 include/video/pm3fb.h   | 1235 -
 12 files changed, 3 insertions(+), 16150 deletions(-)

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 7/20] x86_64: cleanup segments

2006-11-17 Thread Vivek Goyal


Move __KERNEL32_CS up into the unused gdt entry.  __KERNEL32_CS is
used when entering the kernel so putting it first is useful when
trying to keep boot gdt sizes to a minimum.

Set the accessed bit on all gdt entries.  We don't care
so there is no need for the cpu to burn the extra cycles,
and it potentially allows the pages to be immutable.  Plus
it is confusing when debugging and your gdt entries mysteriously
change.

Signed-off-by: Eric W. Biederman <[EMAIL PROTECTED]>
Signed-off-by: Vivek Goyal <[EMAIL PROTECTED]>
---

 arch/x86_64/kernel/head.S|   12 ++--
 include/asm-x86_64/segment.h |2 +-
 2 files changed, 7 insertions(+), 7 deletions(-)

diff -puN arch/x86_64/kernel/head.S~x86_64-cleanup-segments 
arch/x86_64/kernel/head.S
--- linux-2.6.19-rc6-reloc/arch/x86_64/kernel/head.S~x86_64-cleanup-segments
2006-11-17 00:07:57.0 -0500
+++ linux-2.6.19-rc6-reloc-root/arch/x86_64/kernel/head.S   2006-11-17 
00:07:57.0 -0500
@@ -354,13 +354,13 @@ gdt:

 ENTRY(cpu_gdt_table)
.quad   0x  /* NULL descriptor */
+   .quad   0x00cf9b00  /* __KERNEL32_CS */
+   .quad   0x00af9b00  /* __KERNEL_CS */
+   .quad   0x00cf9300  /* __KERNEL_DS */
+   .quad   0x00cffb00  /* __USER32_CS */
+   .quad   0x00cff300  /* __USER_DS, __USER32_DS  */
+   .quad   0x00affb00  /* __USER_CS */
.quad   0x0 /* unused */
-   .quad   0x00af9a00  /* __KERNEL_CS */
-   .quad   0x00cf9200  /* __KERNEL_DS */
-   .quad   0x00cffa00  /* __USER32_CS */
-   .quad   0x00cff200  /* __USER_DS, __USER32_DS  */   
-   .quad   0x00affa00  /* __USER_CS */
-   .quad   0x00cf9a00  /* __KERNEL32_CS */
.quad   0,0 /* TSS */
.quad   0,0 /* LDT */
.quad   0,0,0   /* three TLS descriptors */ 
diff -puN include/asm-x86_64/segment.h~x86_64-cleanup-segments 
include/asm-x86_64/segment.h
--- linux-2.6.19-rc6-reloc/include/asm-x86_64/segment.h~x86_64-cleanup-segments 
2006-11-17 00:07:57.0 -0500
+++ linux-2.6.19-rc6-reloc-root/include/asm-x86_64/segment.h2006-11-17 
00:07:57.0 -0500
@@ -6,7 +6,7 @@
 #define __KERNEL_CS0x10
 #define __KERNEL_DS0x18
 
-#define __KERNEL32_CS   0x38
+#define __KERNEL32_CS   0x08
 
 /* 
  * we cannot use the same code segment descriptor for user and kernel
_
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/20] x86_64: Align data segment to PAGE_SIZE boundary

2006-11-17 Thread Vivek Goyal


o Explicitly align data segment to PAGE_SIZE boundary otherwise depending on
  config options and tool chain it might be placed on a non PAGE_SIZE aligned
  boundary and vmlinux loaders like kexec fail when they encounter a 
  PT_LOAD type segment which is not aligned to PAGE_SIZE boundary.

Signed-off-by: Vivek Goyal <[EMAIL PROTECTED]>
---

 arch/x86_64/kernel/vmlinux.lds.S |1 +
 1 file changed, 1 insertion(+)

diff -puN 
arch/x86_64/kernel/vmlinux.lds.S~x86_64-align-data-segment-to-4K-boundary 
arch/x86_64/kernel/vmlinux.lds.S
--- 
linux-2.6.19-rc6-reloc/arch/x86_64/kernel/vmlinux.lds.S~x86_64-align-data-segment-to-4K-boundary
2006-11-17 00:05:06.0 -0500
+++ linux-2.6.19-rc6-reloc-root/arch/x86_64/kernel/vmlinux.lds.S
2006-11-17 00:05:06.0 -0500
@@ -60,6 +60,7 @@ SECTIONS
   }
 #endif
 
+  . = ALIGN(PAGE_SIZE);/* Align data segment to page size boundary */
/* Data */
   .data : AT(ADDR(.data) - LOAD_OFFSET) {
*(.data)
_
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC: 2.6 patch] remove the broken VIDEO_ZR36120 driver

2006-11-17 Thread Adrian Bunk
The VIDEO_ZR36120 driver has:
- already been marked as BROKEN in 2.6.0 three years ago and
- is still marked as BROKEN.

Drivers that had been marked as BROKEN for such a long time seem to be 
unlikely to be revived in the forseeable future.

But if anyone wants to ever revive this driver, the code is still 
present in the older kernel releases.

Signed-off-by: Adrian Bunk <[EMAIL PROTECTED]>

---

Due to it's size, the patch is attached compressed.

 Documentation/video4linux/zr36120.txt |  162 --
 drivers/media/video/Kconfig   |   12 
 drivers/media/video/Makefile  |2 
 drivers/media/video/zr36120.c | 2079 --
 drivers/media/video/zr36120.h |  279 ---
 drivers/media/video/zr36120_i2c.c |  132 -
 drivers/media/video/zr36120_mem.c |   78 
 drivers/media/video/zr36120_mem.h |3 
 8 files changed, 2747 deletions(-)



patch-remove-broken-VIDEO_ZR36120.gz
Description: Binary data


  1   2   3   4   5   6   7   8   >