Re: 2.6.11-mm2 + Radeon crash

2005-03-09 Thread Benjamin Herrenschmidt
On Wed, 2005-03-09 at 21:12 +0100, Christian Henz wrote:
> Hi, 
> 
> I wanted to try 2.6.11-mm2 for the low latency/realtime lsm stuff and
> I've run into a severe
> problem.
> 
> When I try to start X, my machine reboots. The screen goes dark as
> usual when setting the video mode, but then I get a beep and I'm
> greeted with the BIOS boot messages. This happened 4/5 times i've
> tried, and once the video mode was actually set (at least I saw the
> usual X b/w pattern with some random framebuffer garbage), the machine
> didn't reboot but after that nothing happened. My keyboard was still
> responsive (ie NumLock LED would still go on/off), but i could neither
> kill X with CTRL-ALT-BACKSPACE nor could i switch back to console, so
> I ended up pressing reset.
> 
> After the crashes I booted with a rescue CD to examine the logs, but I
> could not find any obvious errors.
> 
> Everything works nicely on 2.6.10 and earlier kernels. I'm in the
> process of building 2.6.11.2 to see if the crash occurs there.

So ?

Ben.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Fastboot] [RFC] Kdump: Dump Capture Mechanism

2005-03-09 Thread Eric W. Biederman
Vivek Goyal <[EMAIL PROTECTED]> writes:

> Hi,
> 
> Well this discussion has been going on for quite sometime now that
> what's the best way to capture the dump? There seems to be two lines of
> arguments.
> 
> Export ELF view through /proc/vmcore
> 
> This basically involves retrieving saved core image headers and
> exporting those through /proc/vmcore interface. Further user space
> applications can be built on top of it to do advanced processing.
> 
> Do Everything in user space
> ---
> The whole idea is that do all the processing from user space (preferably
> from ramdisk or so).
> 
> When it comes to requirements, Distros and developers seem to be having
> somewhat different requirements.
> 
> Distros:
> ---
> - Fully automate the dump generation/capture process.
> - Configure everything in advance (like, dump storage location).
> - Upon crash, store dump image at pre-configured location and reboot
>   into production kernel ASAP.
> 
> Developers:
> --
> - Keyword is simple and easy to use solution.
> - Should work well in a development environment where, not necessarily
>   all the components (user space, kernel space) are in perfect harmony
>   and things are yet to be stabilized.
> 
> IMO, exporting /proc/vmcore is a good idea. It offers wide variety of
> choices to both developers and distros.
> 
> - It provides the basic dump capturing mechanism in kernel.
> 
> - Developers can store the dump image locally (cp) or transfer it over
>   network (scp, ftp) using standard utilities and don't have to deal
>   with additional user space utilites specifically designed for this
>   purpose.

No just a magic kernel interface.

And with /sbin/kdump as written this already works.
You just need to use a pipe.
kdump |ftp - [EMAIL PROTECTED]
 
> - Developers can directly run gdb on /proc/vmcore generated image and
>   do the limited debugging without need of any other dump
>   capture/analysis utility.

There are a lot of disharmonies that can happen here.  32bit dump
kernel 64bit dumped kernel etc.  Or vice versa.  Those kinds of things
can cause problems if you are not careful.

And even worse you can't directly use gdb unmodified on 32bit systems
because it will be a 64bit core dump, with unreliable virtual addresses.
 
> - Distros can build additional fully automated dump saving solutions on
>   top of /proc/vmcore.  Be it a init script or a custom initial ramdisk
>   or something else...
> 
> So the whole idea is, that /proc/vmcore and user space solutions can co-
> exist. And let the user/distros choose between these based on their
> requirements.
> 
> I was planning to implement /proc/vmcore. Do you have any comments or
> suggestions?

My gut reaction, is a server that talks the gdb-stubs protocol
to export the core dump in read-only mode has about the
same advantages, and doesn't require a kernel patch.  Plus
it does not require an thorough audit.  I freely admit that
having to copy all the entire core dump can be overkill.

Anything in the kernel that we use for crash dump capture
must be hardened, against all manner of insanity.  Placing something
in user space hugely limits the worst that can happen, facilitates
debugging and facilitates auditing for reliability, enormously.

Every line of code that does not go into the kernel now,
saves days of debugging later.  Every kernel abomination
I have had to look at has been because too much code
was in the kernel. lustre/openib gen1/drbd.  I like running
my slow path code in the virtual machine that is a process.
So many problems are ruled out immediately.

Without /proc/vmcore a crash capture kernel will simply be a robust
kernel that can run at a non-default address.  With /proc/vmcore
we have transformed the kernel into a special purpose tool,
that really can't be used for other things.

So if we can do it in user space let's do it there.

For simplicity I do support putting it all in one user space
package so only one set of tools needs to be learned.

If you are going to do something, I recommend making it it's
one in kernel filesystem.  That is the only semi-sane way I can see
to wrap magic /proc/vmcore.

Eric

p.s. I still remember the several of the kludges in the current
crashdump-* patches.  They don't set expectations for 
a robust, clean, and minimal kernel implementation.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: current linus bk, error mounting root

2005-03-09 Thread Jens Axboe
On Wed, Mar 09 2005, Jon Smirl wrote:
> On Wed, 9 Mar 2005 22:09:26 +0100, Jens Axboe <[EMAIL PROTECTED]> wrote:
> > probably not worth the bother, looks like barrier problems. get the
> > serial console running instead and send the full output, I'll take a
> > look in the morning.
> 
> serial console boot output attached.

Hmm ok, nothing of interest there. What does the mount error 6 and 2
from  your original mail mean? I need some more info on what fails
specifically. What mount options are used? What partition is mounted (is
it md or hdaX)?

I'm not sure -bk5 had the follow up fix patch for the barrier rework,
you should probably just retry with -bk6 first.

-- 
Jens Axboe

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


64-bit resources?

2005-03-09 Thread Kumar Gala
Greg,
I was wondering what the state of the change to 64-bit resources was?
thanks
- kumar
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/2] No-exec support for ppc64

2005-03-09 Thread Benjamin Herrenschmidt
On Wed, 2005-03-09 at 21:25 -0600, Olof Johansson wrote:
> Hi,
> 
> On Tue, Mar 08, 2005 at 05:13:26PM -0600, Jake Moilanen wrote:
> > diff -puN arch/ppc64/mm/hash_utils.c~nx-kernel-ppc64 
> > arch/ppc64/mm/hash_utils.c
> > --- linux-2.6-bk/arch/ppc64/mm/hash_utils.c~nx-kernel-ppc64 2005-03-08 
> > 16:08:57 -06:00
> > +++ linux-2.6-bk-moilanen/arch/ppc64/mm/hash_utils.c2005-03-08 
> > 16:08:57 -06:00
> > @@ -89,12 +90,23 @@ static inline void loop_forever(void)
> > ;
> >  }
> >  
> > +int is_kernel_text(unsigned long addr)
> > +{
> > +   if (addr >= (unsigned long)_stext && addr < (unsigned long)__init_end)
> > +   return 1;
> > +
> > +   return 0;
> > +}
> 
> This is used in two files, but never declared extern in the second file
> (iSeries_setup.c). Should it go in a header file as a static inline
> instead?

Yes, I think it should.

> There also seems to be a local static is_kernel_text() in kallsyms that
> overlaps (but it's not identical). Removing that redundancy can be taken
> care of as a janitorial patch outside of the noexec stuff.
> 
> 
> 
> -Olof
> ___
> Linuxppc64-dev mailing list
> [EMAIL PROTECTED]
> https://ozlabs.org/cgi-bin/mailman/listinfo/linuxppc64-dev
-- 
Benjamin Herrenschmidt <[EMAIL PROTECTED]>

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Fastboot] Re: Query: Kdump: Core Image ELF Format

2005-03-09 Thread Eric W. Biederman
Dipankar Sarma <[EMAIL PROTECTED]> writes:

> On Wed, Mar 09, 2005 at 07:17:49AM -0700, Eric W. Biederman wrote:

> > Beyond that I prefer a little command line tool that will do the
> > ELF64 to ELF32 conversion and possibly add in the kva mapping to
> > make the core dump usable with gdb.  Doing it in a separate tool
> > means it is the developer who is doing the analysis who cares
> > not the user who is capturing the system core dump.
> 
> Well, as a kernel developer, I am both :) For me, having to install
> half-a-dozen different command line tools to get and analyze a crash dump
> is a PITA, not to mention potential version mismatches. As someone
> who would like very much to use crash dump for debugging, I would
> much rather be able to force a dump and then use gdb for
> a quick debug. I agree that a customer would see a different
> situation. It would be nice if we can cater to both the kinds.

Crash dumps seem to be a when all else fails kind of solution.  Or
something to make a detailed record of what happened so information
can be logged.

I think are differences are largely a matter of emphasis.  I worry
about the end user and the whole cycle.  For that we need a fixed
simple crash dump format whit no knobs bells or whistles, that can
be given to developers and eventually supported natively by all of
the tools.

I doubt tweaking gdb so it can handle a 64bit ELF core dump even
on 32bit architectures would be very hard.  Once that is in place
the 64->32bit conversion doesn't matter.  The virtual addresses
matter a little more although I am more inclined to teach gdb
about the physical and virtual address differences of whole machine
crash dumps.

I do agree that the ability to tweak things in the short term
to work like a process that does not have the virtual/physical address
distinction is useful.  

The issue of tool versioning problems is bogus.  That is solved
by simply picking a good interface between tools and sticking
with it.  Occasionally there will be paradigm shifts (like threading)
that will cause change, but generally everything will stay the same.
In addition if tools are distributed together it does not matter,
if there are several of them because there is only one update.

Eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Fastboot] Re: Query: Kdump: Core Image ELF Format

2005-03-09 Thread Eric W. Biederman
Vivek Goyal <[EMAIL PROTECTED]> writes:

> I want to fill the virtual addresses of linearly mapped region. That is
> physical addresses from 0 to MAXMEM (896 MB) are mapped by kernel at
> virtual addresses PAGE_OFFSET to (PAGE_OFFSET + MAXMEM). Values of
> PAGE_OFFSET and MAXMEM are already known and hard-coded.

PAGE_OFFSET has a common value of 0xc000, on x86.  However
that value is by no means fixed.  The 4G/4G split changes it
as do some other patches floating around at the time.
On x86-64 I don't know how stable those kinds of offsets are.
 
> I think I used the terminology kernel virtual address and that is adding
> to the confusion. Kernel virtual addresses are not necessarily linearly
> mapped. What I meant was kernel logical addresses whose associated
> physical addresses differ only by a constant offset.

I know what you meant.  I simply meant that things don't look that
constant to me.  Especially in Linux where there are enough people
to try most of the reasonable possibilities.

I don't even think it is a bad idea.  But I do think we have a different
idea of what is constant.

Eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch -mm series] ia64 specific /dev/mem handlers

2005-03-09 Thread Andrew Morton
Jes Sorensen <[EMAIL PROTECTED]> wrote:
>
> Convert /dev/mem read/write calls to use arch_translate_mem_ptr if
>  available. Needed on ia64 for pages converted fo uncached mappings to
>  avoid it being accessed in cached mode after the conversion which can
>  lead to memory corruption. Introduces PG_uncached page flag for
>  marking pages uncached.

For some reason this patch still gives me the creeps.  Maybe it's because
we lose a page flag for something so obscure.

Nothing ever clears PG_uncached.  We'll end up with every page in the
machine marked as being uncached.

But then, nothing ever sets PG_uncached, either.  Is there some patch which
you're hiding from me?

If a page is marked uncached then it'll remain marked as uncached even
after it's unmapped.  Or will it?  Would like to see the other patch, please.

We should add PG_uncached checks to the page allocator.  Is this OK?


--- 25/mm/page_alloc.c~ia64-specific-dev-mem-handlers-checks2005-03-09 
22:53:12.0 -0800
+++ 25-akpm/mm/page_alloc.c 2005-03-09 22:53:44.0 -0800
@@ -108,6 +108,7 @@ static void bad_page(const char *functio
1 << PG_active  |
1 << PG_dirty   |
1 << PG_swapcache |
+   1 << PG_uncached |
1 << PG_writeback);
set_page_count(page, 0);
reset_page_mapcount(page);
@@ -321,6 +322,7 @@ static inline void free_pages_check(cons
1 << PG_reclaim |
1 << PG_slab|
1 << PG_swapcache |
+   1 << PG_uncached |
1 << PG_writeback )))
bad_page(function, page);
if (PageDirty(page))
@@ -446,6 +448,7 @@ static void prep_new_page(struct page *p
1 << PG_dirty   |
1 << PG_reclaim |
1 << PG_swapcache |
+   1 << PG_uncached |
1 << PG_writeback )))
bad_page(__FUNCTION__, page);
 
_

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: huge filesystems

2005-03-09 Thread Chris Wedgwood
On Wed, Mar 09, 2005 at 10:53:48AM -0800, Dan Stromberg wrote:

> My question is, what is the current status of huge filesystems - IE,
> filesystems that exceed 2 terabytes, and hopefully also exceeding 16
> terabytes?

people can and do have >2T filesystems now.  some people on x86 have
hit the 16TB limit and others are large still with 64-bit CPUs

> Am I correct in assuming that the usual linux buffer cache only goes
> to 16 terabytes?

for 32-bit CPUs

> What about the "LBD" patches - what limits are involved there, and
> have they been rolled into a Linus kernel, or one or more vendor
> kernels?

LBD is in 2.6.x and is required for >2TB but sometimes that means >1TB
or even smaller depending on the drivers

many drivers simply won't go above 2T even with CONFIG_LBD so you need
to poke about and see what works for you (or use md/raid to glue
together multiple 2TB volumes)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] -stable, how it's going to work.

2005-03-09 Thread Andi Kleen
On Wed, Mar 09, 2005 at 10:34:08AM -0800, Greg KH wrote:
> On Wed, Mar 09, 2005 at 10:56:33AM +0100, Andi Kleen wrote:
> > Greg KH <[EMAIL PROTECTED]> writes:
> > >
> > > Rules on what kind of patches are accepted, and what ones are not, into
> > > the "-stable" tree:
> > >  - It must be obviously correct and tested.
> > >  - It can not bigger than 100 lines, with context.
> > 
> > This rule seems silly. What happens when a security fix needs 150 lines? 
> 
> Then we bend the rules and accept it :)
> 
> We'll take these as a case-by-case basis...
> 
> > >  - Security patches will be accepted into the -stable tree directly from
> > >the security kernel team, and not go through the normal review cycle.
> > >Contact the kernel security team for more details on this procedure.
> > 
> > This also sounds like a bad rule. How come the security team has more
> > competence to review patches than the subsystem maintainers?  I can
> > see the point of overruling maintainers on security issues when they
> > are not responsive, but if they are I think the should be still the
> > main point of contact.
> 
> Security fixes go from the security team to Linus's tree directly, and
> usually the subsystem maintainer has already been notified and has
> reviewedit.  At that point in time, they are public and accepted into

What guarantees that?

Basically what I would like to avoid is that the security team
merges something through the backdoor that the maintainer considers crap.

If anything you should have a rule like

"Send to maintainer. If he doesn't ACK in 24h send it directly"


> mainline, and need to be made availble to the -stable users as soon as
> possible.
> 
> That is why the "fast track" is going to happen, the patch really was
> reviewed properly, just not in public :)

Well, when you really want to have such formal rules (which is a novelty in 
Linux space BTW, for many years we did fine with unwritten rules)  then you
should spell it out completely. Or alternatively drop all the formal
rules and do it informally like it was always done.


-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: bk commits and dates

2005-03-09 Thread Jeff Garzik
Benjamin Herrenschmidt wrote:
On Wed, 2005-03-09 at 19:47 -0800, David S. Miller wrote:
On Thu, 10 Mar 2005 13:41:59 +1100
Benjamin Herrenschmidt <[EMAIL PROTECTED]> wrote:

I don't know if I'm the only one to have a problem with that, but it
would be nice if it was possible, when you pull a bk tree, to have the
commit messages for the csets in that tree be dated from the day you
pulled, and not the day when they went in the source tree.
When I'm working, I just do "bk csets" after I pull from Linus's
tree to review what went in since the last time I pulled.

Yes, but the commit list archive is handy. I have quite good search
capabilities in my mailer for example, and sometimes, when doign
regression, it's quite useful to browse what went in between two
releases with it (it's just more handy than bk csets).
Speaking strictly in terms of implementation, David Woodhouse's 
bk-commits mailer scripts could probably easily be tweaked to -not- set 
an explicit Date header on the outgoing emails.

It then becomes a matter of deciding whether this is a good idea or not :)
Jeff

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: Bug: ll_rw_blk.c, elevator.c and displaying "default" IO Sche dule r at boot-time (Cosmetic only)

2005-03-09 Thread Roberts-Thomson, James
Jens/Hari,

The patch which you both supplied does solve the problem.

I'd imagine this patch is probably not "critical" enough for a
2.6.11.x-series patch, but it would be nice to see this included in 2.6.12.

Thanks!

James Roberts-Thomson
--
A synonym is a word you use if you can't spell the other one.
 
LKML Readers:  Please ignore the following disclaimer (automatically
attached by my work email gateway).  This email is explicitly declared to be
non confidential and does not contain privileged information.


This communication is confidential and may contain privileged material.
If you are not the intended recipient you must not use, disclose, copy or 
retain it.
If you have received it in error please immediately notify me by return email
and delete the emails.
Thank you.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [BUG] 2.6.11- sym53c8xx Broken on pp64

2005-03-09 Thread Benjamin Herrenschmidt
On Wed, 2005-03-09 at 19:47 -0800, Linus Torvalds wrote:
> 
> On Wed, 9 Mar 2005, Omkhar Arasaratnam wrote:
> > 
> > I confirmed that this occurs with the 2.6.11 code straight from 
> > kernel.org Here is an error from the bringup:
> 
> So if 2.6.9 works, and 2.6.11 does not, can you check 2.6.10? And perhaps 
> hunt it down even more, to a -rc release?
> 
> > sym0: No NVRAM, ID 7, Fast-80 LVD, parity checking
> > CACHE TEST FAILED: DMA error (dstat=0xa0) .sym0: CACHE INCORRECTLY 
> > CONFIGURED
> > sym0: giving up ...
> 
> There are certainly sym changes in there too since 2.6.9, let's see if 
> James or Willy have any suggestions. It might not be ppc64-specific.

Ok, we have it working here on a similar machine with 2.6.11 and failing
in a similar way with bk which is why I asked ;)

The bk problem is found & fixed here tho. I'll send a patch later, it's
a bug with ppc64 iounmap() not properly flushing the hash table after
the set_pte_at() patch due to some crap in our custom implementation of
that guy.

Here's the patch, but I want to get rid of that stuff anyway (at least
make unmap_vm_area take the "mm", or rather make an unmap_vm_area_mm()
and make unmap_vm_area() just call it and then use that instead of our
own implementation, but I'm waiting for Hugh cleanup to get in before
touching any of this).

--

This patch fixes a bug in ppc64 local implementation of iounmap() that
would cause it to incorrectly flush the hash table since the changes to
set_pte have been applied.

Signed-off-by: Benjamin Herrenschmidt <[EMAIL PROTECTED]>

Index: working-2.6/arch/ppc64/mm/init.c
===
--- working-2.6.orig/arch/ppc64/mm/init.c   2005-03-07 13:06:23.0 
+1100
+++ working-2.6/arch/ppc64/mm/init.c2005-03-10 12:59:50.0 +1100
@@ -288,7 +288,7 @@
 static void unmap_im_area_pte(pmd_t *pmd, unsigned long address,
  unsigned long size)
 {
-   unsigned long end;
+   unsigned long base, end;
pte_t *pte;
 
if (pmd_none(*pmd))
@@ -300,6 +300,7 @@
}
 
pte = pte_offset_kernel(pmd, address);
+   base = address & PMD_MASK;
address &= ~PMD_MASK;
end = address + size;
if (end > PMD_SIZE)
@@ -307,7 +308,7 @@
 
do {
pte_t page;
-   page = ptep_get_and_clear(_mm, address, pte);
+   page = ptep_get_and_clear(_mm, base + address, pte);
address += PAGE_SIZE;
pte++;
if (pte_none(page))
@@ -321,7 +322,7 @@
 static void unmap_im_area_pmd(pgd_t *dir, unsigned long address,
  unsigned long size)
 {
-   unsigned long end;
+   unsigned long base, end;
pmd_t *pmd;
 
if (pgd_none(*dir))
@@ -333,13 +334,14 @@
}
 
pmd = pmd_offset(dir, address);
+   base = address & PGDIR_MASK;
address &= ~PGDIR_MASK;
end = address + size;
if (end > PGDIR_SIZE)
end = PGDIR_SIZE;
 
do {
-   unmap_im_area_pte(pmd, address, end - address);
+   unmap_im_area_pte(pmd, base + address, end - address);
address = (address + PMD_SIZE) & PMD_MASK;
pmd++;
} while (address < end);


 

Ben.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


2.6.11-mm2 + Radeon crash

2005-03-09 Thread Christian Henz
Hi, 

I wanted to try 2.6.11-mm2 for the low latency/realtime lsm stuff and
I've run into a severe
problem.

When I try to start X, my machine reboots. The screen goes dark as
usual when setting the video mode, but then I get a beep and I'm
greeted with the BIOS boot messages. This happened 4/5 times i've
tried, and once the video mode was actually set (at least I saw the
usual X b/w pattern with some random framebuffer garbage), the machine
didn't reboot but after that nothing happened. My keyboard was still
responsive (ie NumLock LED would still go on/off), but i could neither
kill X with CTRL-ALT-BACKSPACE nor could i switch back to console, so
I ended up pressing reset.

After the crashes I booted with a rescue CD to examine the logs, but I
could not find any obvious errors.

Everything works nicely on 2.6.10 and earlier kernels. I'm in the
process of building 2.6.11.2 to see if the crash occurs there.

Here is some info on my system:

I've got an Athlon 1000C on a VIA KT133 chipset and a Radeon 7200 (the
original Radeon with 32MB SDR RAM). I'm running Debian/sid.


homer:/home/chrissi# X -version
X: warning; /dev/dri has unusual mode (not 755) or is not a directory.
XFree86 Version 4.3.0.1 (Debian 4.3.0.dfsg.1-12.0.1 20050223080930
[EMAIL PROTECTED])
Release Date: 15 August 2003
X Protocol Version 11, Revision 0, Release 6.6
Build Operating System: Linux 2.6.9 i686 [ELF]
Build Date: 23 February 2005


lspci -vv says bout the Radeon:


:01:00.0 VGA compatible controller: ATI Technologies Inc Radeon
R100 QD [Radeon 7200] (prog-if 00 [VGA])
Subsystem: ATI Technologies Inc Radeon 7000/Radeon
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
ParErr- Stepping+ SERR- FastB2B-
Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=medium
>TAbort- SERR- 
Capabilities: [50] Power Management version 2
Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA
PME(D0-,D1-,D2-,D3hot-,D3cold-)
Status: D0 PME-Enable- DSel=0 DScale=0 PME-



Please let me know what more data is needed to figure this one out.

Thanks,
Christian Henz

PS: If you reply, please CC me as I'm not subscribed.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 1/1] unified spinlock initialization arch/um/drivers/port_kern.c

2005-03-09 Thread Blaisorblade
On Wednesday 09 March 2005 18:12, Russell King wrote:
> On Wed, Mar 09, 2005 at 10:42:33AM +0100, [EMAIL PROTECTED] wrote:
> > From: <[EMAIL PROTECTED]>
> > Cc: , <[EMAIL PROTECTED]>,
> > <[EMAIL PROTECTED]>, <[EMAIL PROTECTED]>
> >
> > Unify the spinlock initialization as far as possible.

> Are you sure this is really the best option in this instance?
> Sometimes, static data initialisation is more efficient than
> code-based manual initialisation, especially when the memory
> is written to anyway.
Agreed, theoretically, but this was done for multiple reasons globally, for 
instance as a preparation to Ingo Molnar's preemption patches. There was 
mention of this on lwn.net about this:

http://lwn.net/Articles/108719/

Ok?
-- 
Paolo Giarrusso, aka Blaisorblade
Linux registered user n. 292729
http://www.user-mode-linux.org/~blaisorblade


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: bk commits and dates

2005-03-09 Thread Benjamin Herrenschmidt
On Wed, 2005-03-09 at 19:28 -0800, Linus Torvalds wrote:
> 
> On Thu, 10 Mar 2005, Benjamin Herrenschmidt wrote:
> > 
> > I don't know if I'm the only one to have a problem with that, but it
> > would be nice if it was possible, when you pull a bk tree, to have the
> > commit messages for the csets in that tree be dated from the day you
> > pulled, and not the day when they went in the source tree.
> 
> Nope, that's against how BK works. It's really distributed, so "my" tree 
> has no special meaning, and as such the fact that I pull has no meaning 
> either - it doesn't trigger as anything special.

Yes, but it would be easy to have the messages dated from the day they
are sent :) Even if you put the real commit date in the message itself.
It's really disturbing to receive mails dated a long time in the past
don't you think ?

> The only thing that ends up being special is when it hits the public tree
> which has the trigger to send out the emails. IOW, the date of the _email_
> is special (in that it says when a commit hit the public tree), not not
> the commits changesets themselves.

Yes, but the email gets the old date.

> Now, if James trigger scripts set the date of the email by the date of the 
> commit, that sounds like a misfeature, but you'd better talk to James, not 
> me, since he's the one doing that part..

Hah ok. Which James ?

Ben.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Page Fault Scalabilty patch V19 [0/4]: Overview

2005-03-09 Thread Christoph Lameter

Changelog:

V18->V19 Fall back to obtaining the page table lock before calling
do_wp_page. Keep mark_page_accessed in do_swap_page and
add SetPageReferenced in do_anonymous_page
Diff against 2.6.11.
V17->V18 Rediff against 2.6.11-rc5-bk4
V16->V17 Do not increment page_count in do_wp_page. Performance data
V15->V16 of this patch: Redesign to allow full backback
for architectures that do not supporting atomic operations.

Note that this is a release against 2.6.11 and not against the latest
bk tree. Some changes have made it into Linus tree that will require some
rework of the patch for post 2.6.11 (look for V20).

An introduction to what this patch does and a patch archive can be found on
http://oss.sgi.com/projects/page_fault_performance. The archive also has the
result of various performance tests (LMBench, Microbenchmark and
kernel compiles).

The basic approach in this patchset is the same as used in SGI's 2.4.X
based kernels which have been in production use in ProPack 3 for a long time.

The patchset is composed of 4 patches:

1/4: ptep_cmpxchg and ptep_xchg to avoid intermittent zeroing of ptes

The current way of synchronizing with the CPU or arch specific
interrupts updating page table entries is to first set a pte
to zero before writing a new value. This patch uses ptep_xchg
and ptep_cmpxchg to avoid writing the zero for certain
configurations.

The patch introduces CONFIG_ATOMIC_TABLE_OPS that may be
enabled as a experimental feature during kernel configuration
if the hardware is able to support atomic operations and if
an SMP kernel is being configured. A Kconfig update for i386,
x86_64 and ia64 has been provided. On i386 this options is
restricted to CPUs better than a 486 and non PAE mode (that
way all the cmpxchg issues on old i386 CPUS and the problems
with 64bit atomic operations on recent i386 CPUS are avoided). 

If CONFIG_ATOMIC_TABLE_OPS is not set then ptep_xchg and
ptep_xcmpxchg are realized by falling back to clearing a pte
before updating it.

The patch does not change the use of mm->page_table_lock and
the only performance improvement is the replacement of
xchg-with-zero-and-then-write-new-pte-value with an xchg with
the new value for SMP on some architectures if
CONFIG_ATOMIC_TABLE_OPS is configured. It should not do anything
major to VM operations.

2/4: Macros for mm counter manipulation

There are various approaches to handling mm counters if the
page_table_lock is no longer acquired. This patch defines
macros in include/linux/sched.h to handle these counters and
makes sure that these macros are used throughout the kernel
to access and manipulate rss and anon_rss. There should be
no change to the generated code as a result of this patch.

3/4: Drop the first use of the page_table_lock in handle_mm_fault

The patch introduces two new functions:

page_table_atomic_start(mm), page_table_atomic_stop(mm)

that fall back to the use of the page_table_lock if
CONFIG_ATOMIC_TABLE_OPS is not defined.

If CONFIG_ATOMIC_TABLE_OPS is defined those functions may
be used to prep the CPU for atomic table ops (i386 in PAE mode
may f.e. get the MMX register ready for 64bit atomic ops) but
are simply empty by default.

Two operations may then be performed on the page table without
acquiring the page table lock:

a) updating access bits in pte
b) anonymous read faults installed a mapping to the zero page.

All counters are still protected with the page_table_lock thus
avoiding any issues there.

Some additional statistics are added to /proc/meminfo to
give some statistics. Also counts spurious faults with no
effect. There is a surprisingly high number of those on ia64
(used to populate the cpu caches with the pte??)

4/4: Drop the use of the page_table_lock in do_anonymous_page

The second acquisition of the page_table_lock is removed 
from do_anonymous_page and allows the anonymous
write fault to be possible without the page_table_lock.

The macros for manipulating rss and anon_rss in include/linux/sched.h
are changed if CONFIG_ATOMIC_TABLE_OPS is set to use atomic
operations for rss and anon_rss (safest solution for now, other
solutions may easily be implemented by changing those macros).

This patch typically yield significant increases in page fault
performance for threaded applications on SMP systems.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  

netdev-2.6 queue updated

2005-03-09 Thread Jeff Garzik
All the stuff that was waiting for Linus has been sent.  Here's what's 
sitting around in the netdev-2.6 to "stew" for a bit:

* wireless-2.6 tree (bk://gkernel.bkbits.net/wireless-2.6)
* 8139cp updates
* starfire update
* 8139too iomap conversion
* natsemi long/short cable option (no longer needed?)
* new skge driver from Stephen Hemminger
BK URL, Patch URL, and changelog attached.
Note that the patch is diff'd against 2.6.11-bk6, which will not exist 
until four hours after this email is sent.

Jeff

BK users:

bk pull bk://gkernel.bkbits.net/netdev-2.6

Patch:
http://www.kernel.org/pub/linux/kernel/people/jgarzik/patchkits/2.6/2.6.11-bk6-netdev1.patch.bz2

This will update the following files:

 drivers/net/wireless/ieee802_11.h   |   78 
 MAINTAINERS |7 
 drivers/net/8139cp.c|  100 
 drivers/net/8139too.c   |  194 -
 drivers/net/Kconfig |   12 
 drivers/net/Makefile|1 
 drivers/net/natsemi.c   |   11 
 drivers/net/skge.c  | 3385 ++
 drivers/net/skge.h  | 3005 +++
 drivers/net/starfire.c  |  142 
 drivers/net/starfire_firmware.h |  346 ++
 drivers/net/wireless/Kconfig|2 
 drivers/net/wireless/Makefile   |2 
 drivers/net/wireless/atmel.c|   62 
 drivers/net/wireless/hostap/Kconfig |  104 
 drivers/net/wireless/hostap/Makefile|8 
 drivers/net/wireless/hostap/hostap.c| 1205 +++
 drivers/net/wireless/hostap/hostap.h|   57 
 drivers/net/wireless/hostap/hostap_80211.h  |  107 
 drivers/net/wireless/hostap/hostap_80211_rx.c   | 1080 +++
 drivers/net/wireless/hostap/hostap_80211_tx.c   |  522 +++
 drivers/net/wireless/hostap/hostap_ap.c | 3259 +
 drivers/net/wireless/hostap/hostap_ap.h |  272 +
 drivers/net/wireless/hostap/hostap_common.h |  556 +++
 drivers/net/wireless/hostap/hostap_config.h |   86 
 drivers/net/wireless/hostap/hostap_crypt.c  |  167 +
 drivers/net/wireless/hostap/hostap_crypt.h  |   50 
 drivers/net/wireless/hostap/hostap_crypt_ccmp.c |  486 +++
 drivers/net/wireless/hostap/hostap_crypt_tkip.c |  696 
 drivers/net/wireless/hostap/hostap_crypt_wep.c  |  281 +
 drivers/net/wireless/hostap/hostap_cs.c |  785 +
 drivers/net/wireless/hostap/hostap_download.c   |  761 +
 drivers/net/wireless/hostap/hostap_hw.c | 3607 +++
 drivers/net/wireless/hostap/hostap_info.c   |  469 +++
 drivers/net/wireless/hostap/hostap_ioctl.c  | 3624 
 drivers/net/wireless/hostap/hostap_pci.c|  452 ++
 drivers/net/wireless/hostap/hostap_plx.c|  611 
 drivers/net/wireless/hostap/hostap_proc.c   |  466 +++
 drivers/net/wireless/hostap/hostap_wlan.h   | 1071 +++
 drivers/net/wireless/orinoco.c  |   11 
 drivers/net/wireless/wl3501.h   |4 
 include/linux/pci_ids.h |3 
 include/net/ieee80211.h |  887 +
 include/net/ieee80211_crypt.h   |   86 
 net/Kconfig |2 
 net/Makefile|1 
 net/ieee80211/Kconfig   |   67 
 net/ieee80211/Makefile  |   11 
 net/ieee80211/ieee80211_crypt.c |  259 +
 net/ieee80211/ieee80211_crypt_ccmp.c|  470 +++
 net/ieee80211/ieee80211_crypt_tkip.c|  708 
 net/ieee80211/ieee80211_crypt_wep.c |  272 +
 net/ieee80211/ieee80211_module.c|  268 +
 net/ieee80211/ieee80211_rx.c| 1206 +++
 net/ieee80211/ieee80211_tx.c|  448 ++
 net/ieee80211/ieee80211_wx.c|  471 +++
 56 files changed, 32947 insertions(+), 356 deletions(-)

through these ChangeSets:

:
  o ieee80211 subsystem

:
  o wireless-2.6 cleanup

:
  o [netdrvr 8139cp] add PCI ID

Adrian Bunk:
  o net/ieee80211/Kconfig: don't describe what gets selected

Alexander Viro:
  o hostap __user annotations

Felipe Damasio:
  o 8139cp net driver: add MODULE_VERSION

François Romieu:
  o ieee80211: offset_in_page() removal
  o ieee80211: C99 initialization for eap_types
  o ieee80211: failure of ieee80211_crypto_init()
  o 8139cp: SG support fixes

Greg Kroah-Hartman:
  o net drivers: convert pci_dev->slot_name usage to pci_name()

Jeff Garzik:
  o [netdrvr starfire] Add GPL'd firmware, remove compat code
  o [wireless hostap] update for new pci_save_state()
  o [netdrvr 8139cp] TSO support

Joshua Kwan:
  o hostap: fix Kconfig typos and missing select CRYPTO

Jouni Malinen:
  o Host 

[PATCH 2/15] ptwalk: change_protection

2005-03-09 Thread Hugh Dickins
Begin the pagetable walker cleanup with a straightforward example,
mprotect's change_protection.  Started out from Nick Piggin's for_each
proposal, but I prefer less hidden; and these are all do while loops,
which degrade slightly when converted to for loops.

Firmly agree with Andi and Nick that addr,end is the way to go: size is
good at the user interface level, but unhelpful down in the loops.  And
the habit of an "address" which is actually an offset from some base has
bitten us several times: use proper address at each level, whyever not?

Don't apply each mask at two levels: all we need is a set of macros
pgd_addr_end, pud_addr_end, pmd_addr_end to give the address of the end
of each range.  Which need to take the min of two addresses, with 0 as
the greatest.  Started out with a different macro, assumed end never 0;
but clear_page_range (alone) might be passed end 0 by some out-of-tree
memory layouts: could special case it, but this macro compiles smaller.
Check "addr != end" instead of "addr < end" to work on that end 0 case.

Signed-off-by: Hugh Dickins <[EMAIL PROTECTED]>
---

 include/asm-generic/pgtable.h |   21 
 mm/mprotect.c |  104 +++---
 2 files changed, 59 insertions(+), 66 deletions(-)

--- ptwalk1/include/asm-generic/pgtable.h   2005-03-09 01:35:49.0 
+
+++ ptwalk2/include/asm-generic/pgtable.h   2005-03-09 01:36:01.0 
+
@@ -135,6 +135,27 @@ static inline void ptep_set_wrprotect(st
 #define pgd_offset_gate(mm, addr)  pgd_offset(mm, addr)
 #endif
 
+/*
+ * When walking page tables, get the address of the next boundary, or
+ * the end address of the range if that comes earlier.  Although end might
+ * wrap to 0 only in clear_page_range, __boundary may wrap to 0 throughout.
+ */
+
+#define pgd_addr_end(addr, end)
\
+({ unsigned long __boundary = ((addr) + PGDIR_SIZE) & PGDIR_MASK;  \
+   (__boundary - 1 < (end) - 1)? __boundary: (end);\
+})
+
+#define pud_addr_end(addr, end)
\
+({ unsigned long __boundary = ((addr) + PUD_SIZE) & PUD_MASK;  \
+   (__boundary - 1 < (end) - 1)? __boundary: (end);\
+})
+
+#define pmd_addr_end(addr, end)
\
+({ unsigned long __boundary = ((addr) + PMD_SIZE) & PMD_MASK;  \
+   (__boundary - 1 < (end) - 1)? __boundary: (end);\
+})
+
 #ifndef __ASSEMBLY__
 /*
  * When walking page tables, we usually want to skip any p?d_none entries;
--- ptwalk1/mm/mprotect.c   2005-03-09 01:35:49.0 +
+++ ptwalk2/mm/mprotect.c   2005-03-09 01:36:01.0 +
@@ -25,104 +25,76 @@
 #include 
 #include 
 
-static inline void
-change_pte_range(struct mm_struct *mm, pmd_t *pmd, unsigned long address,
-   unsigned long size, pgprot_t newprot)
+static inline void change_pte_range(struct mm_struct *mm, pmd_t *pmd,
+   unsigned long addr, unsigned long end, pgprot_t newprot)
 {
-   pte_t * pte;
-   unsigned long base, end;
+   pte_t *pte;
 
if (pmd_none_or_clear_bad(pmd))
return;
-   pte = pte_offset_map(pmd, address);
-   base = address & PMD_MASK;
-   address &= ~PMD_MASK;
-   end = address + size;
-   if (end > PMD_SIZE)
-   end = PMD_SIZE;
+   pte = pte_offset_map(pmd, addr);
do {
if (pte_present(*pte)) {
-   pte_t entry;
+   pte_t ptent;
 
/* Avoid an SMP race with hardware updated dirty/clean
 * bits by wiping the pte and then setting the new pte
 * into place.
 */
-   entry = ptep_get_and_clear(mm, base + address, pte);
-   set_pte_at(mm, base + address, pte, pte_modify(entry, 
newprot));
+   ptent = ptep_get_and_clear(mm, addr, pte);
+   set_pte_at(mm, addr, pte, pte_modify(ptent, newprot));
}
-   address += PAGE_SIZE;
-   pte++;
-   } while (address && (address < end));
+   } while (pte++, addr += PAGE_SIZE, addr != end);
pte_unmap(pte - 1);
 }
 
-static inline void
-change_pmd_range(struct mm_struct *mm, pud_t *pud, unsigned long address,
-unsigned long size, pgprot_t newprot)
+static inline void change_pmd_range(struct mm_struct *mm, pud_t *pud,
+   unsigned long addr, unsigned long end, pgprot_t newprot)
 {
-   pmd_t * pmd;
-   unsigned long base, end;
+   pmd_t *pmd;
+   unsigned long next;
 
if (pud_none_or_clear_bad(pud))
return;
-   pmd = pmd_offset(pud, address);
-   base = address & PUD_MASK;
-   address &= ~PUD_MASK;
-   end = address + size;
-   if (end > 

Re: make -j4 gets stuck w/ ccache over NFS - solved!

2005-03-09 Thread Mark M. Hoffman
Hi Tridge, Greg, et. al.:

I wrote, some months ago:
>  > I'm using ccache version 2.4 [1].  I just changed ~/.ccache to a symbolic
>  > link to a directory which is NFS mounted [2].  The kernel source itself is
>  > on a local FS.  With the ccache suitably primed, when I do a kernel compile
>  > using 'make -j4' it seems to get stuck for seconds at a time.  When it gets
>  > unstuck, it blows through a handful of files and then gets stuck again.

* [EMAIL PROTECTED] <[EMAIL PROTECTED]> [2004-12-08 18:25:59 +1100]:
> I'd suggest you first narrow down the problem to either being a
> locking problem or a file IO problem. To do that, change lock_fd() in
> util.c in ccache to just "return 0;". That will mean the ccache stats
> file could become corrupted, but if it runs fast then you know that it
> is a locking problem. I have noticed severe speed problem with NFS
> locking on Linux previosly, which is why I mention this as a
> possibility.
> 
> Note that removing this locking will not cause ccache to produce
> incorrect object files, it will just mean the stats printed with
> "ccache -s" may be inaccurate.

Thanks for the suggestions.  It wasn't very important to me so I didn't
make time to follow up on it.  I was just playing w/ ccache at the time.

Finally I noticed this patch from -mm1... and it solves the problem.

nfsd--lockd-dont-try-to-match-callback-requests-against-export-table.patch

How I tested: I applied the first 12 patches in 2.6.11-mm1; the above
mentioned was last - couldn't reproduce the bug.  When I unapplied just
that one, I saw it again.

original bug report:
http://marc.theaimsgroup.com/?l=linux-kernel=110238645132535=3

Greg: have you considered this one for 2.6.11.x?

Thanks,

-- 
Mark M. Hoffman
[EMAIL PROTECTED]

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH libata-2.6] AHCI: fix fatal error int handling

2005-03-09 Thread Jeff Garzik
Brett Russ wrote:
I noticed that the AHCI CI (cmd issue) reg wasn't getting cleared
after error ints resulting in no further commands being successfully
issued to the port.  This patch fixes.  All that's really needed is
the 1's complement but I also removed the disabling/enabling of the
FIS_RX b/c this isn't spec'd as necessary when handling error ints.
Signed-off-by: Brett Russ <[EMAIL PROTECTED]>
Thanks, applied.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: bk commits and dates

2005-03-09 Thread Benjamin Herrenschmidt
On Wed, 2005-03-09 at 19:47 -0800, David S. Miller wrote:
> On Thu, 10 Mar 2005 13:41:59 +1100
> Benjamin Herrenschmidt <[EMAIL PROTECTED]> wrote:
> 
> > I don't know if I'm the only one to have a problem with that, but it
> > would be nice if it was possible, when you pull a bk tree, to have the
> > commit messages for the csets in that tree be dated from the day you
> > pulled, and not the day when they went in the source tree.
> 
> When I'm working, I just do "bk csets" after I pull from Linus's
> tree to review what went in since the last time I pulled.

Yes, but the commit list archive is handy. I have quite good search
capabilities in my mailer for example, and sometimes, when doign
regression, it's quite useful to browse what went in between two
releases with it (it's just more handy than bk csets).

Ben.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [BUG] 2.6.11- sym53c8xx Broken on pp64

2005-03-09 Thread Sam Ravnborg
On Wed, Mar 09, 2005 at 06:25:56PM -0800, Linus Torvalds wrote:
> 
> 
> On Thu, 10 Mar 2005, Benjamin Herrenschmidt wrote:
> > 
> > BTW, Linus: Any chance you ever change something to version or
> > extraversion in bk just after a release ? I know I already ask and it
> > degenerated into a flamefest, and I don't know if that is specifically
> > the case now, but I keep getting report of people saying "I have a bug
> > in 2.6.xx" while in fact, they have some kind of bk clone of sometime
> > after 2.6.xx...
> 
> The answer is the same: I'd still like to have somebody (preferably Sam)  
> who is comfortable with all the build scripts get a revision-control-
> specific version at build-time, so that BK users would get the top-of-tree 
> key value, and other people could get some CVS revision or something.

I have a patch somewhere in my inbox, and got one from Ryan yesterday
also. I will see if I during the weekend find some time to look at it.

Sam
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] -stable, how it's going to work.

2005-03-09 Thread Bill Davidsen
http://localhost/blogAndi Kleen wrote:
Greg KH <[EMAIL PROTECTED]> writes:
Rules on what kind of patches are accepted, and what ones are not, into
the "-stable" tree:
- It must be obviously correct and tested.
- It can not bigger than 100 lines, with context.

This rule seems silly. What happens when a security fix needs 150 lines? 

Better maybe a rule like "The patch should be the minimal and safest 
> change to fix an issue". But see below for an exception.
I was willing to accept that there might be an exploitable security 
issue taking more than 100 lines to plug the hole (remember, check your 
elegance at the door), and I like your suggestion regardless of length. 
But I don't think any more exceptions are desirable. As you said, "see 
below."

- It must fix only one thing.
- It must fix a real bug that bothers people (not a, "This could be a
  problem..." type thing.)
- It must fix a problem that causes a build error (but not for things
  marked CONFIG_BROKEN), an oops, a hang, data corruption, a real
  security issue, or some "oh, that's not good" issue.  In short,
  something critical.
- No "theoretical race condition" issues, unless an explanation of how
  the race can be exploited.
- It can not contain any "trivial" fixes in it (spelling changes,
  whitespace cleanups, etc.)
- It must be accepted by the relevant subsystem maintainer.
I have the feeling that drivers will need some special consideration, 
like allowing additions to an existing blacklist to prevent "almost 
working" with similar but not identical chips and the like.

One rule I'm missing:
- It must be accepted to mainline. 

That is what big enterprise distributions often require and I think
it's a good rule. Otherwise you risk code and feature set drift
and we don't want to repeat the 2.4 mistakes again where some 
subsystems had more fixes in 2.4 than 2.6.
Right problem, wrong solution. What you is that SOME solution go in 
mainline. My example above of adding a chipset to a blacklist is a 
perfect case, where you want the chipset to become supported, not long 
term "officially unsupported." And band-aids like using the BKL or long 
linear searches just to prevent something which causes a hard hang or 
oops should not be blessed and included in mainline.

I know you understand this, I just think it's worth saying more clearly.
Also your rules encourage to do different patches for -stable
(e.g. with less comment changes etc.) than for mainline. I don't
think that's a very good thing. Sometimes it is unavoidable
and sometimes the mainline patches are just too big and intrusive,
but in general it's imho best to apply the same patches
to mainline and backport trees.  This has also the advantage
that the patch is best tested as possible; slimmed down patches
usually have a risk of malfunction.
Different in function as well as style. Testing complex patches can be 
done in -mm, reliable is often easier than optimal.

So in general there should be a preference to apply the same
patch as mainline, unless it is very big.
I can't imagine anyone not using a good patch in mainline, I can imagine 
some really brute force patches in stable, because they are easier to 
vat than a really correct solution.

- Security patches will be accepted into the -stable tree directly from
  the security kernel team, and not go through the normal review cycle.
  Contact the kernel security team for more details on this procedure.

This also sounds like a bad rule. How come the security team has more
competence to review patches than the subsystem maintainers?  I can
see the point of overruling maintainers on security issues when they
are not responsive, but if they are I think the should be still the
main point of contact.
I was actually thinking that the normal patch might want a fast path in 
practice. Some things are clearly wrong, like off by one errors in array 
handling and the like, using pointers to freed data or which could be 
null, that kind of thing is pretty unsubtle, and I would think would 
need fewer eyes-on. But until we see how this works in practice, let's 
not fix it. The stable kernel should have a stable process, which should 
not be changed unless there's a demonstrated problem.

--
   -bill davidsen ([EMAIL PROTECTED])
"The secret to procrastination is to put things off until the
 last possible moment - but no longer"  -me
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2/3] cciss: support for more than 8 controllers

2005-03-09 Thread mike . miller
This patch adds support for more than 8 controllers. If we run out of 
preallocated major numbers we dynamically allocate more.

Please consider this for inclusion.

Signed-off-by: Mike Miller <[EMAIL PROTECTED]>

 cciss.c |   36 +++-
 cciss.h |3 +++
 2 files changed, 30 insertions(+), 9 deletions(-)
---
diff -burNp lx2611-p001/drivers/block/cciss.c lx2611-p002/drivers/block/cciss.c
--- lx2611-p001/drivers/block/cciss.c   2005-03-08 16:39:01.650427392 -0600
+++ lx2611-p002/drivers/block/cciss.c   2005-03-08 16:50:47.149175280 -0600
@@ -120,7 +120,11 @@ static struct board_type products[] = {
 
 #define READ_AHEAD  1024
 #define NR_CMDS 384 /* #commands that can be outstanding */
-#define MAX_CTLR 8
+#define MAX_CTLR   32 
+
+/* Originally cciss driver only supports 8 major numbers */
+#define MAX_CTLR_ORIG  8
+
 
 #define CCISS_DMA_MASK 0x  /* 32 bit DMA */
 
@@ -2650,7 +2654,7 @@ static int alloc_cciss_hba(void)
}
}
printk(KERN_WARNING "cciss: This driver supports a maximum"
-   " of 8 controllers.\n");
+   " of %d controllers.\n", MAX_CTLR);
goto out;
 Enomem:
printk(KERN_ERR "cciss: out of memory.\n");
@@ -2682,13 +2686,14 @@ static int __devinit cciss_init_one(stru
request_queue_t *q;
int i;
int j;
+   int rc;
 
printk(KERN_DEBUG "cciss: Device 0x%x has been found at"
" bus %d dev %d func %d\n",
pdev->device, pdev->bus->number, PCI_SLOT(pdev->devfn),
PCI_FUNC(pdev->devfn));
i = alloc_cciss_hba();
-   if( i < 0 ) 
+   if(i < 0) 
return (-1);
if (cciss_pci_init(hba[i], pdev) != 0)
goto clean1;
@@ -2707,11 +2712,24 @@ static int __devinit cciss_init_one(stru
goto clean1;
}
 
-   if (register_blkdev(COMPAQ_CISS_MAJOR+i, hba[i]->devname)) {
-   printk(KERN_ERR "cciss: Unable to register device %s\n",
-   hba[i]->devname);
+   /* 
+* register with the major number, or get a dynamic major number
+* by passing 0 as argument.  This is done for greater than
+* 8 controller support.
+*/
+   if (i < MAX_CTLR_ORIG)
+   hba[i]->major = MAJOR_NR + i;
+   rc = register_blkdev(hba[i]->major, hba[i]->devname);
+   if(rc == -EBUSY || rc == -EINVAL) {
+   printk(KERN_ERR
+   "cciss:  Unable to get major number %d for %s "
+   "on hba %d\n", hba[i]->major, hba[i]->devname, i);
goto clean1;
}
+   else {
+   if (i >= MAX_CTLR_ORIG)
+   hba[i]->major = rc;
+   }
 
/* make sure the board interrupts are off */
hba[i]->access.set_intr_mask(hba[i], CCISS_INTR_OFF);
@@ -2782,7 +2800,7 @@ static int __devinit cciss_init_one(stru
 
sprintf(disk->disk_name, "cciss/c%dd%d", i, j);
sprintf(disk->devfs_name, "cciss/host%d/target%d", i, j);
-   disk->major = COMPAQ_CISS_MAJOR + i;
+   disk->major = hba[i]->major;
disk->first_minor = j << NWD_SHIFT;
disk->fops = _fops;
disk->queue = hba[i]->queue;
@@ -2811,7 +2829,7 @@ clean4:
hba[i]->errinfo_pool_dhandle);
free_irq(hba[i]->intr, hba[i]);
 clean2:
-   unregister_blkdev(COMPAQ_CISS_MAJOR+i, hba[i]->devname);
+   unregister_blkdev(hba[i]->major, hba[i]->devname);
 clean1:
release_io_mem(hba[i]);
free_hba(i);
@@ -2853,7 +2871,7 @@ static void __devexit cciss_remove_one (
pci_set_drvdata(pdev, NULL);
iounmap(hba[i]->vaddr);
cciss_unregister_scsi(i);  /* unhook from SCSI subsystem */
-   unregister_blkdev(COMPAQ_CISS_MAJOR+i, hba[i]->devname);
+   unregister_blkdev(hba[i]->major, hba[i]->devname);
remove_proc_entry(hba[i]->devname, proc_cciss); 

/* remove it from the disk list */
diff -burNp lx2611-p001/drivers/block/cciss.h lx2611-p002/drivers/block/cciss.h
--- lx2611-p001/drivers/block/cciss.h   2004-12-24 15:33:48.0 -0600
+++ lx2611-p002/drivers/block/cciss.h   2005-03-08 16:50:47.150175128 -0600
@@ -13,6 +13,8 @@
 #define IO_OK  0
 #define IO_ERROR   1
 
+#define MAJOR_NR COMPAQ_CISS_MAJOR
+
 struct ctlr_info;
 typedef struct ctlr_info ctlr_info_t;
 
@@ -50,6 +52,7 @@ struct ctlr_info 
CfgTable_struct __iomem *cfgtable;
unsigned int intr;
int interrupts_enabled;
+   int major;
int max_commands;
int commands_outstanding;
int max_outstanding; /* Debug */ 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL 

[SATA] libata-dev queue updated

2005-03-09 Thread Jeff Garzik
Merged recent upstream changes into libata-dev queue.  No new patches 
have found their way into libata-dev since last email.

BK URL, Patch URL, and changelog attached.
Note that the patch is diff'd against 2.6.11-bk6, which won't exist 
until four hours after this email is sent.

Jeff

BK users:

bk pull bk://gkernel.bkbits.net/libata-dev-2.6

Patch:
http://www.kernel.org/pub/linux/kernel/people/jgarzik/libata/2.6.11-bk6-libata-dev1.patch.bz2

This will update the following files:

 drivers/scsi/Kconfig |   18 
 drivers/scsi/Makefile|2 
 drivers/scsi/ahci.c  |   95 +++--
 drivers/scsi/ata_adma.c  |  778 +++
 drivers/scsi/libata-core.c   |  296 +---
 drivers/scsi/libata-scsi.c   |  701 --
 drivers/scsi/libata.h|6 
 drivers/scsi/pata_pdc2027x.c |  771 ++
 drivers/scsi/sata_promise.c  |   84 
 include/linux/ata.h  |   15 
 include/linux/libata.h   |   10 
 include/scsi/scsi.h  |3 
 12 files changed, 2507 insertions(+), 272 deletions(-)

through these ChangeSets:

:
  o [libata scsi] support 12-byte passthru CDB
  o [libata scsi] passthru CDB check condition processing
  o T10/04-262 ATA pass thru - patch

:
  o [libata sata_promise] support PATA ports on SATA controllers

Albert Lee:
  o [libata] use init-device-params ATA command where needed
  o [libata] ata_scsi_verify_xlat() fix
  o pdc2027x timing register fix for 100MHz
  o [libata] CHS support: add CHS support to ata_scsi_verify_xlat(), 
ata_scsi_rw_xlat() and ata_scsiop_read_cap().
  o [libata] CHS support: reorganize read/write translation in 
ata_scsi_rw_xlat()
  o [libata] CHS support: rename vars (s/sector/block/) in 
ata_scsi_verify_xlat()
  o [libata] CHS support: detect C/H/S at IDENTIFY DEVICE time
  o [libata] CHS support: add definitions to headers
  o pdc2027x timing register bug fix
  o [libata pdc2027x] fix incorrect pio and mwdma masks
  o [libata pdc2027x] remove quirks and ROM enable
  o [libata] add driver for Promise PATA 2027x

Brad Campbell:
  o libata basic detection and errata for PATA->SATA bridges

Jeff Garzik:
  o [libata ahci] support PCI MSI interrupt vector
  o [libata adma] Add init code, fix CPB submission code
  o [libata ahci] finish ATAPI support
  o [libata adma] trivial whitespace cleanup
  o [libata dma] fix DMA mode config; add some more initialization code
  o [libata adma] add support for configuring PIO/DMA modes
  o [libata] turn on ATAPI support
  o [libata sata_promise] merge Tobias Lorenz' pdc20619 patch, part 2
  o [libata] small cleanups
  o [libata] remove unused execute-device-diagnostic reset method
  o [libata] add new driver ata_adma
  o [libata pdc2027x] update for upstream struct device conversion
  o [libata sata_promise] fix merge bugs
  o [libata] fix build breakage
  o [libata] fix SATA->PATA bridge detect compile breakage
  o [libata] fix printk warning

John W. Linville:
  o libata: update ATA pass thru opcodes
  o libata: minor style changes in ata_scsi_pass_thru
  o libata: filter SET_FEATURES - XFER MODE from ATA pass thru
  o libata: sync SMART ioctls with ATA pass thru spec (T10/04-262r7)
  o libata: fix command queue leak when xlat_func fails
  o libata: SMART support via ATA pass-thru

Tobias Lorenz:
  o [libata sata_promise] pdc20619 (PATA) support
  o libata-scsi: get-identity ioctl support



[patch] drm missing memset can crash X server..

2005-03-09 Thread Dave Airlie

Egbert Eich reported a bug 2673 on bugs.freedesktop.org and tracked it
down to a missing memset in the setversion ioctl, this causes X server
crashes...

From: Egbert Eich <[EMAIL PROTECTED]>
Signed-off-by: Dave Airlie <[EMAIL PROTECTED]>

diff -Nru a/drivers/char/drm/drm_ioctl.c b/drivers/char/drm/drm_ioctl.c
--- a/drivers/char/drm/drm_ioctl.c  2005-03-09 10:53:42 +11:00
+++ b/drivers/char/drm/drm_ioctl.c  2005-03-09 10:53:43 +11:00
@@ -326,6 +326,8 @@

DRM_COPY_FROM_USER_IOCTL(sv, argp, sizeof(sv));

+   memset(, 0, sizeof(version));
+
dev->driver->version();
retv.drm_di_major = DRM_IF_MAJOR;
retv.drm_di_minor = DRM_IF_MINOR;
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: BUG: Slowdown on 3000 socket-machines tracked down

2005-03-09 Thread Andrew Morton
Christian Schmid <[EMAIL PROTECTED]> wrote:
>
>  > So, maybe a VM problem?  That would be a good place to focus since
>  > I think we can be fairly certain it isn't a problem in just the
>  > networking code.  Otherwise, my tests would show lower bandwidth.
> 
>  Thanks to your tests I am really sure that its no network-code problem 
> anymore. But what I THINK it 
>  is: The network is allocating buffers dynamically and if the vm doesnt 
> provide that buffers fast 
>  enough, it locks as well.

Did anyone have a 100-liner which demonstrates this problem?

The output of `vmstat 1' when the thing starts happening would be interesting.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 6/15] ptwalk: ioremap_page_range

2005-03-09 Thread Hugh Dickins
Convert i386 ioremap pagetable walkers to loops using p?d_addr_end.
Rename internal levels ioremap_p??_range.  Don't cheat, give it a real
(but inlined) ioremap_pud_range; uninline lowest level to help debug.
Replace "page already exists" printk and BUG by BUG_ON.

Signed-off-by: Hugh Dickins <[EMAIL PROTECTED]>
---

 arch/i386/mm/ioremap.c |  112 +++--
 1 files changed, 53 insertions(+), 59 deletions(-)

--- ptwalk5/arch/i386/mm/ioremap.c  2005-03-09 01:12:39.0 +
+++ ptwalk6/arch/i386/mm/ioremap.c  2005-03-09 01:36:51.0 +
@@ -20,89 +20,82 @@
 #define ISA_START_ADDRESS  0xa
 #define ISA_END_ADDRESS0x10
 
-static inline void remap_area_pte(pte_t * pte, unsigned long address, unsigned 
long size,
-   unsigned long phys_addr, unsigned long flags)
+static int ioremap_pte_range(pmd_t *pmd, unsigned long addr,
+   unsigned long end, unsigned long phys_addr, unsigned long flags)
 {
-   unsigned long end;
+   pte_t *pte;
unsigned long pfn;
 
-   address &= ~PMD_MASK;
-   end = address + size;
-   if (end > PMD_SIZE)
-   end = PMD_SIZE;
-   if (address >= end)
-   BUG();
pfn = phys_addr >> PAGE_SHIFT;
+   pte = pte_alloc_kernel(_mm, pmd, addr);
+   if (!pte)
+   return -ENOMEM;
do {
-   if (!pte_none(*pte)) {
-   printk("remap_area_pte: page already exists\n");
-   BUG();
-   }
+   BUG_ON(!pte_none(*pte));
set_pte(pte, pfn_pte(pfn, __pgprot(_PAGE_PRESENT | _PAGE_RW | 
_PAGE_DIRTY | _PAGE_ACCESSED | flags)));
-   address += PAGE_SIZE;
pfn++;
-   pte++;
-   } while (address && (address < end));
+   } while (pte++, addr += PAGE_SIZE, addr != end);
+   return 0;
 }
 
-static inline int remap_area_pmd(pmd_t * pmd, unsigned long address, unsigned 
long size,
-   unsigned long phys_addr, unsigned long flags)
+static inline int ioremap_pmd_range(pud_t *pud, unsigned long addr,
+   unsigned long end, unsigned long phys_addr, unsigned long flags)
 {
-   unsigned long end;
+   pmd_t *pmd;
+   unsigned long next;
 
-   address &= ~PGDIR_MASK;
-   end = address + size;
-   if (end > PGDIR_SIZE)
-   end = PGDIR_SIZE;
-   phys_addr -= address;
-   if (address >= end)
-   BUG();
+   phys_addr -= addr;
+   pmd = pmd_alloc(_mm, pud, addr);
+   if (!pmd)
+   return -ENOMEM;
do {
-   pte_t * pte = pte_alloc_kernel(_mm, pmd, address);
-   if (!pte)
+   next = pmd_addr_end(addr, end);
+   if (ioremap_pte_range(pmd, addr, next, phys_addr + addr, flags))
return -ENOMEM;
-   remap_area_pte(pte, address, end - address, address + 
phys_addr, flags);
-   address = (address + PMD_SIZE) & PMD_MASK;
-   pmd++;
-   } while (address && (address < end));
+   } while (pmd++, addr = next, addr != end);
return 0;
 }
 
-static int remap_area_pages(unsigned long address, unsigned long phys_addr,
-unsigned long size, unsigned long flags)
+static inline int ioremap_pud_range(pgd_t *pgd, unsigned long addr,
+   unsigned long end, unsigned long phys_addr, unsigned long flags)
 {
-   int error;
-   pgd_t * dir;
-   unsigned long end = address + size;
+   pud_t *pud;
+   unsigned long next;
 
-   phys_addr -= address;
-   dir = pgd_offset(_mm, address);
+   phys_addr -= addr;
+   pud = pud_alloc(_mm, pgd, addr);
+   if (!pud)
+   return -ENOMEM;
+   do {
+   next = pud_addr_end(addr, end);
+   if (ioremap_pmd_range(pud, addr, next, phys_addr + addr, flags))
+   return -ENOMEM;
+   } while (pud++, addr = next, addr != end);
+   return 0;
+}
+
+static int ioremap_page_range(unsigned long addr,
+   unsigned long end, unsigned long phys_addr, unsigned long flags)
+{
+   pgd_t *pgd;
+   unsigned long next;
+   int err;
+
+   BUG_ON(addr >= end);
flush_cache_all();
-   if (address >= end)
-   BUG();
+   phys_addr -= addr;
+   pgd = pgd_offset_k(addr);
spin_lock(_mm.page_table_lock);
do {
-   pud_t *pud;
-   pmd_t *pmd;
-   
-   error = -ENOMEM;
-   pud = pud_alloc(_mm, dir, address);
-   if (!pud)
-   break;
-   pmd = pmd_alloc(_mm, pud, address);
-   if (!pmd)
-   break;
-   if (remap_area_pmd(pmd, address, end - address,
-phys_addr + address, 

Re: 2.6.x.y gatekeeper discipline

2005-03-09 Thread DHollenbeck

Where do you see that patch as being applied in the new .y stable series?
Chris
I got that patch description from here:
When you go to http://kernel.org, and click on the stand alone " C " to 
the right of 2.6.11.2

It is a hyperlink to:
http://kernel.org/pub/linux/kernel/v2.6/testing/cset/
Have I mis-understood something, or is the website misleading?  Or both :)
Dick
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Linux 2.6.11.2

2005-03-09 Thread Willy Tarreau
On Wed, Mar 09, 2005 at 03:57:16PM -0800, Matt Mackall wrote:
 
> Imagine we want to go from 2.6.11.3 to 2.6.12

The easiest way would be to keep a local fresh copy of 2.6.11 before
applying 2.6.11.3 anyway.  That would solve a) and b) even more easily.
And yes, I find a) more logical. This is the way all private trees have
been working for ages. When you download 2.6.11-ac2, it's not a patch
against -ac1, but against 2.6.11. If you want to start from -ac1, you
get the 2.6.11-ac1-ac2 patch.

And last, since these patches are mostly bugfixes for the reference kernel
(eg: 2.6.11), it seems logical to be able to patch that kernel with the
latest bug fix.

cheers,
willy

> case a)
> revert patch 2.6.11.3
> get and apply 2.6.12
> 
> case b)
> revert patch 2.6.11.3
> revert patch 2.6.11.2
> revert patch 2.6.11.1
> get and apply 2.6.12
> 
> case c)
> poke around on kernel.org and figure out that the last kernel in .11 is .11.5
> get and apply 2.6.11.4
> get and apply 2.6.11.5
> get and apply 2.6.12
> 
> Note this gets increasingly more painful in cases b and c when there
> are a large number of post-releases. And case c) is really stupid when
> you want to go from 2.6.12 to 2.6.11.
> 
> Also note that -pre, -rc, -bk, -mm, -ac, and every other branch off a
> release has worked the a) way.
> 
> -- 
> Mathematics is the supreme nostalgia of our time.
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [EMAIL PROTECTED]
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Page Fault Scalability patch V19 [4/4]: Drop use of page_table_lock in do_anonymous_page

2005-03-09 Thread Christoph Lameter
On Wed, 9 Mar 2005, Andi Kleen wrote:

> I still think it's a bad idea to add arbitary process size limits like this:

The limit is pretty high: 2^31*PAGE_SIZE bytes. For the standard 4k
pagesize this will be >8TB.

> >
> > +#ifdef CONFIG_ATOMIC_TABLE_OPS
> > +/*
> > + * Atomic page table operations require that the counters are also
> > + * incremented atomically
> > +*/
> > +#define set_mm_counter(mm, member, value) atomic_set(&(mm)->member, value)
> > +#define get_mm_counter(mm, member) ((unsigned 
> > long)atomic_read(&(mm)->member))
> > +#define update_mm_counter(mm, member, value) atomic_add(value, 
> > &(mm)->member)
> > +#define MM_COUNTER_T atomic_t
>
> Can you use atomic64_t on 64bit systems at least?

If atomic64_t is available on all 64 bit systems then its no problem.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] VGA arbitration: draft of kernel side

2005-03-09 Thread Benjamin Herrenschmidt
On Wed, 2005-03-09 at 12:45 +0200, Pekka Enberg wrote:
> Hi Benjamin,
> 
> Few coding style nitpicks follow.
> 
> On Tue, 08 Mar 2005 18:11:59 +1100, Benjamin Herrenschmidt
> <[EMAIL PROTECTED]> wrote:
> > Index: linux-work/include/linux/pci.h
> > ===
> > --- linux-work.orig/include/linux/pci.h 2005-01-24 17:09:57.0 
> > +1100
> > +++ linux-work/include/linux/pci.h  2005-03-08 15:26:25.0 +1100
> > @@ -1064,5 +1064,6 @@
> >  #define PCIPCI_VSFX16
> >  #define PCIPCI_ALIMAGIK32
> >  
> > +
> >  #endif /* __KERNEL__ */
> >  #endif /* LINUX_PCI_H */
> 
> Please drop whitespace noise from the patch.

Oh sure, will do. I'm not about to submit anything yet anyway, and it
will go through a cleanup phase. The above is just residual of quilt
picking up a file where I added something, then removed it.

> > Index: linux-work/drivers/pci/vga.c
> > ===
> > --- /dev/null   1970-01-01 00:00:00.0 +
> > +++ linux-work/drivers/pci/vga.c2005-03-08 18:04:57.0 +1100
> > @@ -0,0 +1,403 @@
> > +static LIST_HEAD(  vga_list);
> 
> Please remove whitespace damage.
> 
> > +static spinlock_t  vga_lock;
> > +static DECLARE_WAIT_QUEUE_HEAD(vga_wait_queue);

The above isn't whitespace damage, it's aligning of the 3 variable
names properly in a column :) I dislike those DECLARE_*() macros because
of that btw. That one is a matter of style, I'm experiencing a bit with
this, but it's definitely intentional.

> 
> Please consolidate both while loops into one function. One possible way would
> be to do:
> 
> static void vga_update_bus(struct pci_bus *bus, unsigned int enable)
> {
>   while (bus) {
>   bridge = bus->self;
>   if (bridge) {
>   pci_read_config_word(bridge, PCI_BRIDGE_CONTROL, );
>   if (cmd & PCI_BRIDGE_CTL_VGA)
>   continue;
>   if (enable)
>   cmd |= PCI_BRIDGE_CTL_VGA;
>   else
>   cmd &= ~PCI_BRIDGE_CTL_VGA;
>   pci_write_config_word(bridge, PCI_BRIDGE_CONTROL, cmd);
>   }
>   bus = bus->parent;
>   }
> }

I think you are beeing anal here, but I'll think about it ;)

> > +/*
> > + * Currently, we assume that the "initial" setup of the system is
> > + * sane, that is we don't come up with conflicting devices, which
> > + * would be annoying. We could double check and be better at
> > + * deciding who is the default here, but we don't. 
> > + */
> > +void vga_arbiter_add_pci_device(struct pci_dev *pdev)
> > +{
> > +   struct vga_device *vgadev;
> > +   unsigned long flags;
> > +   struct pci_bus *bus;
> > +   struct pci_dev *bridge;
> > +   u16 cmd;
> > +
> > +   /* Only deal with VGA class devices */
> > +   if ((pdev->class >> 8) != PCI_CLASS_DISPLAY_VGA)
> > +   return;
> > +
> > +   /* Allocate structure */
> > +   vgadev = kmalloc(sizeof(struct vga_device), GFP_KERNEL);
> > +   memset(vgadev, 0, sizeof(*vgadev));
> 
> Please consider using kcalloc() here.

Will do.

Ben.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Support for GEODE CPUs

2005-03-09 Thread Lennart Sorensen
On Wed, Mar 09, 2005 at 09:59:26PM +, Alan Cox wrote:
> If you build 486 it will still use the TSC because it is available (The
> PIT is buggy but the kernel knows about that anyway and handles it). 

Hmm, I thought that was the whole point of the different cpu type
choices in the kernel.  Then again the MTRR is still available with a
386 kernel on a newer cpu as far as I remember, so I guess I will try a
486 optimized kernel next.

> There are a few Geode tricks to know for performance
> 
> - Turn off the video

That is the plan long term, although the BIOS's serial console doesn't
seem to work well with grub at least on minicom.  I think switching to
lilo may help that.

> - If you can't turn it off use solid areas of colour to speed the system
> up (The hardware uses RLE encoding to reduce ram fetch bandwidth)
> - Remember the cache is only 16K (12K when running X11 as 4K is borrowed
> for the blitter)

Even more reason to keep the video off (I think it steals some system
ram when on as well).

> - The onboard audio is a software SB emulation on older GX. It burns
> CPU.

I have audio disabled since I have no need for it.

> Also avoid touching various legacy registers as much as possible, many
> cause BIOS traps in SMM emulation code. The list I have is NDA but you
> can use rdtsc/inb or outb/rdtsc to work out which 8)

Only PCI and LPC devices in use, so I don't think I will be poking any
legacy registers directly, although I have no idea if the kernel would
be poking any of them as part of running the drivers.  Hopefully not.

Thanks for the information.

Len Sorensen
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ANNOUNCE 0/6] Open-iSCSI High-Performance Initiator for Linux

2005-03-09 Thread Lars Marowsky-Bree
On 2005-03-08T22:25:29, Alex Aizman <[EMAIL PROTECTED]> wrote:

> There's (or at least was up until today) an ongoing discussion on our 
> mailing list at http://groups-beta.google.com/group/open-iscsi. The 
> short and long of it: the problem can be solved, and it will. Couple 
> simple things we already do: mlockall() to keep the daemon un-swapped, 
> and also looking into potential dependency created by syslog (there's 
> one for 2.4 kernel, not sure if this is an issue for 2.6).

BTW, to get around the very same issues, heartbeat does much the same:
lock itself into memory, reserve a couple of pages more to spare on
stack & heap, run at soft-realtime priority.

syslog(), however, sucks.

We went down the path of using our non-blocking IPC library to have all
our various components log to ha_logd, which then logs to syslog() or
writes to disk or wherever.

That works well in our current development series, and if you want to
share code, you can either rip it off (Open Source, we love ya ;) or we
can spin off these parts into a sub-package for you to depend on...

> The sfnet is a learning experience; it is by no means a proof that it 
> cannot be done.

I'd also argue that it MUST be done, because the current way of "Oh,
it's somehow related to block stuff, must be in kernel" leads down to
hell. We better figure out good ways around it ;-)


Sincerely,
Lars Marowsky-Brée <[EMAIL PROTECTED]>

-- 
High Availability & Clustering
SUSE Labs, Research and Development
SUSE LINUX Products GmbH - A Novell Business

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [BUG] 2.6.11- sym53c8xx Broken on pp64

2005-03-09 Thread Ryan Anderson
On Wed, Mar 09, 2005 at 06:25:56PM -0800, Linus Torvalds wrote:
> On Thu, 10 Mar 2005, Benjamin Herrenschmidt wrote:
> > 
> > BTW, Linus: Any chance you ever change something to version or
> > extraversion in bk just after a release ? I know I already ask and it
> > degenerated into a flamefest, and I don't know if that is specifically
> > the case now, but I keep getting report of people saying "I have a bug
> > in 2.6.xx" while in fact, they have some kind of bk clone of sometime
> > after 2.6.xx...
> 
> The answer is the same: I'd still like to have somebody (preferably Sam)  
> who is comfortable with all the build scripts get a revision-control-
> specific version at build-time, so that BK users would get the top-of-tree 
> key value, and other people could get some CVS revision or something.

I've got something that fixes up the version by adding -BK and then 8
hex characters from the md5 hash of the top of tree changeset key.

I was starting to work on stuffing that same value into a /proc file so
that you can figure out what the tree looked like, but at the moment,
you at least get a semi-random string appended to the version.

I resent the patch yesterday, but I'll put it here, too:
 
> I have this dim memory that Sam might even have had some early trials, but 
> maybe thats just wishful thinking.. Sam?

I think that was my patch - Sam was going to look at it, but I suspect
it got lost in more interesting things. :)

(I sent a better described version to Andrew yesterday, if you want to
grab that description and use it instead.)
 
Signed-Off-By: Ryan Anderson <[EMAIL PROTECTED]>

diff -Nru a/Makefile b/Makefile
--- a/Makefile  2005-03-09 02:51:15 -05:00
+++ b/Makefile  2005-03-09 02:51:15 -05:00
@@ -550,6 +550,24 @@
 
 #exportINSTALL_PATH=/boot
 
+# If CONFIG_LOCALVERSION_AUTO is set, we automatically perform some tests
+# and try to determine if the current source tree is a release tree, of any 
sort,
+# or if is a pure development tree.
+# A 'release tree' is any tree with a BitKeeper TAG associated with it.
+# The primary goal of this is to make it safe for a native BitKeeper user to
+# build a release tree (i.e, 2.6.9) and also to continue developing against the
+# current Linus tree, without having the Linus tree overwrite the 2.6.9 tree 
+# when installed.
+#
+# (In the future, CVS and SVN support will be added as well.)
+
+ifeq ($(CONFIG_LOCALVERSION_AUTO),y)
+   ifeq ($(shell ls -d $(srctree)/BitKeeper 
2>/dev/null),$(srctree)/BitKeeper)
+   localversion-bk := $(shell 
$(srctree)/scripts/setlocalversion.sh $(srctree) $(objtree))
+   LOCALVERSION := $(LOCALVERSION)$(localversion-bk)
+   endif
+endif
+
 #
 # INSTALL_MOD_PATH specifies a prefix to MODLIB for module directory
 # relocations required by build roots.  This is not defined in the
diff -Nru a/init/Kconfig b/init/Kconfig
--- a/init/Kconfig  2005-03-09 02:51:15 -05:00
+++ b/init/Kconfig  2005-03-09 02:51:15 -05:00
@@ -69,6 +69,18 @@
  object and source tree, in that order.  Your total string can
  be a maximum of 64 characters.
 
+config LOCALVERSION_AUTO
+   bool "Automatically append version information to the version string"
+   default y
+   help
+ This will try to automatically determine if the current tree is a
+ release tree by looking for BitKeeper tags that belong to the
+ current top of tree revision.
+ A string of the format -BK will be added to the
+ localversion.  The string generated by this will be appended 
+ after any matching localversion* files, and after the 
+ value set in CONFIG_LOCALVERSION
+
 config SWAP
bool "Support for paging of anonymous memory (swap)"
depends on MMU
diff -Nru a/scripts/setlocalversion b/scripts/setlocalversion
--- /dev/null   Wed Dec 31 16:00:00 196900
+++ b/scripts/setlocalversion   2005-03-09 02:51:15 -05:00
@@ -0,0 +1,62 @@
+#!/usr/bin/perl
+# Copyright 2004 - Ryan Anderson <[EMAIL PROTECTED]>  GPL v2
+
+use strict;
+use warnings;
+use Digest::MD5;
+require 5.006;
+
+if (@ARGV != 2) {
+   print < 
+EOT
+   exit(1);
+}
+
+my $debug = 0;
+
+my ($srctree,$objtree) = @ARGV;
+
+my @LOCALVERSIONS = ();
+
+# BitKeeper Version Checks
+
+# We are going to use the following commands to try and determine if
+# this repository is at a Version boundary (i.e, 2.6.10 vs 2.6.10 + some 
patches)
+# We currently assume that all meaningful version boundaries are marked by a 
tag.
+# We don't care what the tag is, just that something exists.
+
[EMAIL PROTECTED] ~/dev/linux/local$ T=`bk changes -r+ -k`
[EMAIL PROTECTED] ~/dev/linux/local$ bk prs -h -d':TAG:\n' -r$T
+
+sub do_bk_checks {
+   chdir($srctree);
+   my $changeset = `bk changes -r+ -k`;
+   chomp $changeset;
+   my $tag = `bk prs -h -d':TAG:' -r'$changeset'`;
+
+   printf("ChangeSet Key = '%s'\nTAG = '%s'\n", $changeset, $tag) if 
($debug > 0);
+
+   if 

Re: [PATCH] make st seekable again

2005-03-09 Thread Alan Cox
On Mer, 2005-03-09 at 21:58, Kai Makisara wrote:
> While waiting for the application to be fixed, it was decided to restore 
> the old behaviour of the tape drivers.

Which means tar won't get fixed 8(

> I don't think implementing proper read-only lseek for tapes is worth the 
> trouble (reliable tracking of the current location is tricky). Purist 
> kernels can refuse lseeks. Pragmatic kernels can allow lseeks until 
> refusing those won't break common applications.

The problem is the existing behaviour code isn't just 'not useful' its
badly broken. No locking, no overflow checks, updates the wrong variable
etc. It is asking for nasty accidents with critical user data.

Alan
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/3] cciss: new controller support

2005-03-09 Thread mike . miller
This patch adds support for 2 new SAS controllers due out this summer.
It also bumps the version to 2.6.6.

Please consider this for inclusion.

Signed-off-by: Mike Miller <[EMAIL PROTECTED]>

 Documentation/cciss.txt |2 ++
 drivers/block/cciss.c   |   14 ++
 include/linux/pci_ids.h |1 +
 3 files changed, 13 insertions(+), 4 deletions(-)
---
diff -burNp lx2611.orig/Documentation/cciss.txt 
lx2611-266/Documentation/cciss.txt
--- lx2611.orig/Documentation/cciss.txt 2005-03-03 13:48:36.0 -0600
+++ lx2611-266/Documentation/cciss.txt  2005-03-08 16:39:12.097839144 -0600
@@ -15,6 +15,8 @@ This driver is known to work with the fo
* SA 6400 U320 Expansion Module
* SA 6i
* SA P600
+   * SA P800
+   * SA E400
 
 If nodes are not already created in the /dev/cciss directory, run as root:
 
diff -burNp lx2611.orig/drivers/block/cciss.c lx2611-266/drivers/block/cciss.c
--- lx2611.orig/drivers/block/cciss.c   2005-03-03 13:48:46.0 -0600
+++ lx2611-266/drivers/block/cciss.c2005-03-08 16:39:01.650427392 -0600
@@ -46,14 +46,14 @@
 #include 
 
 #define CCISS_DRIVER_VERSION(maj,min,submin) ((maj<<16)|(min<<8)|(submin))
-#define DRIVER_NAME "HP CISS Driver (v 2.6.4)"
-#define DRIVER_VERSION CCISS_DRIVER_VERSION(2,6,4)
+#define DRIVER_NAME "HP CISS Driver (v 2.6.6)"
+#define DRIVER_VERSION CCISS_DRIVER_VERSION(2,6,6)
 
 /* Embedded module documentation macros - see modules.h */
 MODULE_AUTHOR("Hewlett-Packard Company");
-MODULE_DESCRIPTION("Driver for HP Controller SA5xxx SA6xxx version 2.6.4");
+MODULE_DESCRIPTION("Driver for HP Controller SA5xxx SA6xxx version 2.6.6");
 MODULE_SUPPORTED_DEVICE("HP SA5i SA5i+ SA532 SA5300 SA5312 SA641 SA642 SA6400"
-   " SA6i P600");
+   " SA6i P600 P800 E400");
 MODULE_LICENSE("GPL");
 
 #include "cciss_cmd.h"
@@ -82,6 +82,10 @@ const struct pci_device_id cciss_pci_dev
0x0E11, 0x4091, 0, 0, 0},
{ PCI_VENDOR_ID_HP, PCI_DEVICE_ID_HP_CISSA,
0x103C, 0x3225, 0, 0, 0},
+   { PCI_VENDOR_ID_HP, PCI_DEVICE_ID_HP_CISSB,
+   0x103c, 0x3223, 0, 0, 0},
+   { PCI_VENDOR_ID_HP, PCI_DEVICE_ID_HP_CISSB,
+   0x103c, 0x3231, 0, 0, 0},
{0,}
 };
 MODULE_DEVICE_TABLE(pci, cciss_pci_device_id);
@@ -103,6 +107,8 @@ static struct board_type products[] = {
{ 0x409D0E11, "Smart Array 6400 EM", _access},
{ 0x40910E11, "Smart Array 6i", _access},
{ 0x3225103C, "Smart Array P600", _access},
+   { 0x3223103C, "Smart Array P800", _access},
+   { 0x3231103C, "Smart Array E400", _access},
 };
 
 /* How long to wait (in millesconds) for board to go into simple mode */
diff -burNp lx2611.orig/include/linux/pci_ids.h 
lx2611-266/include/linux/pci_ids.h
--- lx2611.orig/include/linux/pci_ids.h 2005-03-03 13:49:02.0 -0600
+++ lx2611-266/include/linux/pci_ids.h  2005-03-08 14:16:59.050059584 -0600
@@ -696,6 +696,7 @@
 #define PCI_DEVICE_ID_HP_DIVA_EVEREST  0x1282
 #define PCI_DEVICE_ID_HP_DIVA_AUX  0x1290
 #define PCI_DEVICE_ID_HP_CISSA 0x3220
+#define PCI_DEVICE_ID_HP_CISSB 0x3230
 
 #define PCI_VENDOR_ID_PCTECH   0x1042
 #define PCI_DEVICE_ID_PCTECH_RZ10000x1000
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3/3] cciss: per disk queue support

2005-03-09 Thread mike . miller
This patch adds per disk queue functionality. It seems that the 2.6 kernel 
expects a queue per disk. If you have multiple logical drives on a controller 
all of the queues actually point back to the same queue. If a drive is deleted 
it blows us out of the water.
We hold the lock during any queue operations and have added what we call a
"fair-enough" algorithm to prevent starving out any drive.

Please consider this for inclusion.

Signed-off-by: Mike Miller <[EMAIL PROTECTED]>

 cciss.c |   52 
 cciss.h |5 +
 2 files changed, 53 insertions(+), 4 deletions(-)

diff -burNp lx2611-p002/drivers/block/cciss.c lx2611-p003/drivers/block/cciss.c
--- lx2611-p002/drivers/block/cciss.c   2005-03-08 16:50:47.149175280 -0600
+++ lx2611-p003/drivers/block/cciss.c   2005-03-08 17:17:50.148441888 -0600
@@ -2090,6 +2090,9 @@ static void do_cciss_request(request_que
drive_info_struct *drv;
int i, dir;
 
+   /* We call start_io here in case there is a command waiting on the
+* queue that has not been sent.
+   */
if (blk_queue_plugged(q))
goto startio;
 
@@ -2178,6 +2181,9 @@ queue:
 full:
blk_stop_queue(q);
 startio:
+   /* We will already have the driver lock here so not need
+* to lock it.
+   */
start_io(h);
 }
 
@@ -2187,7 +2193,8 @@ static irqreturn_t do_cciss_intr(int irq
CommandList_struct *c;
unsigned long flags;
__u32 a, a1;
-
+   int j;
+   int start_queue = h->next_to_run;
 
/* Is this interrupt for us? */
if (( h->access.intr_pending(h) == 0) || (h->interrupts_enabled == 0))
@@ -2234,13 +2241,50 @@ static irqreturn_t do_cciss_intr(int irq
}
}
 
-   /*
-* See if we can queue up some more IO
+   /* check to see if we have maxed out the number of commands that can
+* be placed on the queue.  If so then exit.  We do this check here
+* in case the interrupt we serviced was from an ioctl and did not
+* free any new commands.
 */
-   blk_start_queue(h->queue);
+   if ((find_first_zero_bit(h->cmd_pool_bits, NR_CMDS)) == NR_CMDS)
+   goto cleanup;
+ 
+   /* We have room on the queue for more commands.  Now we need to queue
+* them up.  We will also keep track of the next queue to run so
+* that every queue gets a chance to be started first.
+   */
+   for (j=0; j < NWD; j++){
+   int curr_queue = (start_queue + j) % NWD;
+   /* make sure the disk has been added and the drive is real
+* because this can be called from the middle of init_one.
+   */
+   if(!(h->gendisk[curr_queue]->queue) || 
+  !(h->drv[curr_queue].heads))
+   continue;
+   blk_start_queue(h->gendisk[curr_queue]->queue);
+ 
+   /* check to see if we have maxed out the number of commands 
+* that can be placed on the queue.  
+   */
+   if ((find_first_zero_bit(h->cmd_pool_bits, NR_CMDS)) == NR_CMDS)
+   {
+   if (curr_queue == start_queue){
+   h->next_to_run = (start_queue + 1) % NWD;
+   goto cleanup;
+   } else {
+   h->next_to_run = curr_queue;
+   goto cleanup;
+   }
+   } else {
+   curr_queue = (curr_queue + 1) % NWD;
+   }
+   }
+ 
+cleanup:
spin_unlock_irqrestore(CCISS_LOCK(h->ctlr), flags);
return IRQ_HANDLED;
 }
+
 /* 
  *  We cannot read the structure directly, for portablity we must use 
  *   the io functions.
Files lx2611-p002/drivers/block/.cciss.c.rej.swp and 
lx2611-p003/drivers/block/.cciss.c.rej.swp differ
diff -burNp lx2611-p002/drivers/block/cciss.h lx2611-p003/drivers/block/cciss.h
--- lx2611-p002/drivers/block/cciss.h   2005-03-08 16:50:47.150175128 -0600
+++ lx2611-p003/drivers/block/cciss.h   2005-03-08 17:03:04.279114496 -0600
@@ -84,6 +84,11 @@ struct ctlr_info 
int nr_frees; 
int busy_configuring;
 
+   /* This element holds the zero based queue number of the last
+* queue to be started.  It is used for fairness.
+   */
+   int next_to_run;
+
// Disk structures we need to pass back
struct gendisk   *gendisk[NWD];
 #ifdef CONFIG_CISS_SCSI_TAPE
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] drm missing memset can crash X server..

2005-03-09 Thread Chris Wright
* Dave Airlie ([EMAIL PROTECTED]) wrote:
> 
> Egbert Eich reported a bug 2673 on bugs.freedesktop.org and tracked it
> down to a missing memset in the setversion ioctl, this causes X server
> crashes...
> 
> From: Egbert Eich <[EMAIL PROTECTED]>
> Signed-off-by: Dave Airlie <[EMAIL PROTECTED]>

Thanks, queued to -stable.
-chris
-- 
Linux Security Modules http://lsm.immunix.org http://lsm.bkbits.net
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


1/3 swsusp: use non-contiguous memory on resume

2005-03-09 Thread Pavel Machek
Hi!

The following patch is designed to fix a problem in the current
implementation of swsusp in mainline kernels.  Namely, swsusp uses
an array of page backup entries (aka pagedir) to store pointers to memory
pages that must be saved during suspend and restored during resume.

Unfortunately, the pagedir has to be located in a contiguous chunk of memory
and it sometimes turns out that an 8-order or even 9-order allocation is needed
for this purpose.  It sometimes is impossible to get such an allocation and
swsusp may fail during either suspend or resume due to the lack of memory,
although theoretically there is enough free memory for it to succeed.

Moreover, swsusp is more likely to fail for this reason during resume, which
means that it may fail during resume after a successful suspend
(this actually has happened for some people, including me :-)) and this,
potentially, may lead to the loss of data.

The problem is fixed by replacing the pagedir with a linklist so that
high-order memory allocations are avoided (the patches make swsusp use only
0-order allocations).  Unfortunately this means that it's necessary to change
assembly routines used to restore the image after it's been loaded from
swap so that they walk the list instead of walking the array.

This patch makes swsusp allocate only individual pages during resume.
it contains the necessary changes to the assembly routines etc. for i386
and x86-64.

Please apply,
Pavel

Signed-off-by: Rafael J. Wysocki <[EMAIL PROTECTED]>
Signed-off-by: Pavel Machek <[EMAIL PROTECTED]>


diff -Nru linux-2.6.11-a/arch/i386/kernel/asm-offsets.c 
linux-2.6.11-b/arch/i386/kernel/asm-offsets.c
--- linux-2.6.11-a/arch/i386/kernel/asm-offsets.c   2005-03-02 
08:38:00.0 +0100
+++ linux-2.6.11-b/arch/i386/kernel/asm-offsets.c   2005-03-04 
20:14:01.0 +0100
@@ -7,6 +7,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include "sigframe.h"
 #include 
@@ -56,6 +57,11 @@
 
OFFSET(EXEC_DOMAIN_handler, exec_domain, handler);
OFFSET(RT_SIGFRAME_sigcontext, rt_sigframe, uc.uc_mcontext);
+   BLANK();
+
+   OFFSET(pbe_address, pbe, address);
+   OFFSET(pbe_orig_address, pbe, orig_address);
+   OFFSET(pbe_next, pbe, next);
 
/* Offset from the sysenter stack to tss.esp0 */
DEFINE(TSS_sysenter_esp0, offsetof(struct tss_struct, esp0) -
diff -Nru linux-2.6.11-a/arch/i386/power/swsusp.S 
linux-2.6.11-b/arch/i386/power/swsusp.S
--- linux-2.6.11-a/arch/i386/power/swsusp.S 2005-03-02 08:38:37.0 
+0100
+++ linux-2.6.11-b/arch/i386/power/swsusp.S 2005-03-04 20:14:01.0 
+0100
@@ -12,6 +12,7 @@
 #include 
 #include 
 #include 
+#include 
 
.text
 
@@ -28,28 +29,28 @@
ret
 
 ENTRY(swsusp_arch_resume)
-   movl $swsusp_pg_dir-__PAGE_OFFSET,%ecx
-   movl %ecx,%cr3
+   movl$swsusp_pg_dir-__PAGE_OFFSET, %ecx
+   movl%ecx, %cr3
 
-   movlpagedir_nosave, %ebx
-   xorl%eax, %eax
-   xorl%edx, %edx
+   movlpagedir_nosave, %edx
.p2align 4,,7
 
 copy_loop:
-   movl4(%ebx,%edx),%edi
-   movl(%ebx,%edx),%esi
+   testl   %edx, %edx
+   jz  done
+
+   movlpbe_address(%edx), %esi
+   movlpbe_orig_address(%edx), %edi
 
movl$1024, %ecx
rep
movsl
 
-   incl%eax
-   addl$16, %edx
-   cmplnr_copy_pages,%eax
-   jb copy_loop
+   movlpbe_next(%edx), %edx
+   jmp copy_loop
.p2align 4,,7
 
+done:
movl saved_context_esp, %esp
movl saved_context_ebp, %ebp
movl saved_context_ebx, %ebx
diff -Nru linux-2.6.11-a/arch/x86_64/kernel/asm-offsets.c 
linux-2.6.11-b/arch/x86_64/kernel/asm-offsets.c
--- linux-2.6.11-a/arch/x86_64/kernel/asm-offsets.c 2005-03-02 
08:38:10.0 +0100
+++ linux-2.6.11-b/arch/x86_64/kernel/asm-offsets.c 2005-03-04 
20:14:01.0 +0100
@@ -62,8 +62,8 @@
   offsetof (struct rt_sigframe32, uc.uc_mcontext));
BLANK();
 #endif
-   DEFINE(SIZEOF_PBE, sizeof(struct pbe));
DEFINE(pbe_address, offsetof(struct pbe, address));
DEFINE(pbe_orig_address, offsetof(struct pbe, orig_address));
+   DEFINE(pbe_next, offsetof(struct pbe, next));
return 0;
 }
diff -Nru linux-2.6.11-a/arch/x86_64/kernel/suspend_asm.S 
linux-2.6.11-b/arch/x86_64/kernel/suspend_asm.S
--- linux-2.6.11-a/arch/x86_64/kernel/suspend_asm.S 2005-03-02 
08:38:26.0 +0100
+++ linux-2.6.11-b/arch/x86_64/kernel/suspend_asm.S 2005-03-04 
20:14:01.0 +0100
@@ -54,16 +54,10 @@
movq%rax, %cr4;  # turn PGE back on
 
movqpagedir_nosave(%rip), %rdx
-   /* compute the limit */
-   movlnr_copy_pages(%rip), %eax
-   testl   %eax, %eax
-   jz  done
-   movq%rdx,%r8
-   movl$SIZEOF_PBE,%r9d
-   mul %r9  # 

Re: [BUG] 2.6.11- sym53c8xx Broken on pp64

2005-03-09 Thread Omkhar Arasaratnam
Linus Torvalds wrote:
There are certainly sym changes in there too since 2.6.9, let's see if 
James or Willy have any suggestions. It might not be ppc64-specific.

Linus
 

I have tried with 2.6.10, this appears to fail as well. Unfortunately I 
don't have console access right now so I will have confirm the message 
in the am. I'll start bisecting patches once we confirm.

Omkhar
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[Patch] resume PIT

2005-03-09 Thread Luming Yu


[PATCH] resume PIT for x86_64


Signed-off-by: Luming Yu <[EMAIL PROTECTED]>




diff -BruN 0/arch/x86_64/kernel/i8259.c 1/arch/x86_64/kernel/i8259.c
--- 0/arch/x86_64/kernel/i8259.c2005-03-07 23:29:42.0 +0800
+++ 1/arch/x86_64/kernel/i8259.c2005-03-09 12:53:10.0 +0800
@@ -477,6 +477,7 @@
 void call_function_interrupt(void);
 void invalidate_interrupt(void);
 void thermal_interrupt(void);
+void i8254_timer_resume(void);
 
 static void setup_timer(void)
 {
@@ -493,6 +494,11 @@
return 0;
 }
 
+void i8254_timer_resume(void)
+{
+   setup_timer();
+}
+
 static struct sysdev_class timer_sysclass = {
set_kset_name("timer"),
.resume = timer_resume,
diff -BruN 0/arch/x86_64/kernel/time.c 1/arch/x86_64/kernel/time.c
--- 0/arch/x86_64/kernel/time.c 2005-03-07 23:32:23.0 +0800
+++ 1/arch/x86_64/kernel/time.c 2005-03-09 12:53:10.0 +0800
@@ -46,7 +46,7 @@
 #ifdef CONFIG_CPU_FREQ
 static void cpufreq_delayed_get(void);
 #endif
-
+extern void i8254_timer_resume(void);
 extern int using_apic_timer;
 
 DEFINE_SPINLOCK(rtc_lock);
@@ -980,6 +980,8 @@
 
if (vxtime.hpet_address)
hpet_reenable();
+   else
+   i8254_timer_resume();
 
sec = ctime + clock_cmos_diff;
write_seqlock_irqsave(_lock,flags);


i8254.patch
Description: Binary data


Re: [PATCH] Add 2.4.x cpufreq /proc and sysctl interface removal feature-removal-schedule

2005-03-09 Thread Dominik Brodowski
On Wed, Mar 09, 2005 at 04:34:38PM -0800, Greg KH wrote:
> ChangeSet 1.2036, 2005/03/09 09:31:40-08:00, [EMAIL PROTECTED]
> 
> [PATCH] Add 2.4.x cpufreq /proc and sysctl interface removal 
> feature-removal-schedule
> 
> Add 2.4.x cpufreq /proc and sysctl interface removal
> to the feature-removal-schedule.
> 
> [PATCH] cpufreq 2.4 interface removal schedule
> 
> Even though these 2.4. interfaces are already gone in Dave Jones' cpufreq
> bitkeeper tree, here's a patch which properly announces it in
> Documentation/feature-removal-schedule.txt:


Both already _were_ in Linus' tree; the entry got removed along with the
cpufreq 2.4. interface. So please do not re-add this entry.

Dominik
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Fastboot] Re: Query: Kdump: Core Image ELF Format

2005-03-09 Thread Vivek Goyal
On Wed, 2005-03-09 at 07:17 -0700, Eric W. Biederman wrote:
> Vivek Goyal <[EMAIL PROTECTED]> writes:
> 
> > On Tue, 2005-03-08 at 11:00 -0700, Eric W. Biederman wrote: 
> > That sounds good. But we loose the advantage of doing limited debugging
> > with gdb. Crash (or other analysis tools) will still take considerable
> > amount of time before before they are fully ready and tested.
> > 
> > How about giving user the flexibility to choose. What I mean is
> > introducing a command line option in kexec-tools to choose between ELF32
> > and ELF64 headers. For the users who are not using PAE systems, they can
> > very well go with ELF32 headers and do the debugging using gdb.
> > 
> > This also requires, setting the kernel virtual addresses while preparing
> > the headers. KVA for linearly mapped region is known in advance and can
> > be filled at header creation time and gdb can directly operate upon this
> > region.
> 
> I have no problems decorating the ELF header you are generating
> in user space with virtual addresses assuming we can reliably
> get that information.  And before a kernel crashes looks like a reasonable
> time to ask that question.  I don't currently see where you could
> derive that information.

I want to fill the virtual addresses of linearly mapped region. That is
physical addresses from 0 to MAXMEM (896 MB) are mapped by kernel at
virtual addresses PAGE_OFFSET to (PAGE_OFFSET + MAXMEM). Values of
PAGE_OFFSET and MAXMEM are already known and hard-coded.

I think I used the terminology kernel virtual address and that is adding
to the confusion. Kernel virtual addresses are not necessarily linearly
mapped. What I meant was kernel logical addresses whose associated
physical addresses differ only by a constant offset.

> 
> Beyond that I prefer a little command line tool that will do the
> ELF64 to ELF32 conversion and possibly add in the kva mapping to
> make the core dump usable with gdb.  Doing it in a separate tool
> means it is the developer who is doing the analysis who cares
> not the user who is capturing the system core dump.
> 
> But I do agree that it a use case worth solving.
> 
> Eric
> 

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


sata_sil & Seagate HD solution

2005-03-09 Thread John Yau
Hi all,

I recently bought a computer with a Silicon Image 3512 SATA chipset
and a 200GB Seagate ST320082 hard drive without knowledge that these
two pieces of hardware don't play nicely.  However, I called Seagate
tech support and they told me that upgrading my bios would fix the
problem.  Fortunately my motherboard's manufacturer posted an upgrade
2-3 days after I learned of the fix.

I upgraded my motherboard's bios which updated the Silicon Image RAID
bios to 4.3.53.  That seems to have solved the incompatibility
problem.  I've had yet to have a crash during intense drive usage
while running with the MOD15 bug blacklist off.

Those poor souls that have a hard disk in the sata_sil blacklist, if
you're willing to risk it, try upgrading your bios, comment out your
hard drive from the black list and see if you're able to run at full
speed without the drive hanging.


John Yau
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] resync ATI PCI idents into base kernel

2005-03-09 Thread Alan Cox
On Mer, 2005-03-09 at 22:00, Christoph Hellwig wrote:
> Which is?  That's you're so special you don't need to care about the
> workflow the ordinary humans have created?

I don't see the connection between your comment and the thread sorry.

If I send it all to Andrew what will happen. Andrew can either break it
into zillions of pieces and everyone will say "But why do we need this"
or apply it. You might want to ask why so many new drivers don't bother
using pci_ids.h, I'd venture to say its a defensive mechanism against
broken process.

Alan

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Linux 2.6.11-ac1

2005-03-09 Thread Alan Cox
On Mer, 2005-03-09 at 22:22, CaT wrote:
> Argh! Ok. I guess I shouldn't've just bought the card based on this
> driver then so that I could better debug my problems with my promise
> cards. 8(

Its good hardware. It does lots of neat things providing you run -ac
anyway. The raid1 performance is very good and it can do hotplug IDE
transparently in hardware raid modes. Its a good solid little
controller. 

Alan

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] 2.6.10 - direct-io async short read bug

2005-03-09 Thread Andrew Morton
Badari Pulavarty <[EMAIL PROTECTED]> wrote:
>
> On Wed, 2005-03-09 at 11:53, Andrew Morton wrote:
> > Suparna Bhattacharya <[EMAIL PROTECTED]> wrote:
> > >
> > >  >Solaris, which does forcedirectio as a mount option, actually
> > >  > will do buffered I/O on the trailing part.  Consider it like a bounce
> > >  > buffer.  That way they don't DMA the trailing data and succeed the I/O.
> > >  > The I/O returns actual bytes till EOF, just like read(2) is supposed 
> > > to.
> > >  >Either this or a fully DMA'd number 4 is really what we should
> > >  > do.  If security can only be solved via a bounce buffer, who cares?  If
> > >  > the user created themselves a non-aligned file to open O_DIRECT, that's
> > >  > their problem if the last part-sector is negligably slower.
> > > 
> > >  If writes/truncates take care of zeroing out the rest of the sector
> > >  on disk, might we still be OK without having to do the bounce buffer
> > >  thing ?
> > 
> > We can probably rely on the rest of the sector outside i_size being zeroed
> > anyway.  Because if it contains non-zero gunk then the fs already has a
> > problem, and the user can get at that gunk with an expanding truncate and
> > mmap() anyway.
> > 
> 
> Rest of the sector or rest of the block ?

The filesystem-sized block (1< Are you implying that, we
> already do this, so there is no problem reading beyond EOF to user
> buffer ? Or we need to zero out the userbuffer beyond EOF ?

It should be acceptable to assume that the final (1

Re: Direct io on block device has performance regression on 2.6.x kernel

2005-03-09 Thread Andrew Morton
"Chen, Kenneth W" <[EMAIL PROTECTED]> wrote:
>
> > Did you generate a kernel profile?
> 
>  Top 40 kernel hot functions, percentage is normalized to kernel utilization.
> 
>  _spin_unlock_irqrestore  23.54%
>  _spin_unlock_irq 19.27%

Cripes.

Is that with CONFIG_PREEMPT?  If so, and if you disable CONFIG_PREEMPT,
this cost should be accounting the the spin_unlock() caller and we can see
who the culprit is.   Perhaps dio->bio_lock.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 1/1] unified spinlock initialization arch/um/drivers/port_kern.c

2005-03-09 Thread Russell King
On Wed, Mar 09, 2005 at 08:52:24PM +0100, Blaisorblade wrote:
> On Wednesday 09 March 2005 18:12, Russell King wrote:
> > On Wed, Mar 09, 2005 at 10:42:33AM +0100, [EMAIL PROTECTED] wrote:
> > > From: <[EMAIL PROTECTED]>
> > > Cc: , <[EMAIL PROTECTED]>,
> > > <[EMAIL PROTECTED]>, <[EMAIL PROTECTED]>
> > >
> > > Unify the spinlock initialization as far as possible.
> 
> > Are you sure this is really the best option in this instance?
> > Sometimes, static data initialisation is more efficient than
> > code-based manual initialisation, especially when the memory
> > is written to anyway.
> Agreed, theoretically, but this was done for multiple reasons globally, for 
> instance as a preparation to Ingo Molnar's preemption patches. There was 
> mention of this on lwn.net about this:
> 
> http://lwn.net/Articles/108719/

Was this announced on linux-kernel as well?  I don't remember seeing it.

I'm not convinced about the practicality of converting all static
initialisations to code-based initialisations though - I can see
that the number of initialisation functions scattered throughout
the kernel is going to increase dramatically to achieve this.

With a 2.4 to 2.6 kernel bloat already on the order of 140% for
similar functionality, I can only see the kernel getting more and
more bloated _for the same feature level_ because we're trying to
add more features to the kernel.

I'm not entirely convinced that is an all round sane approach.

-- 
Russell King
 Linux kernel2.6 ARM Linux   - http://www.arm.linux.org.uk/
 maintainer of:  2.6 Serial core
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [2.6 patch] drivers/net/sunhme.c: make a struct static

2005-03-09 Thread David S. Miller
On Sat, 19 Feb 2005 09:36:18 +0100
Adrian Bunk <[EMAIL PROTECTED]> wrote:

> This patch makes a needlessly global struct static.
> 
> Signed-off-by: Adrian Bunk <[EMAIL PROTECTED]>

Applied, thanks Adrian.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3/9] UML - "Hardware" random number generator

2005-03-09 Thread Jeff Dike
This implements a hardware random number generator for UML which attaches
itself to the host's /dev/random.

Signed-off-by: Jeff Dike <[EMAIL PROTECTED]>

Index: linux-2.6.11/arch/um/Kconfig_char
===
--- linux-2.6.11.orig/arch/um/Kconfig_char  2005-03-08 20:13:55.0 
-0500
+++ linux-2.6.11/arch/um/Kconfig_char   2005-03-08 20:22:24.0 -0500
@@ -190,5 +190,19 @@
tristate
default UML_SOUND
 
+config UML_RANDOM
+   tristate "Hardware random number generator"
+   help
+   This option enables UML's "hardware" random number generator.  It
+   attaches itself to the host's /dev/random, supplying as much entropy
+   as the host has, rather than the small amount the UML gets from its
+   own drivers.  It registers itself as a standard hardware random number
+   generator, major 10, minor 183, and the canonical device name is 
+   /dev/hwrng.
+   The way to make use of this is to install the rng-tools package
+   (check your distro, or download from 
+   http://sourceforge.net/projects/gkernel/).  rngd periodically reads 
+   /dev/hwrng and injects the entropy into /dev/random.
+
 endmenu
 
Index: linux-2.6.11/arch/um/defconfig
===
--- linux-2.6.11.orig/arch/um/defconfig 2005-03-08 20:13:55.0 -0500
+++ linux-2.6.11/arch/um/defconfig  2005-03-08 20:22:24.0 -0500
@@ -111,6 +111,7 @@
 CONFIG_UML_SOUND=m
 CONFIG_SOUND=m
 CONFIG_HOSTAUDIO=m
+CONFIG_UML_RANDOM=y
 
 #
 # Block devices
Index: linux-2.6.11/arch/um/drivers/Makefile
===
--- linux-2.6.11.orig/arch/um/drivers/Makefile  2005-03-08 20:17:34.0 
-0500
+++ linux-2.6.11/arch/um/drivers/Makefile   2005-03-08 20:22:24.0 
-0500
@@ -3,7 +3,7 @@
 # Licensed under the GPL
 #
 
-CHAN_OBJS := chan_kern.o chan_user.o line.o 
+CHAN_OBJS := chan_kern.o chan_user.o line.o
 
 # pcap is broken in 2.5 because kbuild doesn't allow pcap.a to be linked
 # in to pcap.o
@@ -41,7 +41,7 @@
 obj-$(CONFIG_XTERM_CHAN) += xterm.o xterm_kern.o
 obj-$(CONFIG_UML_WATCHDOG) += harddog.o
 obj-$(CONFIG_BLK_DEV_COW_COMMON) += cow_user.o
-
+obj-$(CONFIG_UML_RANDOM) += random.o
 
 USER_SINGLE_OBJS = $(foreach f,$(patsubst %.o,%,$(obj-y) 
$(obj-m)),$($(f)-objs))
 
Index: linux-2.6.11/arch/um/drivers/random.c
===
--- linux-2.6.11.orig/arch/um/drivers/random.c  2003-09-15 09:40:47.0 
-0400
+++ linux-2.6.11/arch/um/drivers/random.c   2005-03-08 20:22:24.0 
-0500
@@ -0,0 +1,122 @@
+/* Much of this ripped from hw_random.c */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include "os.h"
+
+/*
+ * core module and version information
+ */
+#define RNG_VERSION "1.0.0"
+#define RNG_MODULE_NAME "random"
+#define RNG_DRIVER_NAME   RNG_MODULE_NAME " virtual driver " RNG_VERSION
+#define PFX RNG_MODULE_NAME ": "
+
+#define RNG_MISCDEV_MINOR  183 /* official */
+
+static int random_fd = -1;
+
+static int rng_dev_open (struct inode *inode, struct file *filp)
+{
+   /* enforce read-only access to this chrdev */
+   if ((filp->f_mode & FMODE_READ) == 0)
+   return -EINVAL;
+   if (filp->f_mode & FMODE_WRITE)
+   return -EINVAL;
+
+   return 0;
+}
+
+static ssize_t rng_dev_read (struct file *filp, char __user *buf, size_t size,
+ loff_t * offp)
+{
+u32 data;
+int n, ret = 0, have_data;
+
+while(size){
+n = os_read_file(random_fd, , sizeof(data));
+if(n > 0){
+have_data = n;
+while (have_data && size) {
+if (put_user((u8)data, buf++)) {
+ret = ret ? : -EFAULT;
+break;
+}
+size--;
+ret++;
+have_data--;
+data>>=8;
+}
+}
+else if(n == -EAGAIN){
+if (filp->f_flags & O_NONBLOCK)
+return ret ? : -EAGAIN;
+
+if(need_resched()){
+current->state = TASK_INTERRUPTIBLE;
+schedule_timeout(1);
+}
+}
+else return n;
+   if (signal_pending (current))
+   return ret ? : -ERESTARTSYS;
+   }
+   return ret;
+}
+
+static struct file_operations rng_chrdev_ops = {
+   .owner  = THIS_MODULE,
+   .open   = rng_dev_open,
+   .read   = rng_dev_read,
+};
+
+static struct 

Re: Direct io on block device has performance regression on 2.6.x kernel

2005-03-09 Thread Jesse Barnes
On Wednesday, March 9, 2005 3:23 pm, Andi Kleen wrote:
> "Chen, Kenneth W" <[EMAIL PROTECTED]> writes:
> > Just to clarify here, these data need to be taken at grain of salt. A
> > high count in _spin_unlock_* functions do not automatically points to
> > lock contention.  It's one of the blind spot syndrome with timer based
> > profile on ia64.  There are some lock contentions in 2.6 kernel that
> > we are staring at.  Please do not misinterpret the number here.
>
> Why don't you use oprofileÂ>? It uses NMIs and can profile "inside"
> interrupt disabled sections.

That was oprofile output, but on ia64, 'NMI's are maskable due to the way irq 
disabling works.  Here's a very hackish patch that changes the kernel to use 
cr.tpr instead of psr.i for interrupt control.  Making oprofile use real ia64 
NMIs is left as an exercise for the reader :)

Jesse
= arch/ia64/Kconfig.debug 1.2 vs edited =
--- 1.2/arch/ia64/Kconfig.debug 2005-01-07 16:15:52 -08:00
+++ edited/arch/ia64/Kconfig.debug  2005-02-28 10:07:27 -08:00
@@ -56,6 +56,15 @@
  and restore instructions.  It's useful for tracking down spinlock
  problems, but slow!  If you're unsure, select N.
 
+config IA64_ALLOW_NMI
+   bool "Allow non-maskable interrupts"
+   help
+ The normal ia64 irq enable/disable code prevents even non-maskable
+ interrupts from occuring, which can be a problem for kernel
+ debuggers, watchdogs, and profilers.  Say Y here if you're interested
+ in NMIs and don't mind the small performance penalty this option
+ imposes.
+
 config SYSVIPC_COMPAT
bool
depends on COMPAT && SYSVIPC
= arch/ia64/kernel/head.S 1.31 vs edited =
--- 1.31/arch/ia64/kernel/head.S2005-01-28 15:50:13 -08:00
+++ edited/arch/ia64/kernel/head.S  2005-03-01 13:17:51 -08:00
@@ -59,6 +59,14 @@
.save rp, r0// terminate unwind chain with a NULL rp
.body
 
+#ifdef CONFIG_IA64_ALLOW_NMI   // disable interrupts initially (re-enabled in 
start_kernel())
+   mov r16=1<<16
+   ;;
+   mov cr.tpr=r16
+   ;;
+   srlz.d
+   ;;
+#endif
rsm psr.i | psr.ic
;;
srlz.i
@@ -129,8 +137,8 @@
/*
 * Switch into virtual mode:
 */
-   movl 
r16=(IA64_PSR_IT|IA64_PSR_IC|IA64_PSR_DT|IA64_PSR_RT|IA64_PSR_DFH|IA64_PSR_BN \
- |IA64_PSR_DI)
+   movl 
r16=(IA64_PSR_IT|IA64_PSR_IC|IA64_PSR_I|IA64_PSR_DT|IA64_PSR_RT|IA64_PSR_DFH|\
+ IA64_PSR_BN|IA64_PSR_DI)
;;
mov cr.ipsr=r16
movl r17=1f
= arch/ia64/kernel/irq_ia64.c 1.25 vs edited =
--- 1.25/arch/ia64/kernel/irq_ia64.c2005-01-22 15:54:49 -08:00
+++ edited/arch/ia64/kernel/irq_ia64.c  2005-03-01 12:50:18 -08:00
@@ -103,8 +103,6 @@
 void
 ia64_handle_irq (ia64_vector vector, struct pt_regs *regs)
 {
-   unsigned long saved_tpr;
-
 #if IRQ_DEBUG
{
unsigned long bsp, sp;
@@ -135,17 +133,9 @@
}
 #endif /* IRQ_DEBUG */
 
-   /*
-* Always set TPR to limit maximum interrupt nesting depth to
-* 16 (without this, it would be ~240, which could easily lead
-* to kernel stack overflows).
-*/
irq_enter();
-   saved_tpr = ia64_getreg(_IA64_REG_CR_TPR);
-   ia64_srlz_d();
while (vector != IA64_SPURIOUS_INT_VECTOR) {
if (!IS_RESCHEDULE(vector)) {
-   ia64_setreg(_IA64_REG_CR_TPR, vector);
ia64_srlz_d();
 
__do_IRQ(local_vector_to_irq(vector), regs);
@@ -154,7 +144,6 @@
 * Disable interrupts and send EOI:
 */
local_irq_disable();
-   ia64_setreg(_IA64_REG_CR_TPR, saved_tpr);
}
ia64_eoi();
vector = ia64_get_ivr();
@@ -165,6 +154,7 @@
 * come through until ia64_eoi() has been done.
 */
irq_exit();
+   local_irq_enable();
 }
 
 #ifdef CONFIG_HOTPLUG_CPU
= include/asm-ia64/hw_irq.h 1.15 vs edited =
--- 1.15/include/asm-ia64/hw_irq.h  2005-01-22 15:54:52 -08:00
+++ edited/include/asm-ia64/hw_irq.h2005-03-01 13:01:03 -08:00
@@ -36,6 +36,10 @@
 
 #define AUTO_ASSIGN-1
 
+#define IA64_NMI_VECTOR0x02/* NMI (note that this 
can be
+  masked if psr.i or psr.ic
+  are cleared) */
+
 #define IA64_SPURIOUS_INT_VECTOR   0x0f
 
 /*
= include/asm-ia64/system.h 1.48 vs edited =
--- 1.48/include/asm-ia64/system.h  2005-01-04 18:48:18 -08:00
+++ edited/include/asm-ia64/system.h2005-03-01 15:28:23 -08:00
@@ -107,12 +107,61 @@
 
 #define safe_halt() ia64_pal_halt_light()/* PAL_HALT_LIGHT */
 
+/* For spinlocks etc */
+#ifdef CONFIG_IA64_ALLOW_NMI
+
+#define IA64_TPR_MMI_BIT 

Re: BUG: Slowdown on 3000 socket-machines tracked down

2005-03-09 Thread Ben Greear
Christian Schmid wrote:
H can you try to following just to exclude some theories:
Run it with 4000 sockets and then do the following on the server-machine:
dd if=/dev/zero of=file1 bs=1M count=1024
dd if=/dev/zero of=file2 bs=1M count=1024
dd if=/dev/zero of=file3 bs=1M count=1024
cat file1 > /dev/zero & cat file2 > /dev/zero & cat file3 > /dev/zero &
I THINK it might have something to do with caching-pressure or so. See 
if there is a slow-down on the sending if the page-cache gets full and 
has to be cleared again.

You are running 2.6.11?
Yes, 2.6.11.  I have tuned max_backlog and some other TCP and networking
related settings to give more buffers etc to networking tasks.  I have not
tried any significant disk-IO while doing these tests.
I finally got my systems set up so I can run my WAN emulator at full 1Gbps:
I am getting right at 986Mbps throughput with 30ms round-trip latency
(15ms in both directions).
So, latency does not seem to be the problem either.
I think the problem can be narrowed down to:
1)  Non-optimal kernel network tunings on your server.
2)  Disk-IO (my disk is small and slow compared to a 'real' server, not sure I 
can
 really test this side of things, and I have not tried as of yet.)
3)  Your clients have much more latency and/or don't have enough bandwidth
 to fully load your server.  Since you didn't answer before:  I assume you
 do not have a reliable test bed and are just hoping that enough clients 
connect
 to do your benchmarking.
4)  There is something strange with sendfile and/or your application's coding.
My suggestion would be to eliminate these variables by coming up with a 
repeatable
test bed, alternative traffic generators, WAN/Network emulators for latency, 
etc.
Thanks,
Ben
--
Ben Greear <[EMAIL PROTECTED]>
Candela Technologies Inc  http://www.candelatech.com
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Direct io on block device has performance regression on 2.6.x kernel

2005-03-09 Thread Andrew Vasquez
On Wed, 09 Mar 2005, Chen, Kenneth W wrote:

> Andrew Morton wrote Wednesday, March 09, 2005 6:26 PM
> > What does "1/3 of the total benchmark performance regression" mean?  One
> > third of 0.1% isn't very impressive.  You haven't told us anything at all
> > about the magnitude of this regression.
> 
> 2.6.9 kernel is 6% slower compare to distributor's 2.4 kernel (RHEL3).  
> Roughly
> 2% came from storage driver (I'm not allowed to say anything beyond that, 
> there
> is a fix though).
> 

Ok now, that statement piqued my interest -- since looking through a
previous email it seems you are using the qla2xxx driver.  Care to
elaborate?

Regards,
Andrew Vasquez
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: sched_setscheduler and pids/threads

2005-03-09 Thread Robert Love
On Thu, 2005-03-10 at 15:12 +1100, Dave Airlie wrote:

> In 2.6 all my threads appear as a single PID,if I use chrt -p 
> will it set the scheduling priority for my main thread or for all
> threads in the application?

For just the main thread (or the thread of whatever PID you give).  You
need to set the PID of each thread individually.  The "everything
appears as a single PID" is just an elaborate parlor trick.  Wool pulled
over your eyes.

> Can I used the thread IDs from /proc//task/ to chrt the other
> threads in my app to different priorities?

You can use the PID's in /proc//task/, yes.

Or you can just set the PID of the main thread before it starts other
threads, or use chrt to launch the program, or use chrt to set the PID
of a shell script that starts the application:  Scheduler properties are
inherited.

Best,

Robert Love


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Direct io on block device has performance regression on 2.6.x kernel

2005-03-09 Thread Jesse Barnes
On Wednesday, March 9, 2005 3:23 pm, Andi Kleen wrote:
> "Chen, Kenneth W" <[EMAIL PROTECTED]> writes:
> > Just to clarify here, these data need to be taken at grain of salt. A
> > high count in _spin_unlock_* functions do not automatically points to
> > lock contention.  It's one of the blind spot syndrome with timer based
> > profile on ia64.  There are some lock contentions in 2.6 kernel that
> > we are staring at.  Please do not misinterpret the number here.
>
> Why don't you use oprofileÂ>? It uses NMIs and can profile "inside"
> interrupt disabled sections.

Oh, and there are other ways of doing interrupt off profiling by using the 
PMU.  q-tools can do this I think.

Jesse
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Linux 2.6.11.2

2005-03-09 Thread Greg KH
On Wed, Mar 09, 2005 at 01:06:31PM -0800, Matt Mackall wrote:
> On Wed, Mar 09, 2005 at 12:39:23AM -0800, Greg KH wrote:
> > And to further test this whole -stable system, I've released 2.6.11.2.
> > It contains one patch, which is already in the -bk tree, and came from
> > the security team (hence the lack of the longer review cycle).
> > 
> > It's available now in the normal kernel.org places:
> > kernel.org/pub/linux/kernel/v2.6/patch-2.6.11.2.gz
> > which is a patch against the 2.6.11.1 release.
> 
> Argh! @*#$&!!&! 
> 
> > If consensus arrives
> > that this patch should be against the 2.6.11 tree, it will be done that
> > way in the future.
> 
> Consensus arrived back when 2.6.8.1 came out.

It did?  So, what was it agreed to be?  Any pointers to that agreement?

> Please, folks, there are automated tools that "know" about kernel
> release numbering and so on. Said tools broke with 2.6.11.1 because it
> wasn't in the same place that 2.6.8.1 was and now this breaks with all
> precedent by being an interdiff along a branch.

2.6.11.1 is now in the proper place, sorry for any inconvience that
caused.  This happened yesterday.

> Fixing it in the future is too #*$%* late because you've now turned it
> into a special case.

No, I can always respin the patch, and re-release it if it's a problem.

thanks,

greg k-h
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


3/3 swsusp: enable resume from initrd

2005-03-09 Thread Pavel Machek
Hi!

From: [EMAIL PROTECTED]

When using a fully modularized kernel it is necessary to activate
resume manually as the device node might not be available during
kernel init.

This patch implements a new sysfs attribute '/sys/power/resume' which
allows for manual activation of software resume.  When read from it
prints the configured resume device in 'major:minor' format.  When
written to it expects a device in 'major:minor' format.  This device
is then checked for a suspended image and resume is started if a valid
image is found.  The original functionality is left in place.

It should be used from initramfs, or with care.

Please apply,
Pavel
Signed-off-by: Hannes Reinecke <[EMAIL PROTECTED]>
Signed-off-by: Pavel Machek <[EMAIL PROTECTED]>

--- linux.middle/include/linux/suspend.h2005-02-14 14:14:21.0 
+0100
+++ linux/include/linux/suspend.h   2005-03-03 13:23:17.0 +0100
@@ -35,6 +35,8 @@
 
 
 #define SUSPEND_PD_PAGES(x) (((x)*sizeof(struct pbe))/PAGE_SIZE+1)
+
+extern dev_t swsusp_resume_device;

 /* mm/vmscan.c */
 extern int shrink_mem(void);
--- linux.middle/init/do_mounts.c   2005-02-03 22:28:15.0 +0100
+++ linux/init/do_mounts.c  2005-03-03 13:23:17.0 +0100
@@ -53,7 +53,7 @@
 __setup("ro", readonly);
 __setup("rw", readwrite);
 
-static dev_t __init try_name(char *name, int part)
+static dev_t try_name(char *name, int part)
 {
char path[64];
char buf[32];
@@ -135,7 +135,7 @@
  * is mounted on rootfs /sys.
  */
 
-dev_t __init name_to_dev_t(char *name)
+dev_t name_to_dev_t(char *name)
 {
char s[32];
char *p;
--- linux.middle/kernel/power/disk.c2005-03-02 00:22:49.0 +0100
+++ linux/kernel/power/disk.c   2005-03-04 10:15:46.0 +0100
@@ -16,7 +18,6 @@
 #include 
 #include 
 #include 
-#include 
 #include "power.h"
 
 
@@ -25,13 +26,16 @@
 
 extern int swsusp_suspend(void);
 extern int swsusp_write(void);
+extern int swsusp_check(void);
 extern int swsusp_read(void);
+extern void swsusp_close(void);
 extern int swsusp_resume(void);
 extern int swsusp_free(void);
 
 
 static int noresume = 0;
 char resume_file[256] = CONFIG_PM_STD_PARTITION;
+dev_t swsusp_resume_device;
 
 /**
  * power_down - Shut machine down for hibernate.
@@ -123,45 +127,54 @@
 }
 
 
-static int prepare(void)
+static int prepare_processes(void)
 {
int error;
 
pm_prepare_console();
 
sys_sync();
+
if (freeze_processes()) {
error = -EBUSY;
-   goto Thaw;
+   return error;
}
 
if (pm_disk_mode == PM_DISK_PLATFORM) {
if (pm_ops && pm_ops->prepare) {
if ((error = pm_ops->prepare(PM_SUSPEND_DISK)))
-   goto Thaw;
+   return error;
}
}
 
/* Free memory before shutting down devices. */
free_some_memory();
 
+   return 0;
+}
+
+static void unprepare_processes(void)
+{
+   enable_nonboot_cpus();
+   thaw_processes();
+   pm_restore_console();
+}
+
+static int prepare_devices(void)
+{
+   int error;
+
disable_nonboot_cpus();
if ((error = device_suspend(PMSG_FREEZE))) {
printk("Some devices failed to suspend\n");
-   goto Finish;
+   platform_finish();
+   enable_nonboot_cpus();
+   return error;
}
 
return 0;
- Finish:
-   platform_finish();
- Thaw:
-   enable_nonboot_cpus();
-   thaw_processes();
-   pm_restore_console();
-   return error;
 }
 
-
 /**
  * pm_suspend_disk - The granpappy of power management.
  *
@@ -175,8 +188,15 @@
 {
int error;
 
-   if ((error = prepare()))
+   error = prepare_processes();
+   if (!error) {
+   error = prepare_devices();
+   }
+
+   if (error) {
+   unprepare_processes();
return error;
+   }
 
pr_debug("PM: Attempting to suspend to disk.\n");
if (pm_disk_mode == PM_DISK_FIRMWARE)
@@ -225,14 +245,26 @@
return 0;
}
 
+   pr_debug("PM: Checking swsusp image.\n");
+
+   if ((error = swsusp_check()))
+   goto Done;
+
+   pr_debug("PM: Preparing processes for restore.\n");
+
+   if ((error = prepare_processes())) {
+   swsusp_close();
+   goto Cleanup;
+   }
+
pr_debug("PM: Reading swsusp image.\n");
 
if ((error = swsusp_read()))
-   goto Done;
+   goto Cleanup;
 
-   pr_debug("PM: Preparing system for restore.\n");
+   pr_debug("PM: Preparing devices for restore.\n");
 
-   if ((error = prepare()))
+   if ((error = prepare_devices()))
goto Free;
 
barrier();
@@ -244,6 +276,8 @@
finish();
  Free:
swsusp_free();
+ 

RE: [ANNOUNCE][PATCH 2.6.11 2/3] megaraid_sas: Announcing new mod ule for LSI Logic's SAS based MegaRAID controllers

2005-03-09 Thread Bagalkote, Sreenivas
>
>Even for kernels with a 64bit dma_addr_t you can get 32bit dma 
>addresses
>only.  As a start check whether the pci_set_dma_mask for the 64bit mask
>failed - in that case you can always use 32bit SGLs.
>

Please help me understand: If dma_addr_t is 64 bit, I will get 64bit 
addresses in scatterlist regardless the outcome of pci_set_dma_mask, 
won't I? These addresses may have valid or null high addresses. My idea
was to have 32(64) bit SGLs for 32(64) bit dma_addr_t.

Thanks,
Sreenivas
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 7/9] UML - Speed up tlb flushing

2005-03-09 Thread Jeff Dike
This patch optimizes tlb flushing in a couple of ways to reduce the number
of system calls made to the host in order to update an address space.

Operations are collected, and adjacent ones which can be merged, are.  This
includes consecutive munmaps, mprotects with the same permissions, and mmaps
with the same backing file and permissions and linear in the file.

Second, the munmaps that always preceded mmaps are now done instead of mmap if
necessary.

Signed-off-by: Jeff Dike <[EMAIL PROTECTED]>

Index: linux-2.6.11/arch/um/include/tlb.h
===
--- linux-2.6.11.orig/arch/um/include/tlb.h 2005-03-08 20:17:35.0 
-0500
+++ linux-2.6.11/arch/um/include/tlb.h  2005-03-08 22:22:23.0 -0500
@@ -6,9 +6,48 @@
 #ifndef __TLB_H__
 #define __TLB_H__
 
+#include "um_mmu.h"
+
+struct host_vm_op {
+   enum { MMAP, MUNMAP, MPROTECT } type;
+   union {
+   struct {
+   unsigned long addr;
+   unsigned long len;
+   unsigned int r:1;
+   unsigned int w:1;
+   unsigned int x:1;
+   int fd;
+   __u64 offset;
+   } mmap;
+   struct {
+   unsigned long addr;
+   unsigned long len;
+   } munmap;
+   struct {
+   unsigned long addr;
+   unsigned long len;
+   unsigned int r:1;
+   unsigned int w:1;
+   unsigned int x:1;
+   } mprotect;
+   } u;
+};
+
 extern void mprotect_kernel_vm(int w);
 extern void force_flush_all(void);
 
+extern int add_mmap(unsigned long virt, unsigned long phys, unsigned long len,
+   int r, int w, int x, struct host_vm_op *ops, int index, 
+   int last_filled, int data, 
+   void (*do_ops)(int, struct host_vm_op *, int));
+extern int add_munmap(unsigned long addr, unsigned long len, 
+ struct host_vm_op *ops, int index, int last_filled, 
+ int data, void (*do_ops)(int, struct host_vm_op *, int));
+extern int add_mprotect(unsigned long addr, unsigned long len, int r, int w, 
+   int x, struct host_vm_op *ops, int index, 
+   int last_filled, int data,
+   void (*do_ops)(int, struct host_vm_op *, int));
 #endif
 
 /*
Index: linux-2.6.11/arch/um/kernel/skas/include/skas.h
===
--- linux-2.6.11.orig/arch/um/kernel/skas/include/skas.h2005-03-08 
20:17:35.0 -0500
+++ linux-2.6.11/arch/um/kernel/skas/include/skas.h 2005-03-08 
22:22:23.0 -0500
@@ -22,11 +22,11 @@
 extern void remove_sigstack(void);
 extern void new_thread_handler(int sig);
 extern void handle_syscall(union uml_pt_regs *regs);
-extern void map(int fd, unsigned long virt, unsigned long phys, 
-   unsigned long len, int r, int w, int x);
-extern int unmap(int fd, void *addr, int len);
+extern void map(int fd, unsigned long virt, unsigned long len, int r, int w, 
+   int x, int phys_fd, unsigned long long offset);
+extern int unmap(int fd, void *addr, unsigned long len);
 extern int protect(int fd, unsigned long addr, unsigned long len, 
-  int r, int w, int x, int must_succeed);
+  int r, int w, int x);
 extern void user_signal(int sig, union uml_pt_regs *regs);
 extern int new_mm(int from);
 extern void start_userspace(int cpu);
Index: linux-2.6.11/arch/um/kernel/skas/mem_user.c
===
--- linux-2.6.11.orig/arch/um/kernel/skas/mem_user.c2005-03-08 
21:56:38.0 -0500
+++ linux-2.6.11/arch/um/kernel/skas/mem_user.c 2005-03-08 22:22:23.0 
-0500
@@ -11,16 +11,14 @@
 #include "os.h"
 #include "proc_mm.h"
 
-void map(int fd, unsigned long virt, unsigned long phys, unsigned long len, 
-int r, int w, int x)
+void map(int fd, unsigned long virt, unsigned long len, int r, int w, 
+int x, int phys_fd, unsigned long long offset)
 {
struct proc_mm_op map;
-   __u64 offset;
-   int prot, n, phys_fd;
+   int prot, n;
 
prot = (r ? PROT_READ : 0) | (w ? PROT_WRITE : 0) | 
(x ? PROT_EXEC : 0);
-   phys_fd = phys_mapping(phys, );
 
map = ((struct proc_mm_op) { .op= MM_MMAP,
 .u = 
@@ -38,7 +36,7 @@
printk("map : /proc/mm map failed, err = %d\n", -n);
 }
 
-int unmap(int fd, void *addr, int len)
+int unmap(int fd, void *addr, unsigned long len)
 {
struct proc_mm_op unmap;
int n;
Index: linux-2.6.11/arch/um/kernel/skas/tlb.c
===
--- 

[PATCH 5/9] UML - change semaphores to completions

2005-03-09 Thread Jeff Dike
From: Esben Nielsen 

One of the problems was use of direct architecture specific semaphores
(which doesn't work under PREEMPT_REALTIME) and in places where a quick
(maybe too quick) look at the code told me that completions ought to be
used. Therefore I changed two semaphores to completions which compiled
fine. I have tried the change on 2.6.11-rc2, and it seemed to work, but I
have not tested it heavily.

Signed-off-by: Jeff Dike <[EMAIL PROTECTED]>

Index: linux-2.6.11/arch/um/drivers/port_kern.c
===
--- linux-2.6.11.orig/arch/um/drivers/port_kern.c   2005-03-08 
20:17:34.0 -0500
+++ linux-2.6.11/arch/um/drivers/port_kern.c2005-03-08 22:16:48.0 
-0500
@@ -25,7 +25,7 @@
struct list_head list;
atomic_t wait_count;
int has_connection;
-   struct semaphore sem;
+   struct completion done;
int port;
int fd;
spinlock_t lock;
@@ -68,7 +68,7 @@
conn->fd = fd;
list_add(>list, >port->connections);
 
-   up(>port->sem);
+   complete(>port->done);
return(IRQ_HANDLED);
 }
 
@@ -197,13 +197,14 @@
{ .list = LIST_HEAD_INIT(port->list),
  .wait_count   = ATOMIC_INIT(0),
  .has_connection   = 0,
- .sem  = __SEMAPHORE_INITIALIZER(port->sem, 
- 0),
  .lock = SPIN_LOCK_UNLOCKED,
  .port = port_num,
  .fd   = fd,
  .pending  = LIST_HEAD_INIT(port->pending),
  .connections  = LIST_HEAD_INIT(port->connections) });
+
+   init_completion(>done), 
+
list_add(>list, );
 
  found:
@@ -237,7 +238,7 @@
 atomic_inc(>wait_count);
while(1){
fd = -ERESTARTSYS;
-   if(down_interruptible(>sem))
+if(wait_for_completion_interruptible(>done))
 goto out;
 
spin_lock(>lock);
@@ -308,14 +309,3 @@
 }
 
 __uml_exitcall(free_port);
-
-/*
- * Overrides for Emacs so that we follow Linus's tabbing style.
- * Emacs will notice this stuff at the end of the file and automatically
- * adjust the settings for this buffer only.  This must remain at the end
- * of the file.
- * ---
- * Local variables:
- * c-file-style: "linux"
- * End:
- */
Index: linux-2.6.11/arch/um/drivers/xterm_kern.c
===
--- linux-2.6.11.orig/arch/um/drivers/xterm_kern.c  2005-03-08 
20:17:34.0 -0500
+++ linux-2.6.11/arch/um/drivers/xterm_kern.c   2005-03-08 22:16:48.0 
-0500
@@ -16,7 +16,7 @@
 #include "xterm.h"
 
 struct xterm_wait {
-   struct semaphore sem;
+   struct completion ready;
int fd;
int pid;
int new_fd;
@@ -32,7 +32,7 @@
return(IRQ_NONE);
 
xterm->new_fd = fd;
-   up(>sem);
+   complete(>ready);
return(IRQ_HANDLED);
 }
 
@@ -49,10 +49,10 @@
 
/* This is a locked semaphore... */
*data = ((struct xterm_wait) 
-   { .sem  = __SEMAPHORE_INITIALIZER(data->sem, 0),
- .fd   = socket,
+   { .fd   = socket,
  .pid  = -1,
  .new_fd   = -1 });
+   init_completion(>ready);
 
err = um_request_irq(XTERM_IRQ, socket, IRQ_READ, xterm_interrupt, 
 SA_INTERRUPT | SA_SHIRQ | SA_SAMPLE_RANDOM, 
@@ -68,7 +68,7 @@
 *
 * XXX Note, if the xterm doesn't work for some reason (eg. DISPLAY
 * isn't set) this will hang... */
-   down(>sem);
+   wait_for_completion(>ready);
 
free_irq_by_irq_and_dev(XTERM_IRQ, data);
free_irq(XTERM_IRQ, data);

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: Direct io on block device has performance regression on 2.6.x kernel

2005-03-09 Thread David Lang
On Wed, 9 Mar 2005, Chen, Kenneth W wrote:
Also, I'm rather peeved that we're hearing about this regression now rather
than two years ago.  And mystified as to why yours is the only group which
has reported it.
2.6.X kernel has never been faster than the 2.4 kernel (RHEL3).  At one 
point
of time, around 2.6.2, the gap is pretty close, at around 1%, but still slower.
Around 2.6.5, we found global plug list is causing huge lock contention on
32-way numa box.  That got fixed in 2.6.7.  Then comes 2.6.8 which took a big
dip at close to 20% regression.  Then we fixed 17% regression in the scheduler
(fixed with cache_decay_tick).  2.6.9 is the last one we measured and it is 6%
slower.  It's a constant moving target, a wild goose to chase.
I don't know why other people have not reported the problem, perhaps they
haven't got a chance to run transaction processing db workload on 2.6 kernel.
Perhaps they have not compared, perhaps they are working on the same problem.
I just don't know.
Also the 2.6 kernel is Soo much better in the case where you have many 
threads (even if they are all completely idle) that that improvement may 
be masking the regression that Ken is reporting (I've seen a 50% 
performance hit on 2.4 with just a thousand or two threads compared to 
2.6). let's face it, a typical linux box today starts up a LOT of stuff 
that will never get used, but will count as an idle thread.

David Lang
--
There are two ways of constructing a software design. One way is to make it so 
simple that there are obviously no deficiencies. And the other way is to make 
it so complicated that there are no obvious deficiencies.
 -- C.A.R. Hoare
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Direct io on block device has performance regression on 2.6.xkernel

2005-03-09 Thread David Lang
On Wed, 9 Mar 2005, Andrew Morton wrote:
David Lang <[EMAIL PROTECTED]> wrote:
(I've seen a 50%
 performance hit on 2.4 with just a thousand or two threads compared to
 2.6)
Was that 2.4 kernel a vendor kernel with the O(1) scheduler?
no, a kernel.org kernel. the 2.6 kernel is so much faster for this 
workload that I switched for this app and never looked back. I found that 
if I had 5000 or so idle tasks 2.4 performcane would drop to about a 
quarter of 2.6 (with the CPU system time being the limiting factor)

David Lang
--
There are two ways of constructing a software design. One way is to make it so 
simple that there are obviously no deficiencies. And the other way is to make 
it so complicated that there are no obvious deficiencies.
 -- C.A.R. Hoare
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


sched_setscheduler and pids/threads

2005-03-09 Thread Dave Airlie
Hi all,

I'm a bit confused over 2.6 threading with respects to real time
scheduling settings...

In 2.6 all my threads appear as a single PID, if I use chrt -p 
will it set the scheduling priority for my main thread or for all
threads in the application?

Can I used the thread IDs from /proc//task/ to chrt the other
threads in my app to different priorities?

Thanks,
Dave.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Direct io on block device has performance regression on 2.6.x kernel

2005-03-09 Thread Andrew Morton
"Chen, Kenneth W" <[EMAIL PROTECTED]> wrote:
>
> Andrew Morton wrote Wednesday, March 09, 2005 6:26 PM
> > What does "1/3 of the total benchmark performance regression" mean?  One
> > third of 0.1% isn't very impressive.  You haven't told us anything at all
> > about the magnitude of this regression.
> 
> 2.6.9 kernel is 6% slower compare to distributor's 2.4 kernel (RHEL3).  
> Roughly
> 2% came from storage driver (I'm not allowed to say anything beyond that, 
> there
> is a fix though).

The codepaths are indeed longer in 2.6.

> 2% came from DIO.

hm, that's not a lot.

Once you redo that patch to use aops and to work with O_DIRECT, the paths
will get a little deeper, but not much.  We really should do this so that
O_DIRECT works, and in case someone has gone and mmapped the blockdev.

Fine-grained alignment is probably too hard, and it should fall back to
__blockdev_direct_IO().

Does it do the right thing with a request which is non-page-aligned, but
512-byte aligned?

readv and writev?

2% is pretty thin :(

> The rest of 2% is still unaccounted for.  We don't know where.

General cache replacement, perhaps.  9MB is a big cache though.

> ...
> Around 2.6.5, we found global plug list is causing huge lock contention on
> 32-way numa box.  That got fixed in 2.6.7.  Then comes 2.6.8 which took a big
> dip at close to 20% regression.  Then we fixed 17% regression in the scheduler
> (fixed with cache_decay_tick).  2.6.9 is the last one we measured and it is 6%
> slower.  It's a constant moving target, a wild goose to chase.
> 

OK.  Seems that the 2.4 O(1) scheduler got it right for that machine.

> haven't got a chance to run transaction processing db workload on 2.6 kernel.
> Perhaps they have not compared, perhaps they are working on the same problem.
> I just don't know.

Maybe there are other factors which drown these little things out:
architecture improvements, choice of architecture, driver changes, etc.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Add TPM hardware enablement driver

2005-03-09 Thread Jeff Garzik
Greg KH wrote:
diff -Nru a/drivers/char/tpm/tpm.c b/drivers/char/tpm/tpm.c
--- /dev/null	Wed Dec 31 16:00:00 196900
+++ b/drivers/char/tpm/tpm.c	2005-03-09 16:40:26 -08:00
@@ -0,0 +1,697 @@
+/*
+ * Copyright (C) 2004 IBM Corporation
+ *
+ * Authors:
+ * Leendert van Doorn <[EMAIL PROTECTED]>
+ * Dave Safford <[EMAIL PROTECTED]>
+ * Reiner Sailer <[EMAIL PROTECTED]>
+ * Kylene Hall <[EMAIL PROTECTED]>
+ *
+ * Maintained by: <[EMAIL PROTECTED]>
+ *
+ * Device driver for TCG/TCPA TPM (trusted platform module).
+ * Specifications at www.trustedcomputinggroup.org	 
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation, version 2 of the
+ * License.
+ * 
+ * Note, the TPM chip is not interrupt driven (only polling)
+ * and can have very long timeouts (minutes!). Hence the unusual
+ * calls to schedule_timeout.
+ *
+ */
+
+#include 
+#include 
+#include 
+#include "tpm.h"
+
+#define	TPM_MINOR			224	/* officially assigned */
+
+#define	TPM_BUFSIZE			2048
+
+/* PCI configuration addresses */
+#define	PCI_GEN_PMCON_1			0xA0
+#define	PCI_GEN1_DEC			0xE4
+#define	PCI_LPC_EN			0xE6
+#define	PCI_GEN2_DEC			0xEC
enums preferred to #defines, as these provide more type information, and 
are more visible in a debugger.


+static LIST_HEAD(tpm_chip_list);
+static spinlock_t driver_lock = SPIN_LOCK_UNLOCKED;
+static int dev_mask[32];
don't use '32', create a constant and use that constant instead.

+static void user_reader_timeout(unsigned long ptr)
+{
+   struct tpm_chip *chip = (struct tpm_chip *) ptr;
+
+   down(>buffer_mutex);
+   atomic_set(>data_pending, 0);
+   memset(chip->data_buffer, 0, TPM_BUFSIZE);
+   up(>buffer_mutex);
+}
+
+void tpm_time_expired(unsigned long ptr)
+{
+   int *exp = (int *) ptr;
+   *exp = 1;
+}
+
+EXPORT_SYMBOL_GPL(tpm_time_expired);
+
+/*
+ * Initialize the LPC bus and enable the TPM ports
+ */
+int tpm_lpc_bus_init(struct pci_dev *pci_dev, u16 base)
+{
+   u32 lpcenable, tmp;
+   int is_lpcm = 0;
+
+   switch (pci_dev->vendor) {
+   case PCI_VENDOR_ID_INTEL:
+   switch (pci_dev->device) {
+   case PCI_DEVICE_ID_INTEL_82801CA_12:
+   case PCI_DEVICE_ID_INTEL_82801DB_12:
+   is_lpcm = 1;
+   break;
+   }
+   /* init ICH (enable LPC) */
+   pci_read_config_dword(pci_dev, PCI_GEN1_DEC, );
+   lpcenable |= 0x2000;
+   pci_write_config_dword(pci_dev, PCI_GEN1_DEC, lpcenable);
+
+   if (is_lpcm) {
+   pci_read_config_dword(pci_dev, PCI_GEN1_DEC,
+ );
+   if ((lpcenable & 0x2000) == 0) {
+   dev_err(_dev->dev,
+   "cannot enable LPC\n");
+   return -ENODEV;
+   }
+   }
+
+   /* initialize TPM registers */
+   pci_read_config_dword(pci_dev, PCI_GEN2_DEC, );
+
+   if (!is_lpcm)
+   tmp = (tmp & 0x) | (base & 0xFFF0);
+   else
+   tmp =
+   (tmp & 0x) | (base & 0xFFF0) |
+   0x0001;
+
+   pci_write_config_dword(pci_dev, PCI_GEN2_DEC, tmp);
+
+   if (is_lpcm) {
+   pci_read_config_dword(pci_dev, PCI_GEN_PMCON_1,
+ );
+   tmp |= 0x0004;  /* enable CLKRUN */
+   pci_write_config_dword(pci_dev, PCI_GEN_PMCON_1,
+  tmp);
+   }
+   tpm_write_index(0x0D, 0x55);/* unlock 4F */
+   tpm_write_index(0x0A, 0x00);/* int disable */
+   tpm_write_index(0x08, base);/* base addr lo */
+   tpm_write_index(0x09, (base & 0xFF00) >> 8);  /* base addr hi */
+   tpm_write_index(0x0D, 0xAA);/* lock 4F */
please define symbol names for the TPM registers

+   break;
+   case PCI_VENDOR_ID_AMD:
+   /* nothing yet */
+   break;
+   }
+
+   return 0;
+}
+
+EXPORT_SYMBOL_GPL(tpm_lpc_bus_init);
+
+/*
+ * Internal kernel interface to transmit TPM commands
+ */
+static ssize_t tpm_transmit(struct tpm_chip *chip, const char *buf,
+   size_t bufsiz)
+{
+   ssize_t len;
+   u32 count;
+   __be32 *native_size;
+
+   native_size = (__force __be32 *) (buf + 2);
+   count = be32_to_cpu(*native_size);
+
+   if (count == 0)
+   return -ENODATA;
+   if (count > bufsiz) {
+   dev_err(>pci_dev->dev,
+   "invalid count value %x %x \n", count, bufsiz);
+

Re: Direct io on block device has performance regression on 2.6.x kernel

2005-03-09 Thread Andrew Morton
David Lang <[EMAIL PROTECTED]> wrote:
>
> (I've seen a 50% 
>  performance hit on 2.4 with just a thousand or two threads compared to 
>  2.6)

Was that 2.4 kernel a vendor kernel with the O(1) scheduler?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 6/9] UML - Remove build dependency on perl

2005-03-09 Thread Jeff Dike
To quote .config into config.c for building the result into the code, use sed
instead of perl, as requested by one "embedded" UML user (which notes that 
perl is a big requirement, while busybox provides sed which is used in this 
patch).

I've tested that there are only cosmethical differences in the produced
config.c file, which don't change at all the result (i.e. "a" is replaced by
"" "a" at the beginning, which is non-significant).

Reported by, and initial patch provided by, Rob Landley.
Signed-off-by: Paolo 'Blaisorblade' Giarrusso <[EMAIL PROTECTED]>
Signed-off-by: Jeff Dike <[EMAIL PROTECTED]>

Index: linux-2.6.11/arch/um/kernel/Makefile
===
--- linux-2.6.11.orig/arch/um/kernel/Makefile   2005-03-08 20:17:35.0 
-0500
+++ linux-2.6.11/arch/um/kernel/Makefile2005-03-08 22:19:21.0 
-0500
@@ -4,7 +4,7 @@
 #
 
 extra-y := vmlinux.lds
-clean-files := vmlinux.lds.S
+clean-files := vmlinux.lds.S config.tmp
 
 obj-y = checksum.o config.o exec_kern.o exitcode.o \
helper.o init_task.o irq.o irq_user.o ksyms.o main.o mem.o mem_user.o \
@@ -34,11 +34,25 @@
 $(USER_OBJS) : %.o: %.c
$(CC) $(USER_CFLAGS) $(CFLAGS_$(notdir $@)) -c -o $@ $<
 
-QUOTE = 'my $$config=`cat $(TOPDIR)/.config`; $$config =~ s/"/\\"/g ; $$config 
=~ s/\n/\\n"\n"/g ; while() { $$_ =~ s/CONFIG/$$config/; print $$_ }'
+targets += config.c
 
-quiet_cmd_quote = QUOTE   $@
-cmd_quote = $(PERL) -e $(QUOTE) < $< > $@
+# Be careful with the below Sed code - sed is pitfall-rich!
+# We use sed to lower build requirements, for "embedded" builders for instance.
 
-targets += config.c
-$(obj)/config.c : $(src)/config.c.in $(TOPDIR)/.config FORCE
-   $(call if_changed,quote)
+$(obj)/config.tmp: $(objtree)/.config FORCE
+   $(call if_changed,quote1)
+
+quiet_cmd_quote1 = QUOTE   $@
+  cmd_quote1 = sed -e 's/"/\\"/g' -e 's/^/"/' -e 's/$$/\\n"/' \
+  $< > $@
+
+$(obj)/config.c: $(src)/config.c.in $(obj)/config.tmp FORCE
+   $(call if_changed,quote2)
+
+quiet_cmd_quote2 = QUOTE   $@
+  cmd_quote2 = sed -e '/CONFIG/{'  \
+ -e 's/"CONFIG"\;/""/'\
+ -e 'r $(obj)/config.tmp' \
+ -e 'a""\;'   \
+ -e '}'   \
+ $< > $@

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/15] ptwalk: pagetable walker cleanup

2005-03-09 Thread Benjamin Herrenschmidt
On Wed, 2005-03-09 at 22:05 +, Hugh Dickins wrote:
> Here's a cleanup of the pagetable walkers, in common and i386 code,
> based on 2.6.11-bk5.  Mainly to make them all go the same simpler way,
> so they're easier to follow with less room for error; but also to reduce
> the code size and speed it up a little.  These are janitorial changes,
> other arches may follow whenever it suits them.
>
> .../...

Do you have them on HTTP somewhere ? Apparently, a few of the 15 patches
didn't make it to me.

There are some other bugs introduced by set_pte_at() caused by latent
bugs in the PTE walkers that 'drop' part of the address along the way,
notably the vmalloc.c ones are bogus, thus breaking ppc/ppc64 in subtle
ways. Before I send patches, I'd rather check if it's not all fixed by
your patches first :)

Ben.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] USB: move usb core to use class_simple instead of it's own class functions.

2005-03-09 Thread Greg KH
ChangeSet 1.2051, 2005/03/09 12:17:18-08:00, [EMAIL PROTECTED]

[PATCH] USB: move usb core to use class_simple instead of it's own class 
functions.

This is needed if the class code is going to be made easier to use, and it 
makes the code
smaller and easier to understand.

Signed-off-by: Greg Kroah-Hartman <[EMAIL PROTECTED]>


 drivers/usb/core/file.c |   55 
 1 files changed, 19 insertions(+), 36 deletions(-)


diff -Nru a/drivers/usb/core/file.c b/drivers/usb/core/file.c
--- a/drivers/usb/core/file.c   2005-03-09 16:28:31 -08:00
+++ b/drivers/usb/core/file.c   2005-03-09 16:28:31 -08:00
@@ -66,16 +66,7 @@
.open = usb_open,
 };
 
-static void release_usb_class_dev(struct class_device *class_dev)
-{
-   dbg("%s - %s", __FUNCTION__, class_dev->class_id);
-   kfree(class_dev);
-}
-
-static struct class usb_class = {
-   .name   = "usb",
-   .release= _usb_class_dev,
-};
+static struct class_simple *usb_class;
 
 int usb_major_init(void)
 {
@@ -87,9 +78,9 @@
goto out;
}
 
-   error = class_register(_class);
-   if (error) {
-   err("class_register failed for usb devices");
+   usb_class = class_simple_create(THIS_MODULE, "usb");
+   if (IS_ERR(usb_class)) {
+   err("class_simple_create failed for usb devices");
unregister_chrdev(USB_MAJOR, "usb");
goto out;
}
@@ -102,7 +93,7 @@
 
 void usb_major_cleanup(void)
 {
-   class_unregister(_class);
+   class_simple_destroy(usb_class);
devfs_remove("usb");
unregister_chrdev(USB_MAJOR, "usb");
 }
@@ -134,7 +125,6 @@
int minor_base = class_driver->minor_base;
int minor = 0;
char name[BUS_ID_SIZE];
-   struct class_device *class_dev;
char *temp;
 
 #ifdef CONFIG_USB_DYNAMIC_MINORS
@@ -174,22 +164,18 @@
devfs_mk_cdev(MKDEV(USB_MAJOR, minor), class_driver->mode, name);
 
/* create a usb class device for this usb interface */
-   class_dev = kmalloc(sizeof(*class_dev), GFP_KERNEL);
-   if (class_dev) {
-   memset(class_dev, 0x00, sizeof(struct class_device));
-   class_dev->devt = MKDEV(USB_MAJOR, minor);
-   class_dev->class = _class;
-   class_dev->dev = >dev;
-
-   temp = strrchr(name, '/');
-   if (temp && (temp[1] != 0x00))
-   ++temp;
-   else
-   temp = name;
-   snprintf(class_dev->class_id, BUS_ID_SIZE, "%s", temp);
-   class_set_devdata(class_dev, (void *)(long)intf->minor);
-   class_device_register(class_dev);
-   intf->class_dev = class_dev;
+   temp = strrchr(name, '/');
+   if (temp && (temp[1] != 0x00))
+   ++temp;
+   else
+   temp = name;
+   intf->class_dev = class_simple_device_add(usb_class, MKDEV(USB_MAJOR, 
minor), >dev, "%s", temp);
+   if (IS_ERR(intf->class_dev)) {
+   spin_lock (_lock);
+   usb_minors[intf->minor] = NULL;
+   spin_unlock (_lock);
+   devfs_remove (name);
+   retval = PTR_ERR(intf->class_dev);
}
 exit:
return retval;
@@ -232,11 +218,8 @@
 
snprintf(name, BUS_ID_SIZE, class_driver->name, intf->minor - 
minor_base);
devfs_remove (name);
-
-   if (intf->class_dev) {
-   class_device_unregister(intf->class_dev);
-   intf->class_dev = NULL;
-   }
+   class_simple_device_remove(MKDEV(USB_MAJOR, intf->minor));
+   intf->class_dev = NULL;
intf->minor = -1;
 }
 EXPORT_SYMBOL(usb_deregister_dev);

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: BUG: Slowdown on 3000 socket-machines tracked down

2005-03-09 Thread Christian Schmid
So, maybe a VM problem?  That would be a good place to focus since
I think we can be fairly certain it isn't a problem in just the
networking code.  Otherwise, my tests would show lower bandwidth.
Thanks to your tests I am really sure that its no network-code problem anymore. But what I THINK it 
is: The network is allocating buffers dynamically and if the vm doesnt provide that buffers fast 
enough, it locks as well. Addendum: If I throttle to 100 MBit it doesnt slow-down even with 5000 
sockets. What do you think? I think its about having to free cache more quicker than possible. But 
then, why is CPU still at 30%? Might there be some limit per cyclus? For example if that "cleaner" 
wakes up every 10 ms and cleans max X pages, it would explain an artificial limit.

Chris
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] 2.6.10 - direct-io async short read bug

2005-03-09 Thread Badari Pulavarty
On Wed, 2005-03-09 at 14:39, Andrew Morton wrote:
> Badari Pulavarty <[EMAIL PROTECTED]> wrote:
> >
> > On Wed, 2005-03-09 at 11:53, Andrew Morton wrote:
> > > Suparna Bhattacharya <[EMAIL PROTECTED]> wrote:
> > > >
> > > >  >  Solaris, which does forcedirectio as a mount option, actually
> > > >  > will do buffered I/O on the trailing part.  Consider it like a bounce
> > > >  > buffer.  That way they don't DMA the trailing data and succeed the 
> > > > I/O.
> > > >  > The I/O returns actual bytes till EOF, just like read(2) is supposed 
> > > > to.
> > > >  >  Either this or a fully DMA'd number 4 is really what we should
> > > >  > do.  If security can only be solved via a bounce buffer, who cares?  
> > > > If
> > > >  > the user created themselves a non-aligned file to open O_DIRECT, 
> > > > that's
> > > >  > their problem if the last part-sector is negligably slower.
> > > > 
> > > >  If writes/truncates take care of zeroing out the rest of the sector
> > > >  on disk, might we still be OK without having to do the bounce buffer
> > > >  thing ?
> > > 
> > > We can probably rely on the rest of the sector outside i_size being zeroed
> > > anyway.  Because if it contains non-zero gunk then the fs already has a
> > > problem, and the user can get at that gunk with an expanding truncate and
> > > mmap() anyway.
> > > 
> > 
> > Rest of the sector or rest of the block ?
> 
> The filesystem-sized block (1< have zeroes outside i_size.
> 
> There's one situation where it might not be zeroed out, and that's when the
> final page is mapped MAP_SHARED and the application modifies that page
> outside i_size while writeout is actually in flight.  We can't do much about
> that.
> 
> > Are you implying that, we
> > already do this, so there is no problem reading beyond EOF to user
> > buffer ? Or we need to zero out the userbuffer beyond EOF ?
> 
> It should be acceptable to assume that the final (1< the file contains zeroes outside i_size.
> 
> And if it doesn't contain those zeroes, well, applications are able to read
> that data already.  Although I wouldn't count that as a security hole: that
> data is something which an application wrote there while writing the file,
> rather than being left-over uncontrolled stuff.


Well, in that case - the original patch sent out is good enough to fix
the problem. All the original patch did was after completing the IO,
truncated the size to filesize. The problem is only with the last
block in the file. If the file ends in the middle of the block, we
go ahead and read till the end of the block. I was trying to address
that issue. But, if the block is already zeroed, just truncating the
size after the IO is complete should be good enough.


Thanks,
Badari

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: BUG: Slowdown on 3000 socket-machines tracked down

2005-03-09 Thread Ben Greear
Christian Schmid wrote:
Yes, 2.6.11.  I have tuned max_backlog and some other TCP and networking
related settings to give more buffers etc to networking tasks.  I have 
not
tried any significant disk-IO while doing these tests.

I finally got my systems set up so I can run my WAN emulator at full 
1Gbps:

I am getting right at 986Mbps throughput with 30ms round-trip latency
(15ms in both directions).
So, latency does not seem to be the problem either.
I think the problem can be narrowed down to:
1)  Non-optimal kernel network tunings on your server.

I used all the default-settings on 2.6.11
Here are my settings.  Hopefully it will be clear what I'm
talking about..yell if you need details.  Please note that I explicitly
set the send buffers to 128k and the rcv to 16k in my test so the min and max
socket queue lengths do not matter here.
my $dflt_tx_queue_len = 2000;   # Ethernet driver transmit-queue length.  Might 
be worth making
# it bigger for GigE nics.
my $netdev_max_backlog = 5000; # Maximum number of packets, queued on the INPUT 
side, when
   # the interface receives pkts faster than it can 
process them.
my $wmem_max = 4096000;  # Write memory buffer.  This is probably fine for any 
setup,
 # and could be smaller (256000) for < 5Mbps 
connections.
my $wmem_default = 128000;  # Write memory buffer.  This is probably fine for 
any setup,
# and could be smaller (256000) for < 5Mbps 
connections.
my $rmem_max = 8096000;  # Receive memory (packet) buffer.  If you are running
 # lots of very fast traffic,
 # you may want to make this larger if you are running 
over
 # fast, high-latency networks.
 # For < 5Mbps of traffic, 512000 should be fine.
my $rmem_default = 128000;  # Receive memory (packet) buffer.
# If this is not 1, then the tcp_* settings below will not be applied.
my $modify_tcp_settings = 1;
# See the kernel documentation: Documentation/networking/ip-sysctl.txt
my $tcp_rmem_min = 4096;
my $tcp_rmem_default = 256000;  # TCP specific receive memory pool size.
my $tcp_rmem_max = 3000;  # TCP specific receive memory pool size.
my $tcp_wmem_min = 4096;
my $tcp_wmem_default = 256000;  # TCP specific write memory pool size.
my $tcp_wmem_max = 3000;  # TCP specific write memory pool size.
my $tcp_mem_lo   = 2000; # Below here there is no memory pressure.
my $tcp_mem_pressure = 3000; # Can use up to 30MB for TCP buffers.
my $tcp_mem_high = 6000; # Can use up to 60MB for TCP buffers.


2)  Disk-IO (my disk is small and slow compared to a 'real' server, 
not sure I can
 really test this side of things, and I have not tried as of yet.)

This doesnt explain the speed-up when I change lower_zone_protection 
from 0 to 1024. It also doesnt explain the slowdown on 2.6.11 compared 
to 2.6.10
Disk-IO uses buffers, so a change here could easily starve the rest
of your system.  I'm just saying I can't reliably test this.  To be honest,
my machines are already throwing allocation failures in the ethernet drivers
and I've had the OOM killer kill my main process several times.  So, my machines
are running right at their memory limit, even w/out any disk IO.
3)  Your clients have much more latency and/or don't have enough 
bandwidth
 to fully load your server.  Since you didn't answer before:  I 
assume you
 do not have a reliable test bed and are just hoping that enough 
clients connect
 to do your benchmarking.

Yes I just wait until they connect. On the graph it only takes about 2 
minutes until 3000 sockets are created again.
But, you could get unlucky and have 3000 people on a shitty dialup
connection connect to you.  That does not make it easy to reliably
test the system.

4)  There is something strange with sendfile and/or your application's 
coding.

I am not doing more than calling sendfile. There  is nothing one can do 
wrong.

My suggestion would be to eliminate these variables by coming up with 
a repeatable
test bed, alternative traffic generators, WAN/Network emulators for 
latency, etc.

The problem still is that 1) it speeds up immediately when 
lower_zone_protection is raised to 1024. This proves it is NOT a 
disk-bottleneck. And second: it got much worse with 2.6.11 and 
lower_zone_protection disappeared on 2.6.11
So, maybe a VM problem?  That would be a good place to focus since
I think we can be fairly certain it isn't a problem in just the
networking code.  Otherwise, my tests would show lower bandwidth.
Ben
--
Ben Greear <[EMAIL PROTECTED]>
Candela Technologies Inc  http://www.candelatech.com
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: bk commits and dates

2005-03-09 Thread David S. Miller
On Thu, 10 Mar 2005 13:41:59 +1100
Benjamin Herrenschmidt <[EMAIL PROTECTED]> wrote:

> I don't know if I'm the only one to have a problem with that, but it
> would be nice if it was possible, when you pull a bk tree, to have the
> commit messages for the csets in that tree be dated from the day you
> pulled, and not the day when they went in the source tree.

When I'm working, I just do "bk csets" after I pull from Linus's
tree to review what went in since the last time I pulled.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [BUG] 2.6.11- sym53c8xx Broken on pp64

2005-03-09 Thread Linus Torvalds


On Wed, 9 Mar 2005, Omkhar Arasaratnam wrote:
> 
> I confirmed that this occurs with the 2.6.11 code straight from 
> kernel.org Here is an error from the bringup:

So if 2.6.9 works, and 2.6.11 does not, can you check 2.6.10? And perhaps 
hunt it down even more, to a -rc release?

> sym0: No NVRAM, ID 7, Fast-80 LVD, parity checking
> CACHE TEST FAILED: DMA error (dstat=0xa0) .sym0: CACHE INCORRECTLY CONFIGURED
> sym0: giving up ...

There are certainly sym changes in there too since 2.6.9, let's see if 
James or Willy have any suggestions. It might not be ppc64-specific.

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: Direct io on block device has performance regression on 2.6.x kernel

2005-03-09 Thread Chen, Kenneth W
Andrew Morton wrote Wednesday, March 09, 2005 6:26 PM
> What does "1/3 of the total benchmark performance regression" mean?  One
> third of 0.1% isn't very impressive.  You haven't told us anything at all
> about the magnitude of this regression.

2.6.9 kernel is 6% slower compare to distributor's 2.4 kernel (RHEL3).  Roughly
2% came from storage driver (I'm not allowed to say anything beyond that, there
is a fix though).

2% came from DIO.

The rest of 2% is still unaccounted for.  We don't know where.

> How much system time?  User time?  All that stuff.
20.5% in the kernel, 79.5% in user space.


> But the first thing to do is to work out where the cycles are going to.
You've seen the profile.  That's where all the cycle went.


> Also, I'm rather peeved that we're hearing about this regression now rather
> than two years ago.  And mystified as to why yours is the only group which
> has reported it.

2.6.X kernel has never been faster than the 2.4 kernel (RHEL3).  At one point
of time, around 2.6.2, the gap is pretty close, at around 1%, but still slower.
Around 2.6.5, we found global plug list is causing huge lock contention on
32-way numa box.  That got fixed in 2.6.7.  Then comes 2.6.8 which took a big
dip at close to 20% regression.  Then we fixed 17% regression in the scheduler
(fixed with cache_decay_tick).  2.6.9 is the last one we measured and it is 6%
slower.  It's a constant moving target, a wild goose to chase.

I don't know why other people have not reported the problem, perhaps they
haven't got a chance to run transaction processing db workload on 2.6 kernel.
Perhaps they have not compared, perhaps they are working on the same problem.
I just don't know.

- Ken


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] tpm_atmel build fix

2005-03-09 Thread Greg KH
ChangeSet 1.2038, 2005/03/09 10:13:15-08:00, [EMAIL PROTECTED]

[PATCH] tpm_atmel build fix

drivers/char/tpm/tpm_atmel.c:131: unknown field `fops' specified in initializer
drivers/char/tpm/tpm_atmel.c:131: warning: missing braces around initializer


Signed-off-by: Andrew Morton <[EMAIL PROTECTED]>
Signed-off-by: Greg Kroah-Hartman <[EMAIL PROTECTED]>


 drivers/char/tpm/tpm_atmel.c |2 +-
 1 files changed, 1 insertion(+), 1 deletion(-)


diff -Nru a/drivers/char/tpm/tpm_atmel.c b/drivers/char/tpm/tpm_atmel.c
--- a/drivers/char/tpm/tpm_atmel.c  2005-03-09 16:40:05 -08:00
+++ b/drivers/char/tpm/tpm_atmel.c  2005-03-09 16:40:05 -08:00
@@ -128,7 +128,7 @@
.req_complete_mask = ATML_STATUS_BUSY | ATML_STATUS_DATA_AVAIL,
.req_complete_val = ATML_STATUS_DATA_AVAIL,
.base = TPM_ATML_BASE,
-   .miscdev.fops = _ops,
+   .miscdev = { .fops = _ops, },
 };
 
 static int __devinit tpm_atml_init(struct pci_dev *pci_dev,

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] aoe: update documentation for udev users

2005-03-09 Thread Greg KH
ChangeSet 1.2037, 2005/03/09 10:21:15-08:00, [EMAIL PROTECTED]

[PATCH] aoe: update documentation for udev users

Bodo Eggert <[EMAIL PROTECTED]> writes:

> Ed L Cashin <[EMAIL PROTECTED]> wrote:
>
>> +if=A0test=A0-z=A0"$conf";=A0then
>> +=A0=A0=A0=A0=A0=A0=A0=A0conf=3D"`find=A0/etc=A0-type=A0f=A0-name=A0udev=
.conf=A02>=A0/dev/null`"
>> +fi
>> +if=A0test=A0-z=A0"$conf"=A0||=A0test=A0!=A0-r=A0$conf;=A0then
>> +=A0=A0=A0=A0=A0=A0=A0=A0echo=A0"$me=A0Error:=A0could=A0not=A0find=A0rea=
dable=A0udev.conf=A0in=A0/etc"=A01>&2
>> +=A0=A0=A0=A0=A0=A0=A0=A0exit=A01
>> +fi
>
> This will fail and print
> ---
> bash: test: etc/udev.conf: binary operator expected
> ---
> if there is more than one udev.conf.
>
> Fix: Always put quotes around variables.

Thanks.  With the changes below, it still will complain if it finds
more than one udev.conf, but only if /etc/udev/udev.conf doesn't
exist.

Quote all shell variables, and use /etc/udev/udev.conf if available.

Signed-off-by: Ed L. Cashin <[EMAIL PROTECTED]>
Signed-off-by: Greg Kroah-Hartman <[EMAIL PROTECTED]>


 Documentation/aoe/udev-install.sh |   14 +-
 1 files changed, 9 insertions(+), 5 deletions(-)


diff -Nru a/Documentation/aoe/udev-install.sh 
b/Documentation/aoe/udev-install.sh
--- a/Documentation/aoe/udev-install.sh 2005-03-09 16:16:00 -08:00
+++ b/Documentation/aoe/udev-install.sh 2005-03-09 16:16:00 -08:00
@@ -8,11 +8,15 @@
 # (or environment can specify where to find udev.conf)
 #
 if test -z "$conf"; then
-   conf="`find /etc -type f -name udev.conf 2> /dev/null`"
-fi
-if test -z "$conf" || test ! -r $conf; then
-   echo "$me Error: could not find readable udev.conf in /etc" 1>&2
-   exit 1
+   if test -r /etc/udev/udev.conf; then
+   conf=/etc/udev/udev.conf
+   else
+   conf="`find /etc -type f -name udev.conf 2> /dev/null`"
+   if test -z "$conf" || test ! -r "$conf"; then
+   echo "$me Error: no udev.conf found" 1>&2
+   exit 1
+   fi
+   fi
 fi
 
 # find the directory where udev rules are stored, often

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: bk commits and dates

2005-03-09 Thread Michael Ellerman
Two's company ...

On Thu, 10 Mar 2005 13:41, Benjamin Herrenschmidt wrote:
> While we are at such requests ...
>
> When you pull from one of the trees, like netdev, the commit messages
> are sent to the bk commit list with the original date stamp of the patch
> in the netdev tree.
>
> For example, if Jeff commited a patch from somebody in his netdev tree 3
> weeks ago, and you pull Jeff's tree today, we'll get all the commit
> messages today, but dated from 3 weeks ago.
>
> That means that in my mailing list archive, where my mailer sorts them
> by date, I can't say, for example, everything that is before the 2.6.11
> tag release was in 2.6.11. It's also difficult to spot "new" stuffs as
> they can arrive with dates weeks ago, and thus show up in places I will
> not look for.
>
> I don't know if I'm the only one to have a problem with that, but it
> would be nice if it was possible, when you pull a bk tree, to have the
> commit messages for the csets in that tree be dated from the day you
> pulled, and not the day when they went in the source tree.
>
> Ben.




pgpdo08LaFdfw.pgp
Description: PGP signature


Re: [BUG] 2.6.11- sym53c8xx Broken on pp64

2005-03-09 Thread Omkhar Arasaratnam
Omkhar Arasaratnam wrote:
Benjamin Herrenschmidt wrote:
Are you sure it's plain 2.6.11 and not some bk clone of after 2.6.11 was
released ?
 

Ben - I am in the process of downloading a clean tarball from 
kernel.org to be 100% certain.
I confirmed that this occurs with the 2.6.11 code straight from 
kernel.org Here is an error from the bringup:

sym0: No NVRAM, ID 7, Fast-80 LVD, parity checking
CACHE TEST FAILED: DMA error (dstat=0xa0) .sym0: CACHE INCORRECTLY 
CONFIGURED
sym0: giving up ...

ideas?
Omkhar
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Linux-fbdev-devel] [announce 0/7] fbsplash - The Framebuffer Splash

2005-03-09 Thread Sebastian Kügler
Christoph Hellwig wrote:
> On Wed, Mar 09, 2005 at 12:38:42PM +0100, Pavel Machek wrote:
>> > > Fbsplash - The Framebuffer Splash - is a feature that allows
>> > > displaying images in the background of consoles that use fbcon. The
>> > > project is partially descended from bootsplash.
>> > 
>> > What are you trying to do exactly? I really don't see the point of this
>> > patch.
>> 
>> At least some Debians,
> 
> While there might be a kernel-patch-bootsplash package in Debian none of
> the shipped binary kernels use this.

The standard Ubuntu kernel does.

Regards,

sebas
-- 
  http://vizZzion.org   |   GPG Key ID: 9119 0EF9 
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
One is not superior merely because one sees the world as odious. -
Chateaubriand (1768-1848)


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] videodev: pass dev_t to the class core

2005-03-09 Thread Greg KH
ChangeSet 1.2044, 2005/03/09 09:52:10-08:00, [EMAIL PROTECTED]

[PATCH] videodev: pass dev_t to the class core

Signed-off-by: Kay Sievers <[EMAIL PROTECTED]>
Signed-off-by: Greg Kroah-Hartman <[EMAIL PROTECTED]>


 drivers/media/video/videodev.c |   11 +--
 1 files changed, 1 insertion(+), 10 deletions(-)


diff -Nru a/drivers/media/video/videodev.c b/drivers/media/video/videodev.c
--- a/drivers/media/video/videodev.c2005-03-09 16:29:20 -08:00
+++ b/drivers/media/video/videodev.c2005-03-09 16:29:20 -08:00
@@ -46,15 +46,7 @@
return sprintf(buf,"%.*s\n",(int)sizeof(vfd->name),vfd->name);
 }
 
-static ssize_t show_dev(struct class_device *cd, char *buf)
-{
-   struct video_device *vfd = container_of(cd, struct video_device, 
class_dev);
-   dev_t dev = MKDEV(VIDEO_MAJOR, vfd->minor);
-   return print_dev_t(buf,dev);
-}
-
 static CLASS_DEVICE_ATTR(name, S_IRUGO, show_name, NULL);
-static CLASS_DEVICE_ATTR(dev,  S_IRUGO, show_dev, NULL);
 
 struct video_device *video_device_alloc(void)
 {
@@ -347,12 +339,11 @@
if (vfd->dev)
vfd->class_dev.dev = vfd->dev;
vfd->class_dev.class   = _class;
+   vfd->class_dev.devt   = MKDEV(VIDEO_MAJOR, vfd->minor);
strlcpy(vfd->class_dev.class_id, vfd->devfs_name + 4, BUS_ID_SIZE);
class_device_register(>class_dev);
class_device_create_file(>class_dev,
 _device_attr_name);
-   class_device_create_file(>class_dev,
-_device_attr_dev);
 
 #if 1 /* needed until all drivers are fixed */
if (!vfd->release)

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: bk commits and dates

2005-03-09 Thread Linus Torvalds


On Thu, 10 Mar 2005, Benjamin Herrenschmidt wrote:
> 
> I don't know if I'm the only one to have a problem with that, but it
> would be nice if it was possible, when you pull a bk tree, to have the
> commit messages for the csets in that tree be dated from the day you
> pulled, and not the day when they went in the source tree.

Nope, that's against how BK works. It's really distributed, so "my" tree 
has no special meaning, and as such the fact that I pull has no meaning 
either - it doesn't trigger as anything special.

The only thing that ends up being special is when it hits the public tree
which has the trigger to send out the emails. IOW, the date of the _email_
is special (in that it says when a commit hit the public tree), not not
the commits changesets themselves.

Now, if James trigger scripts set the date of the email by the date of the 
commit, that sounds like a misfeature, but you'd better talk to James, not 
me, since he's the one doing that part..

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/2] No-exec support for ppc64

2005-03-09 Thread Olof Johansson
Hi,

On Tue, Mar 08, 2005 at 05:13:26PM -0600, Jake Moilanen wrote:
> diff -puN arch/ppc64/mm/hash_utils.c~nx-kernel-ppc64 
> arch/ppc64/mm/hash_utils.c
> --- linux-2.6-bk/arch/ppc64/mm/hash_utils.c~nx-kernel-ppc64   2005-03-08 
> 16:08:57 -06:00
> +++ linux-2.6-bk-moilanen/arch/ppc64/mm/hash_utils.c  2005-03-08 16:08:57 
> -06:00
> @@ -89,12 +90,23 @@ static inline void loop_forever(void)
>   ;
>  }
>  
> +int is_kernel_text(unsigned long addr)
> +{
> + if (addr >= (unsigned long)_stext && addr < (unsigned long)__init_end)
> + return 1;
> +
> + return 0;
> +}

This is used in two files, but never declared extern in the second file
(iSeries_setup.c). Should it go in a header file as a static inline
instead?

There also seems to be a local static is_kernel_text() in kallsyms that
overlaps (but it's not identical). Removing that redundancy can be taken
care of as a janitorial patch outside of the noexec stuff.



-Olof
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/2] No-exec support for ppc64

2005-03-09 Thread Olof Johansson
On Tue, Mar 08, 2005 at 05:08:26PM -0600, Jake Moilanen wrote:
> No-exec base and user space support for PPC64.  

Hi, a couple of comments below.


-Olof

> @@ -786,6 +786,7 @@ int hash_huge_page(struct mm_struct *mm,
>   pte_t old_pte, new_pte;
>   unsigned long hpteflags, prpn;
>   long slot;
> + int is_exec;
>   int err = 1;
>  
>   spin_lock(>page_table_lock);
> @@ -796,6 +797,10 @@ int hash_huge_page(struct mm_struct *mm,
>   va = (vsid << 28) | (ea & 0x0fff);
>   vpn = va >> HPAGE_SHIFT;
>  
> + is_exec = access & _PAGE_EXEC;
> + if (unlikely(is_exec && !(pte_val(*ptep) & _PAGE_EXEC)))
> + goto out;

You only use is_exec this one time, you can probably skip it and just
add the mask in the if statement.

> @@ -898,6 +908,7 @@ repeat:
>   err = 0;
>  
>   out:
> +
>   spin_unlock(>page_table_lock);

Whitespace change

> diff -puN include/asm-ppc64/pgtable.h~nx-user-ppc64 
> include/asm-ppc64/pgtable.h
> --- linux-2.6-bk/include/asm-ppc64/pgtable.h~nx-user-ppc642005-03-08 
> 16:08:54 -06:00
> +++ linux-2.6-bk-moilanen/include/asm-ppc64/pgtable.h 2005-03-08 16:08:54 
> -06:00
> @@ -82,14 +82,14 @@
>  #define _PAGE_PRESENT0x0001 /* software: pte contains a translation 
> */
>  #define _PAGE_USER   0x0002 /* matches one of the PP bits */
>  #define _PAGE_FILE   0x0002 /* (!present only) software: pte holds file 
> offset */
> -#define _PAGE_RW 0x0004 /* software: user write access allowed */
> +#define _PAGE_EXEC   0x0004 /* No execute on POWER4 and newer (we invert) */

Good to see the comment there, I remember we talked about that earlier.
It can be somewhat confusing. :-)

>  #define _PAGE_GUARDED0x0008
>  #define _PAGE_COHERENT   0x0010 /* M: enforce memory coherence (SMP 
> systems) */
>  #define _PAGE_NO_CACHE   0x0020 /* I: cache inhibit */
>  #define _PAGE_WRITETHRU  0x0040 /* W: cache write-through */
>  #define _PAGE_DIRTY  0x0080 /* C: page changed */
>  #define _PAGE_ACCESSED   0x0100 /* R: page referenced */
> -#define _PAGE_EXEC   0x0200 /* software: i-cache coherence required */
> +#define _PAGE_RW 0x0200 /* software: user write access allowed */
>  #define _PAGE_HASHPTE0x0400 /* software: pte has an associated HPTE 
> */
>  #define _PAGE_BUSY   0x0800 /* software: PTE & hash are busy */ 
>  #define _PAGE_SECONDARY 0x8000 /* software: HPTE is in secondary group */
> @@ -100,7 +100,7 @@
>  /* PAGE_MASK gives the right answer below, but only by accident */
>  /* It should be preserving the high 48 bits and then specifically */
>  /* preserving _PAGE_SECONDARY | _PAGE_GROUP_IX */
> -#define _PAGE_CHG_MASK   (PAGE_MASK | _PAGE_ACCESSED | _PAGE_DIRTY | 
> _PAGE_HPTEFLAGS)
> +#define _PAGE_CHG_MASK (_PAGE_GUARDED | _PAGE_COHERENT | _PAGE_NO_CACHE | 
> _PAGE_WRITETHRU | _PAGE_DIRTY | _PAGE_ACCESSED | _PAGE_HPTEFLAGS | PAGE_MASK)

Can you break it into 80 columns with \ ?

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] Add SuperHyway bus subsystem

2005-03-09 Thread Greg KH
ChangeSet 1.2027.3.1, 2005/03/09 12:14:18-08:00, [EMAIL PROTECTED]

[PATCH] Add SuperHyway bus subsystem

Signed-off-by: Paul Mundt <[EMAIL PROTECTED]>
Signed-off-by: Greg Kroah-Hartman <[EMAIL PROTECTED]>


 drivers/sh/Makefile  |6 
 drivers/sh/superhyway/Makefile   |7 +
 drivers/sh/superhyway/superhyway-sysfs.c |   45 ++
 drivers/sh/superhyway/superhyway.c   |  201 +++
 include/linux/superhyway.h   |   79 
 5 files changed, 338 insertions(+)


diff -Nru a/drivers/sh/Makefile b/drivers/sh/Makefile
--- /dev/null   Wed Dec 31 16:00:00 196900
+++ b/drivers/sh/Makefile   2005-03-09 16:36:31 -08:00
@@ -0,0 +1,6 @@
+#
+# Makefile for the SuperH specific drivers.
+#
+
+obj-$(CONFIG_SUPERHYWAY) += superhyway/
+
diff -Nru a/drivers/sh/superhyway/Makefile b/drivers/sh/superhyway/Makefile
--- /dev/null   Wed Dec 31 16:00:00 196900
+++ b/drivers/sh/superhyway/Makefile2005-03-09 16:36:31 -08:00
@@ -0,0 +1,7 @@
+#
+# Makefile for the SuperHyway bus drivers.
+#
+
+obj-$(CONFIG_SUPERHYWAY)   += superhyway.o
+obj-$(CONFIG_SYSFS)+= superhyway-sysfs.o
+
diff -Nru a/drivers/sh/superhyway/superhyway-sysfs.c 
b/drivers/sh/superhyway/superhyway-sysfs.c
--- /dev/null   Wed Dec 31 16:00:00 196900
+++ b/drivers/sh/superhyway/superhyway-sysfs.c  2005-03-09 16:36:31 -08:00
@@ -0,0 +1,45 @@
+/*
+ * drivers/sh/superhyway/superhyway-sysfs.c
+ *
+ * SuperHyway Bus sysfs interface
+ *
+ * Copyright (C) 2004, 2005  Paul Mundt <[EMAIL PROTECTED]>
+ *
+ * This file is subject to the terms and conditions of the GNU General Public
+ * License.  See the file "COPYING" in the main directory of this archive
+ * for more details.
+ */
+#include 
+#include 
+#include 
+#include 
+
+#define superhyway_ro_attr(name, fmt, field)   \
+static ssize_t name##_show(struct device *dev, char *buf)  \
+{  \
+   struct superhyway_device *s = to_superhyway_device(dev);\
+   return sprintf(buf, fmt, s->field); \
+}
+
+/* VCR flags */
+superhyway_ro_attr(perr_flags, "0x%02x\n", vcr.perr_flags);
+superhyway_ro_attr(merr_flags, "0x%02x\n", vcr.merr_flags);
+superhyway_ro_attr(mod_vers, "0x%04x\n", vcr.mod_vers);
+superhyway_ro_attr(mod_id, "0x%04x\n", vcr.mod_id);
+superhyway_ro_attr(bot_mb, "0x%02x\n", vcr.bot_mb);
+superhyway_ro_attr(top_mb, "0x%02x\n", vcr.top_mb);
+
+/* Misc */
+superhyway_ro_attr(resource, "0x%08lx\n", resource.start);
+
+struct device_attribute superhyway_dev_attrs[] = {
+   __ATTR_RO(perr_flags),
+   __ATTR_RO(merr_flags),
+   __ATTR_RO(mod_vers),
+   __ATTR_RO(mod_id),
+   __ATTR_RO(bot_mb),
+   __ATTR_RO(top_mb),
+   __ATTR_RO(resource),
+   __ATTR_NULL,
+};
+
diff -Nru a/drivers/sh/superhyway/superhyway.c 
b/drivers/sh/superhyway/superhyway.c
--- /dev/null   Wed Dec 31 16:00:00 196900
+++ b/drivers/sh/superhyway/superhyway.c2005-03-09 16:36:31 -08:00
@@ -0,0 +1,201 @@
+/*
+ * drivers/sh/superhyway/superhyway.c
+ *
+ * SuperHyway Bus Driver
+ *
+ * Copyright (C) 2004, 2005  Paul Mundt <[EMAIL PROTECTED]>
+ *
+ * This file is subject to the terms and conditions of the GNU General Public
+ * License.  See the file "COPYING" in the main directory of this archive
+ * for more details.
+ */
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+static int superhyway_devices;
+
+static struct device superhyway_bus_device = {
+   .bus_id = "superhyway",
+};
+
+static void superhyway_device_release(struct device *dev)
+{
+   kfree(to_superhyway_device(dev));
+}
+
+/**
+ * superhyway_add_device - Add a SuperHyway module
+ * @mod_id: Module ID (taken from MODULE.VCR.MOD_ID).
+ * @base: Physical address where module is mapped.
+ * @vcr: VCR value.
+ *
+ * This is responsible for adding a new SuperHyway module. This sets up a new
+ * struct superhyway_device for the module being added. Each one of @mod_id,
+ * @base, and @vcr are registered with the new device for further use
+ * elsewhere.
+ *
+ * Devices are initially added in the order that they are scanned (from the
+ * top-down of the memory map), and are assigned an ID based on the order that
+ * they are added. Any manual addition of a module will thus get the ID after
+ * the devices already discovered regardless of where it resides in memory.
+ *
+ * Further work can and should be done in superhyway_scan_bus(), to be sure
+ * that any new modules are properly discovered and subsequently registered.
+ */
+int superhyway_add_device(unsigned int mod_id, unsigned long base,
+ unsigned long long vcr)
+{
+   struct superhyway_device *dev;
+
+   dev = kmalloc(sizeof(struct superhyway_device), GFP_KERNEL);
+   if (!dev)
+   return -ENOMEM;
+
+   memset(dev, 0, sizeof(struct superhyway_device));
+
+   

[PATCH] class: add a semaphore to struct class, and use that instead of the subsystem rwsem.

2005-03-09 Thread Greg KH
ChangeSet 1.2055, 2005/03/09 15:41:29-08:00, [EMAIL PROTECTED]

[PATCH] class: add a semaphore to struct class, and use that instead of the 
subsystem rwsem.

This moves us away from using the rwsem, although recursive adds and removes of 
class devices
is not yet possible (nor is it really known if it even is needed.)  So this 
simple change is
done instead.

Signed-off-by: Greg Kroah-Hartman <[EMAIL PROTECTED]>


 drivers/base/class.c   |   23 +++
 include/linux/device.h |2 +-
 2 files changed, 12 insertions(+), 13 deletions(-)


diff -Nru a/drivers/base/class.c b/drivers/base/class.c
--- a/drivers/base/class.c  2005-03-09 16:28:04 -08:00
+++ b/drivers/base/class.c  2005-03-09 16:28:04 -08:00
@@ -140,6 +140,7 @@
 
INIT_LIST_HEAD(>children);
INIT_LIST_HEAD(>interfaces);
+   init_MUTEX(>sem);
error = kobject_set_name(>subsys.kset.kobj, "%s", cls->name);
if (error)
return error;
@@ -413,12 +414,12 @@
 
/* now take care of our own registration */
if (parent) {
-   down_write(>subsys.rwsem);
+   down(>sem);
list_add_tail(_dev->node, >children);
list_for_each_entry(class_intf, >interfaces, node)
if (class_intf->add)
class_intf->add(class_dev);
-   up_write(>subsys.rwsem);
+   up(>sem);
}
 
if (MAJOR(class_dev->devt))
@@ -448,12 +449,12 @@
struct class_interface * class_intf;
 
if (parent) {
-   down_write(>subsys.rwsem);
+   down(>sem);
list_del_init(_dev->node);
list_for_each_entry(class_intf, >interfaces, node)
if (class_intf->remove)
class_intf->remove(class_dev);
-   up_write(>subsys.rwsem);
+   up(>sem);
}
 
if (class_dev->dev)
@@ -509,8 +510,8 @@
 
 int class_interface_register(struct class_interface *class_intf)
 {
-   struct class * parent;
-   struct class_device * class_dev;
+   struct class *parent;
+   struct class_device *class_dev;
 
if (!class_intf || !class_intf->class)
return -ENODEV;
@@ -519,14 +520,13 @@
if (!parent)
return -EINVAL;
 
-   down_write(>subsys.rwsem);
+   down(>sem);
list_add_tail(_intf->node, >interfaces);
-
if (class_intf->add) {
list_for_each_entry(class_dev, >children, node)
class_intf->add(class_dev);
}
-   up_write(>subsys.rwsem);
+   up(>sem);
 
return 0;
 }
@@ -539,14 +539,13 @@
if (!parent)
return;
 
-   down_write(>subsys.rwsem);
+   down(>sem);
list_del_init(_intf->node);
-
if (class_intf->remove) {
list_for_each_entry(class_dev, >children, node)
class_intf->remove(class_dev);
}
-   up_write(>subsys.rwsem);
+   up(>sem);
 
class_put(parent);
 }
diff -Nru a/include/linux/device.h b/include/linux/device.h
--- a/include/linux/device.h2005-03-09 16:28:04 -08:00
+++ b/include/linux/device.h2005-03-09 16:28:04 -08:00
@@ -15,7 +15,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 #include 
@@ -148,6 +147,7 @@
struct subsystemsubsys;
struct list_headchildren;
struct list_headinterfaces;
+   struct semaphoresem;/* locks both the children and 
interfaces lists */
 
struct class_attribute  * class_attrs;
struct class_device_attribute   * class_dev_attrs;

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] aoe: fail IO on disk errors

2005-03-09 Thread Greg KH
ChangeSet 1.2038, 2005/03/09 10:21:33-08:00, [EMAIL PROTECTED]

[PATCH] aoe: fail IO on disk errors

This patch makes disk errors fail the IO instead of getting logged and
ignored.


Fail IO on disk errors

Signed-off-by: Ed L. Cashin <[EMAIL PROTECTED]>
Signed-off-by: Greg Kroah-Hartman <[EMAIL PROTECTED]>


 drivers/block/aoe/aoecmd.c |8 +---
 1 files changed, 5 insertions(+), 3 deletions(-)


diff -Nru a/drivers/block/aoe/aoecmd.c b/drivers/block/aoe/aoecmd.c
--- a/drivers/block/aoe/aoecmd.c2005-03-09 16:15:53 -08:00
+++ b/drivers/block/aoe/aoecmd.c2005-03-09 16:15:53 -08:00
@@ -416,7 +416,9 @@
 
if (ahin->cmdstat & 0xa9) { /* these bits cleared on success */
printk(KERN_CRIT "aoe: aoecmd_ata_rsp: ata error cmd=%2.2Xh "
-   "stat=%2.2Xh\n", ahout->cmdstat, ahin->cmdstat);
+   "stat=%2.2Xh from e%ld.%ld\n", 
+   ahout->cmdstat, ahin->cmdstat,
+   d->aoemajor, d->aoeminor);
if (buf)
buf->flags |= BUFFL_FAIL;
} else {
@@ -458,8 +460,8 @@
if (buf) {
buf->nframesout -= 1;
if (buf->nframesout == 0 && buf->resid == 0) {
-   n = !(buf->flags & BUFFL_FAIL);
-   bio_endio(buf->bio, buf->bio->bi_size, 0);
+   n = (buf->flags & BUFFL_FAIL) ? -EIO : 0;
+   bio_endio(buf->bio, buf->bio->bi_size, n);
mempool_free(buf, d->bufpool);
}
}

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] block core: export MAJOR/MINOR to the hotplug env

2005-03-09 Thread Greg KH
ChangeSet 1.2040, 2005/03/09 09:32:58-08:00, [EMAIL PROTECTED]

[PATCH] block core: export MAJOR/MINOR to the hotplug env

Signed-off-by: Kay Sievers <[EMAIL PROTECTED]>
Signed-off-by: Greg Kroah-Hartman <[EMAIL PROTECTED]>


 drivers/block/genhd.c |   53 --
 1 files changed, 34 insertions(+), 19 deletions(-)


diff -Nru a/drivers/block/genhd.c b/drivers/block/genhd.c
--- a/drivers/block/genhd.c 2005-03-09 16:29:48 -08:00
+++ b/drivers/block/genhd.c 2005-03-09 16:29:48 -08:00
@@ -430,42 +430,57 @@
 static int block_hotplug(struct kset *kset, struct kobject *kobj, char **envp,
 int num_envp, char *buffer, int buffer_size)
 {
-   struct device *dev = NULL;
struct kobj_type *ktype = get_ktype(kobj);
+   struct device *physdev;
+   struct gendisk *disk;
+   struct hd_struct *part;
int length = 0;
int i = 0;
 
-   /* get physical device backing disk or partition */
if (ktype == _block) {
-   struct gendisk *disk = container_of(kobj, struct gendisk, kobj);
-   dev = disk->driverfs_dev;
+   disk = container_of(kobj, struct gendisk, kobj);
+   add_hotplug_env_var(envp, num_envp, , buffer, buffer_size,
+   , "MINOR=%u", disk->first_minor);
} else if (ktype == _part) {
-   struct gendisk *disk = container_of(kobj->parent, struct 
gendisk, kobj);
-   dev = disk->driverfs_dev;
-   }
-
-   if (dev) {
-   /* add physical device, backing this device  */
-   char *path = kobject_get_path(>kobj, GFP_KERNEL);
+   disk = container_of(kobj->parent, struct gendisk, kobj);
+   part = container_of(kobj, struct hd_struct, kobj);
+   add_hotplug_env_var(envp, num_envp, , buffer, buffer_size,
+   , "MINOR=%u",
+   disk->first_minor + part->partno);
+   } else
+   return 0;
+
+   add_hotplug_env_var(envp, num_envp, , buffer, buffer_size, ,
+   "MAJOR=%u", disk->major);
+
+   /* add physical device, backing this device  */
+   physdev = disk->driverfs_dev;
+   if (physdev) {
+   char *path = kobject_get_path(>kobj, GFP_KERNEL);
 
add_hotplug_env_var(envp, num_envp, , buffer, buffer_size,
, "PHYSDEVPATH=%s", path);
kfree(path);
 
-   /* add bus name of physical device */
-   if (dev->bus)
+   if (physdev->bus)
add_hotplug_env_var(envp, num_envp, ,
buffer, buffer_size, ,
-   "PHYSDEVBUS=%s", dev->bus->name);
+   "PHYSDEVBUS=%s",
+   physdev->bus->name);
 
-   /* add driver name of physical device */
-   if (dev->driver)
+   if (physdev->driver)
add_hotplug_env_var(envp, num_envp, ,
buffer, buffer_size, ,
-   "PHYSDEVDRIVER=%s", 
dev->driver->name);
-
-   envp[i] = NULL;
+   "PHYSDEVDRIVER=%s",
+   physdev->driver->name);
}
+
+   /* terminate, set to next free slot, shrink available space */
+   envp[i] = NULL;
+   envp = [i];
+   num_envp -= i;
+   buffer = [length];
+   buffer_size -= length;
 
return 0;
 }

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[BK PATCH] Add Superhighway bus support for 2.6.11

2005-03-09 Thread Greg KH
Hi,

Here is one changeset that adds superhighway bus support to the 2.6.11
kernel.  It has been in the -mm releases for a while.

Please pull from:  bk://kernel.bkbits.net/gregkh/linux/2.6.11/sh

Individual patches will follow, sent to the linux-kernel list.

thanks,

greg k-h

 drivers/sh/Makefile  |6 
 drivers/sh/superhyway/Makefile   |7 +
 drivers/sh/superhyway/superhyway-sysfs.c |   45 ++
 drivers/sh/superhyway/superhyway.c   |  201 +++
 include/linux/superhyway.h   |   79 
 5 files changed, 338 insertions(+)
-


Paul Mundt:
  o Add SuperHyway bus subsystem

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] debufs: make built in types add a \n to their output

2005-03-09 Thread Greg KH
ChangeSet 1.2033, 2005/03/09 15:24:07-08:00, [EMAIL PROTECTED]

[PATCH] debufs: make built in types add a \n to their output

Thanks to Alessandro Rubini <[EMAIL PROTECTED]> for pointing this out.

Signed-off-by: Greg Kroah-Hartman <[EMAIL PROTECTED]>


 fs/debugfs/file.c |2 +-
 1 files changed, 1 insertion(+), 1 deletion(-)


diff -Nru a/fs/debugfs/file.c b/fs/debugfs/file.c
--- a/fs/debugfs/file.c 2005-03-09 16:23:09 -08:00
+++ b/fs/debugfs/file.c 2005-03-09 16:23:09 -08:00
@@ -52,7 +52,7 @@
char buf[32];   
\
type *val = file->private_data; 
\

\
-   snprintf(buf, sizeof(buf), format, *val);   
\
+   snprintf(buf, sizeof(buf), format "\n", *val);  
\
return simple_read_from_buffer(user_buf, count, ppos, buf, 
strlen(buf));\
 }  
\
 static ssize_t write_file_##type(struct file *file, const char __user 
*user_buf,\

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] floppy.c: pass physical device to device registration

2005-03-09 Thread Greg KH
ChangeSet 1.2047, 2005/03/09 09:53:08-08:00, [EMAIL PROTECTED]

[PATCH] floppy.c: pass physical device to device registration

With this patch the floppy driver creates the usual symlink in sysfs to
the physical device backing the block device:

  $tree /sys/block/
  /sys/block/
  |-- fd0
  |   |-- dev
  |   |-- device -> ../../devices/platform/floppy0
  ...

Signed-off-by: Kay Sievers <[EMAIL PROTECTED]>
Signed-off-by: Greg Kroah-Hartman <[EMAIL PROTECTED]>


 drivers/block/floppy.c |   19 ++-
 1 files changed, 6 insertions(+), 13 deletions(-)


diff -Nru a/drivers/block/floppy.c b/drivers/block/floppy.c
--- a/drivers/block/floppy.c2005-03-09 16:28:59 -08:00
+++ b/drivers/block/floppy.c2005-03-09 16:28:59 -08:00
@@ -4370,6 +4370,10 @@
goto out_flush_work;
}
 
+   err = platform_device_register(_device);
+   if (err)
+   goto out_flush_work;
+
for (drive = 0; drive < N_DRIVE; drive++) {
if (!(allowed_drive_mask & (1 << drive)))
continue;
@@ -4379,23 +4383,12 @@
disks[drive]->private_data = (void *)(long)drive;
disks[drive]->queue = floppy_queue;
disks[drive]->flags |= GENHD_FL_REMOVABLE;
+   disks[drive]->driverfs_dev = _device.dev;
add_disk(disks[drive]);
}
 
-   err = platform_device_register(_device);
-   if (err)
-   goto out_del_disk;
-
return 0;
 
-out_del_disk:
-   for (drive = 0; drive < N_DRIVE; drive++) {
-   if (!(allowed_drive_mask & (1 << drive)))
-   continue;
-   if (fdc_state[FDC(drive)].version == FDC_NONE)
-   continue;
-   del_gendisk(disks[drive]);
-   }
 out_flush_work:
flush_scheduled_work();
if (usage_count)
@@ -4600,7 +4593,6 @@
int drive;
 
init_completion(_release);
-   platform_device_unregister(_device);
blk_unregister_region(MKDEV(FLOPPY_MAJOR, 0), 256);
unregister_blkdev(FLOPPY_MAJOR, "fd");
 
@@ -4614,6 +4606,7 @@
}
put_disk(disks[drive]);
}
+   platform_device_unregister(_device);
devfs_remove("floppy");
 
del_timer_sync(_timeout);

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Linux 2.6.11.2

2005-03-09 Thread Matt Mackall
On Thu, Mar 10, 2005 at 02:46:29AM +0100, Bodo Eggert wrote:
> Bill Davidsen <[EMAIL PROTECTED]> wrote:
> 
> > I think you need both x.y.z=>x.y.z.N and x.y.z.N-1=>x.y.z.N patches. My
> > systems which are following the -stable will just need the most recent,
> > but doing x.y.z-1=>x.y.z.N gets really ugly for higher values of N.
> 
> bzcat ../patch-2.6.nn.[0-9].*|patch -p1

You left out the steps where you fetch them and verify their signatures.

-- 
Mathematics is the supreme nostalgia of our time.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [BUG] 2.6.11- sym53c8xx Broken on pp64

2005-03-09 Thread Omkhar Arasaratnam
Benjamin Herrenschmidt wrote:
Are you sure it's plain 2.6.11 and not some bk clone of after 2.6.11 was
released ?
 

Ben - I am in the process of downloading a clean tarball from kernel.org 
to be 100% certain.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] tpm-build-fix

2005-03-09 Thread Greg KH
ChangeSet 1.2039, 2005/03/09 10:13:34-08:00, [EMAIL PROTECTED]

[PATCH] tpm-build-fix

drivers/char/tpm/tpm.c: In function `show_pcrs':
drivers/char/tpm/tpm.c:228: warning: passing arg 1 of `tpm_transmit' from 
incompatible pointer type
drivers/char/tpm/tpm.c:238: warning: passing arg 1 of `tpm_transmit' from 
incompatible pointer type


Signed-off-by: Andrew Morton <[EMAIL PROTECTED]>
Signed-off-by: Greg Kroah-Hartman <[EMAIL PROTECTED]>


 drivers/char/tpm/tpm.c |2 +-
 1 files changed, 1 insertion(+), 1 deletion(-)


diff -Nru a/drivers/char/tpm/tpm.c b/drivers/char/tpm/tpm.c
--- a/drivers/char/tpm/tpm.c2005-03-09 16:39:58 -08:00
+++ b/drivers/char/tpm/tpm.c2005-03-09 16:39:58 -08:00
@@ -219,7 +219,7 @@
int i, j, index, num_pcrs;
char *str = buf;
 
-   struct tpm_chp *chip =
+   struct tpm_chip *chip =
pci_get_drvdata(container_of(dev, struct pci_dev, dev));
if (chip == NULL)
return -ENODEV;

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[BK PATCH] Add TPM driver support for 2.6.11

2005-03-09 Thread Greg KH
Hi,

Here are a few changesets that add support for TPM drivers.  These
patches have all been in the -mm releases for a while now.

Please pull from:  bk://kernel.bkbits.net/gregkh/linux/2.6.11/tpm

Individual patches will follow, sent to the linux-kernel list.

thanks,

greg k-h

 drivers/char/Kconfig |2 
 drivers/char/Makefile|2 
 drivers/char/tpm/Kconfig |   39 ++
 drivers/char/tpm/Makefile|7 
 drivers/char/tpm/tpm.c   |  715 ++-
 drivers/char/tpm/tpm.h   |   92 +
 drivers/char/tpm/tpm_atmel.c |  218 +
 drivers/char/tpm/tpm_nsc.c   |  375 ++
 include/linux/pci_ids.h  |1 
 9 files changed, 1439 insertions(+), 12 deletions(-)
-


:
  o tpm: fix cause of SMP stack traces
  o Add TPM hardware enablement driver

Andrew Morton:
  o tpm-build-fix
  o tpm_atmel build fix
  o tpm_msc-build-fix

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] Reduce cacheline bouncing in cpu_idle_wait

2005-03-09 Thread Zwane Mwaikambo
Andi noted that during normal runtime cpu_idle_map is bounced around a 
lot, and occassionally at a higher frequency than the timer interrupt 
wakeup which we normally exit pm_idle from. So switch to a percpu 
variable. Andi i didn't move things to the slow path because it would 
involve adding scheduler code to wakeup the idle thread on the cpus we're 
waiting for.

 arch/i386/kernel/process.c   |   28 
 arch/ia64/kernel/process.c   |   39 +--
 arch/x86_64/kernel/process.c |   38 +-
 3 files changed, 70 insertions(+), 35 deletions(-)

Signed-off-by: Zwane Mwaikambo <[EMAIL PROTECTED]>

Index: linux-2.6.11-mm2/arch/i386/kernel/process.c
===
RCS file: /home/cvsroot/linux-2.6.11-mm2/arch/i386/kernel/process.c,v
retrieving revision 1.1.1.1
diff -u -p -B -r1.1.1.1 process.c
--- linux-2.6.11-mm2/arch/i386/kernel/process.c 8 Mar 2005 13:53:27 -   
1.1.1.1
+++ linux-2.6.11-mm2/arch/i386/kernel/process.c 10 Mar 2005 02:02:33 -
@@ -78,7 +78,7 @@ unsigned long thread_saved_pc(struct tas
  * Powermanagement idle function, if any..
  */
 void (*pm_idle)(void);
-static cpumask_t cpu_idle_map;
+static DEFINE_PER_CPU(unsigned int, cpu_idle_state);
 
 void disable_hlt(void)
 {
@@ -185,9 +185,10 @@ void cpu_idle (void)
while (1) {
while (!need_resched()) {
void (*idle)(void);
-
-   if (cpu_isset(cpu, cpu_idle_map))
-   cpu_clear(cpu, cpu_idle_map);
+   
+   if (__get_cpu_var(cpu_idle_state))
+   __get_cpu_var(cpu_idle_state) = 0;
+   
rmb();
idle = pm_idle;
 
@@ -206,16 +207,27 @@ void cpu_idle (void)
 
 void cpu_idle_wait(void)
 {
-   int cpu;
+   unsigned int cpu, this_cpu = get_cpu();
cpumask_t map;
 
-   for_each_online_cpu(cpu)
-   cpu_set(cpu, cpu_idle_map);
+   set_cpus_allowed(current, cpumask_of_cpu(this_cpu));
+   put_cpu();
+
+   for_each_online_cpu(cpu) {
+   per_cpu(cpu_idle_state, cpu) = 1;
+   cpu_set(cpu, map);
+   }
+
+   __get_cpu_var(cpu_idle_state) = 0;
 
wmb();
do {
ssleep(1);
-   cpus_and(map, cpu_idle_map, cpu_online_map);
+   for_each_online_cpu(cpu) {
+   if (cpu_isset(cpu, map) && !per_cpu(cpu_idle_state, 
cpu))
+   cpu_clear(cpu, map);
+   }
+   cpus_and(map, map, cpu_online_map);
} while (!cpus_empty(map));
 }
 EXPORT_SYMBOL_GPL(cpu_idle_wait);
Index: linux-2.6.11-mm2/arch/ia64/kernel/process.c
===
RCS file: /home/cvsroot/linux-2.6.11-mm2/arch/ia64/kernel/process.c,v
retrieving revision 1.1.1.1
diff -u -p -B -r1.1.1.1 process.c
--- linux-2.6.11-mm2/arch/ia64/kernel/process.c 8 Mar 2005 13:53:34 -   
1.1.1.1
+++ linux-2.6.11-mm2/arch/ia64/kernel/process.c 10 Mar 2005 01:22:49 -
@@ -49,7 +49,7 @@
 #include "sigframe.h"
 
 void (*ia64_mark_idle)(int);
-static cpumask_t cpu_idle_map;
+static DEFINE_PER_CPU(unsigned int, cpu_idle_map);
 
 unsigned long boot_option_idle_override = 0;
 EXPORT_SYMBOL(boot_option_idle_override);
@@ -229,20 +229,30 @@ static inline void play_dead(void)
 }
 #endif /* CONFIG_HOTPLUG_CPU */
 
-
 void cpu_idle_wait(void)
 {
-int cpu;
-cpumask_t map;
+   unsigned int cpu, this_cpu = get_cpu();
+   cpumask_t map;
+
+   set_cpus_allowed(current, cpumask_of_cpu(this_cpu));
+   put_cpu();
 
-for_each_online_cpu(cpu)
-cpu_set(cpu, cpu_idle_map);
+   for_each_online_cpu(cpu) {
+   per_cpu(cpu_idle_state, cpu) = 1;
+   cpu_set(cpu, map);
+   }
 
-wmb();
-do {
-ssleep(1);
-cpus_and(map, cpu_idle_map, cpu_online_map);
-} while (!cpus_empty(map));
+   __get_cpu_var(cpu_idle_state) = 0;
+
+   wmb();
+   do {
+   ssleep(1);
+   for_each_online_cpu(cpu) {
+   if (cpu_isset(cpu, map) && !per_cpu(cpu_idle_state, 
cpu))
+   cpu_clear(cpu, map);
+   }
+   cpus_and(map, map, cpu_online_map);
+   } while (!cpus_empty(map));
 }
 EXPORT_SYMBOL_GPL(cpu_idle_wait);
 
@@ -261,12 +271,13 @@ cpu_idle (void)
while (!need_resched()) {
void (*idle)(void);
 
+   if (__get_cpu_var(cpu_idle_state))
+   __get_cpu_var(cpu_idle_state) = 0;
+   
+   rmb();
if (mark_idle)
(*mark_idle)(1);
 
-   

[PATCH] Add TPM hardware enablement driver

2005-03-09 Thread Greg KH
ChangeSet 1.2035, 2005/03/09 10:12:19-08:00, [EMAIL PROTECTED]

[PATCH] Add TPM hardware enablement driver

This patch is a device driver to enable new hardware.  The new hardware is
the TPM chip as described by specifications at
.  The TPM chip will enable you to
use hardware to securely store and protect your keys and personal data.
To use the chip according to the specification, you will need the Trusted
Software Stack (TSS) of which an implementation for Linux is available at:
.


Signed-off-by: Leendert van Doorn <[EMAIL PROTECTED]>
Signed-off-by: Reiner Sailer <[EMAIL PROTECTED]>
Signed-off-by: Dave Safford <[EMAIL PROTECTED]>
Signed-off-by: Kylene Hall <[EMAIL PROTECTED]>
Signed-off-by: Greg Kroah-Hartman <[EMAIL PROTECTED]>


 drivers/char/Kconfig |2 
 drivers/char/Makefile|2 
 drivers/char/tpm/Kconfig |   39 ++
 drivers/char/tpm/Makefile|7 
 drivers/char/tpm/tpm.c   |  697 +++
 drivers/char/tpm/tpm.h   |   92 +
 drivers/char/tpm/tpm_atmel.c |  216 +
 drivers/char/tpm/tpm_nsc.c   |  372 ++
 include/linux/pci_ids.h  |1 
 9 files changed, 1427 insertions(+), 1 deletion(-)


diff -Nru a/drivers/char/Kconfig b/drivers/char/Kconfig
--- a/drivers/char/Kconfig  2005-03-09 16:40:26 -08:00
+++ b/drivers/char/Kconfig  2005-03-09 16:40:26 -08:00
@@ -995,5 +995,7 @@
  The mmtimer device allows direct userspace access to the
  Altix system timer.
 
+source "drivers/char/tpm/Kconfig"
+
 endmenu
 
diff -Nru a/drivers/char/Makefile b/drivers/char/Makefile
--- a/drivers/char/Makefile 2005-03-09 16:40:26 -08:00
+++ b/drivers/char/Makefile 2005-03-09 16:40:26 -08:00
@@ -89,7 +89,7 @@
 obj-$(CONFIG_IPMI_HANDLER) += ipmi/
 
 obj-$(CONFIG_HANGCHECK_TIMER) += hangcheck-timer.o
-
+obj-$(CONFIG_TCG_TPM) += tpm/
 # Files generated that shall be removed upon make clean
 clean-files := consolemap_deftbl.c defkeymap.c qtronixmap.c
 
diff -Nru a/drivers/char/tpm/Kconfig b/drivers/char/tpm/Kconfig
--- /dev/null   Wed Dec 31 16:00:00 196900
+++ b/drivers/char/tpm/Kconfig  2005-03-09 16:40:26 -08:00
@@ -0,0 +1,39 @@
+#
+# TPM device configuration
+#
+
+menu "TPM devices"
+
+config TCG_TPM
+   tristate "TPM Hardware Support"
+   depends on EXPERIMENTAL
+   ---help---
+ If you have a TPM security chip in your system, which
+ implements the Trusted Computing Group's specification,
+ say Yes and it will be accessible from within Linux.  For
+ more information see . 
+ An implementation of the Trusted Software Stack (TSS), the 
+ userspace enablement piece of the specification, can be 
+ obtained at: .  To 
+ compile this driver as a module, choose M here; the module 
+ will be called tpm. If unsure, say N.
+
+config TCG_NSC
+   tristate "National Semiconductor TPM Interface"
+   depends on TCG_TPM
+   ---help---
+ If you have a TPM security chip from National Semicondutor 
+ say Yes and it will be accessible from within Linux.  To 
+ compile this driver as a module, choose M here; the module 
+ will be called tpm_nsc.
+
+config TCG_ATMEL
+   tristate "Atmel TPM Interface"
+   depends on TCG_TPM
+   ---help---
+ If you have a TPM security chip from Atmel say Yes and it 
+ will be accessible from within Linux.  To compile this driver 
+ as a module, choose M here; the module will be called tpm_atmel.
+
+endmenu
+
diff -Nru a/drivers/char/tpm/Makefile b/drivers/char/tpm/Makefile
--- /dev/null   Wed Dec 31 16:00:00 196900
+++ b/drivers/char/tpm/Makefile 2005-03-09 16:40:26 -08:00
@@ -0,0 +1,7 @@
+#
+# Makefile for the kernel tpm device drivers.
+#
+obj-$(CONFIG_TCG_TPM) += tpm.o
+obj-$(CONFIG_TCG_NSC) += tpm_nsc.o
+obj-$(CONFIG_TCG_ATMEL) += tpm_atmel.o
+
diff -Nru a/drivers/char/tpm/tpm.c b/drivers/char/tpm/tpm.c
--- /dev/null   Wed Dec 31 16:00:00 196900
+++ b/drivers/char/tpm/tpm.c2005-03-09 16:40:26 -08:00
@@ -0,0 +1,697 @@
+/*
+ * Copyright (C) 2004 IBM Corporation
+ *
+ * Authors:
+ * Leendert van Doorn <[EMAIL PROTECTED]>
+ * Dave Safford <[EMAIL PROTECTED]>
+ * Reiner Sailer <[EMAIL PROTECTED]>
+ * Kylene Hall <[EMAIL PROTECTED]>
+ *
+ * Maintained by: <[EMAIL PROTECTED]>
+ *
+ * Device driver for TCG/TCPA TPM (trusted platform module).
+ * Specifications at www.trustedcomputinggroup.org  
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation, version 2 of the
+ * License.
+ * 
+ * Note, the TPM chip is not interrupt driven (only polling)
+ * and can have very long timeouts (minutes!). Hence the unusual
+ * calls to 

  1   2   3   4   5   6   7   8   9   10   >