date:20070130

Google Earth displays improperly - 2.6.20-rc6

2007-01-30 Thread Jerry Jiang

When I update my kernel to 2.6.20-rc6, I find I can not move the map 
smoothly in the Google Earth (V4).

I do not try any test on this issue. just put it to maillist.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: No mptable found (Tyan h1000E)

2007-01-30 Thread Len Brown

On Tuesday 30 January 2007 18:16, Jan Kasprzak wrote:
>   Hello,
> 
>   I have a Tyan h1000E (S3970) dual-socket board, with two
> dual-core AMD Athlon 2210 CPUs (4 cores total). The problem is
> that the kernel apparently cannot detect the SMP configuration
> (after boot, /proc/cpuinfo lists only one processor).
> The full dmesg output is available at
> 
> http://www.fi.muni.cz/~kas/tmp/dmesg-h1000E.txt

ACPI: RSDP (v002 ACPIAM) @ 0x000f7700
ACPI: XSDT (v001 A M I  OEMXSDT  0x1605 MSFT 0x0097) @ 
0xbfff0100
ACPI: FADT (v003 A M I  OEMFACP  0x1605 MSFT 0x0097) @ 
0xbfff0290
ACPI: OEMB (v001 A M I  AMI_OEM  0x1605 MSFT 0x0097) @ 
0xbfffe040
ACPI: SRAT (v001 AMDHAMMER   0x0001 AMD  0x0001) @ 
0xbfff3420
ACPI: SSDT (v001 A M I  POWERNOW 0x0001 AMD  0x0001) @ 
0xbfff34e0
ACPI: DSDT (v001  0 0000 0x INTL 0x02002026) @ 
0x

This board appears to have no APIC (MADT) table -- which is what Linux uses to 
enumerate processors
in ACPI mode.  (doesn't have MPS either, but in ACPI mode you wouldn't use it 
anyway).

Are there any BIOS SETUP settings for enabling SMP or ACPI features?

I guess that /proc/interrupts shows this system running in PIC mode?

Please open a bugzilla here and attach the output from acpidump.
http://bugzilla.kernel.org/enter_bug.cgi?product=ACPI
Component: Config-Processors

Any chance you can see if Windows finds multiple processors on this board with 
this version of the BIOS?

This system may be proof that Linux needs to parse the DSDT to properly 
enumerate processors.
Though it is somewhat strange to have an SMP without an IOAPIC...

thanks,
-Len

ps. after you get the acpidump for the failing system, check for a BIOS update.

> The most interesting parts of it are probably these (with my comments
> inline marked by "---"):
> 
> [...]
> SRAT: PXM 0 -> APIC 0 -> Node 0
> SRAT: PXM 0 -> APIC 1 -> Node 0
> SRAT: PXM 1 -> APIC 2 -> Node 1
> SRAT: PXM 1 -> APIC 3 -> Node 1
> --- So the kernel can see all four APICs.
> [...]
> Bootmem setup node 0 -bfff
> No mptable found.
> --- the above does not depend on MPS 1.1 or 1.4 settings in the BIOS
> [...]
> Kernel command line: ro root=/dev/md0 console=ttyS0,38400n8
> Initializing CPU#0
> PID hash table entries: 4096 (order: 12, 32768 bytes)
> Console: colour VGA+ 80x25
> Dentry cache hash table entries: 524288 (order: 10, 4194304 bytes)
> Inode-cache hash table entries: 262144 (order: 9, 2097152 bytes)
> Checking aperture...
> CPU 0: aperture @ e800 size 128 MB
> CPU 1: aperture @ e800 size 128 MB
> --- the kernel knows something about CPU1 (presumably the second core of 
> CPU0).
> [...]
> CPU 0/0 -> Node 0
> CPU: Physical Processor ID: 0
> CPU: Processor Core ID: 0
> SMP alternatives: switching to UP code
> Freeing SMP alternatives: 32k freed
> ACPI: Core revision 20060707
> ACPI: setting ELCR to 0200 (from 0e20)
> weird, boot CPU (#0) not listed by the BIOS.
> SMP motherboard not detected.
> Using local APIC timer interrupts.
> result 12469270
> Detected 12.469 MHz APIC timer.
> testing NMI watchdog ... OK.
> SMP disabled
> --- hmm, no SMP configuration detected after all.
> Brought up 1 CPUs
> testing NMI watchdog ... OK.
>   ^^^ here it waits for few seconds before printing "OK."
> 
>   I have tested it with vanilla 2.6.19.2, 2.6.20-rc6, and
> the latest Fedora kernel (2.6.19-1.2895.fc6). I have the latest BIOS
> available for this board, and the BIOS can see all four cores.
> 
>   How can I make all four cores visible by the Linux kernel?
> Thanks,
> 
> -Yenya
> 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] Add "is_power_of_2" checking to log2.h.

2007-01-30 Thread Robert P. J. Day

On Wed, 31 Jan 2007, Nick Piggin wrote:

> Robert P. J. Day wrote:
> > On Tue, 30 Jan 2007, Nick Piggin wrote:
> >
> >
> > > Robert P. J. Day wrote:
> > >
> > > >  Add the inline function "is_power_of_2()" to log2.h, where the value
> > > > zero is *not* considered to be a power of two.
> > >
> > > > Signed-off-by: Robert P. J. Day <[EMAIL PROTECTED]>
> > > >
> > > > /*
> > > > + *  Determine whether some value is a power of two, where zero is
> > > > + * *not* considered a power of two.
> > > > + */
> > >
> > > Why the qualifier? Zero *is* not a power of 2, is it?
> >
> >
> > no, but it bears repeating since some developers might think it *is*.
> > if you peruse the current kernel code, you'll find some tests of the
> > simpler form:
> >
> > ((n & (n - 1)) == 0))
> >
> > which is clearly testing for "power of twoness" but which will
> > return true for a value of zero.  that's wrong, and it's why it's
> > emphasized in the comment.
>
> I would have thought you'd comment the broken ones, but that's just
> me.

  good point, so let's just sum up here.  (man, it's hard to believe
that something this simple could drag on so long.  i feel like i'm
discussing free device driver development or something. :-)

  the new is_power_of_2() macro is defined as:

  (n != 0 && ((n & (n - 1)) == 0))

which (correctly, IMHO) does *not* identify zero as a power of two.
if someone truly wants *that* test, they can write it themselves:

  if (x == 0 || is_power_of_2(x))

  this means that, if someone wants to start rewriting those tests in
the source tree, every time they run across an apparent "power of two"
test of the simpler form:

  (n & (n - 1))

they have to ask themselves, "ok, did this coder mean to include zero
or not?"  in some cases, it's probably not going to be obvious.
(maybe the maintainers could do a quick check themselves and make the
substitution 'cuz, once the kernel janitors get ahold of this, you
never know what hilarity will ensue. :-)

  as far as the patch itself i submitted is concerned, the *only*
place that changed the existing semantics was here:

=
--- a/arch/ppc/syslib/ppc85xx_rio.c
+++ b/arch/ppc/syslib/ppc85xx_rio.c
@@ -59,8 +59,6 @@
 #define DBELL_TID(x)   (*(u8 *)(x + DOORBELL_TID_OFFSET))
 #define DBELL_INF(x)   (*(u16 *)(x + DOORBELL_INFO_OFFSET))

-#define is_power_of_2(x)   (((x) & ((x) - 1)) == 0)
-
 struct rio_atmu_regs {
u32 rowtar;
u32 pad1;
=

so if the powerpc people are ok with that, then the patch itself
should be fine, and it's only the upcoming substitutions in the source
tree that will have to be checked carefully, one by one.

rday

--

Robert P. J. Day
Linux Consulting, Training and Annoying Kernel Pedantry
Waterloo, Ontario, CANADA

http://www.fsdev.dreamhosters.com/wiki/index.php?title=Main_Page

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 2/2] PM: fast power off - driver

2007-01-30 Thread Andi Kleen

akuster <[EMAIL PROTECTED]> writes:

> My apologies, I cc'd the wrong list the first time around.
> 
> - Armin
> ---
> 
> Fastpoweroff default profile driver

Why don't you just call the existing reboot(2)? I've had a simple 
C program that does a very fast poweroff (or reboot) forever
and it works just fine. It won't kill all processes,
but if you really want to do that you can use killall -r .* first.

-Andi

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch] suspend debugging: simulate suspend-to-RAM

2007-01-30 Thread Andi Kleen

Andrey Borzenkov <[EMAIL PROTECTED]> writes:

> Will it work with netconsole too? COM port is often missing in notebooks
> today.

It will work with some luck with firescope (in the worst case you
might need to disable suspend in ochi1394). Many laptops have firewire
ports.

ftp://ftp.firstfloor.org/pub/ak/firescope/

-Andi

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch] kbuild: correctly skip tilded backups in localversion files

2007-01-30 Thread Oleg Verych

kbuild: correctly skip tilded backups in localversion files

 Tildes as in path as in filenames are handled correctly now.

 Definition of `space' was removed, scripts/Kbuild.include has one.
 This definition was taken right from GNU make manual, while Kbuild's
 version is original.

Cc: Roman Zippel <[EMAIL PROTECTED]>
Cc: Bastian Blank <[EMAIL PROTECTED]>
Cc: Sam Ravnborg <[EMAIL PROTECTED]>
Signed-off-by: Oleg Verych <[EMAIL PROTECTED]>
---
Another try.

Original report and fix by Bastian Blank:

 The following patch fixes the problem that localversion files where
 ignored if the tree lives in a path which contains a ~. It changes the
 test to apply to the filename only.
 
 Debian allows versions which contains ~ in it. The upstream part of the
 version is in the directory name of the build tree and we got weird
 results because the localversion files was just got ignored in this
 case.

 Makefile |   15 +--
 1 file changed, 5 insertions(+), 10 deletions(-)

--- linux-2.6.20-rc6/Makefile~4tilde-backups~   2007-01-30 23:33:45.781462750 
+0100
+++ linux-2.6.20-rc6/Makefile   2007-01-31 07:46:18.696404500 +0100
@@ -777,5 +777,5 @@ $(vmlinux-dirs): prepare scripts
 #  $(localver-full)
 #$(localver)
-#  localversion*   (all localversion* files)
+#  localversion*   (files, without backups containing '~')
 #  $(CONFIG_LOCALVERSION)  (from kernel config setting)
 #$(localver-auto)  (only if CONFIG_LOCALVERSION_AUTO is 
set)
@@ -788,14 +788,9 @@ $(vmlinux-dirs): prepare scripts
 # scripts/setlocalversion and add the appropriate checks as needed.
 
-nullstring :=
-space  := $(nullstring) # end of line
-
-___localver = $(objtree)/localversion* $(srctree)/localversion*
-__localver  = $(sort $(wildcard $(___localver)))
-# skip backup files (containing '~')
-_localver = $(foreach f, $(__localver), $(if $(findstring ~, $(f)),,$(f)))
-
+localversion = $(objtree)/localversion $(srctree)/localversion
+ext_versions = $(objtree)/localversion[^~]* $(srctree)/localversion[!~]*
+versions = $(localversion) $(ext_versions)
 localver = $(subst $(space),, \
-  $(shell cat /dev/null $(_localver)) \
+  $(shell cat /dev/null $(sort $(wildcard $(versions \
   $(patsubst "%",%,$(CONFIG_LOCALVERSION)))
 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: How many people are using 2.6.16?

2007-01-30 Thread Adrian Bunk

On Tue, Jan 30, 2007 at 06:36:48PM -0800, Linus Torvalds wrote:
> 
> 
> On Tue, 30 Jan 2007, Mark Lord wrote:
> > 
> > I believe our featherless leader said he though it was an ancient bug,
> > exasperated by something that went into 2.6.19.
> > 
> > If Linus's opinion is correct (still?), then the bug exists in all
> > kernels since somewhere back in the 2.4.xx days.
> 
> The issue was somewhat confused by people certainly *reporting* it for 
> older kernels. Also, as part of the dirty bit cleanups and sanity 
> checkingwe did actually seem to fix a long-standing CIFS corruption (and 
> apparently reisertfs/XFS problems too).
> 
> But the *common* case was actually introduced with 2.6.19, and 2.6.16 
> wouldn't be affected. 

Thanks for the clarifications.

Regarding the longstanding CIFS/reiserfs/XFS problems, it seems the 
status is:

CIFS:
commit cb876f451455b6187a7d69de2c112c45ec4b7f99
  Fix up CIFS for "test_clear_page_dirty()" removal
queued for 2.6.19.3
applies and compiles against 2.6.16

reiserfs:
commit de14569f94513279e3d44d9571a421e9da1759ae
  [PATCH] resierfs: avoid tail packing if an inode was ever mmapped
backport to 2.6.16 required

XFS:
fix not yet in your tree

>   Linus

cu
Adrian

-- 

   "Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
   "Only a promise," Lao Er said.
   Pearl S. Buck - Dragon Seed

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [Cbe-oss-dev] [RFC, PATCH 4/4] Add support to OProfile for profiling Cell BE SPUs -- update

2007-01-30 Thread Arnd Bergmann

On Tuesday 30 January 2007 23:54, Maynard Johnson wrote:

> >>Why do you store them per spu in the first place? The physical spu
> >>doesn't have any relevance to this at all, the only data that is
> >>per spu is the sample data collected on a profiling interrupt,
> >>which you can then copy in the per-context data on a context switch.
> > 
> > The sample data is written out to the event buffer on every profiling 
> > interrupt.  But we don't write out the SPU program counter samples 
> > directly to the event buffer.  First, we have to find the cached_info 
> > for the appropriate SPU context to retrieve the cached vma-to-fileoffset 
> > map.  Then we do the vma_map_lookup to find the fileoffset corresponding 
> > to the SPU PC sample, which we then write out to the event buffer.  This 
> > is one of the most time-critical pieces of the SPU profiling code, so I 
> > used an array to hold the cached_info for fast random access.  But as I 
> > stated in a code comment above, the negative implication of this current 
> > implementation is that the array can only hold the cached_info for 
> > currently running SPU tasks.  I need to give this some more thought.
> 
> I've given this some more thought, and I'm coming to the conclusion that 
> a pure array-based implementation for holding cached_info (getting rid 
> of the lists) would work well for the vast majority of cases in which 
> OProfile will be used.  Yes, it is true that the mapping of an SPU 
> context to a phsyical spu-numbered array location cannot be guaranteed 
> to stay valid, and that's why I discard the cached_info at that array 
> location when the SPU task is switched out.  Yes, it would be terribly 
> inefficient if the same SPU task gets switched back in later and we 
> would have to recreate the cached_info.  However, I contend that 
> OProfile users are interested in profiling one application at a time. 
> They are not going to want to muddy the waters with multiple SPU apps 
> running at the same time.  I can't think of any reason why someone would 
> conscisouly choose to do that.
> 
> Any thoughts from the general community, especially OProfile users?
> 
Please assume that in the near future we will be scheduling SPU contexts
in and out multiple times a second. Even in a single application, you
can easily have more contexts than you have physical SPUs.

The event buffer by definition needs to be per context. If you for some
reason want to collect the samples per physical SPU during an event
interrupt, you should at least make sure that they are copied into the
per-context event buffer on a context switch.

At the context switch point, you probably also want to drain the
hw event counters, so that you account all events correctly.

We also want to be able to profile the context switch code itself, which
means that we also need one event buffer associated with the kernel to
collect events that for a zero context_id.

Of course, the recording of raw samples in the per-context buffer does
not need to have the dcookies along with it, you can still resolve
the pointers when the SPU context gets destroyed (or an object gets
unmapped).

Arnd <><
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Free Linux Driver Development!

2007-01-30 Thread Kumar Gala



On Jan 31, 2007, at 12:26 AM, Greg KH wrote:


On Tue, Jan 30, 2007 at 06:27:29PM -0800, Michael K. Edwards wrote:

On 1/29/07, Greg KH <[EMAIL PROTECTED]> wrote:

Free Linux Driver Development!

Yes, that's right, the Linux kernel community is offering all  
companies

free Linux driver development.  ...

[snip]
[1] for the CPUs that support the bus types that your device  
works on.


Bravo!  Now, is there a message in the same spirit that can be
tailored to embedded space, especially to SoC vendors and (even more
importantly) their customers?  Something along the lines of:

"We understand that embedded hardware is frequently buggy and that  
SoC

vendors are doing well if their own internal software people can get
enough help from the chip guys to bring up enough customer-driven use
cases to win a few design-ins.

We sympathize with embedded developers who stay up nights with an
O-scope and a JTAG emulator reverse-engineering the hardware  
behavior,

trying to figure out which this order of operations works and this
other one doesn't.

We have the software tools and the competence to quantify the
potential gains from current toolchains and kernels, aggressive
compilation options, and in-tree power/latency management strategies,
so that you can build a business case against "fire and forget" SDKs
based on ancient compilers, obsolete kernels, and unmaintained
out-of-tree patches.

We will help platform integrators bridge the gap between the chip
architects' claims about device performance and the condition in  
which

the BSP guys toss drivers over the fence.

You can hang onto the hardware and profit from coaching and code
review, or you can send us a board and whatever doco you've got, and
we'll figure it out.

All we ask is that 1) SoC vendors authorize customers to do an NDA
with OSDL and pass vendor NDA material along to us; 2) when the
product ships, all participants are free to exercise GPL rights with
respect to the chip support and driver code produced; and 3) platform
integrators cooperate with the rework usually needed as code merges
towards Linus's tree."

Or is this a pipe dream?


Oh, I would love to see something like that happen :)

As I come from an embedded background, I love to see Linux running in
tiny systems.  So anything I can do to help out with that I'd love to
offer.

But being able to read the minds of SOC hardwre engineers and  
decode all

of the documentation errors they produce is enough to drive one crazy,
my condolences go out to everyone in that situation...

good luck,

greg k-h


Thanks.  It gets even better when they change things between  
revisions of the same HW block.


Out of interest are you was this geared to any particular SoC's/ 
architectures?


- k
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Free Linux Driver Development!

2007-01-30 Thread Greg KH

On Tue, Jan 30, 2007 at 06:27:29PM -0800, Michael K. Edwards wrote:
> On 1/29/07, Greg KH <[EMAIL PROTECTED]> wrote:
> >Free Linux Driver Development!
> >
> >Yes, that's right, the Linux kernel community is offering all companies
> >free Linux driver development.  ...
> [snip]
> >[1] for the CPUs that support the bus types that your device works on.
> 
> Bravo!  Now, is there a message in the same spirit that can be
> tailored to embedded space, especially to SoC vendors and (even more
> importantly) their customers?  Something along the lines of:
> 
> "We understand that embedded hardware is frequently buggy and that SoC
> vendors are doing well if their own internal software people can get
> enough help from the chip guys to bring up enough customer-driven use
> cases to win a few design-ins.
> 
> We sympathize with embedded developers who stay up nights with an
> O-scope and a JTAG emulator reverse-engineering the hardware behavior,
> trying to figure out which this order of operations works and this
> other one doesn't.
> 
> We have the software tools and the competence to quantify the
> potential gains from current toolchains and kernels, aggressive
> compilation options, and in-tree power/latency management strategies,
> so that you can build a business case against "fire and forget" SDKs
> based on ancient compilers, obsolete kernels, and unmaintained
> out-of-tree patches.
> 
> We will help platform integrators bridge the gap between the chip
> architects' claims about device performance and the condition in which
> the BSP guys toss drivers over the fence.
> 
> You can hang onto the hardware and profit from coaching and code
> review, or you can send us a board and whatever doco you've got, and
> we'll figure it out.
> 
> All we ask is that 1) SoC vendors authorize customers to do an NDA
> with OSDL and pass vendor NDA material along to us; 2) when the
> product ships, all participants are free to exercise GPL rights with
> respect to the chip support and driver code produced; and 3) platform
> integrators cooperate with the rework usually needed as code merges
> towards Linus's tree."
> 
> Or is this a pipe dream?

Oh, I would love to see something like that happen :)

As I come from an embedded background, I love to see Linux running in
tiny systems.  So anything I can do to help out with that I'd love to
offer.

But being able to read the minds of SOC hardwre engineers and decode all
of the documentation errors they produce is enough to drive one crazy,
my condolences go out to everyone in that situation...

good luck,

greg k-h
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Reducing warning output from pci_get_subsys()

2007-01-30 Thread Greg KH

On Tue, Jan 30, 2007 at 11:18:21PM -0600, Kumar Gala wrote:
> 
> On Jan 30, 2007, at 11:11 PM, Andrew Morton wrote:
> 
> >On Tue, 30 Jan 2007 22:55:38 -0600 (CST)
> >Kumar Gala <[EMAIL PROTECTED]> wrote:
> >
> >>Greg,
> >>
> >>There was some code added to warn if pci_get_subsys() is called  
> >>and the
> >>pci_devices is empty.
> >>
> >>I'm wondering if there is some point at which we know its ok for the
> >>pci_devices list be empty if there are no devices on the bus so we  
> >>can
> >>stop printing the message.
> >>
> >>On an embedded PPC reference system I see this message 6 times  
> >>when I've
> >>got no cards in the PCI slots.
> >>
> >
> >I'd suggest we just remove the warning.  Also the one in  
> >pci_find_subsys().
> >
> >I let them go through because I was curious to know what would  
> >cause it to
> >trigger - it might be an indication of other bugs.  But nothing very
> >interesting happened and they're of no use.
> 
> Works for me.

No objection from me either.

thanks,

greg k-h
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 0 of 4] Generic AIO by scheduling stacks

2007-01-30 Thread Linus Torvalds

On Wed, 31 Jan 2007, Nick Piggin wrote:
> 
> I always thought that the AIO people didn't do this because they wanted
> to avoid context switch overhead.

I don't think that scheduling overhead was ever a really the reason, at 
least not the primary one, and at least not on Linux. Sure, we can 
probably make cooperative thread switching a bit faster than even 
VM-sharing thread switching (maybe), but it's not going to be *that* big 
an issue.

Ifaik, the bigger issues were about setup costs (but also purely semantic 
- it was hard to do AIO semantics with threads).

And memory costs. The "one stack page per outstanding AIO" may end up 
still being too expensive, but threads were even more so.

[ Of course, that used to also be the claim by all the people who thought 
  we couldn't do native kernel threads for "normal" threading either, and 
  should go with the n*m setup. Shows how much they knew ;^]

But I've certainly _personally_ always wanted to do AIO with threads. I 
wanted to do it with regular threads (ie the "clone()" kind). It didn't 
fly. But I think we can possibly lower both the setup costs and the memory 
costs with the cooperative approach, to the point where maybe this one is 
more palatable and workable.

And maybe it also solves some of the scalability worries (threads have ID 
space and scheduling setup things that essentially go away by just not 
doing them - which is what the fibrils simply wouldn't have).

(Sadly, some of the people who really _use_ AIO are the database people, 
and they really only care about a particularly stupid and trivial case: 
pure reads and writes. A lot of other loads care about much more complex 
things: filename lookups etc, that traditional AIO cannot do at all, and 
that you really want something more thread-like for. But those other loads 
get kind of swamped by the DB needs, which are might tighter and trivial 
enough that you don't "need" a real thread for them).

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [Cbe-oss-dev] [RFC, PATCH 4/4] Add support to OProfile for profiling Cell BE SPUs -- update

2007-01-30 Thread Arnd Bergmann

On Wednesday 31 January 2007 00:31, Carl Love wrote:
> Unfortunately, the only way we know how to
> figure out what the LFSR value that corresponds to the number in the
> sequence that is N before the last value (0xFF) is to calculate the
> previous value N times.  It is like trying to ask what is the pseudo
> random number that is N before this pseudo random number?

Well, you can at least implement the lfsr both ways, and choose the one
that is faster to get at, like

u32 get_lfsr(u32 v)
{
int i;
u32 r = 0xff;
if (v < 0x7f) {
for (i = 0; i < v; i++)
r = lfsr_forwards(r);
} else {
for (i = 0; i < (0x100 - v); i++)
r = lfsr_backwards(r);
}
return r;
}

Also, if the value doesn't have to be really exact, you could have
a small lookup table with precomputed values, like:

u32 get_lfsr(u32 v)
{
static const lookup[256] = {
0xab3492, 0x3e3f34, 0xc47610c, ... /* insert actual values */
};

return lookup[v >> 16];
}

Arnd <><
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [Cbe-oss-dev] [RFC, PATCH 4/4] Add support to OProfile for profiling Cell BE SPUs -- update

2007-01-30 Thread Arnd Bergmann

On Tuesday 30 January 2007 22:41, Maynard Johnson wrote:
> Arnd Bergmann wrote:

> >>+   kt = ktime_set(0, profiling_interval);
> >>+   if (!spu_prof_running)
> >>+   goto STOP;
> >>+   hrtimer_forward(timer, timer->base->get_time(), kt);
> >>+   return HRTIMER_RESTART;
> > 
> > 
> > is hrtimer_forward really the right interface here? You are ignoring
> > the number of overruns anyway, so hrtimer_start(,,) sounds more
> > correct to me.
> According to Tom Gleixner, "hrtimer_forward is a convenience function to 
> move the expiry time of a timer forward in multiples of the interval, so 
> it is in the future.  After setting the expiry time you restart the 
> timer either with [sic] a return HRTIMER_RESTART (if you are in the 
> timer callback function)."
> > 

Ok, I see. Have you seen the timer actually coming in late, resulting
in hrtimer_forward returning non-zero? I guess it's not a big problem
for statistic data collection if that happens, but you might still want
to be able to see it.

> >>+   /* Since cpufreq_quick_get returns frequency in kHz, we use
> >>+* USEC_PER_SEC here vs NSEC_PER_SEC.
> >>+*/
> >>+   unsigned long nsPerCyc = (USEC_PER_SEC << SCALE_SHIFT)/khzfreq;
> >>+   profiling_interval = (nsPerCyc * cycles_reset) >> SCALE_SHIFT;
> >>+   
> >>+   pr_debug("timer resolution: %lu\n", 
> >>+TICK_NSEC);
> > 
> > 
> > Don't you need to adapt the profiling_interval at run time, when cpufreq
> > changes the core frequency? You should probably use
> > cpufreq_register_notifier() to update this.
> Since OProfile is a statistical profiler, the exact frequency is not 
> critical.  The user is going to be looking for hot spots in their code, 
> so it's all relative.  With that said,  I don't imagine using the 
> cpufreq notiication would be a big deal.  We'll look at it.
>
> >>@@ -480,7 +491,22 @@
> >>   struct op_system_config *sys, int num_ctrs)
> >> {
> >>int i, j, cpu;
> >>+   spu_cycle_reset = 0;
> >> 
> >>+   /* The cpufreq_quick_get function requires that cbe_cpufreq module
> >>+* be loaded.  This function is not actually provided and exported
> >>+* by cbe_cpufreq, but it relies on cbe_cpufreq initialize kernel
> >>+* data structures.  Since there's no way for depmod to realize
> >>+* that our OProfile module depends on cbe_cpufreq, we currently
> >>+* are letting the userspace tool, opcontrol, ensure that the
> >>+* cbe_cpufreq module is loaded.
> >>+*/
> >>+   khzfreq = cpufreq_quick_get(smp_processor_id());
> > 
> > 
> > You should probably have a fallback in here in case the cpufreq module
> > is not loaded. There is a global variable ppc_proc_freq (in Hz) that
> > you can access.
>
> Our userspace tool ensures the cpufreq module is loaded.

You should not rely on user space tools to do the right thing in the kernel.

Moreover, if the exact frequency is not that important, as you mentioned
above, you can probably just hardcode a compile-time constant here.

> >>+ * 
> >>+ * Ideally, we would like to be able to create the cached_info for
> >>+ * an SPU task just one time -- when libspe first loads the SPU 
> >>+ * binary file.  We would store the cached_info in a list.  Then, as
> >>+ * SPU tasks are switched out and new ones switched in, the cached_info
> >>+ * for inactive tasks would be kept, and the active one would be placed
> >>+ * at the head of the list.  But this technique may not with
> >>+ * current spufs functionality since the spu used in bind_context may
> >>+ * be a different spu than was used in a previous bind_context for a
> >>+ * reactivated SPU task.  Additionally, a reactivated SPU task may be
> >>+ * assigned to run on a different physical SPE.  We will investigate
> >>+ * further if this can be done.
> >>+ *
> >>+ */
> > 
> > 
> > You should stuff a pointer to cached_info into struct spu_context,
> > e.g. 'void *profile_private'.
> > 
> > 
> >>+struct cached_info {
> >>+   vma_map_t * map;
> >>+   struct spu * the_spu;
> >>+   struct kref cache_ref;
> >>+   struct list_head list;
> >>+};
> > 
> > 
> > And replace the 'the_spu' member with a back pointer to the
> > spu_context if you need it.
> > 
> > 
> >>+
> >>+/* A data structure for cached information about active SPU tasks.
> >>+ * Storage is dynamically allocated, sized as
> >>+ * "number of active nodes multplied by 8". 
> >>+ * The info_list[n] member holds 0 or more 
> >>+ * 'struct cached_info' objects for SPU#=n. 
> >>+ *
> >>+ * As currently implemented, there will only ever be one cached_info 
> >>+ * in the list for a given SPU.  If we can devise a way to maintain
> >>+ * multiple cached_infos in our list, then it would make sense
> >>+ * to also cache the dcookie representing the PPU task application.
> >>+ * See above description of struct cached_info for more details.
> >>+ */
> >>+struct spu_info_stacks {
> >>+

Re: [PATCH 0 of 4] Generic AIO by scheduling stacks

2007-01-30 Thread Nick Piggin


Nick Piggin wrote:

Linus Torvalds wrote:



On Wed, 31 Jan 2007, Benjamin Herrenschmidt wrote:


- We would now have some measure of task_struct concurrency.  Read 
that twice,

it's scary.  As two fibrils execute and block in turn they'll each be
referencing current->.  It means that we need to audit task_struct 
to make sure
that paths can handle racing as its scheduled away.  The current 
implementation
*does not* let preemption trigger a fibril switch.  So one only has 
to worry
about racing with voluntary scheduling of the fibril paths.  This 
can mean
moving some task_struct members under an accessor that hides them in 
a struct
in task_struct so they're switched along with the fibril.  I think 
this is a

manageable burden.



That's the one scaring me in fact ... Maybe it will end up being an easy
one but I don't feel too comfortable...




We actually have almost zero "interesting" data in the task-struct.

All the real meat of a task has long since been split up into 
structures that can be shared for threading anyway (ie 
signal/files/mm/etc).


Which is why I'm personally very comfy with just re-using task_struct 
as-is.


NOTE! This is with the understanding that we *never* do any 
preemption. The whole point of the microthreading as far as I'm 
concerned is exactly that it is cooperative. It's not preemptive, and 
it's emphatically *not* concurrent (ie you'd never have two fibrils 
running at the same time on separate CPU's).



So using stacks to hold state is (IMO) the logical choice to do async
syscalls, especially once you have a look at some of the other AIO
stuff going around.

I always thought that the AIO people didn't do this because they wanted
to avoid context switch overhead.

So now if we introduce the context switch overhead back, why do we need
just another scheduling primitive? What's so bad about using threads? The
upside is that almost everything is already there and working, and also
they don't have any of these preemption or concurrency restrictions.


In other words, while I share the appreciation for this clever trick of
using cooperative switching between these little thriblets, I don't
actually feel it is very elegant to then have to change the kernel so
much in order to handle them.

I would be fascinated to see where such a big advantage comes from using
these rather than threads. Maybe we can then improve threads not to suck
so much and everybody wins.

--
SUSE Labs, Novell Inc.
Send instant messages to your online friends http://au.messenger.yahoo.com 


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] AGPGART compat ioctl

2007-01-30 Thread Zwane Mwaikambo

On Tue, 30 Jan 2007, Kyle McMartin wrote:

> On Sat, Jan 27, 2007 at 07:28:07PM -0800, Zwane Mwaikambo wrote:
> > Hi Dave,
> > The following video card requires the agpgart driver ioctl 
> > interface in order to detect video memory.
> > 
> 
> Tested with testgart.c on parisc64, seems to work alright. Thanks for
> doing this work, Zwane. I've been meaning to do compat_ioctl for
> agpgart for months.

Kyle,
Thanks for testing! Hopefully Dave can just queue it up for me.

Cheers,
Zwane
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: How many people are using 2.6.16?

2007-01-30 Thread Bron Gondwana

On Tue, Jan 30, 2007 at 06:36:48PM -0800, Linus Torvalds wrote:
> 
> 
> On Tue, 30 Jan 2007, Mark Lord wrote:
> > 
> > I believe our featherless leader said he though it was an ancient bug,
> > exasperated by something that went into 2.6.19.
> > 
> > If Linus's opinion is correct (still?), then the bug exists in all
> > kernels since somewhere back in the 2.4.xx days.
> 
> The issue was somewhat confused by people certainly *reporting* it for 
> older kernels. Also, as part of the dirty bit cleanups and sanity 
> checkingwe did actually seem to fix a long-standing CIFS corruption (and 
> apparently reisertfs/XFS problems too).
> 
> But the *common* case was actually introduced with 2.6.19, and 2.6.16 
> wouldn't be affected. 

We run on reiserfs.  I did try ext3 for a little while on a couple of
servers but performance was really awful compared to reiser, and we
heaved a sigh of relief when we finally migrated all the users off
those filesystems.  There were many complaints about the speed of our
service for a while.

I'm really hoping this is the cause, because do still see occasional
corruption of MMAPed files under heavy load, though less often now
that we've balanced our servers to the point where load spikes are
much less common.

The servers are using either internal Areca cards or LSI SCSI adaptors
connected to external SATA raid boxes.  Either way, there's a few
terabytes of SATA attached to each box, with 10kRPM drives in RAID1
for Cyrus's metadata and 7,2k bigger drives in RAID5 for the actual
emails.  According to iostat these drives are being utilised at over
50% of available bandwidth even now during the "quiet time" - there
are many tens of thousands of users per machine - so we tend to
stress the IO subsystem quite a lot.

Cyrus is also very liberal in its use of MMAP, so we get to push
all sorts of exciting edge cases.  We were still applying patches
to reiserfs until recently, and I'm not sure what the status of that
is (Hans Reiser said to keep harassing him about it - but he's
hardly in a position to be dealing with our issues right now)

Thankfully, now that we're using 300Gb maximum rather than 2Tb
partitions (running multiple instances of Cyrus instead) with
the associated smaller mailboxes.db (the biggest MMAPed and
frequently updated file) things seem less edgy.  I don't like
edgy (queue Ubuntu jokes).

Anyway, I'm hoping to update one of our boxes to 2.6.19.2 soon.
We do have one box running a 2.6.18 series kernel which has been
fine as well.  I'll give feedback if we see any issues with MMAP
on there.

Bron.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch] kbuild: improving gcc option checking

2007-01-30 Thread Oleg Verych

kbuild: improving option checking, Kbuild.include cleanup

 GNU binutils, root users, tmpfiles, external modules ro builds must
 be fixed to do the right thing now.

Cc: Roman Zippel <[EMAIL PROTECTED]>
Cc: Sam Ravnborg <[EMAIL PROTECTED]>
Cc: Horst Schirmeier <[EMAIL PROTECTED]>
Cc: Jan Beulich <[EMAIL PROTECTED]>
Cc: Daniel Drake <[EMAIL PROTECTED]>
Cc: Andi Kleen <[EMAIL PROTECTED]>
Cc: Randy Dunlap <[EMAIL PROTECTED]>
Signed-off-by: Oleg Verych <[EMAIL PROTECTED]>
---

-- all checks by shell united in one macro -- checker-shell;
-- one disposable output sym. link to /dev/null per shell,
   thus no racing, `-Z' is removed;
-- modules' build output directory is used, if supplied;
-- every option checking function calls shell wrapper, acquires probe;
-- `echo -e' bashizm substituted (people with sh != bash have distinct
   CC options!);
-- some spelling and sense added to the comments;
-- small shuffle of whitespace.

Mostly all people, discussing this back in October are in the CC list.
Sam Ravnborg have not much time to reply. I've added Roman Zippel, as
a very kind reviewer and commiter of previous fix. Comments and
testing are appreciated. Thanks.

Note. `Mail-Followup-To' header is used to supply reply-to list.

-o--=O`C
 #oo'L O
<___=E M

--- linux-2.6.20-rc6/scripts/Kbuild.include~4-gcc-option-check  2007-01-12 
19:54:26.0 +0100
+++ linux-2.6.20-rc6/scripts/Kbuild.include 2007-01-31 05:56:58.942445500 
+0100
@@ -2,5 +2,5 @@
 # kbuild: Generic definitions
 
-# Convinient variables
+# Convenient constants
 comma   := ,
 squote  := '
@@ -57,38 +57,44 @@ endef
 # See documentation in Documentation/kbuild/makefiles.txt
 
-# output directory for tests below
-TMPOUT := $(if $(KBUILD_EXTMOD),$(firstword $(KBUILD_EXTMOD))/)
+# checker-shell
+# Usage: option = $(call checker-shell, $(CC)...-o $$OUT, option-ok, otherwise)
+# Exit code chooses option. $$OUT is safe location for needless output.
+define checker-shell
+  $(shell set -e; \
+DIR=$(KBUILD_EXTMOD); \
+cd $${DIR:-$(objtree)}; \
+OUT=$$PWD/..null; \
+\
+ln -s /dev/null $$OUT; \
+if $(1) >/dev/null 2>&1; \
+  then echo "$(2)"; \
+  else echo "$(3)"; \
+fi; \
+rm -f $$OUT)
+endef
 
 # as-option
 # Usage: cflags-y += $(call as-option, -Wa$(comma)-isa=foo,)
-
-as-option = $(shell if $(CC) $(CFLAGS) $(1) -Wa,-Z -c -o /dev/null \
--xassembler /dev/null > /dev/null 2>&1; then echo "$(1)"; \
-else echo "$(2)"; fi ;)
+as-option = $(call checker-shell, \
+   $(CC) $(CFLAGS) $(1) -c -xassembler /dev/null -o $$OUT, $(1), $(2))
 
 # as-instr
 # Usage: cflags-y += $(call as-instr, instr, option1, option2)
-
-as-instr = $(shell if echo -e "$(1)" | \
- $(CC) $(AFLAGS) -c -xassembler - \
-   -o $(TMPOUT)astest.out > /dev/null 2>&1; \
-  then rm $(TMPOUT)astest.out; echo "$(2)"; \
-  else echo "$(3)"; fi)
+as-instr = $(call checker-shell, \
+   printf "$(1)" | $(CC) $(AFLAGS) -c -xassembler -o $$OUT -, $(2), $(3))
 
 # cc-option
 # Usage: cflags-y += $(call cc-option, -march=winchip-c6, -march=i586)
-
-cc-option = $(shell if $(CC) $(CFLAGS) $(1) -S -o /dev/null -xc /dev/null \
- > /dev/null 2>&1; then echo "$(1)"; else echo "$(2)"; fi ;)
+cc-option = $(call checker-shell, \
+   $(CC) $(CFLAGS) $(if $(3),$(3),$(1)) -S -xc /dev/null -o $$OUT, $(1), $(2))
 
 # cc-option-yn
 # Usage: flag := $(call cc-option-yn, -march=winchip-c6)
-cc-option-yn = $(shell if $(CC) $(CFLAGS) $(1) -S -o /dev/null -xc /dev/null \
-> /dev/null 2>&1; then echo "y"; else echo "n"; fi;)
+cc-option-yn = $(call cc-option, "y", "n", $(1))
 
 # cc-option-align
 # Prefix align with either -falign or -malign
 cc-option-align = $(subst -functions=0,,\
-   $(call cc-option,-falign-functions=0,-malign-functions=0))
+   $(call cc-option,-falign-functions=0,-malign-functions=0))
 
 # cc-version
@@ -98,15 +104,13 @@ cc-version = $(shell $(CONFIG_SHELL) $(s
 # cc-ifversion
 # Usage:  EXTRA_CFLAGS += $(call cc-ifversion, -lt, 0402, -O1)
-cc-ifversion = $(shell if [ $(call cc-version, $(CC)) $(1) $(2) ]; then \
-   echo $(3); fi;)
+cc-ifversion = $(shell [ $(call cc-version, $(CC)) $(1) $(2) ] && echo $(3))
 
 # ld-option
 # Usage: ldflags += $(call ld-option, -Wl$(comma)--hash-style=both)
-ld-option = $(shell if $(CC) $(1) -nostdlib -xc /dev/null \
--o $(TMPOUT)ldtest.out > /dev/null 2>&1; \
- then rm $(TMPOUT)ldtest.out; echo "$(1)"; \
- else echo "$(2)"; fi)
+ld-option = $(call checker-shell, \
+   $(CC) $(1) -nostdlib -xc /dev/null -o $$OUT, $(1), $(2))
+
+##
 
-###
 # Shorthand for $(Q)$(MAKE) -f scripts/Makefile.build obj=
 # Usage:
@@ -114,17 +118,26 @@ ld-option = $(shell if $(CC) $(1) -nostd
 build := -f $(if $(KBUILD_SRC),$(srctree)/)scripts/Makefile.build obj
 
-# Prefix -I with $(srctree) if it is not an absolute path

Re: [PATCH 0 of 4] Generic AIO by scheduling stacks

2007-01-30 Thread Nick Piggin


Linus Torvalds wrote:


On Wed, 31 Jan 2007, Benjamin Herrenschmidt wrote:



- We would now have some measure of task_struct concurrency.  Read that twice,
it's scary.  As two fibrils execute and block in turn they'll each be
referencing current->.  It means that we need to audit task_struct to make sure
that paths can handle racing as its scheduled away.  The current implementation
*does not* let preemption trigger a fibril switch.  So one only has to worry
about racing with voluntary scheduling of the fibril paths.  This can mean
moving some task_struct members under an accessor that hides them in a struct
in task_struct so they're switched along with the fibril.  I think this is a
manageable burden.


That's the one scaring me in fact ... Maybe it will end up being an easy
one but I don't feel too comfortable...



We actually have almost zero "interesting" data in the task-struct.

All the real meat of a task has long since been split up into structures 
that can be shared for threading anyway (ie signal/files/mm/etc).


Which is why I'm personally very comfy with just re-using task_struct 
as-is.


NOTE! This is with the understanding that we *never* do any preemption. 
The whole point of the microthreading as far as I'm concerned is exactly 
that it is cooperative. It's not preemptive, and it's emphatically *not* 
concurrent (ie you'd never have two fibrils running at the same time on 
separate CPU's).


So using stacks to hold state is (IMO) the logical choice to do async
syscalls, especially once you have a look at some of the other AIO
stuff going around.

I always thought that the AIO people didn't do this because they wanted
to avoid context switch overhead.

So now if we introduce the context switch overhead back, why do we need
just another scheduling primitive? What's so bad about using threads? The
upside is that almost everything is already there and working, and also
they don't have any of these preemption or concurrency restrictions.

The only thing I saw in Zach's post against the use of threads is that
some kernel API would change. But surely if this is the showstopper then
there must be some better argument than sys_getpid()?!

Aside from that, I'm glad that someone is looking at this way for AIO,
because I really don't like some aspects in the other approach.

--
SUSE Labs, Novell Inc.
Send instant messages to your online friends http://au.messenger.yahoo.com 


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Reducing warning output from pci_get_subsys()

2007-01-30 Thread Kumar Gala



On Jan 30, 2007, at 11:11 PM, Andrew Morton wrote:


On Tue, 30 Jan 2007 22:55:38 -0600 (CST)
Kumar Gala <[EMAIL PROTECTED]> wrote:


Greg,

There was some code added to warn if pci_get_subsys() is called  
and the

pci_devices is empty.

I'm wondering if there is some point at which we know its ok for the
pci_devices list be empty if there are no devices on the bus so we  
can

stop printing the message.

On an embedded PPC reference system I see this message 6 times  
when I've

got no cards in the PCI slots.



I'd suggest we just remove the warning.  Also the one in  
pci_find_subsys().


I let them go through because I was curious to know what would  
cause it to

trigger - it might be an indication of other bugs.  But nothing very
interesting happened and they're of no use.


Works for me.

- k
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 0 of 4] Generic AIO by scheduling stacks

2007-01-30 Thread Benjamin Herrenschmidt


> NOTE! This is with the understanding that we *never* do any preemption. 
> The whole point of the microthreading as far as I'm concerned is exactly 
> that it is cooperative. It's not preemptive, and it's emphatically *not* 
> concurrent (ie you'd never have two fibrils running at the same time on 
> separate CPU's).

That makes it indeed much less worrisome...

> If you want preemptive of concurrent CPU usage, you use separate threads. 
> The point of AIO scheduling is very much inherent in its name: it's for 
> filling up CPU's when there's IO.

Ok, I see, that's in fact pretty similar to some task switching hack I
did about 10 years ago on MacOS to have "asynchronous" IO code be
implemented linearily :-)

Makes lots of sense imho.

Ben.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Reducing warning output from pci_get_subsys()

2007-01-30 Thread Andrew Morton

On Tue, 30 Jan 2007 22:55:38 -0600 (CST)
Kumar Gala <[EMAIL PROTECTED]> wrote:

> Greg,
> 
> There was some code added to warn if pci_get_subsys() is called and the 
> pci_devices is empty.
> 
> I'm wondering if there is some point at which we know its ok for the 
> pci_devices list be empty if there are no devices on the bus so we can 
> stop printing the message.
> 
> On an embedded PPC reference system I see this message 6 times when I've 
> got no cards in the PCI slots.
> 

I'd suggest we just remove the warning.  Also the one in pci_find_subsys().

I let them go through because I was curious to know what would cause it to
trigger - it might be an indication of other bugs.  But nothing very
interesting happened and they're of no use.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] scsi_lib.c: continue after MEDIUM_ERROR

2007-01-30 Thread Douglas Gilbert

Ric Wheeler wrote:
> 
> 
> Mark Lord wrote:
> 
>> Eric D. Mudama wrote:
>>
>>>
>>> Actually, it's possibly worse, since each failure in libata will
>>> generate 3-4 retries.  With existing ATA error recovery in the
>>> drives, that's about 3 seconds per retry on average, or 12 seconds
>>> per failure.  Multiply that by the number of blocks past the error to
>>> complete the request..
>>
>>
>> It really beats the alternative of a forced reboot
>> due to, say, superblock I/O failing because it happened
>> to get merged with an unrelated I/O which then failed..
>> Etc..
>>
>> Definitely an improvement.
>>
>> The number of retries is an entirely separate issue.
>> If we really care about it, then we should fix SD_MAX_RETRIES.
>>
>> The current value of 5 is *way* too high.  It should be zero or one.
>>
>> Cheers
>>
> I think that drives retry enough, we should leave retry at zero for
> normal (non-removable) drives. Should this  be a policy we can set like
> we do with NCQ queue depth via /sys ?

The transport might also want a say. I see ABORTED COMMAND
errors often enough with SAS (e.g. due to expander congestion)
to warrant at least one retry (which works in my testing).
SATA disks behind SAS infrastructure would also be
susceptible to the same "random" failures.

Transport Layer Retries (TLR) in SAS should remove this class
of transport errors but only SAS tape drives support TLR as
far as I know.

Doug Gilbert

> We need to be able to layer things like MD on top of normal drive errors
> in a way that will produce a system that provides reasonable response
> time despite any possible IO error on a single component.  Another case
> that we end up doing on a regular basis is drive recovery. Errors need
> to be limited in scope to just the impacted area and dispatched up to
> the application layer as quickly as we can so that you don't spend days
> watching a copy of  huge drive (think 750GB or more) ;-)
> 
> ric


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reducing warning output from pci_get_subsys()

2007-01-30 Thread Kumar Gala


Greg,

There was some code added to warn if pci_get_subsys() is called and the 
pci_devices is empty.


I'm wondering if there is some point at which we know its ok for the 
pci_devices list be empty if there are no devices on the bus so we can 
stop printing the message.


On an embedded PPC reference system I see this message 6 times when I've 
got no cards in the PCI slots.


- k
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [Need Help] Cpuhotplug operations on 32-bit mode of xeon-64bit processor crashes the system.

2007-01-30 Thread Srinivasa DS


Siddha, Suresh B wrote:

Sorry for my delayed response. I was away on vacation.

What platform is this? what do you mean by crashing? Do you see a
system freeze or oops?
  
Its xeon-64 bit processor,running in 32-bit compatibility 
mode(i386-code). We have not seen this problem in x86_64 envioronment. 
It happens in 32-bit compatibility mode.


Problem is in calculation of apicid's and delivery of IPI's.

I saw a oops,when I do cpuhotplug operations on it.

If you want any further information,please free to ask.

Thanks
Srinivasa Ds


thanks,
suresh

On Mon, Jan 22, 2007 at 01:42:48PM +0530, Srinivasa Ds wrote:
  
I saw cpuhotplug operations on 32-bit mode of xeon-64bit processors 
crashing the system. This happens on latest 2.6.20-rc5 kernel also. Same 
(i386 cpuhotplug code) runs fine on xeon-32bit processors.

Steps to reproduce.

echo 0 > /sys/devices/system/cpu/cpu6/online
echo 1 > /sys/devices/system/cpu/cpu6/online

dmesg shows.
==
Breaking affinity for irq 4
cpu_mask_to_apicid: Not a valid mask!
CPU 6 is now offline
===

On debugging the problem, I found that problem is not in cpuhotplug code 
but in apic part. Execution of  "stale" IPI's by onlined cpus(which we 
offlined earlier) is causing the crash. Now we need to debug,why IPI's 
are reaching the offlined cpu's too.


1)   During the calculation of apicid's, if cpu to which IPI has to 
deliver is not in
same apic cluster,it prints "Not a valid mask" error and returns "0xFF" 
which means broadcast the IPI's to all cpus(which are offlined too) and 
hence the problem.


2) I booted the system with maxcpus=2 boot parameter, and tried cpu 
hotplugging on it.
but still problem recreates(I think there is no concept of apic clusters 
if there are only 2 cpus). Hence it makes me to conclude that problem is 
in delivery of IPI's.


So Iam completely stuck here. Iam not able to move forward in debugging. 
So could someone(may be intel folks) please throw some light on this.


Thanks in advance
  Srinivasa DS
  LTC-IBM



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Need all the patche from kernel-2.6.7 to kernel-2.6.19 related to VM

2007-01-30 Thread Seetharam Dharmosoth

Hi,

Can you please suggest me from where do I get the VM
patches from kernel2.6.7 to kernel-2.6.19?

Thanks
Ram

 



__
Yahoo! India Answers: Share what you know. Learn something new
http://in.answers.yahoo.com/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Linux 2.6.20-rc7

2007-01-30 Thread Linus Torvalds


Yes, I know I said I would only do -rc6 and then the final 2.6.20, but the 
thing is, the known regressions list didn't get whittled down as quickly 
as I hoped, and as a result we now have a -rc7.

It's in good enough shape that I'd probably have been happy to just 
release it as 2.6.20, but since I want 2.6.20 to be a stability release, I 
didn't want to risk any stupid bugs while the regressions got fixed, so 
here's a final -rc7.

In other words, please do give it a good testing. We should have fixed the 
nasty stuff on Adrian's list (and here's another thanks to Adrian for 
keeping me on my toes!) and it's all good. But please give it a quick 
shake-down to make sure that nothing silly happened while fixing the bad 
stuff.

The shortlog really does say most of it - this is just various fixes for a 
number of mostly fairly inconsequential things, but the SG_IO timeout bug 
that hit any NeroLinux user would quite possibly impact other DVD/CD 
reader/writer programs too that used raw commands with timeouts.

The diffstat just looks like line-noise: 244 files changed with an average 
of less than 10 lines per file changed in 179 commits. In other words, 
really no big diffs: it's just a lot of really small stuff.

Linus

---
Adam Litke (1):
  Don't allow the stack to grow into hugetlb reserved regions

Adrian Bunk (1):
  fs/lockd/clntlock.c: add missing newlines to dprintk's

Ahmed S. Darwish (1):
  [CPUFREQ] check sysfs_create_link return value

Al Viro (9):
  b44: src_desc->addr is little-endian
  missing exports of pm_power_off() on alpha and sparc32
  mtd/nand/cafe.c missing include of dma-mapping.h
  sym53c500_cs: remove bogus call fo free_dma()
  pata_platform: fallout from set_mode() change
  missing dma_sync_single_range_for{cpu,device} on alpha
  dma-mapping.h stubs fix
  b44: src_desc->addr is little-endian
  fix indentation-related breakage in Kconfig.i386

Alan Cox (5):
  ide/generic: Jmicron has its own drivers now
  libata cmd64x: whack into a shape that looks like the documentation
  libata hpt3xn: Hopefully sort out the DPLL logic versus the vendor code
  libata: set_mode, Fix the FIXME
  libata-sff: Don't call bmdma_stop on non DMA capable controllers

Alexey Dobriyan (2):
  Fix NULL ->nsproxy dereference in /proc/*/mounts
  core-dumping unreadable binaries via PT_INTERP

Andrew Morton (5):
  jmicron: fix warning
  pata_platform: set_mode fix
  82596 warning fixes
  m68k: uaccess.h needs sched.h
  ntfs: kmap_atomic() atomicity fix

Andrew Victor (6):
  [ARM] 4084/1: Remove CONFIG_DEBUG_WAITQ
  [ARM] 4085/1: AT91: Header fixes.
  [ARM] 4086/1: AT91: Whitespace cleanup
  [ARM] 4087/1: AT91: CPU reset for SAM9x processors
  [ARM] 4088/1: AT91: Unbalanced IRQ in serial driver suspend/resume
  [ARM] 4089/1: AT91: GPIO wake IRQ cleanup

Andy Gospodarek (1):
  bonding: ARP monitoring broken on x86_64

Atsushi Nemoto (1):
  SPI: alternative fix for spi_busnum_to_master

Auke Kok (1):
  e100: fix irq leak on suspend/resume

Avi Kivity (3):
  KVM: Emulate IA32_MISC_ENABLE msr
  KVM: MMU: Perform access checks in walk_addr()
  KVM: MMU: Report nx faults to the guest

Bartlomiej Zolnierkiewicz (3):
  ide: update MAINTAINERS entry
  ia64: add pci_get_legacy_ide_irq()
  ide: add missing __init tags to IDE PCI host drivers

Baruch Even (1):
  [TCP]: Fix sorting of SACK blocks.

Ben Dooks (4):
  [ARM] 4095/1: S3C24XX: Fix GPIO set for Bank A
  [ARM] 4096/1: S3C24XX: change return code form s3c2410_gpio_getcfg()
  S3C24XX: fix passing spi chipselect to select routine
  [ARM] 4117/1: S3C2412: Fix writel() usage in selection code

Benjamin Herrenschmidt (1):
  [POWERPC] Fix sys_pciconfig_iobase bus matching

Catalin Marinas (2):
  [ARM] 4112/1: Only ioremap to supersections if DOMAIN_IO is zero
  [ARM] 4111/1: Allow VFP to work with thread migration on SMP

Conke Hu (3):
  atiixp.c: remove unused code
  atiixp.c: sb600 ide only has one channel
  atiixp.c: add cable detection support for ATI IDE

Dan Williams (1):
  [ARM] 4100/1: iop3xx: fix cpu mask for iop333

Dave Jones (5):
  [AGPGART] Prevent (unlikely) memory leak in amd_create_gatt_pages()
  [AGPGART] Remove pointless typedef in ati-agp
  [AGPGART] Remove pointless assignment.
  [AGPGART] Add new IDs to VIA AGP.
  [CPUFREQ] Remove unneeded errata workaround from p4-clockmod.

David Barksdale (1):
  IPMI: fix timeout list handling

David Milburn (1):
  libata-scsi: ata_task_ioctl should return ATA registers from sense data

David S. Miller (4):
  [AF_PACKET]: Fix BPF handling.
  [AF_PACKET]: Check device down state before hard header callbacks.
  [TCP]: Restore SKB socket owner setting in tcp_transmit_skb().
  [SPARC64]: Set g4/g5 properly in sun4v dtlb-prot handling.

David Woodhouse (1):

Re: [PATCH] scsi_lib.c: continue after MEDIUM_ERROR

2007-01-30 Thread James Bottomley

On Tue, 2007-01-30 at 22:20 -0500, Ric Wheeler wrote:
> Mark Lord wrote:
> > The number of retries is an entirely separate issue.
> > If we really care about it, then we should fix SD_MAX_RETRIES.
> >
> > The current value of 5 is *way* too high.  It should be zero or one.
> >
> > Cheers
> >
> I think that drives retry enough, we should leave retry at zero for 
> normal (non-removable) drives. Should this  be a policy we can set like 
> we do with NCQ queue depth via /sys ?

I don't disagree that it should be settable.  However, retries occur for
other reasons than failures inside the device.  The most standard ones
are unit attentions generated because of other activity (target reset
etc).  The key to the problem is retrying only operations that are
genuinely retryable, which the mid-layer doesn't do such a good job on.

> We need to be able to layer things like MD on top of normal drive errors 
> in a way that will produce a system that provides reasonable response 
> time despite any possible IO error on a single component.  Another case 
> that we end up doing on a regular basis is drive recovery. Errors need 
> to be limited in scope to just the impacted area and dispatched up to 
> the application layer as quickly as we can so that you don't spend days 
> watching a copy of  huge drive (think 750GB or more) ;-)

For the MD case, this is what REQ_FAILFAST is for.

James


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Free Linux Driver Development!

2007-01-30 Thread Andrey Borzenkov

Jeff Garzik wrote:

>> When we switch to PATA and drop old ide stack, what will happen ?
>> Will all driver be ported and full-feature, or some will be obsoleted ?
> 
> All drivers for which we can find users will be ported.  If any features
> disappear that's a bug.
> 

Well, I have a long standing issue with pata_ali not detecting CD-ROM in DMA
mode. When I rarely watch DVD I rather boot into legacy IDE kernel ...

-andrey

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 23/23] clocksource tsc: add verify routine

2007-01-30 Thread Daniel Walker

I've included this as another user of the clocksource interface. I don't see a
usage for this across all achitectures. So a fully generic version isn't needed.

I modified this from the pre 2.6.20-rc6-mm2 release to make the routine smaller
and to make it select a clocksource inside the timer. It should work with an 
HPET.

Signed-Off-By: Daniel Walker <[EMAIL PROTECTED]>

---
 arch/i386/kernel/tsc.c  |   55 
 include/linux/clocksource.h |   16 
 2 files changed, 51 insertions(+), 20 deletions(-)

Index: linux-2.6.19/arch/i386/kernel/tsc.c
===
--- linux-2.6.19.orig/arch/i386/kernel/tsc.c
+++ linux-2.6.19/arch/i386/kernel/tsc.c
@@ -343,47 +343,59 @@ static struct dmi_system_id __initdata b
 {}
 };
 
-#define TSC_FREQ_CHECK_INTERVAL (10*MSEC_PER_SEC) /* 10sec in MS */
+#define WATCHDOG_TRESHOLD (NSEC_PER_SEC >> 4)
+#define TSC_FREQ_CHECK_INTERVAL (MSEC_PER_SEC/2)
 static struct timer_list verify_tsc_freq_timer;
+struct clocksource *verify_clock;
 
 /* XXX - Probably should add locking */
 static void verify_tsc_freq(unsigned long unused)
 {
-   static u64 last_tsc;
-   static unsigned long last_jiffies;
-
-   u64 now_tsc, interval_tsc;
-   unsigned long now_jiffies, interval_jiffies;
+   static cycle_t last_tsc, last_verify;
 
+   cycle_t now_tsc, verify_clock_now, interval;
+   s64 nsecs;
 
if (check_tsc_unstable())
return;
 
-   rdtscll(now_tsc);
-   now_jiffies = jiffies;
+   if (unlikely(system_state != SYSTEM_RUNNING))
+   goto out_timer;
 
-   if (!last_jiffies) {
-   goto out;
+   if (unlikely(!verify_clock)) {
+   verify_clock =
+   clocksource_get_masked_clock(CLOCKSOURCE_PM_AFFECTED);
+   printk("TSC: selected %s clocksource for TSC verification.\n",
+  verify_clock->name);
}
 
-   interval_jiffies = now_jiffies - last_jiffies;
-   interval_tsc = now_tsc - last_tsc;
-   interval_tsc *= HZ;
-   do_div(interval_tsc, cpu_khz*1000);
+   now_tsc = clocksource_read(_tsc);
+   verify_clock_now = clocksource_read(verify_clock);
+
+   if (!last_tsc)
+   goto out;
 
-   if (interval_tsc < (interval_jiffies * 3 / 4)) {
+   interval = clocksource_subtract(verify_clock, verify_clock_now,
+   last_verify);
+   nsecs = cyc2ns(verify_clock, interval);
+
+   interval = clocksource_subtract(_tsc, now_tsc, last_tsc);
+   nsecs -= cyc2ns(_tsc, interval);
+
+   if (nsecs > WATCHDOG_TRESHOLD) {
printk("TSC appears to be running slowly. "
-   "Marking it as unstable\n");
+  "Marking it as unstable");
mark_tsc_unstable();
return;
}
 
 out:
-   last_tsc = now_tsc;
-   last_jiffies = now_jiffies;
+   last_tsc = now_tsc;
+   last_verify = verify_clock_now;
+out_timer:
/* set us up to go off on the next interval: */
mod_timer(_tsc_freq_timer,
-   jiffies + msecs_to_jiffies(TSC_FREQ_CHECK_INTERVAL));
+ jiffies + msecs_to_jiffies(TSC_FREQ_CHECK_INTERVAL));
 }
 
 /*
@@ -444,7 +456,10 @@ static int __init init_tsc_clocksource(v
if (check_tsc_unstable())
clocksource_tsc.flags |= CLOCKSOURCE_UNSTABLE |
 CLOCKSOURCE_NOT_CONTINUOUS;
-
+   /*
+* The verify routine will select the right clock after the
+* system boots fully.
+*/
init_timer(_tsc_freq_timer);
verify_tsc_freq_timer.function = verify_tsc_freq;
verify_tsc_freq_timer.expires =
Index: linux-2.6.19/include/linux/clocksource.h
===
--- linux-2.6.19.orig/include/linux/clocksource.h
+++ linux-2.6.19/include/linux/clocksource.h
@@ -207,6 +207,22 @@ static inline s64 cyc2ns(struct clocksou
 }
 
 /**
+ * clocksource_subtract - Subtract two cycle_t timestamps
+ * @cs:Clocksource the timestamp came from
+ * @stop:  Stop timestamp
+ * @start: Start timestamp
+ *
+ * This subtract accounts for rollover by using the clocksource mask.
+ * The clock can roll over only once, after that this subtract will not
+ * work properly.
+ */
+static inline s64 clocksource_subtract(struct clocksource* cs, cycle_t stop,
+  cycle_t start)
+{
+   return ((stop - start) & cs->mask);
+}
+
+/**
  * clocksource_calculate_interval - Calculates a clocksource interval struct
  *
  * @c: Pointer to clocksource.

-- 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More

[PATCH 01/23] clocksource: drop clocksource-add-verification-watchdog-helper-fix.patch

2007-01-30 Thread Daniel Walker

Drop.

Signed-Off-By: Daniel Walker <[EMAIL PROTECTED]>

---
 kernel/time/clocksource.c |   11 +--
 1 file changed, 1 insertion(+), 10 deletions(-)

Index: linux-2.6.19/kernel/time/clocksource.c
===
--- linux-2.6.19.orig/kernel/time/clocksource.c
+++ linux-2.6.19/kernel/time/clocksource.c
@@ -28,7 +28,6 @@
 #include 
 #include 
 #include 
-#include 
 
 /* XXX - Would like a better way for initializing curr_clocksource */
 extern struct clocksource clocksource_jiffies;
@@ -107,16 +106,8 @@ static void clocksource_watchdog(unsigne
/* Initialized ? */
if (!(cs->flags & CLOCK_SOURCE_WATCHDOG)) {
if ((cs->flags & CLOCK_SOURCE_IS_CONTINUOUS) &&
-   (watchdog->flags & CLOCK_SOURCE_IS_CONTINUOUS)) {
+   (watchdog->flags & CLOCK_SOURCE_IS_CONTINUOUS))
cs->flags |= CLOCK_SOURCE_VALID_FOR_HRES;
-   /*
-* We just marked the clocksource as
-* highres-capable, notify the rest of the
-* system as well so that we transition
-* into high-res mode:
-*/
-   tick_clock_notify();
-   }
cs->flags |= CLOCK_SOURCE_WATCHDOG;
cs->wd_last = csnow;
} else {

-- 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] scsi_lib.c: continue after MEDIUM_ERROR

2007-01-30 Thread Ric Wheeler




Mark Lord wrote:


Eric D. Mudama wrote:



Actually, it's possibly worse, since each failure in libata will 
generate 3-4 retries.  With existing ATA error recovery in the 
drives, that's about 3 seconds per retry on average, or 12 seconds 
per failure.  Multiply that by the number of blocks past the error to 
complete the request..



It really beats the alternative of a forced reboot
due to, say, superblock I/O failing because it happened
to get merged with an unrelated I/O which then failed..
Etc..

Definitely an improvement.

The number of retries is an entirely separate issue.
If we really care about it, then we should fix SD_MAX_RETRIES.

The current value of 5 is *way* too high.  It should be zero or one.

Cheers

I think that drives retry enough, we should leave retry at zero for 
normal (non-removable) drives. Should this  be a policy we can set like 
we do with NCQ queue depth via /sys ?


We need to be able to layer things like MD on top of normal drive errors 
in a way that will produce a system that provides reasonable response 
time despite any possible IO error on a single component.  Another case 
that we end up doing on a regular basis is drive recovery. Errors need 
to be limited in scope to just the impacted area and dispatched up to 
the application layer as quickly as we can so that you don't spend days 
watching a copy of  huge drive (think 750GB or more) ;-)


ric

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Linux 2.6.19.2: Freeze with CIFS mount

2007-01-30 Thread Steven French

The cifs entries in the dmesg log do not indicate any errors, much less 
show the cause of this
particular problem.

The repeated entry:
CIFS VFS: Send error in SETFSUnixInfo = -5
is expected on connection to certain older versions of Samba servers (or 
other servers that 
only partially support the current CIFS Unix Extensions).  It is harmless.

It would be useful to know (e.g. if it is possible to trace the network 
traffic on the server side on your NAS box) whether
any network traffic from the client is being sent when (or just before) 
the hang occurs.

It is possible that the restarting of the NAS box allows reconnection of 
the smb/cifs session to proceed
which presumably could be hanging or looping in the network adapter 
driver, the tcp stack or cifs on
the client, but it is hard to tell without more information.   I don't 
know much about either of the
GigE drivers loaded on your system to determine if there is an easy way to 
tell their state.

There are various ways to analyze system hangs including (at least in some 
cases) getting a system dump which
can be used to isolate the failing location - hopefully 



[EMAIL PROTECTED] wrote on 01/30/2007 06:37:48 AM:

> Hello,
> 
> I report a problem that occurs on a Core2 system (x86_64 used) with a 
Linux 
> 2.6.19.2, when i use a NAS : Maxtor Shared Storage II 320Go (Linux 
2.6.12 
> inside).
> 
> In fact, this NAS can be web-configured to sleep after 30 min. Also,i 
mount a 
> partition of this device through this kind of entry inside the fstab :
> 
> 
> //192.168.1.60/Archive   /home/tuxico/NAS/Archive   cifs 
> noauto,users,iocharset=iso8859-1,noperm,nosetuids,noacl,sfu,
> file_mode=0600,dir_mode=0755,uid=tuxico,gid=users,
> credentials=/root/.credentials 
> 0 0
> 
> 
> Under those circumstances, the Core2 system which is connected to it, 
freeze 
> sometimes completely (mouse, keyboard are frozen, no connection possible 
from 
> an external system - sshd not respond).
> 
> This occurs regularly (within 3-4 days) and it seems that the problem 
results 
> from the awakening of the NAS device.
> 
> Indeed I have disable the sleep feature of the device (via its web 
control 
> panel), then i was unable to trigger the problem for at least 7 days of 
> uptime of the Core2 system.
> 
> I join the .config, the result of lspci, and the CIFS logs that have 
been 
> written to /var/log/messages (i don't know if there are relevant or not 
but 
> in the doubt...).
> 
> Thanks in advance for the help.
> 
> Best regards,
> 
>Eric
> [attachment "lspci" deleted by Steven French/Austin/IBM] [attachment
> "dotconfig" deleted by Steven French/Austin/IBM] [attachment 
> "cifslog" deleted by Steven French/Austin/IBM] 

Steve French
Senior Software Engineer
Linux Technology Center - IBM Austin
phone: 512-838-2294
email: sfrench at-sign us dot ibm dot com

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 02/23] clocksource: drop clocksource-add-verification-watchdog-helper.patch

2007-01-30 Thread Daniel Walker

Drop.

Signed-Off-By: Daniel Walker <[EMAIL PROTECTED]>

---
 arch/i386/Kconfig   |4 -
 arch/i386/kernel/tsc.c  |   49 +
 include/linux/clocksource.h |   15 -
 kernel/time/clocksource.c   |  125 +---
 kernel/timer.c  |   48 
 5 files changed, 79 insertions(+), 162 deletions(-)

Index: linux-2.6.19/arch/i386/Kconfig
===
--- linux-2.6.19.orig/arch/i386/Kconfig
+++ linux-2.6.19/arch/i386/Kconfig
@@ -18,10 +18,6 @@ config GENERIC_TIME
bool
default y
 
-config CLOCKSOURCE_WATCHDOG
-   bool
-   default y
-
 config GENERIC_CLOCKEVENTS
bool
default y
Index: linux-2.6.19/arch/i386/kernel/tsc.c
===
--- linux-2.6.19.orig/arch/i386/kernel/tsc.c
+++ linux-2.6.19/arch/i386/kernel/tsc.c
@@ -344,6 +344,49 @@ static struct dmi_system_id __initdata b
 {}
 };
 
+#define TSC_FREQ_CHECK_INTERVAL (10*MSEC_PER_SEC) /* 10sec in MS */
+static struct timer_list verify_tsc_freq_timer;
+
+/* XXX - Probably should add locking */
+static void verify_tsc_freq(unsigned long unused)
+{
+   static u64 last_tsc;
+   static unsigned long last_jiffies;
+
+   u64 now_tsc, interval_tsc;
+   unsigned long now_jiffies, interval_jiffies;
+
+
+   if (check_tsc_unstable())
+   return;
+
+   rdtscll(now_tsc);
+   now_jiffies = jiffies;
+
+   if (!last_jiffies) {
+   goto out;
+   }
+
+   interval_jiffies = now_jiffies - last_jiffies;
+   interval_tsc = now_tsc - last_tsc;
+   interval_tsc *= HZ;
+   do_div(interval_tsc, cpu_khz*1000);
+
+   if (interval_tsc < (interval_jiffies * 3 / 4)) {
+   printk("TSC appears to be running slowly. "
+   "Marking it as unstable\n");
+   mark_tsc_unstable();
+   return;
+   }
+
+out:
+   last_tsc = now_tsc;
+   last_jiffies = now_jiffies;
+   /* set us up to go off on the next interval: */
+   mod_timer(_tsc_freq_timer,
+   jiffies + msecs_to_jiffies(TSC_FREQ_CHECK_INTERVAL));
+}
+
 /*
  * Make an educated guess if the TSC is trustworthy and synchronized
  * over all CPUs.
@@ -401,6 +444,12 @@ static int __init init_tsc_clocksource(v
clocksource_tsc.flags &= ~CLOCK_SOURCE_IS_CONTINUOUS;
}
 
+   init_timer(_tsc_freq_timer);
+   verify_tsc_freq_timer.function = verify_tsc_freq;
+   verify_tsc_freq_timer.expires =
+   jiffies + msecs_to_jiffies(TSC_FREQ_CHECK_INTERVAL);
+   add_timer(_tsc_freq_timer);
+
return clocksource_register(_tsc);
}
 
Index: linux-2.6.19/include/linux/clocksource.h
===
--- linux-2.6.19.orig/include/linux/clocksource.h
+++ linux-2.6.19/include/linux/clocksource.h
@@ -12,13 +12,11 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 
 /* clocksource cycle base type */
 typedef u64 cycle_t;
-struct clocksource;
 
 /**
  * struct clocksource - hardware abstraction for a free running counter
@@ -66,22 +64,13 @@ struct clocksource {
cycle_t cycle_last, cycle_interval;
u64 xtime_nsec, xtime_interval;
s64 error;
-
-#ifdef CONFIG_CLOCKSOURCE_WATCHDOG
-   /* Watchdog related data, used by the framework */
-   struct list_head wd_list;
-   cycle_t wd_last;
-#endif
 };
 
 /*
  * Clock source flags bits::
  */
-#define CLOCK_SOURCE_IS_CONTINUOUS 0x01
-#define CLOCK_SOURCE_MUST_VERIFY   0x02
-
-#define CLOCK_SOURCE_WATCHDOG  0x10
-#define CLOCK_SOURCE_VALID_FOR_HRES0x20
+#define CLOCK_SOURCE_IS_CONTINUOUS 0x01
+#define CLOCK_SOURCE_MUST_VERIFY   0x02
 
 /* simplify initialization of mask field */
 #define CLOCKSOURCE_MASK(bits) (cycle_t)(bits<64 ? ((1ULL<> 1)
-#define WATCHDOG_TRESHOLD (NSEC_PER_SEC >> 4)
-
-static void clocksource_ratewd(struct clocksource *cs, int64_t delta)
-{
-   if (delta > -WATCHDOG_TRESHOLD && delta < WATCHDOG_TRESHOLD)
-   return;
-
-   printk(KERN_WARNING "Clocksource %s unstable (delta = %Ld ns)\n",
-  cs->name, delta);
-   cs->flags &= ~(CLOCK_SOURCE_VALID_FOR_HRES | CLOCK_SOURCE_WATCHDOG);
-   clocksource_change_rating(cs, 0);
-   cs->flags &= ~CLOCK_SOURCE_WATCHDOG;
-   list_del(>wd_list);
-}
-
-static void clocksource_watchdog(unsigned long data)
-{
-   struct clocksource *cs, *tmp;
-   cycle_t csnow, wdnow;
-   int64_t wd_nsec, cs_nsec;
-
-   spin_lock(_lock);
-
-   wdnow = watchdog->read();
-   wd_nsec = cyc2ns(watchdog, (wdnow - watchdog_last) & watchdog->mask);
-   watchdog_last = wdnow;
-
-   list_for_each_entry_safe(cs, tmp, _list, wd_list) {
-

[PATCH 03/23] clocksource: drop clocksource-remove-the-update-callback.patch

2007-01-30 Thread Daniel Walker

Drop.

Signed-Off-By: Daniel Walker <[EMAIL PROTECTED]>

---
 arch/i386/kernel/tsc.c  |   51 +---
 include/linux/clocksource.h |   11 ++---
 kernel/timer.c  |3 ++
 3 files changed, 45 insertions(+), 20 deletions(-)

Index: linux-2.6.19/arch/i386/kernel/tsc.c
===
--- linux-2.6.19.orig/arch/i386/kernel/tsc.c
+++ linux-2.6.19/arch/i386/kernel/tsc.c
@@ -60,6 +60,12 @@ static inline int check_tsc_unstable(voi
return tsc_unstable;
 }
 
+void mark_tsc_unstable(void)
+{
+   tsc_unstable = 1;
+}
+EXPORT_SYMBOL_GPL(mark_tsc_unstable);
+
 /* Accellerators for sched_clock()
  * convert from cycles(64bits) => nanoseconds (64bits)
  *  basic equation:
@@ -289,6 +295,7 @@ core_initcall(cpufreq_tsc);
 /* clock source code */
 
 static unsigned long current_tsc_khz = 0;
+static int tsc_update_callback(void);
 
 static cycle_t read_tsc(void)
 {
@@ -306,28 +313,38 @@ static struct clocksource clocksource_ts
.mask   = CLOCKSOURCE_MASK(64),
.mult   = 0, /* to be set */
.shift  = 22,
+   .update_callback= tsc_update_callback,
.flags  = CLOCK_SOURCE_IS_CONTINUOUS |
  CLOCK_SOURCE_MUST_VERIFY,
 };
 
-void mark_tsc_unstable(void)
+static int tsc_update_callback(void)
 {
-   if (!tsc_unstable) {
-   tsc_unstable = 1;
-   /* Can be called before registration */
-   if (clocksource_tsc.mult)
-   clocksource_change_rating(_tsc, 0);
-   else
-   clocksource_tsc.rating = 0;
+   int change = 0;
+
+   /* check to see if we should switch to the safe clocksource: */
+   if (clocksource_tsc.rating != 0 && check_tsc_unstable()) {
+   clocksource_tsc.rating = 0;
+   clocksource_reselect();
+   change = 1;
+   }
+
+   /* only update if tsc_khz has changed: */
+   if (current_tsc_khz != tsc_khz) {
+   current_tsc_khz = tsc_khz;
+   clocksource_tsc.mult = clocksource_khz2mult(current_tsc_khz,
+   clocksource_tsc.shift);
+   change = 1;
}
+
+   return change;
 }
-EXPORT_SYMBOL_GPL(mark_tsc_unstable);
 
 static int __init dmi_mark_tsc_unstable(struct dmi_system_id *d)
 {
printk(KERN_NOTICE "%s detected: marking TSC unstable.\n",
   d->ident);
-   tsc_unstable = 1;
+   mark_tsc_unstable();
return 0;
 }
 
@@ -399,12 +416,11 @@ __cpuinit int unsynchronized_tsc(void)
 * Intel systems are normally all synchronized.
 * Exceptions must mark TSC as unstable:
 */
-   if (boot_cpu_data.x86_vendor != X86_VENDOR_INTEL) {
-   /* assume multi socket systems are not synchronized: */
-   if (num_possible_cpus() > 1)
-   tsc_unstable = 1;
-   }
-   return tsc_unstable;
+   if (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL)
+   return 0;
+
+   /* assume multi socket systems are not synchronized: */
+   return num_possible_cpus() > 1;
 }
 
 /*
@@ -433,7 +449,8 @@ static int __init init_tsc_clocksource(v
/* check blacklist */
dmi_check_system(bad_tsc_dmi_table);
 
-   unsynchronized_tsc();
+   if (unsynchronized_tsc()) /* mark unstable if unsynced */
+   mark_tsc_unstable();
check_geode_tsc_reliable();
current_tsc_khz = tsc_khz;
clocksource_tsc.mult = clocksource_khz2mult(current_tsc_khz,
Index: linux-2.6.19/include/linux/clocksource.h
===
--- linux-2.6.19.orig/include/linux/clocksource.h
+++ linux-2.6.19/include/linux/clocksource.h
@@ -44,6 +44,7 @@ typedef u64 cycle_t;
  * subtraction of non 64 bit counters
  * @mult:  cycle to nanosecond multiplier
  * @shift: cycle to nanosecond divisor (power of two)
+ * @update_callback:   called when safe to alter clocksource values
  * @flags: flags describing special properties
  * @vread: vsyscall based read
  * @cycle_interval:Used internally by timekeeping core, please ignore.
@@ -57,6 +58,7 @@ struct clocksource {
cycle_t mask;
u32 mult;
u32 shift;
+   int (*update_callback)(void);
unsigned long flags;
cycle_t (*vread)(void);
 
@@ -184,9 +186,9 @@ static inline void clocksource_calculate
 
 
 /* used to install a new clocksource */
-extern int clocksource_register(struct clocksource*);
-extern struct clocksource* clocksource_get_next(void);
-extern void clocksource_change_rating(struct clocksource *cs, int rating);
+int clocksource_register(struct

[PATCH 14/23] clocksource: increase initcall priority

2007-01-30 Thread Daniel Walker


Normal systems often have almost everything registered in
device_initcall() . Most drivers are registered there, and usually if
people add code that needs an initcall they will either use
device_initcall() or module_init() which both result in the same
initcall..

When John created the clocksource interface he did what most people
would do , and he made the clocksource registration happen in  
device_initcall with most everything else .. The effect of doing this
was the addition of the following code,

/* clocksource_done_booting - Called near the end of bootup
 *
 * Hack to avoid lots of clocksource churn at boot time
 */
static int __init clocksource_done_booting(void)
{
finished_booting = 1;
return 0;
}

late_initcall(clocksource_done_booting);

This is one of two initcalls in the clocksource interface , the other
one is device_initcall(init_clocksource_sysfs); ..

If I leave the clocksource initcall alone then anything that uses a
clocksource in the future would need at least one late_initcall().
Since the clocksources aren't all fully register until after 
device_initcall. 

The reason behind changing that is because it doesn't fit the usually
development flow of initialization functions which , as I said earlier,
almost always end up into device_initcall .

This change certainly isn't mandatory . I feel it would reduce the
likely hood of developers that use the clocksource interface from adding
multiple initcalls (one late_initcall, and one device_initcall). It also
better fits developers tendencies to put almost everything into
device_initcall() ..

In addition,

This patch removes a small amount of code in time keeping which existed to
detect the end of the initcall sequence then selected a clock.

As a note, arm and mips both register their clocksources during time_init()
instead of using initcalls.

Signed-Off-By: Daniel Walker <[EMAIL PROTECTED]>

---
 arch/i386/kernel/hpet.c  |2 +-
 arch/i386/kernel/i8253.c |2 +-
 arch/i386/kernel/tsc.c   |2 +-
 arch/x86_64/kernel/hpet.c|2 +-
 arch/x86_64/kernel/tsc.c |3 +--
 drivers/clocksource/acpi_pm.c|8 +++-
 drivers/clocksource/cyclone.c|2 +-
 drivers/clocksource/scx200_hrt.c |2 +-
 include/linux/clocksource.h  |6 ++
 kernel/time/clocksource.c|   13 -
 kernel/time/jiffies.c|2 +-
 kernel/time/tick-sched.c |8 
 kernel/time/timekeeping.c|   15 +++
 13 files changed, 36 insertions(+), 31 deletions(-)

Index: linux-2.6.19/arch/i386/kernel/hpet.c
===
--- linux-2.6.19.orig/arch/i386/kernel/hpet.c
+++ linux-2.6.19/arch/i386/kernel/hpet.c
@@ -315,7 +315,7 @@ static int __init init_hpet_clocksource(
return clocksource_register(_hpet);
 }
 
-module_init(init_hpet_clocksource);
+clocksource_initcall(init_hpet_clocksource);
 
 #ifdef CONFIG_HPET_EMULATE_RTC
 
Index: linux-2.6.19/arch/i386/kernel/i8253.c
===
--- linux-2.6.19.orig/arch/i386/kernel/i8253.c
+++ linux-2.6.19/arch/i386/kernel/i8253.c
@@ -195,4 +195,4 @@ static int __init init_pit_clocksource(v
clocksource_pit.mult = clocksource_hz2mult(CLOCK_TICK_RATE, 20);
return clocksource_register(_pit);
 }
-module_init(init_pit_clocksource);
+clocksource_initcall(init_pit_clocksource);
Index: linux-2.6.19/arch/i386/kernel/tsc.c
===
--- linux-2.6.19.orig/arch/i386/kernel/tsc.c
+++ linux-2.6.19/arch/i386/kernel/tsc.c
@@ -458,4 +458,4 @@ static int __init init_tsc_clocksource(v
return 0;
 }
 
-module_init(init_tsc_clocksource);
+clocksource_initcall(init_tsc_clocksource);
Index: linux-2.6.19/arch/x86_64/kernel/hpet.c
===
--- linux-2.6.19.orig/arch/x86_64/kernel/hpet.c
+++ linux-2.6.19/arch/x86_64/kernel/hpet.c
@@ -508,4 +508,4 @@ static int __init init_hpet_clocksource(
return clocksource_register(_hpet);
 }
 
-module_init(init_hpet_clocksource);
+clocksource_initcall(init_hpet_clocksource);
Index: linux-2.6.19/arch/x86_64/kernel/tsc.c
===
--- linux-2.6.19.orig/arch/x86_64/kernel/tsc.c
+++ linux-2.6.19/arch/x86_64/kernel/tsc.c
@@ -224,5 +224,4 @@ static int __init init_tsc_clocksource(v
}
return 0;
 }
-
-module_init(init_tsc_clocksource);
+clocksource_initcall(init_tsc_clocksource);
Index: linux-2.6.19/drivers/clocksource/acpi_pm.c
===
--- linux-2.6.19.orig/drivers/clocksource/acpi_pm.c
+++ linux-2.6.19/drivers/clocksource/acpi_pm.c
@@ -214,4 +214,10 @@ pm_good:
return clocksource_register(_acpi_pm);
 }
 
-module_init(init_acpi_pm_clocksource);
+/*
+ * This clocksource is removed from the

[PATCH 13/23] timekeeping: move sysfs layer/drop API calls

2007-01-30 Thread Daniel Walker

This moves the timekeeping sysfs override layer into timekeeping.c and
removes the get_next_clocksource and select_clocksource functions, and 
their component variables, since they are no longer used.

Signed-Off-By: Daniel Walker <[EMAIL PROTECTED]>

---
 include/linux/clocksource.h |5 -
 kernel/time/clocksource.c   |  169 
 kernel/time/timekeeping.c   |  131 +-
 3 files changed, 134 insertions(+), 171 deletions(-)

Index: linux-2.6.19/include/linux/clocksource.h
===
--- linux-2.6.19.orig/include/linux/clocksource.h
+++ linux-2.6.19/include/linux/clocksource.h
@@ -25,9 +25,9 @@ typedef u64 cycle_t;
 extern struct clocksource clocksource_jiffies;
 
 /*
- * Atomic signal that is specific to timekeeping.
+ * Sysfs device extern, for registering clocksources under the same sysfs dir.
  */
-extern atomic_t clock_check;
+extern struct sys_device clocksource_sys_device;
 
 /*
  * Allows inlined calling for notifier routines.
@@ -231,7 +231,6 @@ static inline void clocksource_calculate
 
 
 /* used to install a new clocksource */
-extern struct clocksource *clocksource_get_next(void);
 extern int clocksource_register(struct clocksource*);
 extern void clocksource_rating_change(struct clocksource*);
 extern struct clocksource * clocksource_get_clock(char*);
Index: linux-2.6.19/kernel/time/clocksource.c
===
--- linux-2.6.19.orig/kernel/time/clocksource.c
+++ linux-2.6.19/kernel/time/clocksource.c
@@ -30,27 +30,17 @@
 #include 
 
 /*[Clocksource internal variables]-
- * curr_clocksource:
- * currently selected clocksource. Initialized to clocksource_jiffies.
- * next_clocksource:
- * pending next selected clocksource.
  * clocksource_list:
  * rating sorted linked list with the registered clocksources
  * clocksource_lock:
  * protects manipulations to curr_clocksource and next_clocksource
  * and the clocksource_list
- * override_name:
- * Name of the user-specified clocksource.
  */
-static struct clocksource *curr_clocksource = _jiffies;
-static struct clocksource *next_clocksource;
 static LIST_HEAD(clocksource_list);
 static DEFINE_SPINLOCK(clocksource_lock);
-static char override_name[32];
 static int finished_booting;
 
 ATOMIC_NOTIFIER_HEAD(clocksource_list_notifier);
-atomic_t clock_check = ATOMIC_INIT(0);
 
 /* clocksource_done_booting - Called near the end of bootup
  *
@@ -59,32 +49,12 @@ atomic_t clock_check = ATOMIC_INIT(0);
 static int __init clocksource_done_booting(void)
 {
finished_booting = 1;
-   /* Check for a new clock now */
-   atomic_inc(_check);
return 0;
 }
 
 late_initcall(clocksource_done_booting);
 
 /**
- * clocksource_get_next - Returns the selected clocksource
- *
- */
-struct clocksource *clocksource_get_next(void)
-{
-   unsigned long flags;
-
-   spin_lock_irqsave(_lock, flags);
-   if (next_clocksource && finished_booting) {
-   curr_clocksource = next_clocksource;
-   next_clocksource = NULL;
-   }
-   spin_unlock_irqrestore(_lock, flags);
-
-   return curr_clocksource;
-}
-
-/**
  * __is_registered - Returns a clocksource if it's registered
  * @name:  name of the clocksource to return
  *
@@ -148,25 +118,6 @@ struct clocksource * clocksource_get_clo
return ret;
 }
 
-
-/**
- * select_clocksource - Finds the best registered clocksource.
- *
- * Private function. Must hold clocksource_lock when called.
- *
- * Looks through the list of registered clocksources, returning
- * the one with the highest rating value. If there is a clocksource
- * name that matches the override string, it returns that clocksource.
- */
-static struct clocksource *select_clocksource(void)
-{
-   if (!*override_name)
-   return list_entry(clocksource_list.next,
- struct clocksource, list);
-
-   return __get_clock(override_name);
-}
-
 /*
  * __sorted_list_add - Sorted clocksource add
  * @c: clocksource to add
@@ -210,11 +161,6 @@ int clocksource_register(struct clocksou
 
spin_lock_irqsave(_lock, flags);
__sorted_list_add(c);
-
-   /*
-* scan the registered clocksources, and pick the best one
-*/
-   next_clocksource = select_clocksource();
spin_unlock_irqrestore(_lock, flags);
 
atomic_notifier_call_chain(_list_notifier,
@@ -243,7 +189,6 @@ void clocksource_rating_change(struct cl
list_del_init(>list);
__sorted_list_add(c);
 
-   next_clocksource = select_clocksource();
spin_unlock_irqrestore(_lock, flags);
 
atomic_notifier_call_chain(_list_notifier,
@@ -254,67 +199,6 @@ EXPORT_SYMBOL(clocksource_rating_change)
 
 #ifdef CONFIG_SYSFS
 /**
- * sysfs_show_current_clocksources - sysfs interface for current clocksource

[PATCH 00/23] clocksource update v12

2007-01-30 Thread Daniel Walker

This tree is mostly cleanups . I move timekeeping code into it's own 
file, and I modify the clocksource interface to provide a more robust
API.

I've dropped some duplication off the hrt/dynamic tick patch set which
is all new to that tree and new to -mm. This is an -mm patch set , it's
not meant to go on 2.6.20-rc6 at all .. I'm modifying code only
included in -mm.

I boot tested x86_64 smp, and i386 smp, and compile tested for ARM
(!GENERIC_TIME). This set should be bisect-able for at least i386 and
likely x86_64.

It's also available at,

ftp://source.mvista.com/pub/dwalker/clocksource/clocksource-v12/

Daniel
-- 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 4/10] cxgb3 - bogus status error string

2007-01-30 Thread Divy Le Ray

From: Divy Le Ray <[EMAIL PROTECTED]>

Remove a status error string from the pci-x context 
and add it where it belongs - the pci-e context. 

Signed-off-by: Divy Le Ray <[EMAIL PROTECTED]>
---

 drivers/net/cxgb3/t3_hw.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/net/cxgb3/t3_hw.c b/drivers/net/cxgb3/t3_hw.c
index 4545acb..2215400 100644
--- a/drivers/net/cxgb3/t3_hw.c
+++ b/drivers/net/cxgb3/t3_hw.c
@@ -1181,7 +1181,6 @@ static int t3_handle_intr_status(struct
 static void pci_intr_handler(struct adapter *adapter)
 {
static const struct intr_info pcix1_intr_info[] = {
-   { F_PEXERR, "PCI PEX error", -1, 1 },
{F_MSTDETPARERR, "PCI master detected parity error", -1, 1},
{F_SIGTARABT, "PCI signaled target abort", -1, 1},
{F_RCVTARABT, "PCI received target abort", -1, 1},
@@ -1218,6 +1217,7 @@ static void pci_intr_handler(struct adap
 static void pcie_intr_handler(struct adapter *adapter)
 {
static const struct intr_info pcie_intr_info[] = {
+   {F_PEXERR, "PCI PEX error", -1, 1},
{F_UNXSPLCPLERRR,
 "PCI unexpected split completion DMA read error", -1, 1},
{F_UNXSPLCPLERRC,
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 05/23] clocksource: drop simplify-the-registration-of-clocksources.patch

2007-01-30 Thread Daniel Walker

Drop.

Signed-Off-By: Daniel Walker <[EMAIL PROTECTED]>

---
 include/linux/clocksource.h |3 -
 kernel/time/clocksource.c   |  118 ++--
 2 files changed, 61 insertions(+), 60 deletions(-)

Index: linux-2.6.19/include/linux/clocksource.h
===
--- linux-2.6.19.orig/include/linux/clocksource.h
+++ linux-2.6.19/include/linux/clocksource.h
@@ -198,7 +198,4 @@ static inline void update_vsyscall(struc
 }
 #endif
 
-#define clocksource_reselect() clocksource_change_rating(_tsc, 
clocksource_tsc.rating)
-extern void clocksource_change_rating(struct clocksource *cs, int rating);
-
 #endif /* _LINUX_CLOCKSOURCE_H */
Index: linux-2.6.19/kernel/time/clocksource.c
===
--- linux-2.6.19.orig/kernel/time/clocksource.c
+++ linux-2.6.19/kernel/time/clocksource.c
@@ -47,7 +47,6 @@ extern struct clocksource clocksource_ji
  */
 static struct clocksource *curr_clocksource = _jiffies;
 static struct clocksource *next_clocksource;
-static struct clocksource *clocksource_override;
 static LIST_HEAD(clocksource_list);
 static DEFINE_SPINLOCK(clocksource_lock);
 static char override_name[32];
@@ -84,46 +83,60 @@ struct clocksource *clocksource_get_next
 }
 
 /**
- * select_clocksource - Selects the best registered clocksource.
+ * select_clocksource - Finds the best registered clocksource.
  *
  * Private function. Must hold clocksource_lock when called.
  *
- * Select the clocksource with the best rating, or the clocksource,
- * which is selected by userspace override.
+ * Looks through the list of registered clocksources, returning
+ * the one with the highest rating value. If there is a clocksource
+ * name that matches the override string, it returns that clocksource.
  */
 static struct clocksource *select_clocksource(void)
 {
-   if (list_empty(_list))
-   return NULL;
+   struct clocksource *best = NULL;
+   struct list_head *tmp;
+
+   list_for_each(tmp, _list) {
+   struct clocksource *src;
 
-   if (clocksource_override)
-   return clocksource_override;
+   src = list_entry(tmp, struct clocksource, list);
+   if (!best)
+   best = src;
+
+   /* check for override: */
+   if (strlen(src->name) == strlen(override_name) &&
+   !strcmp(src->name, override_name)) {
+   best = src;
+   break;
+   }
+   /* pick the highest rating: */
+   if (src->rating > best->rating)
+   best = src;
+   }
 
-   return list_entry(clocksource_list.next, struct clocksource, list);
+   return best;
 }
 
-/*
- * Enqueue the clocksource sorted by rating
+/**
+ * is_registered_source - Checks if clocksource is registered
+ * @c: pointer to a clocksource
+ *
+ * Private helper function. Must hold clocksource_lock when called.
+ *
+ * Returns one if the clocksource is already registered, zero otherwise.
  */
-static int clocksource_enqueue(struct clocksource *c)
+static int is_registered_source(struct clocksource *c)
 {
-   struct list_head *tmp, *entry = _list;
+   int len = strlen(c->name);
+   struct list_head *tmp;
 
list_for_each(tmp, _list) {
-   struct clocksource *cs;
+   struct clocksource *src;
 
-   cs = list_entry(tmp, struct clocksource, list);
-   if (cs == c)
-   return -EBUSY;
-   /* Keep track of the place, where to insert */
-   if (cs->rating >= c->rating)
-   entry = tmp;
+   src = list_entry(tmp, struct clocksource, list);
+   if (strlen(src->name) == len && !strcmp(src->name, c->name))
+   return 1;
}
-   list_add(>list, entry);
-
-   if (strlen(c->name) == strlen(override_name) &&
-   !strcmp(c->name, override_name))
-   clocksource_override = c;
 
return 0;
 }
@@ -136,32 +149,42 @@ static int clocksource_enqueue(struct cl
  */
 int clocksource_register(struct clocksource *c)
 {
-   unsigned long flags;
int ret = 0;
+   unsigned long flags;
 
spin_lock_irqsave(_lock, flags);
-   ret = clocksource_enqueue(c);
-   if (!ret)
+   /* check if clocksource is already registered */
+   if (is_registered_source(c)) {
+   printk("register_clocksource: Cannot register %s. "
+  "Already registered!", c->name);
+   ret = -EBUSY;
+   } else {
+   /* register it */
+   list_add(>list, _list);
+   /* scan the registered clocksources, and pick the best one */
next_clocksource = select_clocksource();
+   }
spin_unlock_irqrestore(_lock, flags);
return ret;

[PATCH 2/10] cxgb3 - bind qsets on multiport adapter

2007-01-30 Thread Divy Le Ray

From: Divy Le Ray <[EMAIL PROTECTED]>

Inform FW about the queue set->interface mapping.

Signed-off-by: Divy Le Ray <[EMAIL PROTECTED]>
---

 drivers/net/cxgb3/adapter.h|2 +
 drivers/net/cxgb3/cxgb3_main.c |   68 ++--
 drivers/net/cxgb3/sge.c|8 +
 3 files changed, 54 insertions(+), 24 deletions(-)

diff --git a/drivers/net/cxgb3/adapter.h b/drivers/net/cxgb3/adapter.h
index 16643f6..8902007 100644
--- a/drivers/net/cxgb3/adapter.h
+++ b/drivers/net/cxgb3/adapter.h
@@ -46,6 +46,7 @@ enum {/* adapter flags */
FULL_INIT_DONE = (1 << 0),
USING_MSI = (1 << 1),
USING_MSIX = (1 << 2),
+   QUEUES_BOUND = (1 << 3),
 };
 
 struct rx_desc;
@@ -244,6 +245,7 @@ void t3_free_sge_resources(struct adapte
 void t3_sge_err_intr_handler(struct adapter *adapter);
 intr_handler_t t3_intr_handler(struct adapter *adap, int polling);
 int t3_eth_xmit(struct sk_buff *skb, struct net_device *dev);
+int t3_mgmt_tx(struct adapter *adap, struct sk_buff *skb);
 void t3_update_qset_coalesce(struct sge_qset *qs, const struct qset_params *p);
 int t3_sge_alloc_qset(struct adapter *adapter, unsigned int id, int nports,
  int irq_vec_idx, const struct qset_params *p,
diff --git a/drivers/net/cxgb3/cxgb3_main.c b/drivers/net/cxgb3/cxgb3_main.c
index 8044146..7e7ee7a 100644
--- a/drivers/net/cxgb3/cxgb3_main.c
+++ b/drivers/net/cxgb3/cxgb3_main.c
@@ -649,6 +649,37 @@ static void init_port_mtus(struct adapte
t3_write_reg(adapter, A_TP_MTU_PORT_TABLE, mtus);
 }
 
+static void send_pktsched_cmd(struct adapter *adap, int sched, int qidx, int 
lo,
+ int hi, int port)
+{
+   struct sk_buff *skb;
+   struct mngt_pktsched_wr *req;
+
+   skb = alloc_skb(sizeof(*req), GFP_KERNEL | __GFP_NOFAIL);
+   req = (struct mngt_pktsched_wr *)skb_put(skb, sizeof(*req));
+   req->wr_hi = htonl(V_WR_OP(FW_WROPCODE_MNGT));
+   req->mngt_opcode = FW_MNGTOPCODE_PKTSCHED_SET;
+   req->sched = sched;
+   req->idx = qidx;
+   req->min = lo;
+   req->max = hi;
+   req->binding = port;
+   t3_mgmt_tx(adap, skb);
+}
+
+static void bind_qsets(struct adapter *adap)
+{
+   int i, j;
+
+   for_each_port(adap, i) {
+   const struct port_info *pi = adap2pinfo(adap, i);
+
+   for (j = 0; j < pi->nqsets; ++j)
+   send_pktsched_cmd(adap, 1, pi->first_qset + j, -1,
+ -1, i);
+   }
+}
+
 /**
  * cxgb_up - enable the adapter
  * @adapter: adapter being enabled
@@ -708,6 +739,11 @@ static int cxgb_up(struct adapter *adap)
 
t3_sge_start(adap);
t3_intr_enable(adap);
+
+   if ((adap->flags & (USING_MSIX | QUEUES_BOUND)) == USING_MSIX)
+   bind_qsets(adap);
+   adap->flags |= QUEUES_BOUND;
+
 out:
return err;
 irq_err:
@@ -1830,34 +1866,18 @@ static int cxgb_extension_ioctl(struct n
break;
}
case CHELSIO_SET_PKTSCHED:{
-   struct sk_buff *skb;
struct ch_pktsched_params p;
-   struct mngt_pktsched_wr *req;
 
-   if (!(adapter->flags & FULL_INIT_DONE))
-   return -EIO;/* uP must be up and running */
+   if (!capable(CAP_NET_ADMIN))
+   return -EPERM;
+   if (!adapter->open_device_map)
+   return -EAGAIN; /* uP and SGE must be running */
if (copy_from_user(, useraddr, sizeof(p)))
-   return -EFAULT;
-   skb = alloc_skb(sizeof(*req), GFP_KERNEL);
-   if (!skb)
-   return -ENOMEM;
-   req =
-   (struct mngt_pktsched_wr *)skb_put(skb,
-   sizeof(*req));
-   req->wr_hi = htonl(V_WR_OP(FW_WROPCODE_MNGT));
-   req->mngt_opcode = FW_MNGTOPCODE_PKTSCHED_SET;
-   req->sched = p.sched;
-   req->idx = p.idx;
-   req->min = p.min;
-   req->max = p.max;
-   req->binding = p.binding;
-   printk(KERN_INFO
-   "pktsched: sched %u idx %u min %u max %u binding %u\n",
-   req->sched, req->idx, req->min, req->max,
-   req->binding);
-   skb->priority = 1;
-   offload_tx(>tdev, skb);
+   return -EFAULT;
+   send_pktsched_cmd(adapter, p.sched, p.idx, p.min, p.max,
+ p.binding);
break;
+   
}
default:
return -EOPNOTSUPP;
diff --git a/drivers/net/cxgb3/sge.c b/drivers/net/cxgb3/sge.c
index 6c77f4b..ccea06a 100644
--- a/drivers/net/cxgb3/sge.c
+++ b/drivers/net/cxgb3/sge.c
@@ -1198,6

[PATCH 1/10] cxgb3 - FW versioning

2007-01-30 Thread Divy Le Ray

From: Divy Le Ray <[EMAIL PROTECTED]>

Clean up FW version checking.
The supported FW version is now 3.1.

Signed-off-by: Divy Le Ray <[EMAIL PROTECTED]>
---

 drivers/net/cxgb3/cxgb3_main.c   |   15 ---
 drivers/net/cxgb3/firmware_exports.h |   27 +++
 drivers/net/cxgb3/t3_hw.c|   17 -
 3 files changed, 47 insertions(+), 12 deletions(-)

diff --git a/drivers/net/cxgb3/cxgb3_main.c b/drivers/net/cxgb3/cxgb3_main.c
index 54c49ac..8044146 100644
--- a/drivers/net/cxgb3/cxgb3_main.c
+++ b/drivers/net/cxgb3/cxgb3_main.c
@@ -665,11 +665,8 @@ static int cxgb_up(struct adapter *adap)
 
if (!(adap->flags & FULL_INIT_DONE)) {
err = t3_check_fw_version(adap);
-   if (err) {
-   dev_err(>pdev->dev,
-   "adapter FW is not compatible with driver\n");
+   if (err)
goto out;
-   }
 
err = init_dummy_netdevs(adap);
if (err)
@@ -1002,10 +999,14 @@ static void get_drvinfo(struct net_devic
strcpy(info->bus_info, pci_name(adapter->pdev));
if (!fw_vers)
strcpy(info->fw_version, "N/A");
-   else
+   else {
snprintf(info->fw_version, sizeof(info->fw_version),
-"%s %u.%u", (fw_vers >> 24) ? "T" : "N",
-(fw_vers >> 12) & 0xfff, fw_vers & 0xfff);
+"%s %u.%u.%u",
+G_FW_VERSION_TYPE(fw_vers) ? "T" : "N",
+G_FW_VERSION_MAJOR(fw_vers),
+G_FW_VERSION_MINOR(fw_vers),
+G_FW_VERSION_MICRO(fw_vers));
+   }
 }
 
 static void get_strings(struct net_device *dev, u32 stringset, u8 * data)
diff --git a/drivers/net/cxgb3/firmware_exports.h 
b/drivers/net/cxgb3/firmware_exports.h
index 3565f48..eea7d89 100644
--- a/drivers/net/cxgb3/firmware_exports.h
+++ b/drivers/net/cxgb3/firmware_exports.h
@@ -141,4 +141,31 @@
 #define FW_WRC_NUM \
 (65536 + FW_TUNNEL_NUM + FW_CTRL_NUM + FW_RI_NUM + FW_RX_PKT_NUM)
 
+/*
+ * FW type and version.
+ */
+#define S_FW_VERSION_TYPE  28
+#define M_FW_VERSION_TYPE  0xF
+#define V_FW_VERSION_TYPE(x)   ((x) << S_FW_VERSION_TYPE)
+#define G_FW_VERSION_TYPE(x)   \
+(((x) >> S_FW_VERSION_TYPE) & M_FW_VERSION_TYPE)
+
+#define S_FW_VERSION_MAJOR 16
+#define M_FW_VERSION_MAJOR 0xFFF
+#define V_FW_VERSION_MAJOR(x)  ((x) << S_FW_VERSION_MAJOR)
+#define G_FW_VERSION_MAJOR(x)  \
+(((x) >> S_FW_VERSION_MAJOR) & M_FW_VERSION_MAJOR)
+
+#define S_FW_VERSION_MINOR 8
+#define M_FW_VERSION_MINOR 0xFF
+#define V_FW_VERSION_MINOR(x)  ((x) << S_FW_VERSION_MINOR)
+#define G_FW_VERSION_MINOR(x)  \
+(((x) >> S_FW_VERSION_MINOR) & M_FW_VERSION_MINOR)
+
+#define S_FW_VERSION_MICRO 0
+#define M_FW_VERSION_MICRO 0xFF
+#define V_FW_VERSION_MICRO(x)  ((x) << S_FW_VERSION_MICRO)
+#define G_FW_VERSION_MICRO(x)  \
+(((x) >> S_FW_VERSION_MICRO) & M_FW_VERSION_MICRO)
+
 #endif /* _FIRMWARE_EXPORTS_H_ */
diff --git a/drivers/net/cxgb3/t3_hw.c b/drivers/net/cxgb3/t3_hw.c
index a4e2e57..4545acb 100644
--- a/drivers/net/cxgb3/t3_hw.c
+++ b/drivers/net/cxgb3/t3_hw.c
@@ -826,6 +826,11 @@ static int t3_write_flash(struct adapter
return 0;
 }
 
+enum fw_version_type {
+   FW_VERSION_N3,
+   FW_VERSION_T3
+};
+
 /**
  * t3_get_fw_version - read the firmware version
  * @adapter: the adapter
@@ -849,19 +854,21 @@ int t3_check_fw_version(struct adapter *
 {
int ret;
u32 vers;
+   unsigned int type, major, minor;
 
ret = t3_get_fw_version(adapter, );
if (ret)
return ret;
 
-   /* Minor 0xfff means the FW is an internal development-only version. */
-   if ((vers & 0xfff) == 0xfff)
-   return 0;
+   type = G_FW_VERSION_TYPE(vers);
+   major = G_FW_VERSION_MAJOR(vers);
+   minor = G_FW_VERSION_MINOR(vers);
 
-   if (vers == 0x1002009)
+   if (type == FW_VERSION_T3 && major == 3 && minor == 1)
return 0;
 
-   CH_ERR(adapter, "found wrong FW version, driver needs version 2.9\n");
+   CH_ERR(adapter, "found wrong FW version(%u.%u), "
+  "driver needs version 3.1\n", major, minor);
return -EINVAL;
 }
 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 6/10] cxgb3 - white space to tabs

2007-01-30 Thread Divy Le Ray

From: Divy Le Ray <[EMAIL PROTECTED]>

Use tabs in comments.

Signed-off-by: Divy Le Ray <[EMAIL PROTECTED]>
---

 drivers/net/cxgb3/t3_hw.c |   30 +++---
 1 files changed, 15 insertions(+), 15 deletions(-)

diff --git a/drivers/net/cxgb3/t3_hw.c b/drivers/net/cxgb3/t3_hw.c
index 7112bac..35a7fab 100644
--- a/drivers/net/cxgb3/t3_hw.c
+++ b/drivers/net/cxgb3/t3_hw.c
@@ -14,21 +14,21 @@
 #include "sge_defs.h"
 #include "firmware_exports.h"
 
- /**
-  *t3_wait_op_done_val - wait until an operation is completed
-  *@adapter: the adapter performing the operation
-  *@reg: the register to check for completion
-  *@mask: a single-bit field within @reg that indicates completion
-  *@polarity: the value of the field when the operation is completed
-  *@attempts: number of check iterations
-  *@delay: delay in usecs between iterations
-  *@valp: where to store the value of the register at completion time
-  *
-  *Wait until an operation is completed by checking a bit in a register
-  *up to @attempts times.  If @valp is not NULL the value of the register
-  *at the time it indicated completion is stored there.  Returns 0 if the
-  *operation completes and -EAGAIN otherwise.
-  */
+/**
+ * t3_wait_op_done_val - wait until an operation is completed
+ * @adapter: the adapter performing the operation
+ * @reg: the register to check for completion
+ * @mask: a single-bit field within @reg that indicates completion
+ * @polarity: the value of the field when the operation is completed
+ * @attempts: number of check iterations
+ * @delay: delay in usecs between iterations
+ * @valp: where to store the value of the register at completion time
+ *
+ * Wait until an operation is completed by checking a bit in a register
+ * up to @attempts times.  If @valp is not NULL the value of the register
+ * at the time it indicated completion is stored there.  Returns 0 if the
+ * operation completes and -EAGAIN otherwise.
+ */
 
 int t3_wait_op_done_val(struct adapter *adapter, int reg, u32 mask,
int polarity, int attempts, int delay, u32 *valp)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 9/10] cxgb3 - Add

2007-01-30 Thread Divy Le Ray

From: Divy Le Ray <[EMAIL PROTECTED]>

Include  in adapter.h

Signed-off-by: Divy Le Ray <[EMAIL PROTECTED]>
---

 drivers/net/cxgb3/adapter.h |1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/drivers/net/cxgb3/adapter.h b/drivers/net/cxgb3/adapter.h
index 8902007..97e35c8 100644
--- a/drivers/net/cxgb3/adapter.h
+++ b/drivers/net/cxgb3/adapter.h
@@ -19,6 +19,7 @@
 #include 
 #include 
 #include 
+#include 
 #include "t3cdev.h"
 #include 
 #include 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 8/10] cxgb3 - Unmap offload packets when they are freed.

2007-01-30 Thread Divy Le Ray

From: Divy Le Ray <[EMAIL PROTECTED]>

Offload packets may be DMAed long after their SGE Tx descriptors are done
so they must remain mapped until they are freed rather than until their
descriptors are freed.  Unmap such packets through an skb destructor.

Signed-off-by: Divy Le Ray <[EMAIL PROTECTED]>
---

 drivers/net/cxgb3/sge.c |   63 ++-
 1 files changed, 61 insertions(+), 2 deletions(-)

diff --git a/drivers/net/cxgb3/sge.c b/drivers/net/cxgb3/sge.c
index daef7fd..d563f7a 100644
--- a/drivers/net/cxgb3/sge.c
+++ b/drivers/net/cxgb3/sge.c
@@ -85,6 +85,15 @@ struct unmap_info {  /* packet unmapping
 };
 
 /*
+ * Holds unmapping information for Tx packets that need deferred unmapping.
+ * This structure lives at skb->head and must be allocated by callers.
+ */
+struct deferred_unmap_info {
+   struct pci_dev *pdev;
+   dma_addr_t addr[MAX_SKB_FRAGS + 1];
+};
+
+/*
  * Maps a number of flits to the number of Tx descriptors that can hold them.
  * The formula is
  *
@@ -232,10 +241,13 @@ static void free_tx_desc(struct adapter
struct pci_dev *pdev = adapter->pdev;
unsigned int cidx = q->cidx;
 
+   const int need_unmap = need_skb_unmap() &&
+  q->cntxt_id >= FW_TUNNEL_SGEEC_START;
+
d = >sdesc[cidx];
while (n--) {
if (d->skb) {   /* an SGL is present */
-   if (need_skb_unmap())
+   if (need_unmap)
unmap_skb(d->skb, q, cidx, pdev);
if (d->skb->priority == cidx)
kfree_skb(d->skb);
@@ -1207,6 +1219,50 @@ int t3_mgmt_tx(struct adapter *adap, str
 }
 
 /**
+ * deferred_unmap_destructor - unmap a packet when it is freed
+ * @skb: the packet
+ *
+ * This is the packet destructor used for Tx packets that need to remain
+ * mapped until they are freed rather than until their Tx descriptors are
+ * freed.
+ */
+static void deferred_unmap_destructor(struct sk_buff *skb)
+{
+   int i;
+   const dma_addr_t *p;
+   const struct skb_shared_info *si;
+   const struct deferred_unmap_info *dui;
+   const struct unmap_info *ui = (struct unmap_info *)skb->cb;
+
+   dui = (struct deferred_unmap_info *)skb->head;
+   p = dui->addr;
+
+   if (ui->len)
+   pci_unmap_single(dui->pdev, *p++, ui->len, PCI_DMA_TODEVICE);
+
+   si = skb_shinfo(skb);
+   for (i = 0; i < si->nr_frags; i++)
+   pci_unmap_page(dui->pdev, *p++, si->frags[i].size,
+  PCI_DMA_TODEVICE);
+}
+
+static void setup_deferred_unmapping(struct sk_buff *skb, struct pci_dev *pdev,
+const struct sg_ent *sgl, int sgl_flits)
+{
+   dma_addr_t *p;
+   struct deferred_unmap_info *dui;
+
+   dui = (struct deferred_unmap_info *)skb->head;
+   dui->pdev = pdev;
+   for (p = dui->addr; sgl_flits >= 3; sgl++, sgl_flits -= 3) {
+   *p++ = be64_to_cpu(sgl->addr[0]);
+   *p++ = be64_to_cpu(sgl->addr[1]);
+   }
+   if (sgl_flits)
+   *p = be64_to_cpu(sgl->addr[0]);
+}
+
+/**
  * write_ofld_wr - write an offload work request
  * @adap: the adapter
  * @skb: the packet to send
@@ -1242,8 +1298,11 @@ static void write_ofld_wr(struct adapter
sgp = ndesc == 1 ? (struct sg_ent *)>flit[flits] : sgl;
sgl_flits = make_sgl(skb, sgp, skb->h.raw, skb->tail - skb->h.raw,
 adap->pdev);
-   if (need_skb_unmap())
+   if (need_skb_unmap()) {
+   setup_deferred_unmapping(skb, adap->pdev, sgp, sgl_flits);
+   skb->destructor = deferred_unmap_destructor;
((struct unmap_info *)skb->cb)->len = skb->tail - skb->h.raw;
+   }
 
write_wr_hdr_sgl(ndesc, skb, d, pidx, q, sgl, flits, sgl_flits,
 gen, from->wr_hi, from->wr_lo);
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 10/10] cxgb3 - Add dual licensing

2007-01-30 Thread Divy Le Ray

From: Divy Le Ray <[EMAIL PROTECTED]>

Dual licensing, needed for OFED 1.2

Signed-off-by: Divy Le Ray <[EMAIL PROTECTED]>
---

 drivers/net/cxgb3/adapter.h  |   33 +++
 drivers/net/cxgb3/ael1002.c  |   34 +++-
 drivers/net/cxgb3/common.h   |   34 +++-
 drivers/net/cxgb3/cxgb3_ctl_defs.h   |   34 
 drivers/net/cxgb3/cxgb3_defs.h   |4 +--
 drivers/net/cxgb3/cxgb3_ioctl.h  |   34 +++-
 drivers/net/cxgb3/cxgb3_main.c   |   36 --
 drivers/net/cxgb3/cxgb3_offload.c|4 +--
 drivers/net/cxgb3/cxgb3_offload.h|4 +--
 drivers/net/cxgb3/firmware_exports.h |   48 +++---
 drivers/net/cxgb3/l2t.c  |4 +--
 drivers/net/cxgb3/l2t.h  |4 +--
 drivers/net/cxgb3/mc5.c  |   34 +++-
 drivers/net/cxgb3/sge.c  |   38 +--
 drivers/net/cxgb3/t3_cpl.h   |   34 ++--
 drivers/net/cxgb3/t3_hw.c|   38 +--
 drivers/net/cxgb3/t3cdev.h   |3 +-
 drivers/net/cxgb3/version.h  |   47 ++---
 drivers/net/cxgb3/vsc8211.c  |   34 +++-
 drivers/net/cxgb3/xgmac.c|   34 +++-
 20 files changed, 399 insertions(+), 136 deletions(-)

diff --git a/drivers/net/cxgb3/adapter.h b/drivers/net/cxgb3/adapter.h
index 97e35c8..5c97a64 100644
--- a/drivers/net/cxgb3/adapter.h
+++ b/drivers/net/cxgb3/adapter.h
@@ -1,12 +1,33 @@
 /*
- * This file is part of the Chelsio T3 Ethernet driver for Linux.
+ * Copyright (c) 2003-2007 Chelsio, Inc. All rights reserved.
  *
- * Copyright (C) 2003-2006 Chelsio Communications.  All rights reserved.
+ * This software is available to you under a choice of one of two
+ * licenses.  You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
  *
- * This program is distributed in the hope that it will be useful, but WITHOUT
- * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
- * FITNESS FOR A PARTICULAR PURPOSE.  See the LICENSE file included in this
- * release for licensing terms and conditions.
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ *  - Redistributions of source code must retain the above
+ *copyright notice, this list of conditions and the following
+ *disclaimer.
+ *
+ *  - Redistributions in binary form must reproduce the above
+ *copyright notice, this list of conditions and the following
+ *disclaimer in the documentation and/or other materials
+ *provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
  */
 
 /* This file should not be included directly.  Include common.h instead. */
diff --git a/drivers/net/cxgb3/ael1002.c b/drivers/net/cxgb3/ael1002.c
index 93a90d8..73a41e6 100644
--- a/drivers/net/cxgb3/ael1002.c
+++ b/drivers/net/cxgb3/ael1002.c
@@ -1,14 +1,34 @@
 /*
- * This file is part of the Chelsio T3 Ethernet driver.
+ * Copyright (c) 2005-2007 Chelsio, Inc. All rights reserved.
  *
- * Copyright (C) 2005-2006 Chelsio Communications.  All rights reserved.
+ * This software is available to you under a choice of one of two
+ * licenses.  You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
  *
- * This program is distributed in the hope that it will be useful, but WITHOUT
- * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
- * FITNESS FOR A PARTICULAR PURPOSE.  See the LICENSE file included in this
- * release for licensing terms and conditions.
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ *  - Redistributions of source code must retain the above
+ *copyright notice, this list of conditions and the following
+ *disclaimer.
+ *
+ *  - Redistributions in binary form must reproduce the

[PATCH 7/10] cxgb3 - Remove BUG_ON from t3b_intr_napi

2007-01-30 Thread Divy Le Ray

From: Divy Le Ray <[EMAIL PROTECTED]>

In some cases, SG_DATA_INTR won't clear on read and the following 
interrupt may cause us to assert because NAPI is already scheduled. 
Remove the assertion, NAPI can handle attempts to rearm it while
it's already scheduled.

Signed-off-by: Divy Le Ray <[EMAIL PROTECTED]>
---

 drivers/net/cxgb3/sge.c |2 --
 1 files changed, 0 insertions(+), 2 deletions(-)

diff --git a/drivers/net/cxgb3/sge.c b/drivers/net/cxgb3/sge.c
index 8b3c824..daef7fd 100644
--- a/drivers/net/cxgb3/sge.c
+++ b/drivers/net/cxgb3/sge.c
@@ -2199,14 +2199,12 @@ static irqreturn_t t3b_intr_napi(int irq
if (likely(map & 1)) {
dev = adap->sge.qs[0].netdev;
 
-   BUG_ON(napi_is_scheduled(dev));
if (likely(__netif_rx_schedule_prep(dev)))
__netif_rx_schedule(dev);
}
if (map & 2) {
dev = adap->sge.qs[1].netdev;
 
-   BUG_ON(napi_is_scheduled(dev));
if (likely(__netif_rx_schedule_prep(dev)))
__netif_rx_schedule(dev);
}
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 3/10] cxgb3 - remove SW Tx credits coalescing

2007-01-30 Thread Divy Le Ray

From: Divy Le Ray <[EMAIL PROTECTED]>

Remove tx credit coalescing done in SW.
The HW is caring care of it already.

Signed-off-by: Divy Le Ray <[EMAIL PROTECTED]>
---

 drivers/net/cxgb3/sge.c |   75 +--
 1 files changed, 14 insertions(+), 61 deletions(-)

diff --git a/drivers/net/cxgb3/sge.c b/drivers/net/cxgb3/sge.c
index ccea06a..8b3c824 100644
--- a/drivers/net/cxgb3/sge.c
+++ b/drivers/net/cxgb3/sge.c
@@ -1550,33 +1550,6 @@ static inline int rx_offload(struct t3cd
 }
 
 /**
- * update_tx_completed - update the number of processed Tx descriptors
- * @qs: the queue set to update
- * @idx: which Tx queue within the set to update
- * @credits: number of new processed descriptors
- * @tx_completed: accumulates credits for the queues
- *
- * Updates the number of completed Tx descriptors for a queue set's Tx
- * queue.  On UP systems we updated the information immediately but on
- * MP we accumulate the credits locally and update the Tx queue when we
- * reach a threshold to avoid cache-line bouncing.
- */
-static inline void update_tx_completed(struct sge_qset *qs, int idx,
-  unsigned int credits,
-  unsigned int tx_completed[])
-{
-#ifdef CONFIG_SMP
-   tx_completed[idx] += credits;
-   if (tx_completed[idx] > 32) {
-   qs->txq[idx].processed += tx_completed[idx];
-   tx_completed[idx] = 0;
-   }
-#else
-   qs->txq[idx].processed += credits;
-#endif
-}
-
-/**
  * restart_tx - check whether to restart suspended Tx queues
  * @qs: the queue set to resume
  *
@@ -1656,13 +1629,12 @@ static void rx_eth(struct adapter *adap,
  * handle_rsp_cntrl_info - handles control information in a response
  * @qs: the queue set corresponding to the response
  * @flags: the response control flags
- * @tx_completed: accumulates completion credits for the Tx queues
  *
  * Handles the control information of an SGE response, such as GTS
  * indications and completion credits for the queue set's Tx queues.
+ * HW coalesces credits, we don't do any extra SW coalescing.
  */
-static inline void handle_rsp_cntrl_info(struct sge_qset *qs, u32 flags,
-unsigned int tx_completed[])
+static inline void handle_rsp_cntrl_info(struct sge_qset *qs, u32 flags)
 {
unsigned int credits;
 
@@ -1671,37 +1643,21 @@ static inline void handle_rsp_cntrl_info
clear_bit(TXQ_RUNNING, >txq[TXQ_ETH].flags);
 #endif
 
-   /* ETH credits are already coalesced, return them immediately. */
credits = G_RSPD_TXQ0_CR(flags);
if (credits)
qs->txq[TXQ_ETH].processed += credits;
 
+   credits = G_RSPD_TXQ2_CR(flags);
+   if (credits)
+   qs->txq[TXQ_CTRL].processed += credits;
+
 # if USE_GTS
if (flags & F_RSPD_TXQ1_GTS)
clear_bit(TXQ_RUNNING, >txq[TXQ_OFLD].flags);
 # endif
-   update_tx_completed(qs, TXQ_OFLD, G_RSPD_TXQ1_CR(flags), tx_completed);
-   update_tx_completed(qs, TXQ_CTRL, G_RSPD_TXQ2_CR(flags), tx_completed);
-}
-
-/**
- * flush_tx_completed - returns accumulated Tx completions to Tx queues
- * @qs: the queue set to update
- * @tx_completed: pending completion credits to return to Tx queues
- *
- * Updates the number of completed Tx descriptors for a queue set's Tx
- * queues with the credits pending in @tx_completed.  This does something
- * only on MP systems as on UP systems we return the credits immediately.
- */
-static inline void flush_tx_completed(struct sge_qset *qs,
- unsigned int tx_completed[])
-{
-#if defined(CONFIG_SMP)
-   if (tx_completed[TXQ_OFLD])
-   qs->txq[TXQ_OFLD].processed += tx_completed[TXQ_OFLD];
-   if (tx_completed[TXQ_CTRL])
-   qs->txq[TXQ_CTRL].processed += tx_completed[TXQ_CTRL];
-#endif
+   credits = G_RSPD_TXQ1_CR(flags);
+   if (credits)
+   qs->txq[TXQ_OFLD].processed += credits;
 }
 
 /**
@@ -1784,7 +1740,7 @@ static int process_responses(struct adap
struct sge_rspq *q = >rspq;
struct rsp_desc *r = >desc[q->cidx];
int budget_left = budget;
-   unsigned int sleeping = 0, tx_completed[3] = { 0, 0, 0 };
+   unsigned int sleeping = 0;
struct sk_buff *offload_skbs[RX_BUNDLE_SIZE];
int ngathered = 0;
 
@@ -1837,7 +1793,7 @@ static int process_responses(struct adap
 
if (flags & RSPD_CTRL_MASK) {
sleeping |= flags & RSPD_GTS_MASK;
-   handle_rsp_cntrl_info(qs, flags, tx_completed);
+   handle_rsp_cntrl_info(qs, flags);
}
 
r++;
@@ -1868,7 +1824,6 @@ static int process_responses(struct adap
--budget_left;
}
 
-   flush_tx_completed(qs,

[PATCH 5/10] cxgb3 - Clean up HW init routine

2007-01-30 Thread Divy Le Ray

From: Divy Le Ray <[EMAIL PROTECTED]>

Clean up the tp_config() routine.

Signed-off-by: Divy Le Ray <[EMAIL PROTECTED]>
---

 drivers/net/cxgb3/t3_hw.c |   16 +---
 1 files changed, 5 insertions(+), 11 deletions(-)

diff --git a/drivers/net/cxgb3/t3_hw.c b/drivers/net/cxgb3/t3_hw.c
index 2215400..7112bac 100644
--- a/drivers/net/cxgb3/t3_hw.c
+++ b/drivers/net/cxgb3/t3_hw.c
@@ -2328,8 +2328,6 @@ static inline void tp_wr_indirect(struct
 
 static void tp_config(struct adapter *adap, const struct tp_params *p)
 {
-   unsigned int v;
-
t3_write_reg(adap, A_TP_GLOBAL_CONFIG, F_TXPACINGENABLE | F_PATHMTU |
 F_IPCHECKSUMOFFLOAD | F_UDPCHECKSUMOFFLOAD |
 F_TCPCHECKSUMOFFLOAD | V_IPTTL(64));
@@ -2348,15 +2346,11 @@ static void tp_config(struct adapter *ad
 adap->params.rev > 0 ? F_ENABLEESND : F_T3A_ENABLEESND,
 0);
 
-   v = t3_read_reg(adap, A_TP_PC_CONFIG);
-   v &= ~(F_ENABLEEPCMDAFULL | F_ENABLEOCSPIFULL);
-   t3_write_reg(adap, A_TP_PC_CONFIG, v | F_TXDEFERENABLE |
-F_MODULATEUNIONMODE | F_HEARBEATDACK |
-F_TXCONGESTIONMODE | F_RXCONGESTIONMODE);
-
-   v = t3_read_reg(adap, A_TP_PC_CONFIG2);
-   v &= ~F_CHDRAFULL;
-   t3_write_reg(adap, A_TP_PC_CONFIG2, v);
+   t3_set_reg_field(adap, A_TP_PC_CONFIG,
+F_ENABLEEPCMDAFULL | F_ENABLEOCSPIFULL,
+F_TXDEFERENABLE | F_HEARBEATDACK | F_TXCONGESTIONMODE |
+F_RXCONGESTIONMODE);
+   t3_set_reg_field(adap, A_TP_PC_CONFIG2, F_CHDRAFULL, 0);
 
if (adap->params.rev > 0) {
tp_wr_indirect(adap, A_TP_EGRESS_CONFIG, F_REWRITEFORCETOSIZE);
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 0/10] cxgb3 - Chelsio T3 1G/10G driver updates

2007-01-30 Thread Divy Le Ray


Jeff,

I'm sending a series of incremental patches updating
the cxgb3 driver. These patches are built against
netdev#upstream.

Cheers,
Divy
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 11/23] clocksource: atomic signals

2007-01-30 Thread Daniel Walker

Modifies the way clocks are switched to in the timekeeping code. The original
code would constantly monitor the clocksource list checking for newly added
clocksources. I modified this by using atomic types to signal when a new clock
is added. This allows the operation to be used only when it's needed.

The fast path is also reduced to checking a single atomic value.

Signed-Off-By: Daniel Walker <[EMAIL PROTECTED]>

---
 include/linux/clocksource.h |5 
 include/linux/timekeeping.h |4 ---
 kernel/time/clocksource.c   |6 +
 kernel/time/timekeeping.c   |   49 
 4 files changed, 43 insertions(+), 21 deletions(-)

Index: linux-2.6.19/include/linux/clocksource.h
===
--- linux-2.6.19.orig/include/linux/clocksource.h
+++ linux-2.6.19/include/linux/clocksource.h
@@ -25,6 +25,11 @@ typedef u64 cycle_t;
 extern struct clocksource clocksource_jiffies;
 
 /*
+ * Atomic signal that is specific to timekeeping.
+ */
+extern atomic_t clock_check;
+
+/*
  * Allows inlined calling for notifier routines.
  */
 extern struct atomic_notifier_head clocksource_list_notifier;
Index: linux-2.6.19/include/linux/timekeeping.h
===
--- linux-2.6.19.orig/include/linux/timekeeping.h
+++ linux-2.6.19/include/linux/timekeeping.h
@@ -4,10 +4,6 @@
 #include 
 
 #ifndef CONFIG_GENERIC_TIME
-static inline int change_clocksource(void)
-{
-   return 0;
-}
 
 static inline void change_clocksource(void) { }
 static inline void timekeeping_init_notifier(void) { }
Index: linux-2.6.19/kernel/time/clocksource.c
===
--- linux-2.6.19.orig/kernel/time/clocksource.c
+++ linux-2.6.19/kernel/time/clocksource.c
@@ -50,6 +50,7 @@ static char override_name[32];
 static int finished_booting;
 
 ATOMIC_NOTIFIER_HEAD(clocksource_list_notifier);
+atomic_t clock_check = ATOMIC_INIT(0);
 
 /* clocksource_done_booting - Called near the end of bootup
  *
@@ -58,6 +59,8 @@ ATOMIC_NOTIFIER_HEAD(clocksource_list_no
 static int __init clocksource_done_booting(void)
 {
finished_booting = 1;
+   /* Check for a new clock now */
+   atomic_inc(_check);
return 0;
 }
 
@@ -285,6 +288,9 @@ static ssize_t sysfs_override_clocksourc
/* try to select it: */
next_clocksource = select_clocksource();
 
+   /* Signal that there is a new clocksource */
+   atomic_inc(_check);
+
spin_unlock_irq(_lock);
 
return ret;
Index: linux-2.6.19/kernel/time/timekeeping.c
===
--- linux-2.6.19.orig/kernel/time/timekeeping.c
+++ linux-2.6.19/kernel/time/timekeeping.c
@@ -9,6 +9,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 
@@ -39,7 +40,6 @@ static unsigned long timekeeping_suspend
  * Clock used for timekeeping
  */
 struct clocksource *clock = _jiffies;
-atomic_t clock_recalc_interval = ATOMIC_INIT(0);
 
 #ifdef CONFIG_GENERIC_TIME
 /**
@@ -157,11 +157,12 @@ int do_settimeofday(struct timespec *tv)
 EXPORT_SYMBOL(do_settimeofday);
 
 /**
- * change_clocksource - Swaps clocksources if a new one is available
+ * timkeeping_change_clocksource - Swaps clocksources if a new one is available
  *
  * Accumulates current time interval and initializes new clocksource
+ * Needs to be called with the xtime_lock held.
  */
-static int change_clocksource(void)
+static void timekeeping_change_clocksource(void)
 {
struct clocksource *new;
cycle_t now;
@@ -176,12 +177,15 @@ static int change_clocksource(void)
clock->cycle_last = now;
printk(KERN_INFO "Time: %s clocksource has been installed.\n",
   clock->name);
-   return 1;
-   } else if (unlikely(atomic_read(_recalc_interval))) {
-   atomic_set(_recalc_interval, 0);
-   return 1;
+   tick_clock_notify();
+   clock->error = 0;
+   clock->xtime_nsec = 0;
+   clocksource_calculate_interval(clock, NTP_INTERVAL_LENGTH);
+   } else {
+   clock->error = 0;
+   clock->xtime_nsec = 0;
+   clocksource_calculate_interval(clock, NTP_INTERVAL_LENGTH);
}
-   return 0;
 }
 
 /**
@@ -205,9 +209,14 @@ int timekeeping_is_continuous(void)
 static int
 clocksource_callback(struct notifier_block *nb, unsigned long op, void *c)
 {
-   if (c == clock && op == CLOCKSOURCE_NOTIFY_FREQ &&
-   !atomic_read(_recalc_interval))
-   atomic_inc(_recalc_interval);
+   if (likely(c != clock))
+   return 0;
+
+   switch (op) {
+   case CLOCKSOURCE_NOTIFY_FREQ:
+   case CLOCKSOURCE_NOTIFY_RATING:
+   atomic_inc(_check);
+   }
 
return 0;
 }
@@ -334,6 +343,7 @@ static int __init

[PATCH 17/23] clocksource: avr32 update for new flags

2007-01-30 Thread Daniel Walker

Update avr32 for new flags.

Signed-Off-By: Daniel Walker <[EMAIL PROTECTED]>

---
 arch/avr32/kernel/time.c |1 -
 1 file changed, 1 deletion(-)

Index: linux-2.6.19/arch/avr32/kernel/time.c
===
--- linux-2.6.19.orig/arch/avr32/kernel/time.c
+++ linux-2.6.19/arch/avr32/kernel/time.c
@@ -37,7 +37,6 @@ static struct clocksource clocksource_av
.read   = read_cycle_count,
.mask   = CLOCKSOURCE_MASK(32),
.shift  = 16,
-   .flags  = CLOCK_SOURCE_IS_CONTINUOUS,
 };
 
 /*

-- 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] rfkill - Add support for input key to control wireless radio

2007-01-30 Thread Stephen Hemminger

Hope you will be resubmitting this.

> +/*
> + * rfkill key structure.
> + */
> +struct rfkill_key {
> + /*
> +  * For sysfs representation.
> +  */
> + struct class_device *cdev;
> +
> + /*
> +  * Pointer to rfkill structure
> +  * that was filled in by key driver.
> +  */
> + struct rfkill *rfkill;

Since rfkill is basically a function pointer table,
can it be made const?


> + /*
> +  * Pointer to type structure that this key belongs to.
> +  */
> + struct rfkill_type *type;
> +
> + /*
> +  * Once key status change has been detected, the toggled
> +  * field should be set to indicate a notification to
> +  * user or driver should be performed.
> +  */
> + int toggled;
> +
> + /*
> +  * Current state of the device radio, this state will
> +  * change after the radio has actually been toggled since
> +  * receiving the radio key event.
> +  */
> + int radio_status;
> +
> + /*
> +  * Current status of the key which controls the radio,
> +  * this value will change after the key state has changed
> +  * after polling, or the key driver has send the new state
> +  * manually.
> +  */
> + int key_status;


Maybe turn these bits into a bit values (set_bit/clear_bit) in an unsigned long.

> + /*
> +  * Input device for this key,
> +  * we also keep track of the number of
> +  * times this input device is open. This
> +  * is important for determining to whom we
> +  * should report key events.
> +  */
> + struct input_dev *input;
> + unsigned int open_count;

atomic on open_count?

> + /*
> +  * Key index number.
> +  */
> + unsigned int key_index;
> +
> + /*
> +  * List head structure to be used
> +  * to add this structure to the list.
> +  */
> + struct list_head entry;
> +};
> +
> +/*
> + * rfkill key type structure.
> + */
> +struct rfkill_type {
> + /*
> +  * For sysfs representation.
> +  */
> + struct class_device *cdev;
> +
> + /*
> +  * Name of this radio type.
> +  */
> + char *name;

const?

> + /*
> +  * Key type identification. Value must be any
> +  * in the key_type enum.
> +  */
> + unsigned int key_type;
> +
> + /*
> +  * Number of registered keys of this type.
> +  */
> + unsigned int key_count;
> +};
> +
> +/*
> + * rfkill master structure.
> + */
> +struct rfkill_master {
> + /*
> +  * For sysfs representation.
> +  */
> + struct class *class;
> +
> + /*
> +  * All access to the master structure
> +  * and its children (the keys) are protected
> +  * by this key lock.
> +  */
> + struct semaphore key_sem;

mutex instead of semaphort

> + /*
> +  * List of available key types.
> +  */
> + struct rfkill_type type[KEY_TYPE_MAX];
> +
> + /*
> +  * Total number of registered keys.
> +  */
> + unsigned int key_count;
> +
> + /*
> +  * Number of keys that require polling
> +  */
> + unsigned int poll_required;
> +
> + /*
> +  * List of rfkill_key structures.
> +  */
> + struct list_head key_list;
> +
> + /*
> +  * Work structures for periodic polling,
> +  * as well as the scheduled radio toggling.
> +  */
> + struct work_struct toggle_work;
> + struct work_struct poll_work;

delayed_rearming_work instead?

> +};
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 09/23] clocksource: add block notifier

2007-01-30 Thread Daniel Walker

Adds a call back interface for register/rating change events. This is also used
later in this series to signal other interesting events.

Signed-Off-By: Daniel Walker <[EMAIL PROTECTED]>

---
 include/linux/clocksource.h |   37 +
 include/linux/timekeeping.h |3 +++
 kernel/time/clocksource.c   |   10 ++
 3 files changed, 50 insertions(+)

Index: linux-2.6.19/include/linux/clocksource.h
===
--- linux-2.6.19.orig/include/linux/clocksource.h
+++ linux-2.6.19/include/linux/clocksource.h
@@ -13,6 +13,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -23,6 +24,42 @@ typedef u64 cycle_t;
 /* XXX - Would like a better way for initializing curr_clocksource */
 extern struct clocksource clocksource_jiffies;
 
+/*
+ * Allows inlined calling for notifier routines.
+ */
+extern struct atomic_notifier_head clocksource_list_notifier;
+
+/*
+ * Block notifier flags.
+ */
+#define CLOCKSOURCE_NOTIFY_REGISTER1
+#define CLOCKSOURCE_NOTIFY_RATING  2
+#define CLOCKSOURCE_NOTIFY_FREQ4
+
+/**
+ * clocksource_notifier_register - Registers a list change notifier
+ * @nb:pointer to a notifier block
+ *
+ * Returns zero always.
+ */
+static inline int clocksource_notifier_register(struct notifier_block *nb)
+{
+   return atomic_notifier_chain_register(_list_notifier, nb);
+}
+
+/**
+ * clocksource_freq_change - Allows notification of dynamic frequency changes.
+ *
+ * Signals that a clocksource is dynamically changing it's frequency.
+ * This could happen if a clocksource becomes more/less stable.
+ */
+static inline void clocksource_freq_change(struct clocksource *c)
+{
+   atomic_notifier_call_chain(_list_notifier,
+  CLOCKSOURCE_NOTIFY_FREQ, c);
+}
+
+
 /**
  * struct clocksource - hardware abstraction for a free running counter
  * Provides mostly state-free accessors to the underlying hardware.
Index: linux-2.6.19/include/linux/timekeeping.h
===
--- linux-2.6.19.orig/include/linux/timekeeping.h
+++ linux-2.6.19/include/linux/timekeeping.h
@@ -8,6 +8,9 @@ static inline int change_clocksource(voi
 {
return 0;
 }
+
+static inline void change_clocksource(void) { }
+
 #endif /* !CONFIG_GENERIC_TIME */
 
 #endif /* _LINUX_TIMEKEEPING_H */
Index: linux-2.6.19/kernel/time/clocksource.c
===
--- linux-2.6.19.orig/kernel/time/clocksource.c
+++ linux-2.6.19/kernel/time/clocksource.c
@@ -49,6 +49,8 @@ static DEFINE_SPINLOCK(clocksource_lock)
 static char override_name[32];
 static int finished_booting;
 
+ATOMIC_NOTIFIER_HEAD(clocksource_list_notifier);
+
 /* clocksource_done_booting - Called near the end of bootup
  *
  * Hack to avoid lots of clocksource churn at boot time
@@ -193,6 +195,10 @@ int clocksource_register(struct clocksou
 */
next_clocksource = select_clocksource();
spin_unlock_irqrestore(_lock, flags);
+
+   atomic_notifier_call_chain(_list_notifier,
+  CLOCKSOURCE_NOTIFY_REGISTER, c);
+
return ret;
 }
 EXPORT_SYMBOL(clocksource_register);
@@ -218,6 +224,10 @@ void clocksource_rating_change(struct cl
 
next_clocksource = select_clocksource();
spin_unlock_irqrestore(_lock, flags);
+
+   atomic_notifier_call_chain(_list_notifier,
+  CLOCKSOURCE_NOTIFY_RATING, c);
+
 }
 EXPORT_SYMBOL(clocksource_rating_change);
 

-- 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 16/23] clocksource: arm update for new flags

2007-01-30 Thread Daniel Walker

Update ARM for new flags.

Signed-Off-By: Daniel Walker <[EMAIL PROTECTED]>

---
 arch/arm/mach-imx/time.c  |1 -
 arch/arm/mach-ixp4xx/common.c |1 -
 arch/arm/mach-netx/time.c |1 -
 arch/arm/mach-pxa/time.c  |1 -
 4 files changed, 4 deletions(-)

Index: linux-2.6.19/arch/arm/mach-imx/time.c
===
--- linux-2.6.19.orig/arch/arm/mach-imx/time.c
+++ linux-2.6.19/arch/arm/mach-imx/time.c
@@ -87,7 +87,6 @@ static struct clocksource clocksource_im
.read   = imx_get_cycles,
.mask   = 0x,
.shift  = 20,
-   .flags  = CLOCK_SOURCE_IS_CONTINUOUS,
 };
 
 static int __init imx_clocksource_init(void)
Index: linux-2.6.19/arch/arm/mach-ixp4xx/common.c
===
--- linux-2.6.19.orig/arch/arm/mach-ixp4xx/common.c
+++ linux-2.6.19/arch/arm/mach-ixp4xx/common.c
@@ -395,7 +395,6 @@ static struct clocksource clocksource_ix
.read   = ixp4xx_get_cycles,
.mask   = CLOCKSOURCE_MASK(32),
.shift  = 20,
-   .flags  = CLOCK_SOURCE_IS_CONTINUOUS,
 };
 
 unsigned long ixp4xx_timer_freq = FREQ;
Index: linux-2.6.19/arch/arm/mach-netx/time.c
===
--- linux-2.6.19.orig/arch/arm/mach-netx/time.c
+++ linux-2.6.19/arch/arm/mach-netx/time.c
@@ -62,7 +62,6 @@ static struct clocksource clocksource_ne
.read   = netx_get_cycles,
.mask   = CLOCKSOURCE_MASK(32),
.shift  = 20,
-   .flags  = CLOCK_SOURCE_IS_CONTINUOUS,
 };
 
 /*
Index: linux-2.6.19/arch/arm/mach-pxa/time.c
===
--- linux-2.6.19.orig/arch/arm/mach-pxa/time.c
+++ linux-2.6.19/arch/arm/mach-pxa/time.c
@@ -112,7 +112,6 @@ static struct clocksource clocksource_px
.read   = pxa_get_cycles,
.mask   = CLOCKSOURCE_MASK(32),
.shift  = 20,
-   .flags  = CLOCK_SOURCE_IS_CONTINUOUS,
 };
 
 static void __init pxa_timer_init(void)

-- 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 22/23] clocksource: new clock lookup method

2007-01-30 Thread Daniel Walker

This patch modifies the current clocksource API so that clocks
can be masked if they have specific negative qualities. For 
instance, if a clock is not atomic you can choose not to include 
it in the list you select from, atomic in this case being lockless. 

The following qualities can be masked off,

#define CLOCKSOURCE_NOT_CONTINUOUS  1
#define CLOCKSOURCE_UNSTABLE2
#define CLOCKSOURCE_NOT_ATOMIC  4
#define CLOCKSOURCE_UNDER_32BITS8
#define CLOCKSOURCE_64BITS  16
#define CLOCKSOURCE_PM_AFFECTED 32

I modify the .flags to accomplish this.

The reasoning behind this is that it's not interesting to have a
positive "rating" value, and have a positive flags value. 

For instance, if the clock has a high rating you assume it's continuous.
The selection process isn't helped by stating "continuous" in the flags.

The point is to list all the negative side effects that some clocks have
which the programmer knows in advance that their code can not tolerate.

Signed-Off-By: Daniel Walker <[EMAIL PROTECTED]>

---
 include/linux/clocksource.h |   35 +---
 kernel/time/clocksource.c   |   47 ++--
 kernel/time/jiffies.c   |1 
 kernel/time/timekeeping.c   |   11 ++
 4 files changed, 55 insertions(+), 39 deletions(-)

Index: linux-2.6.19/include/linux/clocksource.h
===
--- linux-2.6.19.orig/include/linux/clocksource.h
+++ linux-2.6.19/include/linux/clocksource.h
@@ -30,6 +30,11 @@ extern struct clocksource clocksource_ji
 extern struct sys_device clocksource_sys_device;
 
 /*
+ * Read function type.
+ */
+typedef cycle_t (*clock_read_t)(void);
+
+/*
  * Allows inlined calling for notifier routines.
  */
 extern struct atomic_notifier_head clocksource_list_notifier;
@@ -40,6 +45,7 @@ extern struct atomic_notifier_head clock
 #define CLOCKSOURCE_NOTIFY_REGISTER1
 #define CLOCKSOURCE_NOTIFY_RATING  2
 #define CLOCKSOURCE_NOTIFY_FREQ4
+#define CLOCKSOURCE_NOTIFY_UNSTABLE8
 
 /*
  * Defined so the initcall can be changes without touching
@@ -119,12 +125,6 @@ struct clocksource {
s64 error;
 };
 
-/*
- * Clock source flags bits::
- */
-#define CLOCK_SOURCE_IS_CONTINUOUS 0x01
-#define CLOCK_SOURCE_MUST_VERIFY   0x02
-
 /* simplify initialization of mask field */
 #define CLOCKSOURCE_MASK(bits) (cycle_t)(bits<64 ? ((1ULLcycle_interval * c->mult;
 }
 
+/*
+ * Clocksource flags
+ */
 #define CLOCKSOURCE_NOT_CONTINUOUS 1
 #define CLOCKSOURCE_UNSTABLE   2
 #define CLOCKSOURCE_NOT_ATOMIC 4
@@ -243,9 +246,25 @@ static inline void clocksource_calculate
 #define CLOCKSOURCE_PM_AFFECTED32
 
 /* used to install a new clocksource */
+extern void clocksource_mark_unstable(struct clocksource *);
+extern struct clocksource *clocksource_get_unstable(void);
 extern int clocksource_register(struct clocksource*);
 extern void clocksource_rating_change(struct clocksource*);
-extern struct clocksource * clocksource_get_clock(char*);
+extern struct clocksource * clocksource_get_clock(char*, unsigned long);
+
+
+/**
+ * clocksource_get_masked_clock - Finds highest rated clocksource w/o mask
+ * @mask:  Clocksource features that are not wanted.
+ *
+ * Returns the highest rated clocksource that doesn't have any of the bits set
+ * in mask. If none are register the jiffies clock is returned. If all the 
clocks
+ * have the mask bits set, then NULL is returned.
+ */
+static inline struct clocksource * clocksource_get_masked_clock(unsigned long 
mask)
+{
+   return clocksource_get_clock(NULL, mask);
+}
 
 /**
  * clocksource_get_best_clock - Finds highest rated clocksource
@@ -255,7 +274,7 @@ extern struct clocksource * clocksource_
  */
 static inline struct clocksource * clocksource_get_best_clock(void)
 {
-   return clocksource_get_clock(NULL);
+   return clocksource_get_clock(NULL, 0);
 }
 
 #ifdef CONFIG_GENERIC_TIME_VSYSCALL
Index: linux-2.6.19/kernel/time/clocksource.c
===
--- linux-2.6.19.orig/kernel/time/clocksource.c
+++ linux-2.6.19/kernel/time/clocksource.c
@@ -51,7 +51,8 @@ ATOMIC_NOTIFIER_HEAD(clocksource_list_no
  * If no clocksources are registered the jiffies clock is
  * returned.
  */
-static struct clocksource * __is_registered(char * name)
+static inline
+struct clocksource * __is_registered(char * name, unsigned long mask)
 {
struct list_head *tmp;
 
@@ -59,7 +60,11 @@ static struct clocksource * __is_registe
struct clocksource *src;
 
src = list_entry(tmp, struct clocksource, list);
-   if (!strcmp(src->name, name))
+   if (name) {
+   if (!strcmp(src->name, name))
+   return src;
+
+   } else if (!(src->flags & mask))

[PATCH 15/23] clocksource: add new flags

2007-01-30 Thread Daniel Walker

Compile patch .. This just adds some code so the next few patches will compile.

Signed-Off-By: Daniel Walker <[EMAIL PROTECTED]>

---
 include/linux/clocksource.h |6 ++
 1 file changed, 6 insertions(+)

Index: linux-2.6.19/include/linux/clocksource.h
===
--- linux-2.6.19.orig/include/linux/clocksource.h
+++ linux-2.6.19/include/linux/clocksource.h
@@ -235,6 +235,12 @@ static inline void clocksource_calculate
c->xtime_interval = (u64)c->cycle_interval * c->mult;
 }
 
+#define CLOCKSOURCE_NOT_CONTINUOUS 1
+#define CLOCKSOURCE_UNSTABLE   2
+#define CLOCKSOURCE_NOT_ATOMIC 4
+#define CLOCKSOURCE_UNDER_32BITS   8
+#define CLOCKSOURCE_64BITS 16
+#define CLOCKSOURCE_PM_AFFECTED32
 
 /* used to install a new clocksource */
 extern int clocksource_register(struct clocksource*);

-- 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 20/23] clocksource: x86_64 update for new flags

2007-01-30 Thread Daniel Walker

Update x86_64 for new flags.

Signed-Off-By: Daniel Walker <[EMAIL PROTECTED]>

---
 arch/x86_64/kernel/hpet.c |1 -
 arch/x86_64/kernel/tsc.c  |   11 ---
 2 files changed, 4 insertions(+), 8 deletions(-)

Index: linux-2.6.19/arch/x86_64/kernel/hpet.c
===
--- linux-2.6.19.orig/arch/x86_64/kernel/hpet.c
+++ linux-2.6.19/arch/x86_64/kernel/hpet.c
@@ -470,7 +470,6 @@ struct clocksource clocksource_hpet = {
.mask   = (cycle_t)HPET_MASK,
.mult   = 0, /* set below */
.shift  = HPET_SHIFT,
-   .flags  = CLOCK_SOURCE_IS_CONTINUOUS,
.vread  = vread_hpet,
 };
 
Index: linux-2.6.19/arch/x86_64/kernel/tsc.c
===
--- linux-2.6.19.orig/arch/x86_64/kernel/tsc.c
+++ linux-2.6.19/arch/x86_64/kernel/tsc.c
@@ -195,18 +195,15 @@ static struct clocksource clocksource_ts
.read   = read_tsc,
.mask   = CLOCKSOURCE_MASK(64),
.shift  = 22,
-   .flags  = CLOCK_SOURCE_IS_CONTINUOUS |
- CLOCK_SOURCE_MUST_VERIFY,
+   .flags  = CLOCKSOURCE_64BITS | CLOCKSOURCE_PM_AFFECTED,
.vread  = vread_tsc,
 };
 
 void mark_tsc_unstable(void)
 {
/* check to see if we should switch to the safe clocksource: */
-   if (unlikely(!tsc_unstable && clocksource_tsc.rating != 50)) {
-   clocksource_tsc.rating = 50;
-   clocksource_rating_change(_tsc);
-   }
+   if (unlikely(!tsc_unstable && clocksource_tsc.rating != 50))
+   clocksource_tsc.flags |= CLOCKSOURCE_UNSTABLE;
 
tsc_unstable = 1;
 }
@@ -218,7 +215,7 @@ static int __init init_tsc_clocksource(v
clocksource_tsc.mult = clocksource_khz2mult(cpu_khz,
clocksource_tsc.shift);
if (check_tsc_unstable())
-   clocksource_tsc.rating = 50;
+   clocksource_tsc.flags |= CLOCKSOURCE_UNSTABLE;
 
return clocksource_register(_tsc);
}

-- 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 10/23] clocksource: remove update_callback

2007-01-30 Thread Daniel Walker

Uses the block notifier to replace the functionality of update_callback().
update_callback() was a special case specifically for the tsc, but including
it in the clocksource structure duplicated it needlessly for other clocks.

Signed-Off-By: Daniel Walker <[EMAIL PROTECTED]>

---
 arch/i386/kernel/tsc.c  |   33 +
 arch/x86_64/kernel/tsc.c|   19 +--
 include/linux/clocksource.h |2 --
 include/linux/timekeeping.h |1 +
 kernel/time/timekeeping.c   |   32 ++--
 5 files changed, 45 insertions(+), 42 deletions(-)

Index: linux-2.6.19/arch/i386/kernel/tsc.c
===
--- linux-2.6.19.orig/arch/i386/kernel/tsc.c
+++ linux-2.6.19/arch/i386/kernel/tsc.c
@@ -50,8 +50,7 @@ static int __init tsc_setup(char *str)
 __setup("notsc", tsc_setup);
 
 /*
- * code to mark and check if the TSC is unstable
- * due to cpufreq or due to unsynced TSCs
+ * Flag that denotes an unstable tsc and check function.
  */
 static int tsc_unstable;
 
@@ -60,12 +59,6 @@ static inline int check_tsc_unstable(voi
return tsc_unstable;
 }
 
-void mark_tsc_unstable(void)
-{
-   tsc_unstable = 1;
-}
-EXPORT_SYMBOL_GPL(mark_tsc_unstable);
-
 /* Accellerators for sched_clock()
  * convert from cycles(64bits) => nanoseconds (64bits)
  *  basic equation:
@@ -179,6 +172,7 @@ int recalibrate_cpu_khz(void)
if (cpu_has_tsc) {
cpu_khz = calculate_cpu_khz();
tsc_khz = cpu_khz;
+   mark_tsc_unstable();
cpu_data[0].loops_per_jiffy =
cpufreq_scale(cpu_data[0].loops_per_jiffy,
cpu_khz_old, cpu_khz);
@@ -295,7 +289,6 @@ core_initcall(cpufreq_tsc);
 /* clock source code */
 
 static unsigned long current_tsc_khz = 0;
-static int tsc_update_callback(void);
 
 static cycle_t read_tsc(void)
 {
@@ -313,32 +306,24 @@ static struct clocksource clocksource_ts
.mask   = CLOCKSOURCE_MASK(64),
.mult   = 0, /* to be set */
.shift  = 22,
-   .update_callback= tsc_update_callback,
.flags  = CLOCK_SOURCE_IS_CONTINUOUS |
  CLOCK_SOURCE_MUST_VERIFY,
 };
 
-static int tsc_update_callback(void)
+/*
+ * Code to mark if the TSC is unstable due to cpufreq or due to unsynced TSCs
+ */
+void mark_tsc_unstable(void)
 {
-   int change = 0;
-
/* check to see if we should switch to the safe clocksource: */
-   if (clocksource_tsc.rating != 0 && check_tsc_unstable()) {
+   if (unlikely(!tsc_unstable && clocksource_tsc.rating != 0)) {
clocksource_tsc.rating = 0;
clocksource_rating_change(_tsc);
-   change = 1;
}
 
-   /* only update if tsc_khz has changed: */
-   if (current_tsc_khz != tsc_khz) {
-   current_tsc_khz = tsc_khz;
-   clocksource_tsc.mult = clocksource_khz2mult(current_tsc_khz,
-   clocksource_tsc.shift);
-   change = 1;
-   }
-
-   return change;
+   tsc_unstable = 1;
 }
+EXPORT_SYMBOL_GPL(mark_tsc_unstable);
 
 static int __init dmi_mark_tsc_unstable(struct dmi_system_id *d)
 {
Index: linux-2.6.19/arch/x86_64/kernel/tsc.c
===
--- linux-2.6.19.orig/arch/x86_64/kernel/tsc.c
+++ linux-2.6.19/arch/x86_64/kernel/tsc.c
@@ -47,11 +47,6 @@ static inline int check_tsc_unstable(voi
return tsc_unstable;
 }
 
-void mark_tsc_unstable(void)
-{
-   tsc_unstable = 1;
-}
-EXPORT_SYMBOL_GPL(mark_tsc_unstable);
 
 #ifdef CONFIG_CPU_FREQ
 
@@ -182,8 +177,6 @@ __setup("notsc", notsc_setup);
 
 /* clock source code: */
 
-static int tsc_update_callback(void);
-
 static cycle_t read_tsc(void)
 {
cycle_t ret = (cycle_t)get_cycles_sync();
@@ -202,24 +195,22 @@ static struct clocksource clocksource_ts
.read   = read_tsc,
.mask   = CLOCKSOURCE_MASK(64),
.shift  = 22,
-   .update_callback= tsc_update_callback,
.flags  = CLOCK_SOURCE_IS_CONTINUOUS |
  CLOCK_SOURCE_MUST_VERIFY,
.vread  = vread_tsc,
 };
 
-static int tsc_update_callback(void)
+void mark_tsc_unstable(void)
 {
-   int change = 0;
-
/* check to see if we should switch to the safe clocksource: */
-   if (clocksource_tsc.rating != 50 && check_tsc_unstable()) {
+   if (unlikely(!tsc_unstable && clocksource_tsc.rating != 50)) {
clocksource_tsc.rating = 50;
clocksource_rating_change(_tsc);
-   change = 1;
}
-   return change;
+
+   tsc_unstable = 1;
 }
+EXPORT_SYMBOL_GPL(mark_tsc_unstable);
 
 static int __init

[PATCH 12/23] clocksource: add clocksource_get_clock()

2007-01-30 Thread Daniel Walker

One new API call clocksource_get_clock() which allows clocks to be selected
based on their name, or if the name is null the highest rated clock is returned.

Signed-Off-By: Daniel Walker <[EMAIL PROTECTED]>

---
 include/linux/clocksource.h |   12 
 kernel/time/clocksource.c   |   18 ++
 2 files changed, 30 insertions(+)

Index: linux-2.6.19/include/linux/clocksource.h
===
--- linux-2.6.19.orig/include/linux/clocksource.h
+++ linux-2.6.19/include/linux/clocksource.h
@@ -234,6 +234,18 @@ static inline void clocksource_calculate
 extern struct clocksource *clocksource_get_next(void);
 extern int clocksource_register(struct clocksource*);
 extern void clocksource_rating_change(struct clocksource*);
+extern struct clocksource * clocksource_get_clock(char*);
+
+/**
+ * clocksource_get_best_clock - Finds highest rated clocksource
+ *
+ * Returns the highest rated clocksource. If none are register the
+ * jiffies clock is returned.
+ */
+static inline struct clocksource * clocksource_get_best_clock(void)
+{
+   return clocksource_get_clock(NULL);
+}
 
 #ifdef CONFIG_GENERIC_TIME_VSYSCALL
 extern void update_vsyscall(struct timespec *ts, struct clocksource *c);
Index: linux-2.6.19/kernel/time/clocksource.c
===
--- linux-2.6.19.orig/kernel/time/clocksource.c
+++ linux-2.6.19/kernel/time/clocksource.c
@@ -132,6 +132,24 @@ static inline struct clocksource * __get
 }
 
 /**
+ * clocksource_get_clock - Finds a specific clocksource
+ * @name:  name of the clocksource to return
+ *
+ * Returns the clocksource if registered, zero otherwise.
+ */
+struct clocksource * clocksource_get_clock(char * name)
+{
+   struct clocksource * ret;
+   unsigned long flags;
+
+   spin_lock_irqsave(_lock, flags);
+   ret = __get_clock(name);
+   spin_unlock_irqrestore(_lock, flags);
+   return ret;
+}
+
+
+/**
  * select_clocksource - Finds the best registered clocksource.
  *
  * Private function. Must hold clocksource_lock when called.

-- 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 06/23] timekeeping: create kernel/time/timekeeping.c

2007-01-30 Thread Daniel Walker

Move the generic timekeeping code from kernel/timer.c to
kernel/time/timekeeping.c . This requires some glue code which is
added to the include/linux/timekeeping.h header.

I tried to be as careful as possible in picking up recent changes to
the timekeeping code. This patches is on top of -mm , and moves all
the changes included in -mm.

This is also moving do_timer and the load calculation code which was
connect to the timekeeping code. Moving it provided for slightly better
compiler optimization.

Signed-Off-By: Daniel Walker <[EMAIL PROTECTED]> 

---
 include/linux/clocksource.h |3 
 include/linux/timekeeping.h |   13 
 kernel/time/Makefile|2 
 kernel/time/clocksource.c   |3 
 kernel/time/timekeeping.c   |  711 
 kernel/timer.c  |  694 --
 6 files changed, 729 insertions(+), 697 deletions(-)

Index: linux-2.6.19/include/linux/clocksource.h
===
--- linux-2.6.19.orig/include/linux/clocksource.h
+++ linux-2.6.19/include/linux/clocksource.h
@@ -18,6 +18,9 @@
 /* clocksource cycle base type */
 typedef u64 cycle_t;
 
+/* XXX - Would like a better way for initializing curr_clocksource */
+extern struct clocksource clocksource_jiffies;
+
 /**
  * struct clocksource - hardware abstraction for a free running counter
  * Provides mostly state-free accessors to the underlying hardware.
Index: linux-2.6.19/include/linux/timekeeping.h
===
--- /dev/null
+++ linux-2.6.19/include/linux/timekeeping.h
@@ -0,0 +1,13 @@
+#ifndef _LINUX_TIMEKEEPING_H
+#define _LINUX_TIMEKEEPING_H
+
+#include 
+
+#ifndef CONFIG_GENERIC_TIME
+static inline int change_clocksource(void)
+{
+   return 0;
+}
+#endif /* !CONFIG_GENERIC_TIME */
+
+#endif /* _LINUX_TIMEKEEPING_H */
Index: linux-2.6.19/kernel/time/Makefile
===
--- linux-2.6.19.orig/kernel/time/Makefile
+++ linux-2.6.19/kernel/time/Makefile
@@ -1,4 +1,4 @@
-obj-y += ntp.o clocksource.o jiffies.o timer_list.o
+obj-y += ntp.o clocksource.o jiffies.o timer_list.o timekeeping.o
 
 obj-$(CONFIG_GENERIC_CLOCKEVENTS)  += clockevents.o
 obj-$(CONFIG_GENERIC_CLOCKEVENTS)  += tick-common.o
Index: linux-2.6.19/kernel/time/clocksource.c
===
--- linux-2.6.19.orig/kernel/time/clocksource.c
+++ linux-2.6.19/kernel/time/clocksource.c
@@ -29,9 +29,6 @@
 #include 
 #include 
 
-/* XXX - Would like a better way for initializing curr_clocksource */
-extern struct clocksource clocksource_jiffies;
-
 /*[Clocksource internal variables]-
  * curr_clocksource:
  * currently selected clocksource. Initialized to clocksource_jiffies.
Index: linux-2.6.19/kernel/time/timekeeping.c
===
--- /dev/null
+++ linux-2.6.19/kernel/time/timekeeping.c
@@ -0,0 +1,711 @@
+/*
+ *  linux/kernel/time/timekeeping.c
+ *
+ *  timekeeping functions
+ *
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+/*
+ * The current time
+ * wall_to_monotonic is what we need to add to xtime (or xtime corrected
+ * for sub jiffie times) to get to monotonic time.  Monotonic is pegged
+ * at zero at system boot time, so wall_to_monotonic will be negative,
+ * however, we will ALWAYS keep the tv_nsec part positive so we can use
+ * the usual normalization.
+ */
+struct timespec xtime __attribute__ ((aligned (16)));
+struct timespec wall_to_monotonic __attribute__ ((aligned (16)));
+
+EXPORT_SYMBOL(xtime);
+
+/*
+ * flag for if timekeeping is suspended
+ */
+static int timekeeping_suspended;
+
+/*
+ * time in seconds when suspend began
+ */
+static unsigned long timekeeping_suspend_time;
+
+/*
+ * Clock used for timekeeping
+ */
+struct clocksource *clock = _jiffies;
+
+#ifdef CONFIG_GENERIC_TIME
+/**
+ * __get_nsec_offset - Returns nanoseconds since last call to periodic_hook
+ *
+ * private function, must hold xtime_lock lock when being
+ * called. Returns the number of nanoseconds since the
+ * last call to update_wall_time() (adjusted by NTP scaling)
+ */
+static inline s64 __get_nsec_offset(void)
+{
+   cycle_t cycle_now, cycle_delta;
+   s64 ns_offset;
+
+   /* read clocksource: */
+   cycle_now = clocksource_read(clock);
+
+   /* calculate the delta since the last update_wall_time: */
+   cycle_delta = (cycle_now - clock->cycle_last) & clock->mask;
+
+   /* convert to nanoseconds: */
+   ns_offset = cyc2ns(clock, cycle_delta);
+
+   return ns_offset;
+}
+
+/**
+ * __get_realtime_clock_ts - Returns the time of day in a timespec
+ * @ts:pointer to the timespec to be set
+ *
+ * Returns the time of day in a timespec. Used by
+ * do_gettimeofday() and get_realtime_clock_ts().
+ */
+static inline void

[PATCH 18/23] clocksource: i386 update for new flags

2007-01-30 Thread Daniel Walker

Update i386 for new flags.

Signed-Off-By: Daniel Walker <[EMAIL PROTECTED]>

---
 arch/i386/kernel/hpet.c|1 -
 arch/i386/kernel/i8253.c   |1 +
 arch/i386/kernel/tsc.c |   23 +++
 arch/i386/kernel/vmitime.c |2 +-
 4 files changed, 13 insertions(+), 14 deletions(-)

Index: linux-2.6.19/arch/i386/kernel/hpet.c
===
--- linux-2.6.19.orig/arch/i386/kernel/hpet.c
+++ linux-2.6.19/arch/i386/kernel/hpet.c
@@ -287,7 +287,6 @@ static struct clocksource clocksource_hp
.read   = read_hpet,
.mask   = HPET_MASK,
.shift  = HPET_SHIFT,
-   .flags  = CLOCK_SOURCE_IS_CONTINUOUS,
 };
 
 static int __init init_hpet_clocksource(void)
Index: linux-2.6.19/arch/i386/kernel/i8253.c
===
--- linux-2.6.19.orig/arch/i386/kernel/i8253.c
+++ linux-2.6.19/arch/i386/kernel/i8253.c
@@ -183,6 +183,7 @@ static struct clocksource clocksource_pi
.rating = 110,
.read   = pit_read,
.mask   = CLOCKSOURCE_MASK(32),
+   .flags  = CLOCKSOURCE_NOT_ATOMIC,
.mult   = 0,
.shift  = 20,
 };
Index: linux-2.6.19/arch/i386/kernel/tsc.c
===
--- linux-2.6.19.orig/arch/i386/kernel/tsc.c
+++ linux-2.6.19/arch/i386/kernel/tsc.c
@@ -306,8 +306,7 @@ static struct clocksource clocksource_ts
.mask   = CLOCKSOURCE_MASK(64),
.mult   = 0, /* to be set */
.shift  = 22,
-   .flags  = CLOCK_SOURCE_IS_CONTINUOUS |
- CLOCK_SOURCE_MUST_VERIFY,
+   .flags  = CLOCKSOURCE_64BITS | CLOCKSOURCE_PM_AFFECTED,
 };
 
 /*
@@ -316,10 +315,8 @@ static struct clocksource clocksource_ts
 void mark_tsc_unstable(void)
 {
/* check to see if we should switch to the safe clocksource: */
-   if (unlikely(!tsc_unstable && clocksource_tsc.rating != 0)) {
-   clocksource_tsc.rating = 0;
-   clocksource_rating_change(_tsc);
-   }
+   if (unlikely(!tsc_unstable))
+   clocksource_mark_unstable(_tsc);
 
tsc_unstable = 1;
 }
@@ -434,17 +431,19 @@ static int __init init_tsc_clocksource(v
/* check blacklist */
dmi_check_system(bad_tsc_dmi_table);
 
-   if (unsynchronized_tsc()) /* mark unstable if unsynced */
+   if (unsynchronized_tsc()) { /* mark unstable if unsynced */
mark_tsc_unstable();
+   clocksource_tsc.flags |= CLOCKSOURCE_UNSTABLE;
+   }
check_geode_tsc_reliable();
current_tsc_khz = tsc_khz;
clocksource_tsc.mult = clocksource_khz2mult(current_tsc_khz,
clocksource_tsc.shift);
-   /* lower the rating if we already know its unstable: */
-   if (check_tsc_unstable()) {
-   clocksource_tsc.rating = 0;
-   clocksource_tsc.flags &= ~CLOCK_SOURCE_IS_CONTINUOUS;
-   }
+
+   /* flags as unstable if we already know its unstable: */
+   if (check_tsc_unstable())
+   clocksource_tsc.flags |= CLOCKSOURCE_UNSTABLE |
+CLOCKSOURCE_NOT_CONTINUOUS;
 
init_timer(_tsc_freq_timer);
verify_tsc_freq_timer.function = verify_tsc_freq;
Index: linux-2.6.19/arch/i386/kernel/vmitime.c
===
--- linux-2.6.19.orig/arch/i386/kernel/vmitime.c
+++ linux-2.6.19/arch/i386/kernel/vmitime.c
@@ -115,7 +115,7 @@ static struct clocksource clocksource_vm
.mask   = CLOCKSOURCE_MASK(64),
.mult   = 0, /* to be set */
.shift  = 22,
-   .flags  = CLOCK_SOURCE_IS_CONTINUOUS,
+   .flags  = CLOCKSOURCE_64BITS,
 };
 
 

-- 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 04/23] clocksource: drop time-x86_64-tsc-fixup-clocksource-changes.patch

2007-01-30 Thread Daniel Walker

Drop.

Signed-Off-By: Daniel Walker <[EMAIL PROTECTED]>

---
 arch/x86_64/kernel/tsc.c |   31 +--
 1 file changed, 21 insertions(+), 10 deletions(-)

Index: linux-2.6.19/arch/x86_64/kernel/tsc.c
===
--- linux-2.6.19.orig/arch/x86_64/kernel/tsc.c
+++ linux-2.6.19/arch/x86_64/kernel/tsc.c
@@ -46,6 +46,13 @@ static inline int check_tsc_unstable(voi
 {
return tsc_unstable;
 }
+
+void mark_tsc_unstable(void)
+{
+   tsc_unstable = 1;
+}
+EXPORT_SYMBOL_GPL(mark_tsc_unstable);
+
 #ifdef CONFIG_CPU_FREQ
 
 /* Frequency scaling support. Adjust the TSC based timer when the cpu frequency
@@ -174,6 +181,9 @@ __setup("notsc", notsc_setup);
 
 
 /* clock source code: */
+
+static int tsc_update_callback(void);
+
 static cycle_t read_tsc(void)
 {
cycle_t ret = (cycle_t)get_cycles_sync();
@@ -192,23 +202,24 @@ static struct clocksource clocksource_ts
.read   = read_tsc,
.mask   = CLOCKSOURCE_MASK(64),
.shift  = 22,
+   .update_callback= tsc_update_callback,
.flags  = CLOCK_SOURCE_IS_CONTINUOUS |
  CLOCK_SOURCE_MUST_VERIFY,
.vread  = vread_tsc,
 };
 
-void mark_tsc_unstable(void)
+static int tsc_update_callback(void)
 {
-   if (!tsc_unstable) {
-   tsc_unstable = 1;
-   /* Change only the rating, when not registered */
-   if (clocksource_tsc.mult)
-   clocksource_change_rating(_tsc, 0);
-   else
-   clocksource_tsc.rating = 0;
+   int change = 0;
+
+   /* check to see if we should switch to the safe clocksource: */
+   if (clocksource_tsc.rating != 50 && check_tsc_unstable()) {
+   clocksource_tsc.rating = 50;
+   clocksource_reselect();
+   change = 1;
}
+   return change;
 }
-EXPORT_SYMBOL_GPL(mark_tsc_unstable);
 
 static int __init init_tsc_clocksource(void)
 {
@@ -216,7 +227,7 @@ static int __init init_tsc_clocksource(v
clocksource_tsc.mult = clocksource_khz2mult(cpu_khz,
clocksource_tsc.shift);
if (check_tsc_unstable())
-   clocksource_tsc.rating = 0;
+   clocksource_tsc.rating = 50;
 
return clocksource_register(_tsc);
}

-- 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 08/23] clocksource: drop duplicate register checking

2007-01-30 Thread Daniel Walker

This is something Thomas already dropped, and I'm just sticking
with that .. If you register your clocksource _twice_ your kernel will
likely not work correctly (and might crash).

Signed-Off-By: Daniel Walker <[EMAIL PROTECTED]>

---
 kernel/time/clocksource.c |   19 ++-
 1 file changed, 6 insertions(+), 13 deletions(-)

Index: linux-2.6.19/kernel/time/clocksource.c
===
--- linux-2.6.19.orig/kernel/time/clocksource.c
+++ linux-2.6.19/kernel/time/clocksource.c
@@ -186,16 +186,12 @@ int clocksource_register(struct clocksou
unsigned long flags;
 
spin_lock_irqsave(_lock, flags);
-   if (unlikely(!list_empty(>list) && __is_registered(c->name))) {
-   printk("register_clocksource: Cannot register %s clocksource. "
-  "Already registered!", c->name);
-   ret = -EBUSY;
-   } else {
-   INIT_LIST_HEAD(>list);
-   __sorted_list_add(c);
-   /* scan the registered clocksources, and pick the best one */
-   next_clocksource = select_clocksource();
-   }
+   __sorted_list_add(c);
+
+   /*
+* scan the registered clocksources, and pick the best one
+*/
+   next_clocksource = select_clocksource();
spin_unlock_irqrestore(_lock, flags);
return ret;
 }
@@ -212,9 +208,6 @@ void clocksource_rating_change(struct cl
 {
unsigned long flags;
 
-   if (unlikely(list_empty(>list)))
-   return;
-
spin_lock_irqsave(_lock, flags);
 
/*

-- 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 21/23] clocksource: drivers/ update for new flags

2007-01-30 Thread Daniel Walker

Update drivers/ for new flags.

Signed-Off-By: Daniel Walker <[EMAIL PROTECTED]>

---
 drivers/clocksource/acpi_pm.c|2 +-
 drivers/clocksource/cyclone.c|1 -
 drivers/clocksource/scx200_hrt.c |1 -
 3 files changed, 1 insertion(+), 3 deletions(-)

Index: linux-2.6.19/drivers/clocksource/acpi_pm.c
===
--- linux-2.6.19.orig/drivers/clocksource/acpi_pm.c
+++ linux-2.6.19/drivers/clocksource/acpi_pm.c
@@ -73,7 +73,7 @@ static struct clocksource clocksource_ac
.mask   = (cycle_t)ACPI_PM_MASK,
.mult   = 0, /*to be caluclated*/
.shift  = 22,
-   .flags  = CLOCK_SOURCE_IS_CONTINUOUS,
+   .flags  = CLOCKSOURCE_UNDER_32BITS,
 
 };
 
Index: linux-2.6.19/drivers/clocksource/cyclone.c
===
--- linux-2.6.19.orig/drivers/clocksource/cyclone.c
+++ linux-2.6.19/drivers/clocksource/cyclone.c
@@ -31,7 +31,6 @@ static struct clocksource clocksource_cy
.mask   = CYCLONE_TIMER_MASK,
.mult   = 10,
.shift  = 0,
-   .flags  = CLOCK_SOURCE_IS_CONTINUOUS,
 };
 
 static int __init init_cyclone_clocksource(void)
Index: linux-2.6.19/drivers/clocksource/scx200_hrt.c
===
--- linux-2.6.19.orig/drivers/clocksource/scx200_hrt.c
+++ linux-2.6.19/drivers/clocksource/scx200_hrt.c
@@ -57,7 +57,6 @@ static struct clocksource cs_hrt = {
.rating = 250,
.read   = read_hrt,
.mask   = CLOCKSOURCE_MASK(32),
-   .flags  = CLOCK_SOURCE_IS_CONTINUOUS,
/* mult, shift are set based on mhz27 flag */
 };
 

-- 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 19/23] clocksource: mips update for new flags

2007-01-30 Thread Daniel Walker

Update mips for new flags.

Signed-Off-By: Daniel Walker <[EMAIL PROTECTED]>

---
 arch/mips/kernel/time.c |1 -
 1 file changed, 1 deletion(-)

Index: linux-2.6.19/arch/mips/kernel/time.c
===
--- linux-2.6.19.orig/arch/mips/kernel/time.c
+++ linux-2.6.19/arch/mips/kernel/time.c
@@ -307,7 +307,6 @@ static unsigned int __init calibrate_hpt
 struct clocksource clocksource_mips = {
.name   = "MIPS",
.mask   = 0x,
-   .flags  = CLOCK_SOURCE_IS_CONTINUOUS,
 };
 
 static void __init init_mips_clocksource(void)

-- 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 07/23] clocksource: rating sorted list

2007-01-30 Thread Daniel Walker

Converts the original plain list into a sorted list based on the clock rating.
Later in my tree this allows some of the variables to be dropped since the
highest rated clock is always at the front of the list. This also does some
other nice things like allow the sysfs files to print the clocks in a more
interesting order. It's forward looking.

Signed-Off-By: Daniel Walker <[EMAIL PROTECTED]>

---
 arch/i386/kernel/tsc.c  |2 
 arch/x86_64/kernel/tsc.c|2 
 include/linux/clocksource.h |8 +-
 kernel/time/clocksource.c   |  132 +---
 4 files changed, 96 insertions(+), 48 deletions(-)

Index: linux-2.6.19/arch/i386/kernel/tsc.c
===
--- linux-2.6.19.orig/arch/i386/kernel/tsc.c
+++ linux-2.6.19/arch/i386/kernel/tsc.c
@@ -325,7 +325,7 @@ static int tsc_update_callback(void)
/* check to see if we should switch to the safe clocksource: */
if (clocksource_tsc.rating != 0 && check_tsc_unstable()) {
clocksource_tsc.rating = 0;
-   clocksource_reselect();
+   clocksource_rating_change(_tsc);
change = 1;
}
 
Index: linux-2.6.19/arch/x86_64/kernel/tsc.c
===
--- linux-2.6.19.orig/arch/x86_64/kernel/tsc.c
+++ linux-2.6.19/arch/x86_64/kernel/tsc.c
@@ -229,7 +229,7 @@ static int tsc_update_callback(void)
/* check to see if we should switch to the safe clocksource: */
if (clocksource_tsc.rating != 50 && check_tsc_unstable()) {
clocksource_tsc.rating = 50;
-   clocksource_reselect();
+   clocksource_rating_change(_tsc);
change = 1;
}
return change;
Index: linux-2.6.19/include/linux/clocksource.h
===
--- linux-2.6.19.orig/include/linux/clocksource.h
+++ linux-2.6.19/include/linux/clocksource.h
@@ -12,6 +12,8 @@
 #include 
 #include 
 #include 
+#include 
+
 #include 
 #include 
 
@@ -189,9 +191,9 @@ static inline void clocksource_calculate
 
 
 /* used to install a new clocksource */
-int clocksource_register(struct clocksource*);
-void clocksource_reselect(void);
-struct clocksource* clocksource_get_next(void);
+extern struct clocksource *clocksource_get_next(void);
+extern int clocksource_register(struct clocksource*);
+extern void clocksource_rating_change(struct clocksource*);
 
 #ifdef CONFIG_GENERIC_TIME_VSYSCALL
 extern void update_vsyscall(struct timespec *ts, struct clocksource *c);
Index: linux-2.6.19/kernel/time/clocksource.c
===
--- linux-2.6.19.orig/kernel/time/clocksource.c
+++ linux-2.6.19/kernel/time/clocksource.c
@@ -35,7 +35,7 @@
  * next_clocksource:
  * pending next selected clocksource.
  * clocksource_list:
- * linked list with the registered clocksources
+ * rating sorted linked list with the registered clocksources
  * clocksource_lock:
  * protects manipulations to curr_clocksource and next_clocksource
  * and the clocksource_list
@@ -80,69 +80,105 @@ struct clocksource *clocksource_get_next
 }
 
 /**
- * select_clocksource - Finds the best registered clocksource.
+ * __is_registered - Returns a clocksource if it's registered
+ * @name:  name of the clocksource to return
  *
  * Private function. Must hold clocksource_lock when called.
  *
- * Looks through the list of registered clocksources, returning
- * the one with the highest rating value. If there is a clocksource
- * name that matches the override string, it returns that clocksource.
+ * Returns the clocksource if registered, zero otherwise.
+ * If no clocksources are registered the jiffies clock is
+ * returned.
  */
-static struct clocksource *select_clocksource(void)
+static struct clocksource * __is_registered(char * name)
 {
-   struct clocksource *best = NULL;
struct list_head *tmp;
 
list_for_each(tmp, _list) {
struct clocksource *src;
 
src = list_entry(tmp, struct clocksource, list);
-   if (!best)
-   best = src;
-
-   /* check for override: */
-   if (strlen(src->name) == strlen(override_name) &&
-   !strcmp(src->name, override_name)) {
-   best = src;
-   break;
-   }
-   /* pick the highest rating: */
-   if (src->rating > best->rating)
-   best = src;
+   if (!strcmp(src->name, name))
+   return src;
}
 
-   return best;
+   return 0;
 }
 
 /**
- * is_registered_source - Checks if clocksource is registered
- * @c: pointer to a clocksource
+ * __get_clock - Finds a specific clocksource
+ * @name:  name of the clocksource to return
  *
- *

Re: [PATCH] mm: remove global locks from mm/highmem.c

2007-01-30 Thread David Chinner

On Tue, Jan 30, 2007 at 05:11:32PM -0800, Andrew Morton wrote:
> On Wed, 31 Jan 2007 11:44:36 +1100
> David Chinner <[EMAIL PROTECTED]> wrote:
> 
> > On Mon, Jan 29, 2007 at 06:15:57PM -0800, Andrew Morton wrote:
> > > We still don't know what is the source of kmap() activity which
> > > necessitated this patch btw.  AFAIK the busiest source is ext2 
> > > directories,
> > > but perhaps NFS under certain conditions?
> > > 
> > > 
> > > 
> > > ->prepare_write no longer requires that the caller kmap the page.
> > 
> > Agreed, but don't we (xfs_iozero) have to map it first to zero it?
> > 
> > I think what you are saying here, Andrew, is that we can
> > do something like:
> > 
> > page = grab_cache_page
> > ->prepare_write(page)
> > kaddr = kmap_atomic(page, KM_USER0)
> > memset(kaddr+offset, 0, bytes)
> > flush_dcache_page(page)
> > kunmap_atomic(kaddr, KM_USER0)
> > ->commit_write(page)
> > 
> > to avoid using kmap() altogether?
> 
> Yup.  Even better, use clear_highpage().

For even more goodness, clearmem_highpage_flush() does exactly
the right thing for partial page zeroing ;)

Thanks, Andrew, I've added a patch to my QA tree with this mod.

Cheers,

Dave.
-- 
Dave Chinner
Principal Engineer
SGI Australian Software Group
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] AGPGART compat ioctl

2007-01-30 Thread Kyle McMartin

On Sat, Jan 27, 2007 at 07:28:07PM -0800, Zwane Mwaikambo wrote:
> Hi Dave,
>   The following video card requires the agpgart driver ioctl 
> interface in order to detect video memory.
> 

Tested with testgart.c on parisc64, seems to work alright. Thanks for
doing this work, Zwane. I've been meaning to do compat_ioctl for
agpgart for months.

Cheers,
Kyle M.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 0 of 4] Generic AIO by scheduling stacks

2007-01-30 Thread Linus Torvalds

On Tue, 30 Jan 2007, Linus Torvalds wrote:
> 
> Does that mean that we might not have some cases where we'd need to make 
> sure we do things differently? Of course not. Something migt show up. But 
> this actually makes it very clear what the difference between "struct 
> thread_struct" and "struct task_struct" are. One is shared between 
> fibrils, the other isn't.

Btw, this is also something where we should just disallow certain system 
calls from being done through the asynchronous method. 

Notably, clone/fork(), execve() and exit() are all things that we probably 
simply shouldn't allow as "AIO" events.

The process handling ones are obvious: they are very much about the shared 
"struct task_struct", so they rather clearly should only done "natively".

More interesting is the question about "close()", though. Currently we 
have an optimization (fget/fput_light) that basically boils down to "we 
know we are the only owners". That optimization becomes more "interesting" 
with AIO - we need to disable it when fibrils are active (because other 
fibrils or the main thread can do it), but we can still keep it for the 
non-fibril case.

So we can certainly allow close() to happen in a fibril, but at the same 
time, this is an area where just the *existence* of fibrils means that 
certain other decisions that were thread-related may be modified to be 
aware of the micro-threads too.

I suspect there are rather few of those, though. The only one I can think 
of is literally the fget/fput_light() case, but there could be others.

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch] some scripts: replace gawk, head, bc with shell, update

2007-01-30 Thread Oleg Verych

scripts: replace gawk, head, bc with shell, update

  Replacing overhead of using some (external) programs
  instead of good old `sh'.

Cc: Roman Zippel <[EMAIL PROTECTED]>
Cc: Sam Ravnborg <[EMAIL PROTECTED]>
Cc: William Stearns <[EMAIL PROTECTED]>
Cc: Martin Schlemmer <[EMAIL PROTECTED]>
Signed-off-by: Oleg Verych <[EMAIL PROTECTED]>
---
-o--=O`C
 #oo'L O
<___=E M

 scripts/gen_initramfs_list.sh | 44 +-
 scripts/makelst   | 34 
 2 files changed, 40 insertions(+), 38 deletions(-)

--- linux-2.6.20-rc6/scripts/makelst~4-update-gawk-bc-head-rip  2007-01-12 
19:54:26.0 +0100
+++ linux-2.6.20-rc6/scripts/makelst2007-01-31 03:02:53.433642000 +0100
@@ -1,31 +1,31 @@
-#!/bin/bash
+#!/bin/sh
 # A script to dump mixed source code & assembly
 # with correct relocations from System.map
-# Requires the following lines in Rules.make.
-# Author(s): DJ Barrow ([EMAIL PROTECTED],[EMAIL PROTECTED]) 
-#William Stearns <[EMAIL PROTECTED]>
+# Requires the following lines in makefile:
 #%.lst: %.c
 #  $(CC) $(CFLAGS) $(EXTRA_CFLAGS) $(CFLAGS_$@) -g -c -o $*.o $<
-#  $(TOPDIR)/scripts/makelst $*.o $(TOPDIR)/System.map $(OBJDUMP)
+#  $(srctree)/scripts/makelst $*.o $(objtree)/System.map $(OBJDUMP)
 #
-#Copyright (C) 2000 IBM Corporation
-#Author(s): DJ Barrow ([EMAIL PROTECTED],[EMAIL PROTECTED]) 
+# Copyright (C) 2000 IBM Corporation
+# Author(s): DJ Barrow ([EMAIL PROTECTED],[EMAIL PROTECTED])
+#William Stearns <[EMAIL PROTECTED]>
 #
 
-t1=`$3 --syms $1 | grep .text | grep " F " | head -n 1`
+# awk style field access
+field() {
+  shift $1 ; echo $1
+}
+
+t1=`$3 --syms $1 | grep .text | grep -m1 " F "`
 if [ -n "$t1" ]; then
-  t2=`echo $t1 | gawk '{ print $6 }'`
+  t2=`field 6 $t1`
   if [ ! -r $2 ]; then
 echo "No System.map" >&2
-t7=0
   else
 t3=`grep $t2 $2`
-t4=`echo $t3 | gawk '{ print $1 }'`
-t5=`echo $t1 | gawk '{ print $1 }'`
-t6=`echo $t4 - $t5 | tr a-f A-F`
-t7=`( echo  ibase=16 ; echo $t6 ) | bc`
+t4=`field 1 $t3`
+t5=`field 1 $t1`
+t6=`printf "%lu" $((0x$t4 - 0x$t5))`
   fi
-else
-  t7=0
 fi
-$3 -r --source --adjust-vma=$t7 $1
+$3 -r --source --adjust-vma=${t6:-0} $1


--- linux-2.6.20-rc6/scripts/gen_initramfs_list.sh~4gawk-rip2007-01-12 
19:54:26.0 +0100
+++ linux-2.6.20-rc6/scripts/gen_initramfs_list.sh  2007-01-31 
03:02:25.847918000 +0100
@@ -1,5 +1,5 @@
 #!/bin/bash
 # Copyright (C) Martin Schlemmer <[EMAIL PROTECTED]>
-# Copyright (c) 2006   Sam Ravnborg <[EMAIL PROTECTED]>
+# Copyright (C) 2006 Sam Ravnborg <[EMAIL PROTECTED]>
 #
 # Released under the terms of the GNU GPL
@@ -18,13 +18,13 @@ Usage:
 $0 [-o ] [-u ] [-g ] {-d | } ...
-o   Create gzipped initramfs file named  using
-  gen_init_cpio and gzip
+  gen_init_cpio and gzip
-uUser ID to map to user ID 0 (root).
-   is only meaningful if 
-  is a directory.
+   is only meaningful if 
+  is a directory.
-gGroup ID to map to group ID 0 (root).
-   is only meaningful if 
-  is a directory.
+   is only meaningful if 
+  is a directory.
  File list or directory for cpio archive.
-  If  is a .cpio file it will be used
+  If  is a .cpio file it will be used
   as direct input to initramfs.
-d Output the default cpio list.
@@ -37,4 +37,11 @@ EOF
 }
 
+# awk style field access
+# $1 - field number; rest is argument string
+field() {
+   shift $1
+   echo $1
+}
+
 list_default_initramfs() {
# echo usr/kinit/kinit
@@ -120,23 +127,18 @@ parse() {
;;
"nod")
-   local dev_type=
-   local maj=$(LC_ALL=C ls -l "${location}" | \
-   gawk '{sub(/,/, "", $5); print $5}')
-   local min=$(LC_ALL=C ls -l "${location}" | \
-   gawk '{print $6}')
-
-   if [ -b "${location}" ]; then
-   dev_type="b"
-   else
-   dev_type="c"
-   fi
-   str="${ftype} ${name} ${str} ${dev_type} ${maj} ${min}"
+   local dev=`LC_ALL=C ls -l "${location}"`
+   local maj=`field 5 ${dev}`
+   local min=`field 6 ${dev}`
+   maj=${maj%,}
+
+   [ -b "${location}" ] && dev="b" || dev="c"
+
+   str="${ftype} ${name} ${str} ${dev} ${maj} ${min}"
;;
"slink")
-   local

Re: [PATCH 0 of 4] Generic AIO by scheduling stacks

2007-01-30 Thread Linus Torvalds

On Wed, 31 Jan 2007, Benjamin Herrenschmidt wrote:

> > - We would now have some measure of task_struct concurrency.  Read that 
> > twice,
> > it's scary.  As two fibrils execute and block in turn they'll each be
> > referencing current->.  It means that we need to audit task_struct to make 
> > sure
> > that paths can handle racing as its scheduled away.  The current 
> > implementation
> > *does not* let preemption trigger a fibril switch.  So one only has to worry
> > about racing with voluntary scheduling of the fibril paths.  This can mean
> > moving some task_struct members under an accessor that hides them in a 
> > struct
> > in task_struct so they're switched along with the fibril.  I think this is a
> > manageable burden.
> 
> That's the one scaring me in fact ... Maybe it will end up being an easy
> one but I don't feel too comfortable...

We actually have almost zero "interesting" data in the task-struct.

All the real meat of a task has long since been split up into structures 
that can be shared for threading anyway (ie signal/files/mm/etc).

Which is why I'm personally very comfy with just re-using task_struct 
as-is.

NOTE! This is with the understanding that we *never* do any preemption. 
The whole point of the microthreading as far as I'm concerned is exactly 
that it is cooperative. It's not preemptive, and it's emphatically *not* 
concurrent (ie you'd never have two fibrils running at the same time on 
separate CPU's).

If you want preemptive of concurrent CPU usage, you use separate threads. 
The point of AIO scheduling is very much inherent in its name: it's for 
filling up CPU's when there's IO.

So the theory (and largely practice) is that you want to use real threads 
to fill your CPU's, but then *within* those threads you use AIO to make 
sure that each thread actually uses the CPU efficiently and doesn't just 
block with nothing to do.

So with the understanding that this is neither CPU-concurrent nor 
preemptive (*within* a fibril group - obviously the thread itself gets 
both preempted and concurrently run with other threads), I don't worry at 
all about sharing "struct task_struct".

Does that mean that we might not have some cases where we'd need to make 
sure we do things differently? Of course not. Something migt show up. But 
this actually makes it very clear what the difference between "struct 
thread_struct" and "struct task_struct" are. One is shared between 
fibrils, the other isn't.

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: How many people are using 2.6.16?

2007-01-30 Thread Linus Torvalds

On Tue, 30 Jan 2007, Mark Lord wrote:
> 
> I believe our featherless leader said he though it was an ancient bug,
> exasperated by something that went into 2.6.19.
> 
> If Linus's opinion is correct (still?), then the bug exists in all
> kernels since somewhere back in the 2.4.xx days.

The issue was somewhat confused by people certainly *reporting* it for 
older kernels. Also, as part of the dirty bit cleanups and sanity 
checkingwe did actually seem to fix a long-standing CIFS corruption (and 
apparently reisertfs/XFS problems too).

But the *common* case was actually introduced with 2.6.19, and 2.6.16 
wouldn't be affected. 

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] VIA IRQ quirk breakage fix

2007-01-30 Thread Andrew Morton

On Tue, 30 Jan 2007 13:25:58 +0100
Jean Delvare <[EMAIL PROTECTED]> wrote:

> So here comes the third
> (and hopefully last) iteration of the patch:

argh, it looks like I sent v2 to Linus.

Here's the missing bit.  Please confirm that we want it?


From: Jean Delvare <[EMAIL PROTECTED]>

Add special handling for the VT82C686.

Signed-off-by: Jean Delvare <[EMAIL PROTECTED]>
Cc: Alan Cox <[EMAIL PROTECTED]>
Cc: Nick Piggin <[EMAIL PROTECTED]>
Signed-off-by: Andrew Morton <[EMAIL PROTECTED]>
---

 drivers/pci/quirks.c |8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff -puN drivers/pci/quirks.c~via-quirk-fix-update drivers/pci/quirks.c
--- a/drivers/pci/quirks.c~via-quirk-fix-update
+++ a/drivers/pci/quirks.c
@@ -661,9 +661,11 @@ static void quirk_via_bridge(struct pci_
/* See what bridge we have and find the device ranges */
switch (dev->device) {
case PCI_DEVICE_ID_VIA_82C686:
-   /* 82C686 is special */
-   via_vlink_dev_lo = 7;
-   via_vlink_dev_hi = 7;
+   /* The VT82C686 is special, it attaches to PCI and can have
+  any device number. All its subdevices are functions of
+  that single device. */
+   via_vlink_dev_lo = PCI_SLOT(dev->devfn);
+   via_vlink_dev_hi = PCI_SLOT(dev->devfn);
break;
case PCI_DEVICE_ID_VIA_8237:
case PCI_DEVICE_ID_VIA_8237A:
_

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Hidden SSID's

2007-01-30 Thread Larry Finger

Jouni Malinen wrote:
> On Tue, Jan 30, 2007 at 01:08:29AM -0600, Larry Finger wrote:
> 
>> Any AP with a hidden SSID will only respond to probe requests that specify 
>> its SSID, and will ignore
>> any other probes. In addition, the response will have an empty SSID field. 
>> These responses are the
>> only ones in which a substitution would occur. These are the same responses 
>> where the current code
>> sends back the "" pseudo-SSID. My change would put the correct one 
>> there.
> 
> Is the SSID from the probe response really used here? Your patch did not
> look like that.. The SSID from the last scan request command may not be
> the one that triggered the last scan (e.g., one could request a new scan
> without specifying an SSID).

If one does the equivalent of 'iwlist eth1 scan essid myssid', then a probe 
response with
NETWORK_EMPTY_ESSID set in the network flags will have 'myssid' returned in the 
SSID field of the
returned buffer. If the input command were 'iwlist eth1 scan', then an empty 
SSID would be returned
under the same circumstances. My code saves the SSID that is in the extra 
argument of the
SIOCSIWSCAN call, and uses that in the SIOCGIWSCAN call.
> 
>> We aren't guessing. The response frame with the empty SSID field must have 
>> come from the AP with the
>> SSID we want. Filling in the expected value is just making it easier for the 
>> user-space tools.
> 
> I don't see how the proposed patch would be using the correct SSID value
> in all cases. Especially cases where there are multiple APs using hidden
> SSIDs, but with different real SSID values and cases where multiple scan
> requests are being processed would be likely to leave windows open for
> reporting incorrect SSID.

I can think of one instance where the wrong value could be reported. That is if 
some other STA
probes a different hidden AP just when we have sent a probe request. For WPA 
this should not cause a
problem as wpa_supplicant will sort that out while authenticating.

What is the method that should be used to associated with a given hidden AP?

Larry




-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Free Linux Driver Development!

2007-01-30 Thread Michael K. Edwards


On 1/29/07, Greg KH <[EMAIL PROTECTED]> wrote:

Free Linux Driver Development!

Yes, that's right, the Linux kernel community is offering all companies
free Linux driver development.  ...

[snip]

[1] for the CPUs that support the bus types that your device works on.


Bravo!  Now, is there a message in the same spirit that can be
tailored to embedded space, especially to SoC vendors and (even more
importantly) their customers?  Something along the lines of:

"We understand that embedded hardware is frequently buggy and that SoC
vendors are doing well if their own internal software people can get
enough help from the chip guys to bring up enough customer-driven use
cases to win a few design-ins.

We sympathize with embedded developers who stay up nights with an
O-scope and a JTAG emulator reverse-engineering the hardware behavior,
trying to figure out which this order of operations works and this
other one doesn't.

We have the software tools and the competence to quantify the
potential gains from current toolchains and kernels, aggressive
compilation options, and in-tree power/latency management strategies,
so that you can build a business case against "fire and forget" SDKs
based on ancient compilers, obsolete kernels, and unmaintained
out-of-tree patches.

We will help platform integrators bridge the gap between the chip
architects' claims about device performance and the condition in which
the BSP guys toss drivers over the fence.

You can hang onto the hardware and profit from coaching and code
review, or you can send us a board and whatever doco you've got, and
we'll figure it out.

All we ask is that 1) SoC vendors authorize customers to do an NDA
with OSDL and pass vendor NDA material along to us; 2) when the
product ships, all participants are free to exercise GPL rights with
respect to the chip support and driver code produced; and 3) platform
integrators cooperate with the rework usually needed as code merges
towards Linus's tree."

Or is this a pipe dream?

Cheers,
- Michael
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: O_DIRECT question

2007-01-30 Thread Andrea Arcangeli

On Tue, Jan 30, 2007 at 06:07:14PM -0500, Phillip Susi wrote:
> It most certainly matters where the error happened because "you are 
> screwd" is not an acceptable outcome in a mission critical application. 

An I/O error is not an acceptable outcome in a mission critical app,
all mission critical setups should be fault tolerant, so if raid
cannot recover at the first sign of error the whole system should
instantly go down and let the secondary takeover from it. See slony
etc...

Trying to recover the recoverable by mucking up with data making even
_more_ writes on a failing disk before doing physical mirror image of
the disk (the readable part) isn't a good idea IMHO. At best you could
retry writing on the same sector hoping somebody disconnected the scsi
cable by mistake.

>  A well engineered solution will deal with errors as best as possible, 
> not simply give up and tell the user they are screwed because the 
> designer was lazy.  There is a reason that read and write return the 
> number of bytes _actually_ transfered, and the application is supposed 
> to check that result to verify proper operation.

You can track the range where it happened with fsync too like said in
previous email, and you can take the big database lock and then
read-write read-write every single block in that range until you find
the failing place if you really want to. read-write in place should be
safe.

> No, there is a slight difference.  An fsync() flushes all dirty buffers 
> in an undefined order.  Using O_DIRECT or O_SYNC, you can control the 
> flush order because you can simply wait for one set of writes to 
> complete before starting another set that must not be written until 
> after the first are on the disk.  You can emulate that by placing an 
> fsync between both sets of writes, but that will flush any other
> dirty 

Doing fsync after every write will provide the same ordering
guarantee as O_SYNC, thought it was obvious what I meant here.

The whole point is that most of the time you don't need it, you need
an fsync after a couple of writes. All smtp servers uses fsync for the
same reason, they also have to journal their writes to avoid losing
email when there is a power loss.

If you use writev or aio pwrite you can do well with O_SYNC too though.

> buffers whose ordering you do not care about.  Also there is no aio 
> version of fsync.

please have a second look at aio_abi.h:

IOCB_CMD_FSYNC = 2,
IOCB_CMD_FDSYNC = 3,

there must be a reason why they exist, right?

> sync has no effect on reading, so that test is pointless.  direct saves 
> the cpu overhead of the buffer copy, but isn't good if the cache isn't 
> entirely cold.  The large buffer size really has little to do with it, 

direct bypasses the cache so the cache is freezing not just cold.

> rather it is the fact that the writes to null do not block dd from 
> making the next read for any length of time.  If dd were blocking on an 
> actual output device, that would leave the input device idle for the 
> portion of the time that dd were blocked.

The objective was to measure the pipeline stall, if you stall it for
other reason anyway what's the point?

> In any case, this is a totally different example than your previous one 
> which had dd _writing_ to a disk, where it would block for long periods 
> of time due to O_SYNC, thereby preventing it from reading from the input 
> buffer in a timely manner.  By not reading the input pipe frequently, it 
> becomes full and thus, tar blocks.  In that case the large buffer size 
> is actually a detriment because with a smaller buffer size, dd would not 
> be blocked as long and so it could empty the pipe more frequently 
> allowing tar to block less.

It would run slower with smaller buffer size because it would block
too and it would read and write slower too. For my backup usage
keeping tar blocked is actually a feature, so the load of the backup
decreases. To me it's important the MB/sec of the writes and the
MB/sec of the reads (to lower the load), I don't care too much about
how long it takes as far as things runs as efficiently as possible
when they run. The rate limiting effect of the blocking isn't a
problem to me.

> You seem to have missed the point of this thread.  Denis Vlasenko's 
> message that you replied to simply pointed out that they are 
> semantically equivalent, so O_DIRECT can be dropped provided that O_SYNC 
> + madvise could be fixed to perform as well.  Several people including 
> Linus seem to like this idea and think it is quite possible.

I answered to that email to point out the fundamental differences
between O_SYNC and O_DIRECT, if you don't like what I said I'm sorry
but that's how things are running today and I don't see quite possible
to change (unless of course we remove performance from the equation,
then indeed they'll be much the same).

Perhaps a IOCB_CMD_PREADAHEAD plus MAP_SHARED backed by lagepages
loaded with a new syscall that reads a piece at

Re: [Ksummit-2007-discuss] Re: [Ksummit-2006-discuss] 2007 Linux Kernel Summit

2007-01-30 Thread Jes Sorensen


Matt Domsch wrote:

As one who regularly fills a sponsor slot (though I have also gotten
an invitation on merit in the past), I don't believe the sponsor slot
people detract from the sessions.  Most of the time we keep quiet,
occasionally offering our insights or challenges.  Jonathan's writeups
are fantastic, but it doesn't really compare with being there and
participating in discussions, either hallway or main room.  Besides
consuming oxygen, what's the real concern here?


Hi Matt,

I don't think sponsor slots per se are damaging, the problem is that
they take up a seat. Combined with this fanatic 'we must only allow our
favorite 80 elite people into the room' idea. In this situation sponsor
slots are costly and often a waste at the technical level. Same goes
with having 12 committee members for an 80 seat summit, but nobody
seems to like to talk about that issue :)

Cheers,
Jes
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch] use __u64 rather than u64 in parisc statfs structs

2007-01-30 Thread Kyle McMartin

On Sun, Jan 28, 2007 at 06:48:26PM -0500, Mike Frysinger wrote:
> the statfs header exports some structs to userspace ... the parisc statfs64 
> struct currently uses u64 so the trivial attached patch fixes it to use __u64
> -mike

ack'd and merged. can you please not attach patches but properly send them
inline so i don't have to edit them before applying them to my tree?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Free Linux Driver Development!

2007-01-30 Thread Adrian Bunk

On Tue, Jan 30, 2007 at 05:24:28PM -0800, Greg KH wrote:
> On Wed, Jan 31, 2007 at 02:13:40AM +0100, Adrian Bunk wrote:
> > On Tue, Jan 30, 2007 at 11:10:20AM -0800, Greg KH wrote:
> > > On Tue, Jan 30, 2007 at 09:45:50AM -0800, Roland Dreier wrote:
> > >...
> > > > And there are plenty of documented devices that no one cares enough
> > > > about to submit a driver for.
> > > 
> > > Any specific examples?  I have a long list of people who wish to write
> > > new drivers but just don't know which hardware is not yet supported.
> > >...
> > 
> > Wrinting a driver for shiny new hardware is cool.
> > 
> > But understanding and maintaining an already existing driver and working 
> > on bug reports for this driver is something not-so-cool that would be 
> > required in many areas of the kernel.
> > 
> > Would someone from your long list of people e.g. be willing to maintain 
> > drivers/block/floppy.c ?
> 
> What?  Throw a fresh-faced newbie instantly into the tar-pit of despair
> that floppy.c is?  Do you want everyone just to run screaming from
> kernel development never to be seen again?
> 
> :)

Other than with the ISA drivers example, at least everyone has the 
hardware... ;-)

> Seriously, if you need help with something like this, bring it up on the
> kernel-janitors list, there are lots of people there that are willing to
> help out with stuff like long-term maintenance and bug fixing but don't
> know where to start.
> 
> That's also where the majority of the people who have volunteered to
> help are also hanging out at.

The idea of some kind of task list already appeared in this thread - 
this might be the best way to publish and track such issues.

> thanks,
> 
> greg k-h

cu
Adrian

-- 

   "Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
   "Only a promise," Lao Er said.
   Pearl S. Buck - Dragon Seed

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Free Linux Driver Development!

2007-01-30 Thread Adrian Bunk

On Wed, Jan 31, 2007 at 11:19:15AM +1000, Trent Waddington wrote:
> On 1/31/07, Adrian Bunk <[EMAIL PROTECTED]> wrote:
> >Would someone from your long list of people e.g. be willing to maintain
> >drivers/block/floppy.c ?
> 
> I have a floppy drive!  Will have to go buy some disks though.  What's
> wrong with it?

There isn't something specific wrong.

It's simply that the last time someone completely understood this 120 kB 
driver was in the last millenium.

That comes up every few months when some bug report arrives or in the 
cases when a patch breaks the floppy driver.

> Trent

cu
Adrian

-- 

   "Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
   "Only a promise," Lao Er said.
   Pearl S. Buck - Dragon Seed

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Free Linux Driver Development!

2007-01-30 Thread Bartlomiej Zolnierkiewicz


On 1/31/07, Greg KH <[EMAIL PROTECTED]> wrote:

On Wed, Jan 31, 2007 at 02:13:40AM +0100, Adrian Bunk wrote:
> On Tue, Jan 30, 2007 at 11:10:20AM -0800, Greg KH wrote:
> > On Tue, Jan 30, 2007 at 09:45:50AM -0800, Roland Dreier wrote:
> >...
> > > And there are plenty of documented devices that no one cares enough
> > > about to submit a driver for.
> >
> > Any specific examples?  I have a long list of people who wish to write
> > new drivers but just don't know which hardware is not yet supported.
> >...
>
> Wrinting a driver for shiny new hardware is cool.
>
> But understanding and maintaining an already existing driver and working
> on bug reports for this driver is something not-so-cool that would be
> required in many areas of the kernel.
>
> Would someone from your long list of people e.g. be willing to maintain
> drivers/block/floppy.c ?

What?  Throw a fresh-faced newbie instantly into the tar-pit of despair
that floppy.c is?  Do you want everyone just to run screaming from
kernel development never to be seen again?

:)

Seriously, if you need help with something like this, bring it up on the
kernel-janitors list, there are lots of people there that are willing to
help out with stuff like long-term maintenance and bug fixing but don't
know where to start.


http://bugzilla.kernel.org

:)

Bart
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 0 of 4] Generic AIO by scheduling stacks

2007-01-30 Thread Benjamin Herrenschmidt

On Tue, 2007-01-30 at 15:45 -0800, Zach Brown wrote:
> > Btw, I noticed that you didn't Cc Ingo. Definitely worth doing. Not  
> > just
> > because he's basically the normal scheduler maintainer, but also  
> > because
> > he's historically been involved in things like the async filename  
> > lookup
> > that the in-kernel web server thing used.
> 
> Yeah, that was dumb.  I had him in the cc: initially, then thought it  
> was too large and lobbed a bunch off.  My mistake.
> 
> Ingo, I'm interested in your reaction to the i386-specific mechanics  
> here (the thread_info copies terrify me) and the general notion of  
> how to tie this cleanly into the scheduler.

Thread info copies aren't such a big deal, we do that for irq stacks
already afaik

Ben.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 0 of 4] Generic AIO by scheduling stacks

2007-01-30 Thread Benjamin Herrenschmidt

> - We would now have some measure of task_struct concurrency.  Read that twice,
> it's scary.  As two fibrils execute and block in turn they'll each be
> referencing current->.  It means that we need to audit task_struct to make 
> sure
> that paths can handle racing as its scheduled away.  The current 
> implementation
> *does not* let preemption trigger a fibril switch.  So one only has to worry
> about racing with voluntary scheduling of the fibril paths.  This can mean
> moving some task_struct members under an accessor that hides them in a struct
> in task_struct so they're switched along with the fibril.  I think this is a
> manageable burden.

That's the one scaring me in fact ... Maybe it will end up being an easy
one but I don't feel too comfortable... we didn't create fibril-like
things for threads, instead, we share PIDs between tasks. I wonder if
the sane approach would be to actually create task structs (or have a
pool of them pre-created sitting there for performances) and add a way
to share the necessary bits so that syscalls can be run on those
spin-offs.

Ben.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] pipefs unique inode numbers

2007-01-30 Thread Jeff Layton

Linus Torvalds wrote:
>
> On Tue, 30 Jan 2007, Jeff Layton wrote:
>> Also, that patch would break many 32-bit programs not compiled with large
>> offsets when run in compatibility mode on a 64-bit kernel. If they were to
>> do a stat on this inode, it would likely generate an EOVERFLOW error since
>> the pointer address would probably not fit in a 32 bit field.
>>
>> That problem was the whole impetus for this set of patches.
>
> Well, we have that problem with the slowly incrementing "last_ino" too.
>
> Should we make "last_ino" be "static unsigned int" instead of "long"?
>
> Does anybody actually even use the old stat() with 32-bit interfaces? We
> warn for it, and we've done so for a long time.. I dont' remember people
> even complaining about the warning, so..
>
>Linus

I've actually sent Andrew a patch that does that and the same thing to
the counter in iunique as well. It's in -mm now, but I think it's pretty
safe and can probably go into your tree any time you're ready for it.

It's been quite a while since I looked at the original problem, but I
believe glibc actually uses stat64 to make the call, so it doesn't
throw the warning. The EOVERFLOW comes from glibc when it gets back an
st_ino value that won't fit in the 32 bit buffer provided by the program.

Obviously, we can't do anything for filesystems with permanent inodes larger
than 32 bits, but when generating them on the fly via new_inode or iunique,
we ought to try and have them fit in 32 bits if possible (at least as long
as we can).

-- Jeff
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: How many people are using 2.6.16?

2007-01-30 Thread Mike Houston

On Wed, 31 Jan 2007 00:52:15 +0100
Adrian Bunk <[EMAIL PROTECTED]> wrote:

> On Mon, Jan 29, 2007 at 04:04:48PM -0500, Mike Houston wrote:

> > I've been using Adrian's 2.6.16 kernel releases on two internet
> > servers that I look after remotely. One of them is RHEL 4 the
> > other is Fedora Core 2 (Ensim Webppliance). I'm especially wary
> > of breaking RHEL 4, and the 2.6.16.xx kernels work perfectly
> > except for the hald not starting (but that doesn't matter on that
> > server).
> >...
> 
> I haven't heard of this before, and in a quick test hald from
> HAL 0.5.8.1 starts fine here.
> 
> Are there any error messages?
>

I think I recall hearing about hald breaking in rhel4 with modern
kernels here on this list, some time before I built a custom kernel
for that rig (Athlon 64 3200+ on Asus A8V w' VIA K8T800Pro chipset
running 32 bit RHEL 4 ES). I was expecting it to happen. I think at
the time, the current kernel was 2.6.15.2 or thereabouts.

However, I haven't upgraded any software that's in the distro packages
beyond what up2date provides, so:

$ rpm -qa | grep hal
hal-0.4.2-4.EL4

So sorry about that, I didn't mean to take up any of your time. I
only mentioned it incidentally and wasn't expecting it to be
addressed. (I was more happily stating that nothing of significance
to me is broken in the distro). If that was something I needed, I
would have looked into upgrading it.

Mike Houston
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [Cbe-oss-dev] [RFC, PATCH 4/4] Add support to OProfile for profiling Cell BE SPUs -- update

2007-01-30 Thread Christian Krafft

On Tue, 30 Jan 2007 15:31:09 -0800
Carl Love <[EMAIL PROTECTED]> wrote:

> An LFSR sequence is similar to a pseudo random number sequence. For a 24
> bit LFSR sequence each number between 0 and 2^24 will occur once in the
> sequence but not in a normal counting order.  The hardware uses the LFSR
> sequence to count to since it is much simpler to implement in hardware
> then a normal counter.  Unfortunately, the only way we know how to
> figure out what the LFSR value that corresponds to the number in the
> sequence that is N before the last value (0xFF) is to calculate the
> previous value N times.  It is like trying to ask what is the pseudo
> random number that is N before this pseudo random number?

That should be no problem. 
You can just revers your algorithm and let it run x times instead of 0xFF-x.

> 
> I will add a short comment to the code that will summarize the above
> paragraph.
> [snip]
> 
> ___
> cbe-oss-dev mailing list
> [EMAIL PROTECTED]
> https://ozlabs.org/mailman/listinfo/cbe-oss-dev


-- 
Mit freundlichen Grüssen,
kind regards,

Christian Krafft
IBM Systems & Technology Group, 
Linux Kernel Development
IT Specialist
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] scsi_lib.c: continue after MEDIUM_ERROR

2007-01-30 Thread Mark Lord


Eric D. Mudama wrote:


Actually, it's possibly worse, since each failure in libata will 
generate 3-4 retries.  With existing ATA error recovery in the drives, 
that's about 3 seconds per retry on average, or 12 seconds per failure.  
Multiply that by the number of blocks past the error to complete the 
request..


It really beats the alternative of a forced reboot
due to, say, superblock I/O failing because it happened
to get merged with an unrelated I/O which then failed..
Etc..

Definitely an improvement.

The number of retries is an entirely separate issue.
If we really care about it, then we should fix SD_MAX_RETRIES.

The current value of 5 is *way* too high.  It should be zero or one.

Cheers
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] pipefs unique inode numbers

2007-01-30 Thread Jeff Layton

Jeff Layton wrote:

Bodo Eggert wrote:
 > change pipefs to use a unique inode number equal to the memory
 > address unless it would be truncated.
 >
 > Signed-Off-By: Bodo Eggert <[EMAIL PROTECTED]>
 > ---
 > Tested on i386.
 >
 > --- 2.6.19/fs/pipe.c.ori2007-01-30 22:02:46.0 +0100
 > +++ 2.6.19/fs/pipe.c2007-01-30 23:22:27.0 +0100
 > @@ -864,6 +864,10 @@ static struct inode * get_pipe_inode(voi
 >  inode->i_uid = current->fsuid;
 >  inode->i_gid = current->fsgid;
 >  inode->i_atime = inode->i_mtime = inode->i_ctime = CURRENT_TIME;
 > +/* The address of *inode is unique, so we'll get an unique inode 
number.
 > + * Off cause this will not work for 32 bit inodes on 64 bit 
systems. */

 > +if (sizeof(inode->i_ino) >= sizeof(struct inode*))
 > +inode->i_ino = (unsigned int) inode;
 >
 >  return inode;
 >

Also, that patch would break many 32-bit programs not compiled with large
offsets when run in compatibility mode on a 64-bit kernel. If they were to
do a stat on this inode, it would likely generate an EOVERFLOW error since
the pointer address would probably not fit in a 32 bit field.

That problem was the whole impetus for this set of patches.

Actually, sorry...I misread the patch. It wouldn't have that problem. My
mistake.

Still though, I considered an approach somewhat similar to this early on.
I was thinking of taking a bit-shifted inode address and hashing it to
give a unique value. If you do the math, you can discard the lower 9 bits
of the pointer, so you end up being able to use the lower 41 bits of the
pointer. So a scheme like that could work if you could guarantee that
all inode addresses wouldn't be > 2^41 apart.

The problem is, you can't guarantee that, especially in a NUMA situation.

See the thread entitled:

[RFC][PATCH] ensure i_ino uniqueness in filesystems without
permanent inode numbers (via pointer conversion)

in linux-fsdevel, ~Nov 17th for more info.

-- Jeff
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] scsi_lib.c: continue after MEDIUM_ERROR

2007-01-30 Thread Mark Lord


James Bottomley wrote:

First off, please send SCSI patches to the SCSI list:



Fixed already, thanks!


This patch fixes the behaviour to be similar to what we had originally.

When a bad sector is encounted, SCSI will now work around it again,
failing *only* the bad sector itself.


Erm, but the corollary is that if we get a large read failure because of
a bad track, you're going to try and chunk up it a sector at a time


That's better than the huge data-loss scenario that we currently
have for single-sector errors.  MUCH better.


forcing an individual error for each sector is going to annoy some
people ... particularly removable medium ones which return this error if
the medium isn't present ... Are you sure this is really what we want to
do?


No, for removed-medium everything just fails right away.
This patch is *only* for media errors, not any other failures.

Cheers
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: How many people are using 2.6.16?

2007-01-30 Thread Mark Lord


Adrian Bunk wrote:

On Tue, Jan 30, 2007 at 09:13:00AM +1100, Bron Gondwana wrote:
We do a lot of Cyrus which does a lot of MMAP - and we also use the 
Areca driver - which are both strong reasons to move to 2.6.19.2, but

if the MMAP fix was ported back to 2.6.16 we might consider staying
there instead.


Please correct me if I'm wrong, but as far as I understand the problem
the mmap bug was introduced in 2.6.19.


I believe our featherless leader said he though it was an ancient bug,
exasperated by something that went into 2.6.19.

If Linus's opinion is correct (still?), then the bug exists in all
kernels since somewhere back in the 2.4.xx days.

Linus?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch 0/9] buffered write deadlock fix

2007-01-30 Thread Nick Piggin

On Tue, Jan 30, 2007 at 03:21:19PM -0800, Andrew Morton wrote:
> On Tue, 30 Jan 2007 12:55:58 -0800
> Andrew Morton <[EMAIL PROTECTED]> wrote:
> 
> > y'know, four or five years back I fixed this bug by doing
> > 
> > current->locked_page = page;
> > 
> > in the write() code, and then teaching the pagefault code to avoid locking
> > the same page.  Patch below.
> > 
> > But then evil mean Hugh pointed out that the patch is still vulnerable to
> > ab/ba deadlocking so I dropped it.
> 
> And he was right, of course.  Task A holds file a's i_mutex and takes a
> fault against file b's page.  Task B holds file b's i_mutex and takes a
> fault against file a's page.  Drat.
> 
> I wonder if there's a sane way of preventing that.

If you want to go down the path of carrying state around in task_struct,
you can take the mmap_sem and set a flag, then get_user_pages the source
page and lock both source and destination in ascending order, then your
page fault handler checks the flag and skips mmap_sem, and the rest of
your fault path checks both the page locks you're holding.

At which point you arrive at a horrible mess :)

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Free Linux Driver Development!

2007-01-30 Thread Greg KH

On Tue, Jan 30, 2007 at 05:07:57PM -0800, Roland Dreier wrote:
>  > > To me, it's clear that historically the community hasn't delivered on
>  > 
>  > How is that clear?  As noted in the specific examples I provided, that
>  > is how a large number of popular drivers and subsystems have been
>  > developed.
> 
> Yes, I agree that it often works.  What I'm arguing is that it doesn't
> ALWAYS work.  And Greg is promising (in effect, on my behalf) that "If
> you give us specs, then we WILL have drivers."  As I've said several
> times, I'm all for encouraging vendors to open specs.  The only thing
> I don't like is marketing open specs by making promises that we may
> not be able to keep.

I really think we can keep these promises.  A number of us have been
doing just that for many years now, and I don't see any reason why we
would stop doing that now.

I would _love_ to be inundated with specs, so many that we run out of
developers to work on the devices.  But I really don't see that
happening any time soon, as there's not that many devices that Linux
doesn't already support.

And if such a situation does happen, perhaps I will be able to convince
some distro companies to pony up the development man-power to help us
from going back on our promise...  I know quite a few companies would
love to help out just a "problem" as it is in their best interest to
have Linux support as many devices as possible.

So please, don't be so down on the offer, you don't have to do any work
if you don't want to :)

thanks,

greg k-h
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] pipefs unique inode numbers

2007-01-30 Thread Linus Torvalds

On Tue, 30 Jan 2007, Jeff Layton wrote:
> 
> Also, that patch would break many 32-bit programs not compiled with large
> offsets when run in compatibility mode on a 64-bit kernel. If they were to
> do a stat on this inode, it would likely generate an EOVERFLOW error since
> the pointer address would probably not fit in a 32 bit field.
> 
> That problem was the whole impetus for this set of patches.

Well, we have that problem with the slowly incrementing "last_ino" too.

Should we make "last_ino" be "static unsigned int" instead of "long"?

Does anybody actually even use the old stat() with 32-bit interfaces? We 
warn for it, and we've done so for a long time.. I dont' remember people 
even complaining about the warning, so..

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.20-rc6-mm3

2007-01-30 Thread Andrew Morton

On Wed, 31 Jan 2007 02:16:43 +0100
Tilman Schmidt <[EMAIL PROTECTED]> wrote:

> Am 30.01.2007 23:18 schrieb Maciej Rutecki:
> > Second problem, power button doesn't work. When I pressed it, I has this
> > error:
> > 
> > ACPI Error (evevent-0305): No installed handler for fixed event
> > [0002] [20070126]
> 
> Same here, minus the message. (Or perhaps I just don't know where to look.)
> Problem also exists in 2.6.20-rc6-mm2. With 2.6.20-rc6-git1 the power
> button of this machine works fine.
> 

That's significant - in your case at least the 2.6.20-rc6-mm3 ACPI update
isn't the cause.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Free Linux Driver Development!

2007-01-30 Thread Greg KH

On Wed, Jan 31, 2007 at 02:13:40AM +0100, Adrian Bunk wrote:
> On Tue, Jan 30, 2007 at 11:10:20AM -0800, Greg KH wrote:
> > On Tue, Jan 30, 2007 at 09:45:50AM -0800, Roland Dreier wrote:
> >...
> > > And there are plenty of documented devices that no one cares enough
> > > about to submit a driver for.
> > 
> > Any specific examples?  I have a long list of people who wish to write
> > new drivers but just don't know which hardware is not yet supported.
> >...
> 
> Wrinting a driver for shiny new hardware is cool.
> 
> But understanding and maintaining an already existing driver and working 
> on bug reports for this driver is something not-so-cool that would be 
> required in many areas of the kernel.
> 
> Would someone from your long list of people e.g. be willing to maintain 
> drivers/block/floppy.c ?

What?  Throw a fresh-faced newbie instantly into the tar-pit of despair
that floppy.c is?  Do you want everyone just to run screaming from
kernel development never to be seen again?

:)

Seriously, if you need help with something like this, bring it up on the
kernel-janitors list, there are lots of people there that are willing to
help out with stuff like long-term maintenance and bug fixing but don't
know where to start.

That's also where the majority of the people who have volunteered to
help are also hanging out at.

thanks,

greg k-h
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] Add "is_power_of_2" checking to log2.h.

2007-01-30 Thread Nick Piggin


Robert P. J. Day wrote:

On Tue, 30 Jan 2007, Nick Piggin wrote:



Robert P. J. Day wrote:


 Add the inline function "is_power_of_2()" to log2.h, where the value
zero is *not* considered to be a power of two.



Signed-off-by: Robert P. J. Day <[EMAIL PROTECTED]>

/*
+ *  Determine whether some value is a power of two, where zero is
+ * *not* considered a power of two.
+ */


Why the qualifier? Zero *is* not a power of 2, is it?



no, but it bears repeating since some developers might think it *is*.
if you peruse the current kernel code, you'll find some tests of the
simpler form:

((n & (n - 1)) == 0))

which is clearly testing for "power of twoness" but which will return
true for a value of zero.  that's wrong, and it's why it's emphasized
in the comment.


I would have thought you'd comment the broken ones, but that's just me.

--
SUSE Labs, Novell Inc.
Send instant messages to your online friends http://au.messenger.yahoo.com 


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: MMC: tifm_7xx1 driver detects card but does not make it mountable

2007-01-30 Thread Alex Dubov

Memorystick support is not yet implemented (work in progress).

--- [EMAIL PROTECTED] wrote:

> Running recent kernels the insertion of a MemoryStick into the card
> reader of a Sony Vaio VGNSZ3XWP is detected but the card does not seem
> to be probed for partitions and hence is not made available to
> userspace.
> 
> Here is the message upon card insertion:
> 
> tifm_7xx1: ms card detected in socket 0
> 
> lspci:
> 
> 00:00.0 Host bridge: Intel Corporation Mobile 945GM/PM/GMS/940GML and
> 945GT Express Memory Controller Hub (rev 03)
> 00:02.0 VGA compatible controller: Intel Corporation Mobile
> 945GM/GMS/940GML Express Integrated Graphics Controller (rev 03)
> 00:02.1 Display controller: Intel Corporation Mobile 945GM/GMS/940GML
> Express Integrated Graphics Controller (rev 03)
> 00:1b.0 Audio device: Intel Corporation 82801G (ICH7 Family) High
> Definition Audio Controller (rev 02)
> 00:1c.0 PCI bridge: Intel Corporation 82801G (ICH7 Family) PCI Express
> Port 1 (rev 02)
> 00:1c.1 PCI bridge: Intel Corporation 82801G (ICH7 Family) PCI Express
> Port 2 (rev 02)
> 00:1c.2 PCI bridge: Intel Corporation 82801G (ICH7 Family) PCI Express
> Port 3 (rev 02)
> 00:1c.3 PCI bridge: Intel Corporation 82801G (ICH7 Family) PCI Express
> Port 4 (rev 02)
> 00:1d.0 USB Controller: Intel Corporation 82801G (ICH7 Family) USB UHCI
> #1 (rev 02)
> 00:1d.1 USB Controller: Intel Corporation 82801G (ICH7 Family) USB UHCI
> #2 (rev 02)
> 00:1d.2 USB Controller: Intel Corporation 82801G (ICH7 Family) USB UHCI
> #3 (rev 02)
> 00:1d.3 USB Controller: Intel Corporation 82801G (ICH7 Family) USB UHCI
> #4 (rev 02)
> 00:1d.7 USB Controller: Intel Corporation 82801G (ICH7 Family) USB2 EHCI
> Controller (rev 02)
> 00:1e.0 PCI bridge: Intel Corporation 82801 Mobile PCI Bridge (rev e2)
> 00:1f.0 ISA bridge: Intel Corporation 82801GBM (ICH7-M) LPC Interface
> Bridge (rev 02)
> 00:1f.1 IDE interface: Intel Corporation 82801G (ICH7 Family) IDE
> Controller (rev 02)
> 00:1f.2 IDE interface: Intel Corporation 82801GBM/GHM (ICH7 Family)
> Serial ATA Storage Controller IDE (rev 02)
> 00:1f.3 SMBus: Intel Corporation 82801G (ICH7 Family) SMBus Controller
> (rev 02)
> 06:00.0 Network controller: Intel Corporation PRO/Wireless 3945ABG
> Network Connection (rev 02)
> 07:00.0 Ethernet controller: Marvell Technology Group Ltd. 88E8036 PCI-E
> Fast Ethernet Controller (rev 15)
> 09:04.0 CardBus bridge: Texas Instruments PCIxx12 Cardbus Controller
> 09:04.1 FireWire (IEEE 1394): Texas Instruments PCIxx12 OHCI Compliant
> IEEE 1394 Host Controller
> 09:04.2 Mass storage controller: Texas Instruments 5-in-1 Multimedia
> Card Reader (SD/MMC/MS/MS PRO/xD)
> 
> lsusb:
> 
> Bus 004 Device 001: ID :  
> Bus 004 Device 002: ID 044e:300c Alps Electric Co., Ltd 
> Bus 003 Device 002: ID 0483:2016 SGS Thomson Microelectronics
> Fingerprint Reader
> Bus 003 Device 001: ID :  
> Bus 001 Device 001: ID :  
> Bus 005 Device 001: ID :  
> Bus 005 Device 004: ID 05ca:1830 Ricoh Co., Ltd 
> Bus 005 Device 002: ID 054c:0281 Sony Corp. 
> Bus 002 Device 001: ID :  
> 
> 
> Adam
> 



 

TV dinner still cooling? 
Check out "Tonight's Picks" on Yahoo! TV.
http://tv.yahoo.com/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: ACPI C and P states on Conroe

2007-01-30 Thread Len Brown

On Tuesday 30 January 2007 12:35, Joe Harvell wrote:
> I am trying to enable all the power saving features I can on my Conroe
> E6600.  After much searching on the web, I am a little confused about
> the Linux kernel support for ACPI on the Conroe.
> 
> Here is my setup:
> Intel Core2 Duo E6600
> Asus P5-B Deluxe board (Intel P965).
> I am running a Gentoo kernel based on 2.6.19.4.
> 
> I have managed to enable EIST using cpufreq with the speedstep-centrino
> driver.  But my understanding from browsing the ACPI spec is that this
> is still within C0, i.e. not much power savings.

Right, P-states are effective only when code is executing,
and on this processor (with C1E) will have no effect on idle power.

> Here are my questions:
> 
> 1) For P states, which cpufreq driver should I be using?  I've heard
> speedstep-centrino is deprecated (but only some aspects of it) that are
> being moved into acpi-cpufreq.  But I can't get acpi-cpufreq to load in
> my kernel version.

In 2.6.19 I believe that speedstep-centrino is the one to use.
The transition to acpi-cpufreq happens in 2.6.20.

> Also, I would have thought speedstep-ich would be 
> the driver, just based on the name.

Don't use speedstep-ich.

> How do I know (other than trying 
> all modules to see which one loads) which one I should be using?
> 2) What kind of support for C1-C3 does the Conroe have?  The ACPI spec
> says C2 and C3 require chipset support on the motherboard.  Does P965
> have that.  Does it matter between boards (e.g. P5B)?

I believe that Conroe currently supports just C1 --
this is true for the ones I have.

Internally it is an "Enhanced C1" called C1E where the voltage is reduced
in C1-- but this is transparent to software, which thinks it is just C1.

You can observe this in
/proc/acpi/processor/*/power

> 3) What versions of the kernel support C1-C3 states?  What kernel
> options are germane to this?  What libraries/tools are involved?

C1-C3 have been supported for a long time.
2.6.20 adds a few tweaks to use a more efficient implementation,
but you'll not notice a difference on today's desktop processor.

cheers,
-Len
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] pipefs unique inode numbers

2007-01-30 Thread Jeff Layton


Bodo Eggert wrote:
> change pipefs to use a unique inode number equal to the memory
> address unless it would be truncated.
>
> Signed-Off-By: Bodo Eggert <[EMAIL PROTECTED]>
> ---
> Tested on i386.
>
> --- 2.6.19/fs/pipe.c.ori   2007-01-30 22:02:46.0 +0100
> +++ 2.6.19/fs/pipe.c   2007-01-30 23:22:27.0 +0100
> @@ -864,6 +864,10 @@ static struct inode * get_pipe_inode(voi
>inode->i_uid = current->fsuid;
>inode->i_gid = current->fsgid;
>inode->i_atime = inode->i_mtime = inode->i_ctime = CURRENT_TIME;
> +  /* The address of *inode is unique, so we'll get an unique inode number.
> +   * Off cause this will not work for 32 bit inodes on 64 bit systems. */
> +  if (sizeof(inode->i_ino) >= sizeof(struct inode*))
> +  inode->i_ino = (unsigned int) inode;
>
>return inode;
>

Also, that patch would break many 32-bit programs not compiled with large
offsets when run in compatibility mode on a 64-bit kernel. If they were to
do a stat on this inode, it would likely generate an EOVERFLOW error since
the pointer address would probably not fit in a 32 bit field.

That problem was the whole impetus for this set of patches.

-- Jeff
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 972 matches

Mail list logo