Re: [rfc patch 2/2] direct-io: remove address alignment check

2005-07-14 Thread Badari Pulavarty

Daniel McNeil wrote:


On Thu, 2005-07-14 at 16:16, Badari Pulavarty wrote:


How does your patch ensures that we meet the driver alignment
restrictions ? Like you said, you need atleast "even" byte alignment
for IDE etc..

And also, are there any restrictions on how much the "minimum" IO
size has to be ? I mean, can I read "1" byte ? I guess you are
not relaxing it (yet)..




This patch does not change the i/o size requirements -- they
must be a multiple of device block size (usually 512).

It only relaxes the address alignment restriction.  I do not
know what the driver alignment restrictions are.  Without the
1st patch, it was impossible to relax the address space
check and have direct-io generate the correct i/o's to submit.

This 2nd patch, is just for testing and generating feedback
to find out what the address alignment issues are.  Then
we can decide how to proceed.

Did you look over the 1st patch?  Comments?


Yes. I did look at the first patch and my questions were basically
towards the first patch. I don't see any enforcement of alignment
with your patch at all. So, we let the driver fail if it can't
handle it ?

BTW, I don't think the first patch is really doing the right thing.
You got little carried away while cleaning up.
You are trying to relax "user buffer" alignment only. If your
"offset" is in the middle of a filesystem block (say 4k), you still
need to zero out the first portion to be able to write into the
middle. That "evil" code is still needed. :(

Thanks,
Badari

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: RFC: Hugepage COW

2005-07-14 Thread David Gibson
On Thu, Jul 14, 2005 at 07:00:11PM -0700, Christoph Lameter wrote:
> On Fri, 15 Jul 2005, David Gibson wrote:
> 
> > Well, the COW patch implements a fault handler, obviously.  What
> > specifically where you thinking about?
> 
> About a fault handler of course and about surrounding scalability issues.
> I worked on some hugepage related patches last fall. Have you had a look 
> at the work of Ken, Ray and me on the subject?

I'm still not at all sure what you're getting at.  Do you mean the
demand-allocation patches which were floating around at some point - I
gather they're important for doing sensible NUMA allocation of
hugepages.  They have a small overlap with the COW code, in the fault
handler, but not much.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/people/dgibson
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: PS/2 Keyboard is dead after resume.

2005-07-14 Thread Dmitry Torokhov
On Thursday 14 July 2005 21:35, Andrew Haninger wrote:
> Hello.
> 
> I'm using Linux Kernel 2.6.12.2 plus suspend 2.1.9.9 and acpi-20050408
> with the hibernate-1.10 script. My machine is a Shuttle SK43G which
> has a VIA KM400 chipset with an Athlon XP CPU.
> 
> Suspension seems to work well. However, when I resume, the keyboard is
> dead and there is a warning in dmesg before and after suspension:
> 
> atkbd.c: Spurious ACK on isa0060/serio0. Some program, like XFree86,
> might be trying access hardware directly.

Could you try doing:

echo 1 > /sys/modules/i8042/parameters/debug

before suspending and the post your dmesg, please? Maybe we see something
there.

-- 
Dmitry
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Sandisk Compact Flash

2005-07-14 Thread David Hinds
On Wed, Jul 13, 2005 at 07:04:38PM +0530, [EMAIL PROTECTED] wrote:
> 
> I ma newbie to compactflash driver , I am using mpc862 PPC processor
> on my custom board having 64mb ram running linuxppc-2.4.18 kernel .
> i am using Sandisk Extreme CF 1GB which is 133x high speed, but
> found the performance with other kingston 1GB CF with slower speed ,
> is both same , CF is implemented on pcmcia port , i am not sure what
> is the mode set for transfer , Feature set command is used in which
> it sets the PIO mode or Multiword DMA transfer mode by specifying
> its value in Sector count register , i am not able to understand in
> linux kernel ide driver where this is set , is it by default set ,
> this mode is set or we need to set it , i think we should assign
> this value , right now i am not able to trace this in my code.  ,

All Compact Flash cards, in 16-bit PCMCIA card readers, operate in PIO
mode 1 (polled IO, no DMA), which means you will get only about 1
MB/sec regardless of the card's claimed tranfer speed.  Some cameras
also only support this mode; others will run CF cars in "TrueIDE"
mode, which is required to use the DMA transfer modes.

There are high performance CF card readers that can use TrueIDE mode:
both CardBus ones and Firewire ones.  For example:

http://www.dpreview.com/news/0310/03102103delkincardbustest.asp

It sounds like your card reader is one of the slow 16-bit ones.

-- Dave
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [rfc patch 2/2] direct-io: remove address alignment check

2005-07-14 Thread Badari Pulavarty

Tejun Heo wrote:


Daniel McNeil wrote:


This patch relaxes the direct i/o alignment check so that user addresses
do not have to be a multiple of the device block size.

I've done some preliminary testing and it mostly works on an ext3
file system on a ide disk.  I have seen trouble when the user address
is on an odd byte boundary.  Sometimes the data is read back incorrectly
on read and sometimes I get these kernel error messages:
hda: dma_timer_expiry: dma status == 0x60
hda: DMA timeout retry
hda: timeout waiting for DMA
hda: status error: status=0x58 { DriveReady SeekComplete 
DataRequest }

ide: failed opcode was: unknown
hda: drive not ready for command

Doing direct-io with user addresses on even, non-512 boundaries appears
to be working correctly.

Any additional testing and/or comments welcome.



 Hi, Daniel.

 I don't think the change is a good idea.  We may be able to relax 
alignment contraints on some hardware to certain levels, but IMHO it 
will be very difficult to verify.  All internal block IO code follows 
strict block boundary alignment.  And as raw IOs (especially unaligned 
ones) aren't very common operations, they won't get tested much.  Then 
when some rare (probably not an open source one) application uses it on 
some rare buggy hardware, it may cause *very* strange things.


 Also, I don't think it will improve application programmer's 
convenience.  As each hardware employs different DMA alignemnt, we need 
to implement a way to export the alignment to user space and enforce it. 
  So, in the end, user application must do aligned allocation 
accordingly.  Just following block boundary will be easier.


 I don't know why you wanna relax the alignment requirement, but 
wouldn't it be easier to just write/use block-aligned allocator for such 
buffers?  It will even make the program more portable.




I can imagine a reason for relaxing the alignment. I keep getting asked
whether we can do "O_DIRECT mount option".  Database folks wants to
make sure all the access to files in a given filesystem are O_DIRECT
(whether they are accessing or some random program like ftp, scp, cp
are acessing them). This was mainly to ensure that buffered accesses to
the file doesn't polute the pagecache (while database is using O_DIRECT
access). Seems like a logical request, but not easy to do :(

Thanks,
Badari

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [announce] linux kernel performance project launch at sourceforge.net

2005-07-14 Thread randy_dunlap
On Fri, 15 Jul 2005 00:27:48 +0200 Andi Kleen wrote:

> > > Some oprofile listings from a few of the test runs would be also nice.
> > 
> > That is in the works.  We will upload profile data.  I'm having problem
> > with oprofile on some versions of kernel and that is being investigated
> > right now.
> 
> If you run statically compiled kernels you could as well use the
> old style readprofile.  It just doesn't work with modules.

It can be made to work with modules (and has been)[1],
but I'd just stick with not using modules, given a choice.

---
~Randy

[1] http://developer.osdl.org/rddunlap/modprofile/
(against 2.6.6)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 2.6.13-git] 8250 tweaks

2005-07-14 Thread Sam Song
Russell King <[EMAIL PROTECTED]> wrote:
> Well, in this case, the "whinging" resulted in
> finding a _real_ bug and locating why your ports 
> weren't being found.  So I guess it's
> good for something.

Indeed! The old kernel didn't have such an advantage.

> Can you mail me a diff of the changes you made to
> arch/ppc/platforms/sandpoint.h please?  

Certainly. 

> If that file is being used it seems that you 
> actually have 4 ports defined in total.  However,
> I'm a little confused because the sandpoint.h
> defines don't seem to match your original dmesg 
> output.

Well, I use a sandpoint-based board. Not the same as
the reference one. There are two serial ports on the
board and I enabled them both with IRQ9/10. 
In addition, No 8259 on this board.

Pls don't apply this patch:-)

Thanks,

Sam




Start your day with Yahoo! - make it your home page 
http://www.yahoo.com/r/hs 
 

sandpoint-8250.diff
Description: 2389572820-sandpoint-8250.diff


Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt

2005-07-14 Thread Linus Torvalds


On Fri, 15 Jul 2005, Jesper Juhl wrote:
>
> It's buggy, that I know. setting kernel_hz (the new boot parameter) to 
> 250 causes my system clock to run at something like 4-5 times normal 
> speed

4 times normal. You don't actually make the timer interrupt happen at 
250Hz, so the timer will be programmed to run at the full 1kHz.

You also need to actually change the LATCH define (in 
include/linux/jiffies.h) to take this into account (there might be 
something else too).

So 

/* LATCH is used in the interval timer and ftape setup. */
#define LATCH  ((CLOCK_TICK_RATE + HZ/2) / HZ)  /* For divider */

should become something like

/* LATCH is used in the interval timer and ftape setup. */
#define LATCH  ((CLOCK_TICK_RATE*jiffies_increment + HZ/2) / HZ)  /* 
For divider */

and you might be getting closer.

Of course, you need to make sure that LATCH is used only after 
jiffies_increment is set up. See "setup_pit_timer(void)" in
arch/i386/kernel/timers/timer_pit.c for more details.

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt

2005-07-14 Thread Zwane Mwaikambo
On Fri, 15 Jul 2005, Lee Revell wrote:

> On Fri, 2005-07-15 at 14:08 +1000, Con Kolivas wrote:
> > Audio did show slightly larger max latencies but nothing that would be of 
> > significance.
> > 
> > On video, maximum latencies are only slightly larger at HZ 250, all the 
> > desired cpu was achieved, but the average latency and number of missed 
> > deadlines was significantly higher.
> 
> Because audio timing is driven by the soundcard interrupt while video,
> like MIDI, relies heavily on timers.

In interbench it's not driven by a soundcard interrupt.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: RFC: Hugepage COW

2005-07-14 Thread Christoph Lameter
On Fri, 15 Jul 2005, David Gibson wrote:

> I'm still not at all sure what you're getting at.  Do you mean the
> demand-allocation patches which were floating around at some point - I
> gather they're important for doing sensible NUMA allocation of
> hugepages.  They have a small overlap with the COW code, in the fault
> handler, but not much.

Yes I meant that. I do not have time right now but I will be trying to 
contribute to this if things slow down a bit. Keep me posted.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Filesystem capabilities support

2005-07-14 Thread Nicholas Hans Simmonds
On Fri, Jul 15, 2005 at 05:45:58AM +0200, Jesper Juhl wrote:
> On 7/16/05, Nicholas Hans Simmonds <[EMAIL PROTECTED]> wrote:
> 
> While I'm not qualified to comment on the implementation I do have a
> few small codingstyle comments :-)
> 
> 
> > diff --git a/fs/read_write.c b/fs/read_write.c
> > --- a/fs/read_write.c
> > +++ b/fs/read_write.c
> > @@ -14,6 +14,7 @@
> >  #include 
> >  #include 
> >  #include 
> > +#include 
> > 
> >  #include 
> >  #include 
> > @@ -303,6 +304,16 @@ ssize_t vfs_write(struct file *file, con
> > else
> > ret = do_sync_write(file, buf, count, pos);
> > if (ret > 0) {
> > +#ifdef CONFIG_SECURITY_FS_CAPABILITIES
> > +   struct dentry *d = file->f_dentry;
> > +   if(d->d_inode->i_op && d->d_inode->i_op->
> 
>   if (d->d_inode->i_op ...
> 
> > +   
> > removexattr) {
> > +   down(>d_inode->i_sem);
> > +   d->d_inode->i_op->removexattr(d,
> > +   
> > XATTR_CAP_SET);
> > +   up(>d_inode->i_sem);
> > +   }
> > +#endif /* CONFIG_SECURITY_FS_CAPABILITIES */
> > fsnotify_modify(file->f_dentry);
> > current->wchar += ret;
> > }
> > diff --git a/include/linux/capability.h b/include/linux/capability.h
> > --- a/include/linux/capability.h
> > +++ b/include/linux/capability.h
> > @@ -39,7 +39,19 @@ typedef struct __user_cap_data_struct {
> >  __u32 permitted;
> >  __u32 inheritable;
> >  } __user *cap_user_data_t;
> > -
> > +
> > +struct cap_xattr_data {
> > +   __u32 version;
> > +   __u32 mask_effective;
> > +   __u32 effective;
> > +   __u32 mask_permitted;
> > +   __u32 permitted;
> > +   __u32 mask_inheritable;
> > +   __u32 inheritable;
> > +};
> > +
> > +#define XATTR_CAP_SET XATTR_SECURITY_PREFIX "cap_set"
> > +
> >  #ifdef __KERNEL__
> > 
> >  #include 
> > diff --git a/security/Kconfig b/security/Kconfig
> > --- a/security/Kconfig
> > +++ b/security/Kconfig
> > @@ -60,6 +60,13 @@ config SECURITY_CAPABILITIES
> >   This enables the "default" Linux capabilities functionality.
> >   If you are unsure how to answer this question, answer Y.
> > 
> > +config SECURITY_FS_CAPABILITIES
> > +   bool "Filesystem Capabilities (EXPERIMENTAL)"
> > +   depends on SECURITY && EXPERIMENTAL
> > +   help
> > + This permits a process' capabilities to be set by an extended
> > + attribute in the security namespace (security.cap_set).
> > +
> >  config SECURITY_ROOTPLUG
> > tristate "Root Plug Support"
> > depends on USB && SECURITY
> > diff --git a/security/commoncap.c b/security/commoncap.c
> > --- a/security/commoncap.c
> > +++ b/security/commoncap.c
> > @@ -111,9 +111,15 @@ void cap_capset_set (struct task_struct
> > 
> >  int cap_bprm_set_security (struct linux_binprm *bprm)
> >  {
> > +#ifdef CONFIG_SECURITY_FS_CAPABILITIES
> > +   ssize_t (*bprm_getxattr)(struct dentry *,const char *,void 
> > *,size_t);
> > +   struct dentry *bprm_dentry;
> > +   ssize_t ret;
> > +   struct cap_xattr_data caps;
> > +#endif /* CONFIG_SECURITY_FS_CAPABILITIES */
> > +
> > /* Copied from fs/exec.c:prepare_binprm. */
> > 
> > -   /* We don't have VFS support for capabilities yet */
> > cap_clear (bprm->cap_inheritable);
> > cap_clear (bprm->cap_permitted);
> > cap_clear (bprm->cap_effective);
> > @@ -134,6 +140,44 @@ int cap_bprm_set_security (struct linux_
> > if (bprm->e_uid == 0)
> > cap_set_full (bprm->cap_effective);
> > }
> > +
> > +#ifdef CONFIG_SECURITY_FS_CAPABILITIES
> > +   /* Locate any VFS capabilities: */
> > +
> > +   bprm_dentry = bprm->file->f_dentry;
> > +   if(!(bprm_dentry->d_inode->i_op &&
> 
>   if (!(bprm_dentry->d_inode->i_op ...
> 
> > +   bprm_dentry->d_inode->i_op->getxattr))
> > +   return 0;
> > +   bprm_getxattr = bprm_dentry->d_inode->i_op->getxattr;
> > +
> > +   down(_dentry->d_inode->i_sem);
> > +   ret = bprm_getxattr(bprm_dentry,XATTR_CAP_SET,,sizeof(caps));
> > +   up(_dentry->d_inode->i_sem);
> > +   if(ret == sizeof(caps)) {
> 
>   if (ret == sizeof(caps)) {
> 
> > +   be32_to_cpus();
> > +   be32_to_cpus();
> > +   be32_to_cpus(_effective);
> > +   be32_to_cpus();
> > +   be32_to_cpus(_permitted);
> > +   be32_to_cpus();
> > +   be32_to_cpus(_inheritable);
> > +   if(caps.version 

Patch for ub and blank CDs in 2.6.12

2005-07-14 Thread Pete Zaitcev
This patch fixes a microcode lockup in my CD-ROM adapters when a blank
CD is inserted. However, do not try to burn CDs yet! I'm pretty sure
that trying it will end in coasters.

 - Fix a few cases where we were unable to resynchronize with replies
   for previous commands. The main thing is to keep reading replies
   in case of a stall. This is done with the new state CLRRS.
 - Since I am forgetting the basic state machine already, document it.
 - Move counter increments in the looping path in its own function.
 - Fix a harmless buglet in case CSW read fails to submit: do not
   override state.
 - Implement the Alan Stern's idea for adaptive signature checking.

Signed-off-by: Pete Zaitcev <[EMAIL PROTECTED]>

diff -urp -X dontdiff linux-2.6.12/drivers/block/ub.c 
linux-2.6.12-lem/drivers/block/ub.c
--- linux-2.6.12/drivers/block/ub.c 2005-06-21 12:58:18.0 -0700
+++ linux-2.6.12-lem/drivers/block/ub.c 2005-07-14 21:25:03.0 -0700
@@ -23,6 +23,7 @@
  *  -- Exterminate P3 printks
  *  -- Resove XXX's
  *  -- Redo "benh's retries", perhaps have spin-up code to handle them. V:D=?
+ *  -- CLEAR, CLR2STS, CLRRS seem to be ripe for refactoring.
  */
 #include 
 #include 
@@ -38,6 +39,73 @@
 #define UB_MAJOR 180
 
 /*
+ * The command state machine is the key model for understanding of this driver.
+ *
+ * The general rule is that all transitions are done towards the bottom
+ * of the diagram, thus preventing any loops.
+ *
+ * An exception to that is how the STAT state is handled. A counter allows it
+ * to be re-entered along the path marked with [C].
+ *
+ *   ++
+ *   ! INIT   !
+ *   ++
+ *   !
+ *ub_scsi_cmd_start fails ->--\
+ *   !!
+ *   V!
+ *   ++   !
+ *   ! CMD!   !
+ *   ++   !
+ *   !++  !
+ * was -EPIPE -->>! CLEAR  !  !
+ *   !++  !
+ *   !!   !
+ * was error -->- ! ->\
+ *   !!   !
+ *  /--<-- cmd->dir == NONE ? !   !
+ *  !!!   !
+ *  !V!   !
+ *  !++   !   !
+ *  !! DATA   !   !   !
+ *  !++   !   !
+ *  !!   +-+  !   !
+ *  !  was -EPIPE -->--->! CLR2STS !  !   !
+ *  !!   +-+  !   !
+ *  !!!   !   !
+ *  !!  was error --> ! ->\
+ *  !  was error -->- ! - ! ->\
+ *  !!!   !   !
+ *  !V!   !   !
+ *  \--->++   !   !   !
+ *   ! STAT   !<--/   !   !
+ *  /--->++   !   !
+ *  !!!   !
+ * [C] was -EPIPE -->---\ !   !
+ *  !!  ! !   !
+ *  +< len == 0 ! !   !
+ *  !!  ! !   !
+ *  !  was error -->--!-->\
+ *  !!  ! !   !
+ *  +< bad CSW  ! !   !
+ *  +< bad tag  ! !   !
+ *  !!  V !   !
+ *  !! ++ !   !
+ *  !! ! CLRRS  ! !   !
+ *  !! ++ !   !
+ *  !!  ! !   !
+ *  \--- 

Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt

2005-07-14 Thread Lee Revell
On Fri, 2005-07-15 at 14:08 +1000, Con Kolivas wrote:
> Audio did show slightly larger max latencies but nothing that would be of 
> significance.
> 
> On video, maximum latencies are only slightly larger at HZ 250, all the 
> desired cpu was achieved, but the average latency and number of missed 
> deadlines was significantly higher.

Because audio timing is driven by the soundcard interrupt while video,
like MIDI, relies heavily on timers.

Lee 

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [discuss] Re: NUMA support for dual core Opteron

2005-07-14 Thread yhlu
EFI support in x86-64?

Is EFI only support IA64?

Is acpi in EFI?

YH

On 7/14/05, Andi Kleen <[EMAIL PROTECTED]> wrote:
> On Thu, 14 Jul 2005 20:52:58 -0700
> yhlu <[EMAIL PROTECTED]> wrote:
> 
> > Andi,
> >
> > How do yo think about make x86-64 kernel support openfirmware interface?
> 
> I don't like it. We already have the old x86 BIOS interfaces and ACPI
> and at some point EFI. No need for more.
> 
> -Andi
>
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt

2005-07-14 Thread Con Kolivas
On Fri, 15 Jul 2005 09:25, Linus Torvalds wrote:
> On Thu, 14 Jul 2005, Lee Revell wrote:
> > On Thu, 2005-07-14 at 09:37 -0700, Linus Torvalds wrote:
> > > I have to say, this whole thread has been pretty damn worthless in
> > > general in my not-so-humble opinion.
> >
> > This thread has really gone OT, but to revisit the original issue for a
> > bit, are you still unwilling to consider leaving the default HZ at 1000
> > for 2.6.13?
>
> Yes. I see absolutely no point to it until I actually hear people who have
> actually tried some real load that doesn't work. Dammit, I want a real
> user who says that he can noticeable see his DVD stuttering, not some
> theory.

Disclaimer - This is not proof of a real world dvd stuttering, simply a 
benchmarked result. My code may be crap, but then the real apps out there may 
also be.

Results from interbench v0.21 
(http://ck.kolivas.org/apps/interbench/interbench-0.21.tar.bz2)

2.6.13-rc1 on a pentium4 3.06

HZ=1000:
--- Benchmarking Audio in the presence of loads ---
Latency +/- SD (ms)  Max Latency   % Desired CPU  % Deadlines Met
None  0.012 +/- 0.001960.021 100100
Video  1.28 +/- 0.509   2.01 100100
X 0.289 +/- 0.578  2 100100
Burn  0.014 +/- 0.002  0.023 100100
Write 0.025 +/- 0.0349  0.49 100100
Read   0.02 +/- 0.003830.052 100100
Compile   0.023 +/- 0.007520.054 100100
Memload   0.222 +/- 0.892   9.04 100100

--- Benchmarking Video in the presence of loads ---
Latency +/- SD (ms)  Max Latency   % Desired CPU  % Deadlines Met
None  0.012 +/- 0.001690.023 100100
X  2.55 +/- 2.3718.7 100   75.8
Burn   1.08 +/- 1.0616.7 100   88.2
Write 0.224 +/- 0.215   16.7 100   97.8
Read  0.019 +/- 0.003540.059 100100
Compile4.55 +/- 4.5317.6 100   57.5
Memload 1.3 +/- 1.3451.5 100 88


HZ=250:
--- Benchmarking Audio in the presence of loads ---
Latency +/- SD (ms)  Max Latency   % Desired CPU  % Deadlines Met
None  0.011 +/- 0.001520.022 100100
Video 0.157 +/- 0.398   3.62 100100
X   1.3 +/- 1.824.01 100100
Burn  0.014 +/- 0.001420.026 100100
Write 0.022 +/- 0.0125 0.092 100100
Read  0.021 +/- 0.003660.048 100100
Compile0.03 +/- 0.0469 0.559 100100
Memload   0.144 +/- 0.681   8.05 100100

--- Benchmarking Video in the presence of loads ---
Latency +/- SD (ms)  Max Latency   % Desired CPU  % Deadlines Met
None  5 +/- 4.9916.7 100 54
X  9.98 +/- 8.9420.7 100   31.2
Burn   16.6 +/- 16.616.7 100  0.167
Write  4.11 +/- 4.0816.7 100   60.8
Read   2.55 +/- 2.5316.7 100   73.8
Compile15.6 +/- 15.617.7 1003.5
Memload2.91 +/- 2.9245.4 100   72.5


Audio did show slightly larger max latencies but nothing that would be of 
significance.

On video, maximum latencies are only slightly larger at HZ 250, all the 
desired cpu was achieved, but the average latency and number of missed 
deadlines was significantly higher.

Cheers,
Con


pgp55H3zXUdbU.pgp
Description: PGP signature


Re: [discuss] Re: NUMA support for dual core Opteron

2005-07-14 Thread Andi Kleen
On Thu, 14 Jul 2005 20:52:58 -0700
yhlu <[EMAIL PROTECTED]> wrote:

> Andi,
> 
> How do yo think about make x86-64 kernel support openfirmware interface?

I don't like it. We already have the old x86 BIOS interfaces and ACPI
and at some point EFI. No need for more.

-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[ANNOUNCE] Interbench v0.21

2005-07-14 Thread Con Kolivas

Interbech is a an application is designed to benchmark interactivity in Linux.

Version 0.21 update

http://ck.kolivas.org/apps/interbench/interbench-0.21.tar.bz2


Changes:

Changed the design to run the benchmarked and background loads as separate 
processes that spawn their own threads instead of everything running as a 
thread of the same process. This was suggested to me originally by Ingo 
Molnar who noticed significant slowdown due to conflict over ->mm->mmap_sem, 
invalidating the benchmark results when run in real time mode. This makes a 
large difference to the latencies measured under mem_load particularly when 
running real time benchmarks on a RT-PREEMPT kernel.

Accounting changes to max_latency to only measure the largest latency of a 
single scheduling frame - this makes max_latency much smaller (and probably 
more realistic). Often you may see max latency exactly one frame wide now 
(consistent with one dropped frame) such as 16.7ms on video.

Minor cleanups.

Cheers,
Con


pgpIMsbekm6kw.pgp
Description: PGP signature


Re: [discuss] Re: NUMA support for dual core Opteron

2005-07-14 Thread yhlu
Andi,

How do yo think about make x86-64 kernel support openfirmware interface?

Can we borrow some code from ppc64 arch?

YH


On 7/14/05, Andi Kleen <[EMAIL PROTECTED]> wrote:
> On Thu, Jul 14, 2005 at 07:46:49PM -0700, yhlu wrote:
> > p.s. can you use powernow when acpi is disabled?
> 
> Only on uniprocessor machines.
> 
> > p.s.s  Is powerpc64 support ACPI? or ACPI is only can be used by x86?
> 
> powerpc64 uses openfirmware, not ACPI.
> 
> -Andi
>
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Filesystem capabilities support

2005-07-14 Thread Jesper Juhl
On 7/16/05, Nicholas Hans Simmonds <[EMAIL PROTECTED]> wrote:

While I'm not qualified to comment on the implementation I do have a
few small codingstyle comments :-)


> diff --git a/fs/read_write.c b/fs/read_write.c
> --- a/fs/read_write.c
> +++ b/fs/read_write.c
> @@ -14,6 +14,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
> 
>  #include 
>  #include 
> @@ -303,6 +304,16 @@ ssize_t vfs_write(struct file *file, con
> else
> ret = do_sync_write(file, buf, count, pos);
> if (ret > 0) {
> +#ifdef CONFIG_SECURITY_FS_CAPABILITIES
> +   struct dentry *d = file->f_dentry;
> +   if(d->d_inode->i_op && d->d_inode->i_op->

  if (d->d_inode->i_op ...

> +   removexattr) {
> +   down(>d_inode->i_sem);
> +   d->d_inode->i_op->removexattr(d,
> +   
> XATTR_CAP_SET);
> +   up(>d_inode->i_sem);
> +   }
> +#endif /* CONFIG_SECURITY_FS_CAPABILITIES */
> fsnotify_modify(file->f_dentry);
> current->wchar += ret;
> }
> diff --git a/include/linux/capability.h b/include/linux/capability.h
> --- a/include/linux/capability.h
> +++ b/include/linux/capability.h
> @@ -39,7 +39,19 @@ typedef struct __user_cap_data_struct {
>  __u32 permitted;
>  __u32 inheritable;
>  } __user *cap_user_data_t;
> -
> +
> +struct cap_xattr_data {
> +   __u32 version;
> +   __u32 mask_effective;
> +   __u32 effective;
> +   __u32 mask_permitted;
> +   __u32 permitted;
> +   __u32 mask_inheritable;
> +   __u32 inheritable;
> +};
> +
> +#define XATTR_CAP_SET XATTR_SECURITY_PREFIX "cap_set"
> +
>  #ifdef __KERNEL__
> 
>  #include 
> diff --git a/security/Kconfig b/security/Kconfig
> --- a/security/Kconfig
> +++ b/security/Kconfig
> @@ -60,6 +60,13 @@ config SECURITY_CAPABILITIES
>   This enables the "default" Linux capabilities functionality.
>   If you are unsure how to answer this question, answer Y.
> 
> +config SECURITY_FS_CAPABILITIES
> +   bool "Filesystem Capabilities (EXPERIMENTAL)"
> +   depends on SECURITY && EXPERIMENTAL
> +   help
> + This permits a process' capabilities to be set by an extended
> + attribute in the security namespace (security.cap_set).
> +
>  config SECURITY_ROOTPLUG
> tristate "Root Plug Support"
> depends on USB && SECURITY
> diff --git a/security/commoncap.c b/security/commoncap.c
> --- a/security/commoncap.c
> +++ b/security/commoncap.c
> @@ -111,9 +111,15 @@ void cap_capset_set (struct task_struct
> 
>  int cap_bprm_set_security (struct linux_binprm *bprm)
>  {
> +#ifdef CONFIG_SECURITY_FS_CAPABILITIES
> +   ssize_t (*bprm_getxattr)(struct dentry *,const char *,void *,size_t);
> +   struct dentry *bprm_dentry;
> +   ssize_t ret;
> +   struct cap_xattr_data caps;
> +#endif /* CONFIG_SECURITY_FS_CAPABILITIES */
> +
> /* Copied from fs/exec.c:prepare_binprm. */
> 
> -   /* We don't have VFS support for capabilities yet */
> cap_clear (bprm->cap_inheritable);
> cap_clear (bprm->cap_permitted);
> cap_clear (bprm->cap_effective);
> @@ -134,6 +140,44 @@ int cap_bprm_set_security (struct linux_
> if (bprm->e_uid == 0)
> cap_set_full (bprm->cap_effective);
> }
> +
> +#ifdef CONFIG_SECURITY_FS_CAPABILITIES
> +   /* Locate any VFS capabilities: */
> +
> +   bprm_dentry = bprm->file->f_dentry;
> +   if(!(bprm_dentry->d_inode->i_op &&

  if (!(bprm_dentry->d_inode->i_op ...

> +   bprm_dentry->d_inode->i_op->getxattr))
> +   return 0;
> +   bprm_getxattr = bprm_dentry->d_inode->i_op->getxattr;
> +
> +   down(_dentry->d_inode->i_sem);
> +   ret = bprm_getxattr(bprm_dentry,XATTR_CAP_SET,,sizeof(caps));
> +   up(_dentry->d_inode->i_sem);
> +   if(ret == sizeof(caps)) {

  if (ret == sizeof(caps)) {

> +   be32_to_cpus();
> +   be32_to_cpus();
> +   be32_to_cpus(_effective);
> +   be32_to_cpus();
> +   be32_to_cpus(_permitted);
> +   be32_to_cpus();
> +   be32_to_cpus(_inheritable);
> +   if(caps.version == _LINUX_CAPABILITY_VERSION) {

  if (caps.version ...

> +   cap_t(bprm->cap_effective) &= caps.mask_effective;
> +   cap_t(bprm->cap_effective) |= caps.effective;
> +
> +   cap_t(bprm->cap_permitted) &= caps.mask_permitted;
> +   

Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt

2005-07-14 Thread Jesper Juhl

Linus Torvalds wrote:


On Thu, 14 Jul 2005, Lee Revell wrote:


I don't think this will fly because we take a big performance hit by
calculating HZ at runtime.



I think it might be an acceptable solution for a distribution that really 
needed it, since it should be fairly simple. However, it's definitely not 
the right solution.


HOWEVER. I bet that somebody who really really cares (hint hint) could
easily make HZ be 1000, and then dynamically tweak the divisor at bootup
to be either 1000, 250, or 100, and then increment "jiffies" by 1, 4 or
10.

My wild guess is that this is 20 lines of code, plus another 20 for 
"setup", so that you can choose between 100/250/1000 Hz with a kernel 
command line.


It wouldn't be "dynamic" at first - you'd just set it up at bootup, and 
set a "jiffies_increment" variable, and change the


jiffies_64++;

into

jiffies_64 += jiffies_increment;

and you'd be done. 

Really. I dare you guys. First one to send me a tested patch gets a gold 
star.




I don't know if this will earn me that gold star or not, but it's what I 
have at this point.
It's buggy, that I know. setting kernel_hz (the new boot parameter) to 
250 causes my system clock to run at something like 4-5 times normal 
speed, and key repeat is all funny (super fast) and various other nasty 
things - however, it does build and it does boot (i386 only as that's 
all I have) and apart from the kernels notion of time being way off it 
does seem to be a step in the right direction.
Me never having looked at the kernels timer code before is most likely 
the explanation for the bugs - that and the fact that I didn't try to 
tackle the bogomips calculation at all...  Ohh, and it'll probably bee 
even more strange if you build the kernel with CONFIG_HZ set to anything 
other than 1000 - if this is what we want to do, then I guess CONFIG_HZ 
needs to go away completely and just always be 1000 and then people 
should use the boot option to modify if needed/wanted.


Anyway, is this somewhere along the lines of what you were thinking?

The patch is below, but I've also attached it since I suspect 
thunderbird will mangle it.



diff -upr linux-2.6.13-rc3-orig/arch/i386/kernel/time.c 
linux-2.6.13-rc3/arch/i386/kernel/time.c
--- linux-2.6.13-rc3-orig/arch/i386/kernel/time.c	2005-07-14 
20:33:35.0 +0200
+++ linux-2.6.13-rc3/arch/i386/kernel/time.c	2005-07-15 
04:02:50.0 +0200

@@ -75,6 +75,7 @@ int pit_latch_buggy;  /* ext
 #include "do_timer.h"

 u64 jiffies_64 = INITIAL_JIFFIES;
+u16 jiffies_increment = 1;

 EXPORT_SYMBOL(jiffies_64);

@@ -481,3 +482,27 @@ void __init time_init(void)

time_init_hook();
 }
+
+static int __init jiffies_increment_setup(char *str)
+{
+   printk(KERN_NOTICE "setting up jiffies_increment : ");
+   if (str) {
+   printk("kernel_hz = %s, ", str);
+   } else {
+   printk("kernel_hz is unset, ");
+   }
+   if (!strncmp("100", str, 3)) {
+   jiffies_increment = 10;
+   printk("jiffies_increment set to 10, effective HZ will be 
100\n");
+   } else if (!strncmp("250", str, 3)) {
+   jiffies_increment = 4;
+   printk("jiffies_increment set to 4, effective HZ will be 
250\n");
+   } else {
+   jiffies_increment = 1;
+   printk("jiffies_increment set to 1, effective HZ will be 
1000\n");
+   }
+
+   return 1;
+}
+
+__setup("kernel_hz=", jiffies_increment_setup);
diff -upr linux-2.6.13-rc3-orig/arch/i386/kernel/timers/timer_pm.c 
linux-2.6.13-rc3/arch/i386/kernel/timers/timer_pm.c
--- linux-2.6.13-rc3-orig/arch/i386/kernel/timers/timer_pm.c	2005-07-14 
20:33:35.0 +0200
+++ linux-2.6.13-rc3/arch/i386/kernel/timers/timer_pm.c	2005-07-15 
02:59:39.0 +0200

@@ -176,7 +176,7 @@ static void mark_offset_pmtmr(void)

/* compensate for lost ticks */
if (lost >= 2)
-   jiffies_64 += lost - 1;
+   jiffies_64 += (lost * jiffies_increment) - 1;

/* don't calculate delay for first run,
   or if we've got less then a tick */
diff -upr linux-2.6.13-rc3-orig/arch/i386/kernel/timers/timer_tsc.c 
linux-2.6.13-rc3/arch/i386/kernel/timers/timer_tsc.c
--- linux-2.6.13-rc3-orig/arch/i386/kernel/timers/timer_tsc.c	2005-07-14 
20:33:35.0 +0200
+++ linux-2.6.13-rc3/arch/i386/kernel/timers/timer_tsc.c	2005-07-15 
02:59:13.0 +0200

@@ -193,7 +193,7 @@ static void mark_offset_tsc_hpet(void)
offset = hpet_readl(HPET_T0_CMP) - hpet_tick;
if (unlikely(((offset - hpet_last) > hpet_tick) && (hpet_last != 0))) {
int lost_ticks = (offset - hpet_last) / hpet_tick;
-   jiffies_64 += lost_ticks;
+   jiffies_64 += lost_ticks * jiffies_increment;
}
hpet_last = hpet_current;

@@ -415,7 +415,7 @@ static void mark_offset_tsc(void)
lost = delta/(100/HZ);
delay = delta%(100/HZ);

Re: [PATCH] ramfs: pretend dirent sizes

2005-07-14 Thread Andrew Morton
Jan Blunck <[EMAIL PROTECTED]> wrote:
>
> This patch adds bogo dirent sizes for ramfs like already available for 
> tmpfs.
> 
> Although i_size of directories isn't covered by the POSIX standard it is 
> a bad idea to always set it to zero. Therefore pretend a bogo dirent 
> size for directory i_sizes.
> 

Does it really matter?

+static int ramfs_link(struct dentry *old_dentry, struct inode *dir, struct 
dentry *dentry)
> +{
> + dir->i_size += BOGO_DIRENT_SIZE;
> + return simple_link(old_dentry, dir, dentry);
> +}
> +
> +static int ramfs_unlink(struct inode *dir, struct dentry *dentry)
> +{
> + dir->i_size -= BOGO_DIRENT_SIZE;
> + return simple_unlink(dir, dentry);
> +}
> +
> +static int ramfs_rmdir(struct inode *dir, struct dentry *dentry)
> +{
> + int ret;
> +
> + ret = simple_rmdir(dir, dentry);
> + if (ret != -ENOTEMPTY)
> + dir->i_size -= BOGO_DIRENT_SIZE;
> +
> + return ret;
> +}
> +
> +static int ramfs_rename(struct inode *old_dir, struct dentry *old_dentry,
> + struct inode *new_dir, struct dentry *new_dentry)
> +{
> + int ret;
> +
> + ret = simple_rename(old_dir, old_dentry, new_dir, new_dentry);
> + if (ret != -ENOTEMPTY) {
> + old_dir->i_size -= BOGO_DIRENT_SIZE;
> + new_dir->i_size += BOGO_DIRENT_SIZE;
> + }
> +
> + return ret;
> +}
> +

I wonder if these should be in libfs - sysfs has the same problem, for
example and someone might want to come along and fix that up too.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Filesystem capabilities support

2005-07-14 Thread Nicholas Hans Simmonds
On Thu, Jul 14, 2005 at 04:05:17PM -0400, Horst von Brand wrote:
> Nicholas Hans Simmonds <[EMAIL PROTECTED]> wrote:
> 
> [...]
> 
> > Other than this, what are the general thoughts about this method as
> > opposed to just using a well defined byte order?
> 
> I'd prefer a defined byte order. That way it won't bite too hard if I
> happen to move a filesystem (image) from PC to SPARC or whatever.
> -- 
> Dr. Horst H. von Brand   User #22616 counter.li.org
> Departamento de Informatica Fono: +56 32 654431
> Universidad Tecnica Federico Santa Maria  +56 32 654239
> Casilla 110-V, Valparaiso, ChileFax:  +56 32 797513

Fair enough. With inotify now in Linus' tree, my patch will conflict
so I've fixed this in the following new patch which also stores the
xattr data in big-endian format. I've tested it this time and it
seems to work. Also if anyone can think of a neater method of byte-
swapping the structure (some sort of string operation?) I'd be glad
to hear it as the current code looks a bit ugly.

Thanks as ever,

Nicholas

diff --git a/fs/read_write.c b/fs/read_write.c
--- a/fs/read_write.c
+++ b/fs/read_write.c
@@ -14,6 +14,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -303,6 +304,16 @@ ssize_t vfs_write(struct file *file, con
else
ret = do_sync_write(file, buf, count, pos);
if (ret > 0) {
+#ifdef CONFIG_SECURITY_FS_CAPABILITIES
+   struct dentry *d = file->f_dentry;
+   if(d->d_inode->i_op && d->d_inode->i_op->
+   removexattr) {
+   down(>d_inode->i_sem);
+   d->d_inode->i_op->removexattr(d,
+   XATTR_CAP_SET);
+   up(>d_inode->i_sem);
+   }
+#endif /* CONFIG_SECURITY_FS_CAPABILITIES */
fsnotify_modify(file->f_dentry);
current->wchar += ret;
}
diff --git a/include/linux/capability.h b/include/linux/capability.h
--- a/include/linux/capability.h
+++ b/include/linux/capability.h
@@ -39,7 +39,19 @@ typedef struct __user_cap_data_struct {
 __u32 permitted;
 __u32 inheritable;
 } __user *cap_user_data_t;
-  
+
+struct cap_xattr_data {
+   __u32 version;
+   __u32 mask_effective;
+   __u32 effective;
+   __u32 mask_permitted;
+   __u32 permitted;
+   __u32 mask_inheritable;
+   __u32 inheritable;
+};
+
+#define XATTR_CAP_SET XATTR_SECURITY_PREFIX "cap_set"
+
 #ifdef __KERNEL__
 
 #include 
diff --git a/security/Kconfig b/security/Kconfig
--- a/security/Kconfig
+++ b/security/Kconfig
@@ -60,6 +60,13 @@ config SECURITY_CAPABILITIES
  This enables the "default" Linux capabilities functionality.
  If you are unsure how to answer this question, answer Y.
 
+config SECURITY_FS_CAPABILITIES
+   bool "Filesystem Capabilities (EXPERIMENTAL)"
+   depends on SECURITY && EXPERIMENTAL
+   help
+ This permits a process' capabilities to be set by an extended
+ attribute in the security namespace (security.cap_set).
+
 config SECURITY_ROOTPLUG
tristate "Root Plug Support"
depends on USB && SECURITY
diff --git a/security/commoncap.c b/security/commoncap.c
--- a/security/commoncap.c
+++ b/security/commoncap.c
@@ -111,9 +111,15 @@ void cap_capset_set (struct task_struct 
 
 int cap_bprm_set_security (struct linux_binprm *bprm)
 {
+#ifdef CONFIG_SECURITY_FS_CAPABILITIES
+   ssize_t (*bprm_getxattr)(struct dentry *,const char *,void *,size_t);
+   struct dentry *bprm_dentry;
+   ssize_t ret;
+   struct cap_xattr_data caps;
+#endif /* CONFIG_SECURITY_FS_CAPABILITIES */
+
/* Copied from fs/exec.c:prepare_binprm. */
 
-   /* We don't have VFS support for capabilities yet */
cap_clear (bprm->cap_inheritable);
cap_clear (bprm->cap_permitted);
cap_clear (bprm->cap_effective);
@@ -134,6 +140,44 @@ int cap_bprm_set_security (struct linux_
if (bprm->e_uid == 0)
cap_set_full (bprm->cap_effective);
}
+
+#ifdef CONFIG_SECURITY_FS_CAPABILITIES
+   /* Locate any VFS capabilities: */
+
+   bprm_dentry = bprm->file->f_dentry;
+   if(!(bprm_dentry->d_inode->i_op &&
+   bprm_dentry->d_inode->i_op->getxattr))
+   return 0;
+   bprm_getxattr = bprm_dentry->d_inode->i_op->getxattr;
+
+   down(_dentry->d_inode->i_sem);
+   ret = bprm_getxattr(bprm_dentry,XATTR_CAP_SET,,sizeof(caps));
+   up(_dentry->d_inode->i_sem);
+   if(ret == sizeof(caps)) {
+   be32_to_cpus();
+   be32_to_cpus();
+   

RFC: IOMMU bypass

2005-07-14 Thread Stephen Rothwell
Hi all,

We (Anton Blanchard and others) have been trying to figure out the best
(or any) way to allow for IOMMU bypass when setting up DMA mappings on
particular devices.  Our current idea is to hang a structure of pointers
to DMA mapping operations off the struct device and inherit it from the
device's parent.  This would allow for per-bus (rather than per-bus_type)
mapping operations and also allow a driver to override the bus's
operations for a particular device.

Does this make sense?  Comments (hopefully consructive) please.

Is there a better/simpler/more sensible way to do this?

-- 
Cheers,
Stephen Rothwell[EMAIL PROTECTED]
http://www.canb.auug.org.au/~sfr/


pgpKYMmGXqHrP.pgp
Description: PGP signature


Re: [discuss] Re: NUMA support for dual core Opteron

2005-07-14 Thread Andi Kleen
On Thu, Jul 14, 2005 at 07:46:49PM -0700, yhlu wrote:
> p.s. can you use powernow when acpi is disabled?

Only on uniprocessor machines.

> p.s.s  Is powerpc64 support ACPI? or ACPI is only can be used by x86?

powerpc64 uses openfirmware, not ACPI.

-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.13-rc3 ACPI regression and hang on x86-64

2005-07-14 Thread Yu, Luming
On Thursday 14 July 2005 18:36, Mikael Pettersson wrote:
> On my x86-64 laptop (Targa Visionary 811: Athlon64 + VIA chipset,
> Arima OEM:d HW also sold by eMachines and others), ACPI is broken
> and hangs the x86-64 2.6.13-rc3 kernel.
>
> During boot, ACPI reduces the screen's brightness (it's always
> done this in the x86-64 kernels but not the i386 ones), so I
> have to press a specific key combination (Fn+F8) to increase the
> brightness. This worked up to and including the 2.6.13-rc2 kernel,
> but with 2.6.13-rc3 it causes an error message:
>
> acpi_ec-0217 [04] acpi_ec_leave_burst_mo: --->status fail
This message is a warning.

>
> on the console, and then the machine is hung hard.
If you didn't press that key,   the machine still hung?

>
> With the i386 kernel, both this key combination and the other one
> for reducing the brightness work as expected.
>
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] ramfs: pretend dirent sizes

2005-07-14 Thread Jan Blunck
This patch adds bogo dirent sizes for ramfs like already available for 
tmpfs.


Although i_size of directories isn't covered by the POSIX standard it is 
a bad idea to always set it to zero. Therefore pretend a bogo dirent 
size for directory i_sizes.


Jan

Signed-off-by: Jan Blunck <[EMAIL PROTECTED]>

 fs/ramfs/inode.c |   51 +++
 1 files changed, 47 insertions(+), 4 deletions(-)

Index: linux-2.6/fs/ramfs/inode.c
===
--- linux-2.6.orig/fs/ramfs/inode.c
+++ linux-2.6/fs/ramfs/inode.c
@@ -38,6 +38,9 @@
 /* some random number */
 #define RAMFS_MAGIC	0x858458f6
 
+/* Pretend that each entry is of this size in directory's i_size */
+#define BOGO_DIRENT_SIZE 20
+
 static struct super_operations ramfs_ops;
 static struct address_space_operations ramfs_aops;
 static struct inode_operations ramfs_file_inode_operations;
@@ -77,6 +80,7 @@ struct inode *ramfs_get_inode(struct sup
 
 			/* directory inodes start off with i_nlink == 2 (for "." entry) */
 			inode->i_nlink++;
+			inode->i_size = 2 * BOGO_DIRENT_SIZE;
 			break;
 		case S_IFLNK:
 			inode->i_op = _symlink_inode_operations;
@@ -97,6 +101,7 @@ ramfs_mknod(struct inode *dir, struct de
 	int error = -ENOSPC;
 
 	if (inode) {
+		dir->i_size += BOGO_DIRENT_SIZE;
 		if (dir->i_mode & S_ISGID) {
 			inode->i_gid = dir->i_gid;
 			if (S_ISDIR(mode))
@@ -132,6 +137,7 @@ static int ramfs_symlink(struct inode * 
 		int l = strlen(symname)+1;
 		error = page_symlink(inode, symname, l);
 		if (!error) {
+			dir->i_size += BOGO_DIRENT_SIZE;
 			if (dir->i_mode & S_ISGID)
 inode->i_gid = dir->i_gid;
 			d_instantiate(dentry, inode);
@@ -142,6 +148,43 @@ static int ramfs_symlink(struct inode * 
 	return error;
 }
 
+static int ramfs_link(struct dentry *old_dentry, struct inode *dir, struct dentry *dentry)
+{
+	dir->i_size += BOGO_DIRENT_SIZE;
+	return simple_link(old_dentry, dir, dentry);
+}
+
+static int ramfs_unlink(struct inode *dir, struct dentry *dentry)
+{
+	dir->i_size -= BOGO_DIRENT_SIZE;
+	return simple_unlink(dir, dentry);
+}
+
+static int ramfs_rmdir(struct inode *dir, struct dentry *dentry)
+{
+	int ret;
+
+	ret = simple_rmdir(dir, dentry);
+	if (ret != -ENOTEMPTY)
+		dir->i_size -= BOGO_DIRENT_SIZE;
+
+	return ret;
+}
+
+static int ramfs_rename(struct inode *old_dir, struct dentry *old_dentry,
+			struct inode *new_dir, struct dentry *new_dentry)
+{
+	int ret;
+
+	ret = simple_rename(old_dir, old_dentry, new_dir, new_dentry);
+	if (ret != -ENOTEMPTY) {
+		old_dir->i_size -= BOGO_DIRENT_SIZE;
+		new_dir->i_size += BOGO_DIRENT_SIZE;
+	}
+
+	return ret;
+}
+
 static struct address_space_operations ramfs_aops = {
 	.readpage	= simple_readpage,
 	.prepare_write	= simple_prepare_write,
@@ -164,13 +207,13 @@ static struct inode_operations ramfs_fil
 static struct inode_operations ramfs_dir_inode_operations = {
 	.create		= ramfs_create,
 	.lookup		= simple_lookup,
-	.link		= simple_link,
-	.unlink		= simple_unlink,
+	.link		= ramfs_link,
+	.unlink		= ramfs_unlink,
 	.symlink	= ramfs_symlink,
 	.mkdir		= ramfs_mkdir,
-	.rmdir		= simple_rmdir,
+	.rmdir		= ramfs_rmdir,
 	.mknod		= ramfs_mknod,
-	.rename		= simple_rename,
+	.rename		= ramfs_rename,
 };
 
 static struct super_operations ramfs_ops = {


[PATCH] I2C-MPC: Restore code removed

2005-07-14 Thread Kumar Gala
Greg,

If we can get this to Linus before 2.6.13 is released that would be a good 
thing.

I2C-MPC: Restore code removed

A previous patch to remove support for the OCP device model was way
to generious and moved some of the platform device model code, oops.

Signed-off-by: Kumar Gala <[EMAIL PROTECTED]>

---
commit 0a224850142b1169b5a67735e51268057d24b833
tree fd8ba3598b6ba6585e2dd90c2d53138d7e6c7d17
parent 1eccf76f7b943d1e4b722c7f0847876127cad8b7
author Kumar K. Gala <[EMAIL PROTECTED]> Thu, 14 Jul 2005 21:41:36 -0500
committer Kumar K. Gala <[EMAIL PROTECTED]> Thu, 14 Jul 2005 21:41:36 -0500

 drivers/i2c/busses/i2c-mpc.c |   94 ++
 1 files changed, 94 insertions(+), 0 deletions(-)

diff --git a/drivers/i2c/busses/i2c-mpc.c b/drivers/i2c/busses/i2c-mpc.c
--- a/drivers/i2c/busses/i2c-mpc.c
+++ b/drivers/i2c/busses/i2c-mpc.c
@@ -288,6 +288,100 @@ static struct i2c_adapter mpc_ops = {
.retries = 1
 };
 
+static int fsl_i2c_probe(struct device *device)
+{
+   int result = 0;
+   struct mpc_i2c *i2c;
+   struct platform_device *pdev = to_platform_device(device);
+   struct fsl_i2c_platform_data *pdata;
+   struct resource *r = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+
+   pdata = (struct fsl_i2c_platform_data *) pdev->dev.platform_data;
+
+   if (!(i2c = kmalloc(sizeof(*i2c), GFP_KERNEL))) {
+   return -ENOMEM;
+   }
+   memset(i2c, 0, sizeof(*i2c));
+
+   i2c->irq = platform_get_irq(pdev, 0);
+   i2c->flags = pdata->device_flags;
+   init_waitqueue_head(>queue);
+
+   i2c->base = ioremap((phys_addr_t)r->start, MPC_I2C_REGION);
+
+   if (!i2c->base) {
+   printk(KERN_ERR "i2c-mpc - failed to map controller\n");
+   result = -ENOMEM;
+   goto fail_map;
+   }
+
+   if (i2c->irq != 0)
+   if ((result = request_irq(i2c->irq, mpc_i2c_isr,
+ SA_SHIRQ, "i2c-mpc", i2c)) < 0) {
+   printk(KERN_ERR
+  "i2c-mpc - failed to attach interrupt\n");
+   goto fail_irq;
+   }
+
+   mpc_i2c_setclock(i2c);
+   dev_set_drvdata(device, i2c);
+
+   i2c->adap = mpc_ops;
+   i2c_set_adapdata(>adap, i2c);
+   i2c->adap.dev.parent = >dev;
+   if ((result = i2c_add_adapter(>adap)) < 0) {
+   printk(KERN_ERR "i2c-mpc - failed to add adapter\n");
+   goto fail_add;
+   }
+
+   return result;
+
+  fail_add:
+   if (i2c->irq != 0)
+   free_irq(i2c->irq, NULL);
+  fail_irq:
+   iounmap(i2c->base);
+  fail_map:
+   kfree(i2c);
+   return result;
+};
+
+static int fsl_i2c_remove(struct device *device)
+{
+   struct mpc_i2c *i2c = dev_get_drvdata(device);
+
+   i2c_del_adapter(>adap);
+   dev_set_drvdata(device, NULL);
+
+   if (i2c->irq != 0)
+   free_irq(i2c->irq, i2c);
+
+   iounmap(i2c->base);
+   kfree(i2c);
+   return 0;
+};
+
+/* Structure for a device driver */
+static struct device_driver fsl_i2c_driver = {
+   .name = "fsl-i2c",
+   .bus = _bus_type,
+   .probe = fsl_i2c_probe,
+   .remove = fsl_i2c_remove,
+};
+
+static int __init fsl_i2c_init(void)
+{
+   return driver_register(_i2c_driver);
+}
+
+static void __exit fsl_i2c_exit(void)
+{
+   driver_unregister(_i2c_driver);
+}
+
+module_init(fsl_i2c_init);
+module_exit(fsl_i2c_exit);
+
 MODULE_AUTHOR("Adrian Cox <[EMAIL PROTECTED]>");
 MODULE_DESCRIPTION
 ("I2C-Bus adapter for MPC107 bridge and MPC824x/85xx/52xx processors");
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [discuss] Re: NUMA support for dual core Opteron

2005-07-14 Thread yhlu
I didn't see any problem about NUMA with LinuxBIOS + 8way dual core system.
of couse the acpi support in Kernel is disabled.

p.s. can you use powernow when acpi is disabled?
p.s.s  Is powerpc64 support ACPI? or ACPI is only can be used by x86?

YH

On 7/14/05, Andi Kleen <[EMAIL PROTECTED]> wrote:
> [closed mailing list dropped. Sorry I have no plans to argue with
> your mailbots]
> 
> On Thu, Jul 14, 2005 at 01:00:01PM -0600, Ronald G. Minnich wrote:
> > if there is any chance of getting along without ACPI entries that is best.
> > Linux did do this once already, for SMP K8: K8 can boot and run NUMA
> > without an SRAT table. What more is needed for dual core, and could Linux
> > support in this area be extended?
> 
> The dual core NUMA parsing problem could be probably fixed. I personally
> have no plans to work on it though, since the ACPI method works fine.
> 
> Feel free to submit patches.
> 
> However with 90+W CPUs I would strongly recommend having support
> for PowerNow! and the old style PST table doesn't support
> dual core or SMP, so you need ACPI for that anyways.
> 
> -Andi
>
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


PS/2 Keyboard is dead after resume.

2005-07-14 Thread Andrew Haninger
Hello.

I'm using Linux Kernel 2.6.12.2 plus suspend 2.1.9.9 and acpi-20050408
with the hibernate-1.10 script. My machine is a Shuttle SK43G which
has a VIA KM400 chipset with an Athlon XP CPU.

Suspension seems to work well. However, when I resume, the keyboard is
dead and there is a warning in dmesg before and after suspension:

atkbd.c: Spurious ACK on isa0060/serio0. Some program, like XFree86,
might be trying access hardware directly.
Please include the following information in bug reports:
- SUSPEND core   : 2.1.9.9
- Kernel Version : 2.6.12.2
- Compiler vers. : 3.3
- Attempt number : 1
- Pageset sizes  : 5821 (5821 low) and 118350 (118350 low).
- Parameters : 0 32 0 1 0 5
- Calculations   : Image size: 124376. Ram to suspend: 2240.
- Limits : 126960 pages RAM. Initial boot: 123894.
- Overall expected compression percentage: 0.
- Compressor lzf enabled.
  Compressed 508604416 bytes into 23739845 (95 percent compression).
- Swapwriter active.
  Swap available for image: 487964 pages.
- Filewriter inactive.
- Preemptive kernel.
- Max extents used: 4
- I/O speed: Write 251 MB/s, Read 198 MB/s.
Resume block device is defe0860.
Real Time Clock Driver v1.12
atkbd.c: Spurious ACK on isa0060/serio0. Some program, like XFree86,
might be trying access hardware directly.


This machine doesn't have XFree86 on it.

I am presuming that this is a bug since I've used the exact same
kernel+patches (with hibernate 1.09 script) on another machine without
issues. I'm not sure if it's a suspension bug or if it's a kernel bug
that is brought to light by the suspend2 patches. If I'm wrong and
I've made a mistake, I'd love to hear it.

Thanks.

-Andy
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt

2005-07-14 Thread Fernando Lopez-Lezcano
On Thu, 2005-07-14 at 16:49, Linus Torvalds wrote:
> On Thu, 14 Jul 2005, Lee Revell wrote:
> > 
> > And I'm incredibly frustrated by this insistence on hard data when it's
> > completely obvious to anyone who knows the first thing about MIDI that
> > HZ=250 will fail in situations where HZ=1000 succeeds.
> 
> Ok, guys. How many people have this MIDI thing? How many of you can't be 
> bothered to set the default to suit your usage?
> 
> > It's straight from the MIDI spec.  Your argument is pretty close to "the
> > MIDI spec is wrong, no one can hear the difference between 1ms and 4ms".
> 
> No.
> 
> YOUR argument is "nobody else matters, only I do".
> 
> MY argument is that this is a case of give and take. 

Take from "few" multimedia users, give to "many" laptop users. Where
"few" and "many" are not very well defined quantities, but obviously
"many" > "few" :-) 

As to how few is few. I don't claim to know, but users that bother to
subscribe to the Planet CCRMA[*] mailing list number 750+, so that's one
datapoint. Total users of Planet CCRMA, I have no idea. Most of them
will use MIDI, either externally through hardware interfaces or
internally through the ALSA sequencer api. 

Planet CCRMA includes custom kernels with Ingo's patches for low
latency, so I will have to configure them with HZ=1000 (or 500 or
whatever) in 2.6.13+. Oh well. 

HZ=250 is a setback anyway, as many advances had been made recently in
the stock kernel that made it more and more suitable to multimedia work
(_GREAT_ work BTW). That raised my hopes that, eventually, I would not
have to build kernels, just apps, as stock kernels would be good enough.
This will make the wait longer. 

Sigh, I'll be patient and dream about high resolution timers or other
technically elegant solutions that will not penalize multimedia apps or
laptops... 

-- Fernando

[*] http://ccrma.stanford.edu/planetccrma/software/


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: sata_sil 3112 activity LED patch

2005-07-14 Thread Christian Kroll
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Hello,

I have tested the patch against my DawiControl DC-150 RAID controller
which is basically an add-on card with a SiI 3112 ASIC and a flash ROM.
The activity LED of my case is directly connected to the add-on card.

Unfortunately your patch doesn't have any effect on the LED. The
activity LED gets turned on by the card's BIOS at boot time and
continues to shine until I shut down the computer.
On the other hand it did not erase my Flash ROM and I haven't spotted
any data loss so far.

The LED does work as expected under that other OS as soon as Silicon
Image's reference driver is loaded, hence it is connected correctly.

Test setup:
I'm using DawiControl's version of the BIOS and not the reference BIOS
of Silicon Image. The test system is a Tualatin Celeron 1400 with an
i440BX based mainboard. The following hard disk is connected to the
controller: Seagate ST3160827AS (native SATA interface). The sata_sil
driver is loaded as a module.
Test kernel is vanilla 2.6.12.2. No tainted modules were used
while doing these tests.

If you require more information, don't hesitate to contact me.

Regards
Christian Kroll

- --
Christian Kroll 
GnuPG Fingerprint: DA5D 5BFA 5C95 FD09 2A72  517E 10CB DCD5 71ED 7E35

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.1 (GNU/Linux)

iD8DBQFC1xsQEMvc1XHtfjURAqKQAJ0fp5EtdymeUsiklcqYsCR9Q7VyngCeIfKV
Sb/wTjlvfk6MPMk/KEBkBPY=
=g7Vc
-END PGP SIGNATURE-


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] PCI bus class driver rewrite for 2.6.13-rc2 [0/9]

2005-07-14 Thread Greg KH
On Thu, Jul 14, 2005 at 04:54:56AM -0400, Adam Belay wrote:
> Hi all,
> 
> I'm in the process of overhauling some aspects of the PCI subsystem.
> This patch series is a rewrite of the PCI probing and detection code.
> It creates a well defined PCI bus class API and allows a standard PCI
> driver to bind to PCI bridge devices.  This results in the following:
> 
> * cleaner code
> * improved driver core support
> * the option of adding new PCI bridge drivers
> * better power management

This looks great, thanks for doing this.

> For these changes to be fully effective, the following code (some of
> which was broken by these changes) will need to be fixed:



Good luck with all of this, it's a lot :)

thanks,

greg k-h
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Why is 2.6.12.2 less stable on my laptop than 2.6.10?

2005-07-14 Thread Dave Airlie
> That, of course, you cannot do. But, you can regression test a lot of
> other things, and having a default test suite that is constantly being
> added to and always being run before releases (that test hardware
> agnostic stuff) could help cut down on the number of regressions in
> new releases.
> You can't test everything this way, nor should you, but you can test
> many things, and adding a bit of formal testing to the release
> procedure wouldn't be a bad thing IMO.

But if you read peoples complaints about regression they are nearly
always to do with hardware that used to work not working any more ..
alps touchpads, sound cards, software suspend.. so these people still
gain nothing by you regression testing anything so you still get as
many reports.. the -rc series is meant to provide the testing for the
release so nothing really big gets through (like can't boot from IDE
anymore or something like that)

Dave.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Why is 2.6.12.2 less stable on my laptop than 2.6.10?

2005-07-14 Thread Parag Warudkar
On Thursday 14 July 2005 20:38, Andi Kleen wrote:
> It's basically impossible to regression test swsusp except to release it.
> Its success or failure depends on exactly the driver
> combination/platform/BIOS version etc.  e.g. all drivers have to cooperate
> and the particular bugs in your BIOS need to be worked around etc. Since
> that is quite fragile regressions are common.

I have always wondered how Windows got it right circa 1995 - Version after 
version, several different hardwares and it always works reliably. 
I am using Linux since 1997 and not a single time have I succeeded in getting 
it to suspend and resume reliably. 

Is it such an un-interesting subject to warrant serious effort or there is a 
lot of hardware documentation missing or in general the driver model and OS 
design itself makes it impossible to get suspend / resume right?

Parag
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Why is 2.6.12.2 less stable on my laptop than 2.6.10?

2005-07-14 Thread Andi Kleen
On Thu, Jul 14, 2005 at 10:09:11PM -0400, Parag Warudkar wrote:
> I have always wondered how Windows got it right circa 1995 - Version after 
> version, several different hardwares and it always works reliably. 
> I am using Linux since 1997 and not a single time have I succeeded in getting 
> it to suspend and resume reliably. 

What happens with Windows is that the Laptop vendor takes the
frozen Windows version available at the time the machine hits the market 
and then tweaks the BIOS and the drivers until everything runs and then
releases the machine.

But if you use newer (or older) W. releases or even service packs or different
drivers on that machine you end up exactly with the same problem.

> Is it such an un-interesting subject to warrant serious effort or there is a 
> lot of hardware documentation missing or in general the driver model and OS 
> design itself makes it impossible to get suspend / resume right?

I think you underestimate the complexity of the problem. Suspend/resume
is a fragile cooperation  of many many different components in the 
kernel/firmware/hardware
and all of them have to work flawlessly together.  That's hard.

-Andi

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Why is 2.6.12.2 less stable on my laptop than 2.6.10?

2005-07-14 Thread Andi Kleen
> You can't test everything this way, nor should you, but you can test
> many things, and adding a bit of formal testing to the release
> procedure wouldn't be a bad thing IMO.

In the linux model that's left to the distributions. In fact doing it properly
takes months. You wouldn't want to wait months for a new mainline kernel.

Formal testing is not really compatible with "release early, release often" 

You could do things like "run LTP first", but in practice LTP rarely finds
bugs.

-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Why is 2.6.12.2 less stable on my laptop than 2.6.10?

2005-07-14 Thread Dave Jones
On Fri, Jul 15, 2005 at 03:45:28AM +0200, Jesper Juhl wrote:
 
 > > > The problem is the process, not than the code.
 > > > * The issues are too much ad-hock code flux without enough 
 > > > disciplined/formal
 > > > regression testing and review.
 > > 
 > > It's basically impossible to regression test swsusp except to release it.
 > > Its success or failure depends on exactly the driver 
 > > combination/platform/BIOS
 > > version etc.  e.g. all drivers have to cooperate and the particular
 > > bugs in your BIOS need to be worked around etc. Since that is quite fragile
 > > regressions are common.
 > > 
 > > However in some other cases I agree some more regression testing
 > > before release would be nice. But that's not how Linux works.  Linux
 > > does regression testing after release.
 > > 
 > And who says that couldn't change?
 > 
 > In my oppinion it would be nice if Linus/Andrew had some basic
 > regression tests they could run on kernels before releasing them.

The problem is that this wouldn't cover the more painful problems
such as hardware specific problems.

As Fedora kernel maintainer, I frequently get asked why peoples
sound cards stopped working when they did an update, or why
their system no longer boots, usually followed by a
"wasnt this update tested before it was released?"

The bulk of all the regressions I see reported every time
I put out a kernel update rpm that rebases to a newer
upstream release are in drivers. Those just aren't going
to be caught by folks that don't have the hardware.

The only way to cover as many combinations of hardware
out there is by releasing test kernels. (Updates-testing
repository for Fedora users, or -rc kernels in Linus' case).
If users won't/don't test those 'test' releases, we're
going to regress when the final release happens, there's
no two ways about it.

Dave

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Why is 2.6.12.2 less stable on my laptop than 2.6.10?

2005-07-14 Thread Jesper Juhl
On 7/15/05, Chris Friesen <[EMAIL PROTECTED]> wrote:
> Jesper Juhl wrote:
> 
> > In my oppinion it would be nice if Linus/Andrew had some basic
> > regression tests they could run on kernels before releasing them.
> 
> How do you regression test behaviour on broken hardware (and BIOSes)
> that you don't have?
> 
That, of course, you cannot do. But, you can regression test a lot of
other things, and having a default test suite that is constantly being
added to and always being run before releases (that test hardware
agnostic stuff) could help cut down on the number of regressions in
new releases.
You can't test everything this way, nor should you, but you can test
many things, and adding a bit of formal testing to the release
procedure wouldn't be a bad thing IMO.

-- 
Jesper Juhl <[EMAIL PROTECTED]>
Don't top-post  http://www.catb.org/~esr/jargon/html/T/top-post.html
Plain text mails only, please  http://www.expita.com/nomime.html
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Why is 2.6.12.2 less stable on my laptop than 2.6.10?

2005-07-14 Thread Chris Friesen

Jesper Juhl wrote:


In my oppinion it would be nice if Linus/Andrew had some basic
regression tests they could run on kernels before releasing them.


How do you regression test behaviour on broken hardware (and BIOSes) 
that you don't have?


Chris


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt

2005-07-14 Thread Bill Davidsen

Linus Torvalds wrote:


On Thu, 14 Jul 2005, Lee Revell wrote:
 


I don't think this will fly because we take a big performance hit by
calculating HZ at runtime.
   



I think it might be an acceptable solution for a distribution that really 
needed it, since it should be fairly simple. However, it's definitely not 
the right solution.


HOWEVER. I bet that somebody who really really cares (hint hint) could
easily make HZ be 1000, and then dynamically tweak the divisor at bootup
to be either 1000, 250, or 100, and then increment "jiffies" by 1, 4 or
10.

My wild guess is that this is 20 lines of code, plus another 20 for 
"setup", so that you can choose between 100/250/1000 Hz with a kernel 
command line.


It wouldn't be "dynamic" at first - you'd just set it up at bootup, and 
set a "jiffies_increment" variable, and change the


jiffies_64++;

into

jiffies_64 += jiffies_increment;

and you'd be done. 

Really. I dare you guys. First one to send me a tested patch gets a gold 
star.


Then, a year from now, people will realize how _easy_ it is to change the
jiffies_increment on the fly, and add a /sys/kernel/timer_frequency file, 
and then you can switch it around at run-time.


Trust me. When I say that the right thing to do is to just have a fixed 
(but high) HZ value, and just changing the timer rate, I'm -right-.


I'm always right. This time I'm just even more right than usual.



And humble, too ;-)

Do you actually have something against tickless, or just don't think it 
can be done in reasonable time? I can see this needing very careful 
thought on SMP.


--
bill davidsen <[EMAIL PROTECTED]>
 CTO TMR Associates, Inc
 Doing interesting things with small computers since 1979

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt

2005-07-14 Thread Eric St-Laurent
On Thu, 2005-07-14 at 17:24 -0700, Linus Torvalds wrote:
> 
> On Thu, 14 Jul 2005, Lee Revell wrote:
> 
> Trust me. When I say that the right thing to do is to just have a fixed 
> (but high) HZ value, and just changing the timer rate, I'm -right-.
> 
> I'm always right. This time I'm just even more right than usual.

Of course you are, jiffies are simple and efficient.

But it may be worthwhile to provide better/simpler API for relative
timeouts and also better hide the implementation details of the tick
system.


If i sum-up the discussion from my POV:

- use a 32-bit tick counter on 32-bit platforms and use a 64-bit counter
on 64-bit platforms

- keep the constant HZ=1000 (mS resolution) on 32-bit platforms

- remove the assumption that timer interrupts and jiffies are 1:1 thing
(jiffies may be incremented by >1 ticks at timer interrupt)

- determine jiffies_increment at boot

- have a slow clock mode to help power management (adjust
jiffies_increment by the slowdown factor)

- it may be useful to bump up HZ to 1e6 (uS res.) or 1e9 (nS res.) on
64-bit platforms, if there are benefits such as better accuracy during
time units conversions or if a higher frequency timer hardware is
available/viable.

- it may be also useful to bump HZ on -RT (Real-time) kernels, or with
-HRT (High-resolution timers support). Users of those kernel are willing
to pay the cost of the overhead to have better resolution

- avoid direct usage of the jiffies variable, instead use jiffies()
(inline or MACRO), IMO monotonic_clock() would be a better name

- provide a relative timeout API (see my previous post, or Alan's
suggestions)

- remove most of the direct use of jiffies through the code and replace
them with msleep(), relative timer, etc

- use human units for those APIs


- Eric St-Laurent


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: RFC: Hugepage COW

2005-07-14 Thread Christoph Lameter
On Fri, 15 Jul 2005, David Gibson wrote:

> Well, the COW patch implements a fault handler, obviously.  What
> specifically where you thinking about?

About a fault handler of course and about surrounding scalability issues.
I worked on some hugepage related patches last fall. Have you had a look 
at the work of Ken, Ray and me on the subject?


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: RFC: Hugepage COW

2005-07-14 Thread David Gibson
On Thu, Jul 14, 2005 at 10:24:33AM -0700, Christoph Lameter wrote:
> On Thu, 7 Jul 2005, David Gibson wrote:
> 
> > Now that the hugepage code has been consolidated across the
> > architectures, it becomes much easier to implement copy-on-write.
> > Hugepage COW is of limited utility of itself, however, it is
> > essentially a prerequisite for any of a number of methods of allowing
> > userland programs to automatically use hugepages without code changes
> > e.g. hugepage malloc() libraries, implicit hugepage mmap(), hugepage
> > ELF segments.  For certain applications (particularly enormous HPC
> > FORTRAN programs), these can result in a large performance
> > improvement.
> > 
> > Thoughts?  Flames?
> 
> Great stuff. I am glad that you are cleaning up the hugepages and are 
> making progress improving them. What are your thoughts on implementing 
> fault handling for huge pages?

Well, the COW patch implements a fault handler, obviously.  What
specifically where you thinking about?

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/people/dgibson
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Why is 2.6.12.2 less stable on my laptop than 2.6.10?

2005-07-14 Thread Jesper Juhl
On 15 Jul 2005 02:38:58 +0200, Andi Kleen <[EMAIL PROTECTED]> wrote:
> Mark Gross <[EMAIL PROTECTED]> writes:
> >
> > The problem is the process, not than the code.
> > * The issues are too much ad-hock code flux without enough 
> > disciplined/formal
> > regression testing and review.
> 
> It's basically impossible to regression test swsusp except to release it.
> Its success or failure depends on exactly the driver combination/platform/BIOS
> version etc.  e.g. all drivers have to cooperate and the particular
> bugs in your BIOS need to be worked around etc. Since that is quite fragile
> regressions are common.
> 
> However in some other cases I agree some more regression testing
> before release would be nice. But that's not how Linux works.  Linux
> does regression testing after release.
> 
And who says that couldn't change?

In my oppinion it would be nice if Linus/Andrew had some basic
regression tests they could run on kernels before releasing them.
There are plenty of "Linux test" projects out there that could be
borrowed from to create some sort of regression test harness for them
to run prior to release.   It would be super nice if they had a suite
of tests to run and could then drop a mail on lkml saying 2.6.x is
almost ready to go, but it currently fails regression tests #x, #y &
#z, we need to get those fixed first before we can release this - and
then every time a bug was found that could resonably be tested for in
the future it would be added to the regression test suite...  That
would lead to more consistent quality I believe.


-- 
Jesper Juhl <[EMAIL PROTECTED]>
Don't top-post  http://www.catb.org/~esr/jargon/html/T/top-post.html
Plain text mails only, please  http://www.expita.com/nomime.html
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt

2005-07-14 Thread Jesper Juhl
On 7/15/05, Linus Torvalds <[EMAIL PROTECTED]> wrote:
> 
> 
> On Thu, 14 Jul 2005, Lee Revell wrote:
> >
> > I don't think this will fly because we take a big performance hit by
> > calculating HZ at runtime.
> 
> I think it might be an acceptable solution for a distribution that really
> needed it, since it should be fairly simple. However, it's definitely not
> the right solution.
> 
> HOWEVER. I bet that somebody who really really cares (hint hint) could
> easily make HZ be 1000, and then dynamically tweak the divisor at bootup
> to be either 1000, 250, or 100, and then increment "jiffies" by 1, 4 or
> 10.
> 
[...]
> 
> Really. I dare you guys. First one to send me a tested patch gets a gold
> star.
> 

Testing a patch right now, I'll send it to you as soon as it doesn't
blow up on boot (which it currently does).

-- 
Jesper Juhl <[EMAIL PROTECTED]>
Don't top-post  http://www.catb.org/~esr/jargon/html/T/top-post.html
Plain text mails only, please  http://www.expita.com/nomime.html
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: pc_keyb: controller jammed (0xA7)

2005-07-14 Thread Pete Zaitcev
On Thu, 14 Jul 2005 18:27:01 +0200, Thoralf Will <[EMAIL PROTECTED]> wrote:

> I didn't find any useful answer anywhere so far, hope it's ok to ask here.
> I'm currently trying to get a 2.4.31 up and running on an IBM
> BladeCenter HS20/8843. (base system is a stripped down RH9)
> 
> When booting the kernel the console is spammmed with:
>pc_keyb: controller jammed (0xA7)
> But it seems there are no further consequences and the keyboard is
> working.

I saw a patch for it by Brian Maly, and yes, it was for 2.4.x.
Maybe he can send you a rediff against current Marcelo's tree.

However, is there a reason you're running 2.4.31 in Summer of 2005?
Did you try 2.6, does that one do the same thing? It has a rather
different infrastructure with the serio.

-- Pete
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt

2005-07-14 Thread Eric St-Laurent
On Thu, 2005-07-14 at 23:37 +0100, Alan Cox wrote:

> In actual fact you also want to fix users of
> 
>   while(time_before(foo, jiffies)) { whack(mole); }
> 
> to become
> 
>   init_timeout();
>   timeout.expires = jiffies + n
>   add_timeout();
>   while(!timeout_expired()) {}
> 
> Which is a trivial wrapper around timers as we have them now

Or something like this:

struct timeout_timer {
unsigned long expires;
};

static inline void timeout_set(struct timeout_timer *timer,
unsigned int msecs)
{
timer->expires = jiffies + msecs_to_jiffies(msecs);
}

static inline int timeout_expired(struct timeout_timer *timer)
{
return (time_after(jiffies, timer->expires));
}

It provides a nice API for relative timeouts without adding overhead.


- Eric St-Laurent


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt

2005-07-14 Thread Bill Davidsen

David Lang wrote:


On Wed, 13 Jul 2005, Bill Davidsen wrote:

How serious is the 1/HZ = sane problem, and more to the point how 
many programs get the HZ value with a system call as opposed to 
including a header or building it in? I know some of my older 
programs use header files, that was part of the planning for the 
future even before 2.5 started. At the time I didn't expect to have 
to use the system call.



in binary 1/100 or 1/1000 are not sane values to start with so I don't 
think that that this is likly to be that critical (remembering that 
the kernel doesn't do floating point math) 



The kernel isn't the issue, it's programs which do timing and get values 
in ticks which they convert to time by dividing by HZ. I at least get 
that from a header, proper way would be with the syscall.


--
bill davidsen <[EMAIL PROTECTED]>
 CTO TMR Associates, Inc
 Doing interesting things with small computers since 1979

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: interrupt hooking in kernel 2.6

2005-07-14 Thread Lee Revell
On Fri, 2005-07-15 at 03:55 +0300, Zvi Rackover wrote:
> hello all,
> 
> i wish to write a module for i386 that can hook interrupts. the module
> loads its own interrupt descriptor table instead of the default
> system's table. after executing my own handler(s), the old appropriate
> handler will be called.

I think you want the "I-Pipe", just posted to LKML few weeks ago.

Lee

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


interrupt hooking in kernel 2.6

2005-07-14 Thread Zvi Rackover
hello all,

i wish to write a module for i386 that can hook interrupts. the module
loads its own interrupt descriptor table instead of the default
system's table. after executing my own handler(s), the old appropriate
handler will be called.
i could not find any documentation or sample code explaining how this
is done in 2.6. There are very few outdated examples on the web which
apparently are not suitable.
can anyone help me out?

yours,
zvi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt

2005-07-14 Thread Linus Torvalds


On Fri, 15 Jul 2005, Jesper Juhl wrote:
>
> Even if we only have to do it once at boot?  The thought was to detect
> what type of machine we are booting on, figure out what a good HZ
> would be for that type of box, then set that HZ value and treat it as
> a constant from that point forward.

No, it really should be a compile-time constant, or a lot of things get a 
lot more expensive. There's a HZ embedded in a lot of places, and some of 
them are divides, for example. Others do optimized special cases based on 
static knowledge of what HZ is.

So this is why I so strongly argue that we should have a constant HZ, but 
a dynamic _increment_ of "jiffies". Nobody (obviously) depends on jiffies 
being constant, so it's ok to increment jiffies by pretty much any value.

Yeah, yeah, there might be some _very_ few code-paths (bogomips, I think)  
that may look at when "jiffies" changes, and actually measure one tick 
that way. They would need to be taught that they don't measure "one" tick 
any more, they measure "jiffies_increment" ticks or something.

But I really wouldn't be surprised if the bogomips calibration loop was 
really the only thing that needed some small tweaking for increments of 
other than one.

(Oh, we'll find other things we want to fix up, and such a change would
result in other changes down the line, no question about that.  But I
don't think it would be very much at all, and I don't think it would 
turn out at all traumatic).

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Why is 2.6.12.2 less stable on my laptop than 2.6.10?

2005-07-14 Thread Andi Kleen
Mark Gross <[EMAIL PROTECTED]> writes:
> 
> The problem is the process, not than the code.
> * The issues are too much ad-hock code flux without enough disciplined/formal 
> regression testing and review.  

It's basically impossible to regression test swsusp except to release it. 
Its success or failure depends on exactly the driver combination/platform/BIOS
version etc.  e.g. all drivers have to cooperate and the particular
bugs in your BIOS need to be worked around etc. Since that is quite fragile
regressions are common.

However in some other cases I agree some more regression testing
before release would be nice. But that's not how Linux works.  Linux
does regression testing after release.

-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Question on thread_info

2005-07-14 Thread Purshotam Roy
Hello,
Could anybody provide any information as to why thread_info is kept
at the top of the stack. Also, does it mean that when thread_info is
kept it will be kept at the bottom of the heap itself also.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [rfc patch 2/2] direct-io: remove address alignment check

2005-07-14 Thread Tejun Heo

Daniel McNeil wrote:

This patch relaxes the direct i/o alignment check so that user addresses
do not have to be a multiple of the device block size.

I've done some preliminary testing and it mostly works on an ext3
file system on a ide disk.  I have seen trouble when the user address
is on an odd byte boundary.  Sometimes the data is read back incorrectly
on read and sometimes I get these kernel error messages:
hda: dma_timer_expiry: dma status == 0x60
hda: DMA timeout retry
hda: timeout waiting for DMA
hda: status error: status=0x58 { DriveReady SeekComplete DataRequest }
ide: failed opcode was: unknown
hda: drive not ready for command

Doing direct-io with user addresses on even, non-512 boundaries appears
to be working correctly.

Any additional testing and/or comments welcome.



 Hi, Daniel.

 I don't think the change is a good idea.  We may be able to relax 
alignment contraints on some hardware to certain levels, but IMHO it 
will be very difficult to verify.  All internal block IO code follows 
strict block boundary alignment.  And as raw IOs (especially unaligned 
ones) aren't very common operations, they won't get tested much.  Then 
when some rare (probably not an open source one) application uses it on 
some rare buggy hardware, it may cause *very* strange things.


 Also, I don't think it will improve application programmer's 
convenience.  As each hardware employs different DMA alignemnt, we need 
to implement a way to export the alignment to user space and enforce it. 
  So, in the end, user application must do aligned allocation 
accordingly.  Just following block boundary will be easier.


 I don't know why you wanna relax the alignment requirement, but 
wouldn't it be easier to just write/use block-aligned allocator for such 
buffers?  It will even make the program more portable.


--
tejun
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: rcu-refcount stacker performance

2005-07-14 Thread Joe Seigh

[EMAIL PROTECTED] wrote:

Quoting Paul E. McKenney ([EMAIL PROTECTED]):

OK, but in the above case, "do something" cannot be sleeping, since
it is under rcu_read_lock().



Oh, but that's not quite what the code is doing, rather it is doing:

rcu_read_lock
while get next element from list
inc element.refcount
rcu_read_unlock
do something
rcu_read_lock
dec refcount
rcu_read_unlock



I've been experimenting with various lock-free methods in user space, which
is preemptive.   Stuff like RCU, RCU+SMR which I've mentioned before,
and some atomically thread-safe reference counting.  I have a proxy
GC based on the latter called APPC (atomic pointer proxy collector).
Basically you use a proxy refcounted object for the whole list
rather than every element in the list.  Before you access the list,
you increment the refcount of the proxy object, and afterwards you
decrement it.  One interlocked instruction for each so performance
wise it looks like a reader lock which never blocks.

Writers enqueue deleted nodes on the collector object and then
push a new collector object in place.

The collector objects look like


   proxy_anchor -> c_obj <- c_obj <- c_obj
  ^
  | reader

The previous collector objects are back linked so when a reader
thread releases it, all unreference collector objects have
deallocation performed on them and attached nodes.

A bit sketchy.  You can see a working example of this using
C++ refcounted pointers (which can't be used in the kernel
naturally, you'll have to implement your own) at
http://atomic-ptr-plus.sourceforge.net/

--
Joe Seigh

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt

2005-07-14 Thread Linus Torvalds


On Thu, 14 Jul 2005, Lee Revell wrote:
> 
> I don't think this will fly because we take a big performance hit by
> calculating HZ at runtime.

I think it might be an acceptable solution for a distribution that really 
needed it, since it should be fairly simple. However, it's definitely not 
the right solution.

HOWEVER. I bet that somebody who really really cares (hint hint) could
easily make HZ be 1000, and then dynamically tweak the divisor at bootup
to be either 1000, 250, or 100, and then increment "jiffies" by 1, 4 or
10.

My wild guess is that this is 20 lines of code, plus another 20 for 
"setup", so that you can choose between 100/250/1000 Hz with a kernel 
command line.

It wouldn't be "dynamic" at first - you'd just set it up at bootup, and 
set a "jiffies_increment" variable, and change the

jiffies_64++;

into

jiffies_64 += jiffies_increment;

and you'd be done. 

Really. I dare you guys. First one to send me a tested patch gets a gold 
star.

Then, a year from now, people will realize how _easy_ it is to change the
jiffies_increment on the fly, and add a /sys/kernel/timer_frequency file, 
and then you can switch it around at run-time.

Trust me. When I say that the right thing to do is to just have a fixed 
(but high) HZ value, and just changing the timer rate, I'm -right-.

I'm always right. This time I'm just even more right than usual.

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt

2005-07-14 Thread Jesper Juhl
On 7/15/05, Lee Revell <[EMAIL PROTECTED]> wrote:
> On Fri, 2005-07-15 at 02:04 +0200, Jesper Juhl wrote:
> > While reading this thread it occoured to me that perhaps what we
> > really want (besides sub HZ timers) might be for the kernel to
> > auto-tune HZ?
> >
> > Would it make sense to introduce a new config option (say
> > CONFIG_HZ_AUTO) that when selected does something like this at boot:
> >
> > if (running_on_a_laptop()) {
> > set_HZ_to(250);
> > }
> 
> I don't think this will fly because we take a big performance hit by
> calculating HZ at runtime.
> 
Even if we only have to do it once at boot?  The thought was to detect
what type of machine we are booting on, figure out what a good HZ
would be for that type of box, then set that HZ value and treat it as
a constant from that point forward.

-- 
Jesper Juhl <[EMAIL PROTECTED]>
Don't top-post  http://www.catb.org/~esr/jargon/html/T/top-post.html
Plain text mails only, please  http://www.expita.com/nomime.html
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt

2005-07-14 Thread Lee Revell
On Fri, 2005-07-15 at 02:04 +0200, Jesper Juhl wrote:
> While reading this thread it occoured to me that perhaps what we
> really want (besides sub HZ timers) might be for the kernel to
> auto-tune HZ?
> 
> Would it make sense to introduce a new config option (say
> CONFIG_HZ_AUTO) that when selected does something like this at boot:
> 
> if (running_on_a_laptop()) {
> set_HZ_to(250);
> }

I don't think this will fly because we take a big performance hit by
calculating HZ at runtime.

Lee

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] RealTimeSync Patch

2005-07-14 Thread Andrew Morton
Elias Kesh <[EMAIL PROTECTED]> wrote:
>
> Hello,
> 
> I would like to get some feedback on this patch for the kernel.  It's sole 
> purpose is to help in reducing boot time by not waiting to synchronize the 
> clock edge with the hardware clock. This when combined with other boot 
> reduction patched can bring the kernel boot time to well under 10 seconds, in 
> most cases two or three seconds.  In a desktop system this patch is probably 
> insignificant, howerver several patches like this in a set top box or cell 
> phone will be signicant.

Please wordwrap your emails at column 72 or thereabouts.

>  I understand that there may be some concerns with patches like these so I 
> would like to start a discussion so that I can better understand what the 
> issues are. The members of the CELinux Forum have quite a bit we would like 
> to contribute.

You should send the patches to this mailing list, just as you have done here.

> Looking at the archives I see that a an intel patch was submitted back in 
> October but I am unable to determine what the resolution was.

What patch was that?

> This patch included is for PPC but other architecutres are available on the 
> patch web site below.

I get connection refused from tree.celinuxforum.org

> Detailed information on the patch can be found here:
> http://tree.celinuxforum.org/CelfPubWiki/RTCNoSync
> 
> In addition, other patches for boot time reduction can be found here:
> http://tree.celinuxforum.org/CelfPubWiki/PatchArchive

Finish the patches and just send them.  No fuss.  See
http://www.zip.com.au/~akpm/linux/patches/stuff/tpp.txt for a few details.

> *
> Fast boot options (FASTBOOT) [N/y/?] (NEW) y
>   Disable synch on read of Real Time Clock (RTC_NO_SYNC) [N/y/?] (NEW) y
> 

This particular feature seems to be ppc-specific and hence the folks at
[EMAIL PROTECTED] should be involved.  Probably the
CONFIG_RTC_NO_SYNC Kconfig option should be in arch/ppc/Kconfig - one would
need to see all the patches to determine that.

It might be better to use a kernel boot option rather than another
compile-time option for this - you'd need to discuss that with other ppc
people.   Or perhaps the code in there is just being dumb and can be fixed.

In general, it's taking wy to long to get all these CELinux patches
into the outside world.  Thanks for getting this one on the wires.  Let's
get them all done and finish this thing.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt

2005-07-14 Thread Jesper Juhl
On 7/13/05, Chris Wedgwood <[EMAIL PROTECTED]> wrote:
> On Wed, Jul 13, 2005 at 01:48:57PM -0700, Andrew Morton wrote:
> 
> > Len Brown, a year ago: "The bottom line number to laptop users is
> > battery lifetime.  Just today somebody complained to me that Windows
> > gets twice the battery life that Linux does."
> 
> It seems the motivation for lower HZ is really:
> 
>(1) ACPI/SMM suckage in laptops
> 
>(2) NUMA systems with *horrible* remote memory latencies
> 
> Both can be detected from you .config and we could see HZ as needed
> there and everyone else could avoid this surely?
> 

While reading this thread it occoured to me that perhaps what we
really want (besides sub HZ timers) might be for the kernel to
auto-tune HZ?

Would it make sense to introduce a new config option (say
CONFIG_HZ_AUTO) that when selected does something like this at boot:

if (running_on_a_laptop()) {
set_HZ_to(250);
} else if (running_on_large_NUMA_box()) {
set_HZ_to_100();
} else if (running_on_multimedia_box() {
set_HZ_to_1000();
} else {
set_HZ_to_some_other_sane_default();
}

and if user wants to not use the auto detection they can select a
certain HZ in their .config instead of CONFIG_HZ_AUTO.


Just wanted to throw the idea up in the air in case it made sense.
Feel free to pick it apart or simply ignore it. :-)


-- 
Jesper Juhl <[EMAIL PROTECTED]>
Don't top-post  http://www.catb.org/~esr/jargon/html/T/top-post.html
Plain text mails only, please  http://www.expita.com/nomime.html
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [rfc patch 2/2] direct-io: remove address alignment check

2005-07-14 Thread Daniel McNeil
On Thu, 2005-07-14 at 16:39, Andrew Morton wrote:
> Daniel McNeil <[EMAIL PROTECTED]> wrote:
> >
> > Do drivers have problems with odd addresses or with
> >  non-512 addresses?
> 
> I do recall hearing rumours that some bus-masters have fairly strict memory
> alignment requirements.  A cacheline size, perhaps - that would be 32 bytes
> given the age of the hardware.
> 
> But yeah, it's v.  risky to assume that all bus masters can cope with
> memory alignments down to two bytes.
> 
> It would be sane to put the minimum alignment into ->backing_dev_info,
> default to 512, get the device drivers to override that as they are tested.
> 
> But this introduces a very very bad problem: people will write applications
> which work on their hardware, ship the things and then find that the apps
> break on other people's hardware.  So we can't do that.
> 
> Instead, we need to work out the minimum alignment requirement for all disk
> controllers and DMA controllers and motherboards in the world.  And that
> includes catering for weird ones which appear to work but which
> occasionally fail in mysterious ways with finer alignments.  That's hard. 
> It's easier to continue to make application developers jump through hoops.

I was hoping this patch would help turn rumors into real data :)

If we did put min alignment into backing_dev_info, we could implement
the equivalent of bounce buffers for direct-io -- or just fall back
to buffer i/o like it does sometimes anyway.  That way application
would not break, just get worse performance on some hardware.

Right now I just wanted to get the issues on table, get some test
results, and see how to proceed from there.  Since this patch only
affects direct i/o, getting test results shouldn't cause too many
problems.

Thanks,

Daniel

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Why is 2.6.12.2 less stable on my laptop than 2.6.10?

2005-07-14 Thread Andrew Morton
Mark Gross <[EMAIL PROTECTED]> wrote:
>
> I know this is a broken record, but the development process within the LKML 
>  isn't resulting in more stable and better code.  Some process change could 
> be 
>  a good thing.

We rely upon people (such as [EMAIL PROTECTED]) to send bug reports.

>  Why does my alps mouse pad have to stop working every time I test a new 
>  "STABLE" kernel?  

The alps driver is always broken.  Seems to be a feature.

Please test 2.6.13-rc3 and if it also fails send a comprehensive bug report
to Dmitry Torokhov <[EMAIL PROTECTED]> and Vojtech Pavlik
<[EMAIL PROTECTED]>

>  Why does swsup have to start hanging on shut and startup down randomly?

swsusp also is a problematic feature.  You appear to have chosen two of the
very most problematic parts of the kernel (you missed ACPI) and then
generalised them to the whole.  That isn't valid.

Please test 2.6.13-rc3 and if it also fails send a comprehensive bug report
to Pavel Machek <[EMAIL PROTECTED]>
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt

2005-07-14 Thread Lee Revell
On Thu, 2005-07-14 at 16:49 -0700, Linus Torvalds wrote:
> 
> On Thu, 14 Jul 2005, Lee Revell wrote:
> > 
> > And I'm incredibly frustrated by this insistence on hard data when it's
> > completely obvious to anyone who knows the first thing about MIDI that
> > HZ=250 will fail in situations where HZ=1000 succeeds.
> 
> Ok, guys. How many people have this MIDI thing?

>  How many of you can't be 
> bothered to set the default to suit your usage?

Very few, and even fewer, respectively.  But, we'd still like to be able
to use the same kernel image as everyone else if possible.

I guess we'll have to deal with it until a variable tick solution is
ready.

Lee


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt

2005-07-14 Thread Lee Revell
On Thu, 2005-07-14 at 16:49 -0700, Linus Torvalds wrote:
> YOUR argument is "nobody else matters, only I do".
> 
> MY argument is that this is a case of give and take. 

I wouldn't say that.  I do agree with you that HZ=1000 for everyone is
problematic, I just feel that a reasonable compromise is CONFIG_HZ with
the default left at 1000.

Lee

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt

2005-07-14 Thread Linus Torvalds


On Thu, 14 Jul 2005, Lee Revell wrote:
> 
> And I'm incredibly frustrated by this insistence on hard data when it's
> completely obvious to anyone who knows the first thing about MIDI that
> HZ=250 will fail in situations where HZ=1000 succeeds.

Ok, guys. How many people have this MIDI thing? How many of you can't be 
bothered to set the default to suit your usage?

> It's straight from the MIDI spec.  Your argument is pretty close to "the
> MIDI spec is wrong, no one can hear the difference between 1ms and 4ms".

No.

YOUR argument is "nobody else matters, only I do".

MY argument is that this is a case of give and take. 

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt

2005-07-14 Thread Lee Revell
On Thu, 2005-07-14 at 16:25 -0700, Linus Torvalds wrote:
> On Thu, 14 Jul 2005, Lee Revell wrote:
> > This thread has really gone OT, but to revisit the original issue for a
> > bit, are you still unwilling to consider leaving the default HZ at 1000
> > for 2.6.13?
> 
> Yes. I see absolutely no point to it until I actually hear people who have 
> actually tried some real load that doesn't work. Dammit, I want a real 
> user who says that he can noticeable see his DVD stuttering, not some 
> theory.

Well, on the plus side, this will probably drive the development of a
mergeable variable tick solution, as I can't imagine the distros will
want to have to ship a separate HZ=1000 kernel for multimedia use.

Lee

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: 2.6.9 chrdev_open: serial_core: uart_open

2005-07-14 Thread karl malbrain
chrdev_open issues a lock_kernel() before calling uart_open.

It would appear that servicing the blocking open request uart_open goes to
sleep with the kernel locked.  Would this shut down subsequent access to
opening "/dev/tty"???

karl m



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [rfc patch 2/2] direct-io: remove address alignment check

2005-07-14 Thread Daniel McNeil
On Thu, 2005-07-14 at 16:16, Badari Pulavarty wrote:
> How does your patch ensures that we meet the driver alignment
> restrictions ? Like you said, you need atleast "even" byte alignment
> for IDE etc..
> 
> And also, are there any restrictions on how much the "minimum" IO
> size has to be ? I mean, can I read "1" byte ? I guess you are
> not relaxing it (yet)..
> 

This patch does not change the i/o size requirements -- they
must be a multiple of device block size (usually 512).

It only relaxes the address alignment restriction.  I do not
know what the driver alignment restrictions are.  Without the
1st patch, it was impossible to relax the address space
check and have direct-io generate the correct i/o's to submit.

This 2nd patch, is just for testing and generating feedback
to find out what the address alignment issues are.  Then
we can decide how to proceed.

Did you look over the 1st patch?  Comments?

Thanks,

Daniel
 

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt

2005-07-14 Thread Lee Revell
On Thu, 2005-07-14 at 16:25 -0700, Linus Torvalds wrote:
> Yes. I see absolutely no point to it until I actually hear people who have 
> actually tried some real load that doesn't work. Dammit, I want a real 
> user who says that he can noticeable see his DVD stuttering, not some 
> theory.
> 
> I'm incredibly fed up with the theoretical whining. 
> 

And I'm incredibly frustrated by this insistence on hard data when it's
completely obvious to anyone who knows the first thing about MIDI that
HZ=250 will fail in situations where HZ=1000 succeeds.

Do you really consider this "theoretical whining"?

http://lkml.org/lkml/2005/7/13/229

It's straight from the MIDI spec.  Your argument is pretty close to "the
MIDI spec is wrong, no one can hear the difference between 1ms and 4ms".

Lee





-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [rfc patch 2/2] direct-io: remove address alignment check

2005-07-14 Thread Andrew Morton
Daniel McNeil <[EMAIL PROTECTED]> wrote:
>
> Do drivers have problems with odd addresses or with
>  non-512 addresses?

I do recall hearing rumours that some bus-masters have fairly strict memory
alignment requirements.  A cacheline size, perhaps - that would be 32 bytes
given the age of the hardware.

But yeah, it's v.  risky to assume that all bus masters can cope with
memory alignments down to two bytes.

It would be sane to put the minimum alignment into ->backing_dev_info,
default to 512, get the device drivers to override that as they are tested.

But this introduces a very very bad problem: people will write applications
which work on their hardware, ship the things and then find that the apps
break on other people's hardware.  So we can't do that.

Instead, we need to work out the minimum alignment requirement for all disk
controllers and DMA controllers and motherboards in the world.  And that
includes catering for weird ones which appear to work but which
occasionally fail in mysterious ways with finer alignments.  That's hard. 
It's easier to continue to make application developers jump through hoops.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt

2005-07-14 Thread Linus Torvalds


On Thu, 14 Jul 2005, Linus Torvalds wrote:
> 
> But what you can do is to have HZ at some reasonably high value (ie in the 
> kHz range), and then slow down the system clock to conserve energy, and 
> increment jiffies by 16 or 32 when in "slow clock mode".

Btw, it doesn't have to even be a slow-down due to the kernels decision.

In a VM environment, the timer interrupt might be erratic, and the timer 
interrupt might read some hardware register (TSC or something) and use 
_that_ to increment jiffies by the "proper" amount.

See? The point is that "jiffies" is useful exactly because it's very cheap 
to read portably (there are no portable high-performance alternatives) and 
because it has the right resolution to be useful in 32 bits.

That doesn't mean that the code that updates it can't be clever. We
already have code that updates it that is a lot more intelligent than 99%
of the code that reads it:  we update it in 64 bits under xtime_lock, even
though most readers use a lock-less 32-bit read and only get a partial
value - the part they care about.

And this is a wonderful property that everybody seems to be ignoring, even 
though we have absolutely tons of code that just takes all of this 
goodness for granted.

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 2.6] remove PCI_BRIDGE_CTL_VGA handling from setup-bus.c

2005-07-14 Thread Ivan Kokshaysky
On Thu, Jul 14, 2005 at 06:42:30PM -0400, Jon Smirl wrote:
> You need to take this code into account, from arch/i386/pci/fixup.c

Yes, I've seen that (nice code, btw :-). But my code snippet has
nothing to do with x86 or any particular architecture - it just
shows that some hypothetical platform that doesn't have working
PCI firmware won't be hurt heavily by that patch. ;-)

Ivan.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt

2005-07-14 Thread Linus Torvalds


On Thu, 14 Jul 2005, Lee Revell wrote:
>
> On Thu, 2005-07-14 at 09:37 -0700, Linus Torvalds wrote:
> > I have to say, this whole thread has been pretty damn worthless in
> > general in my not-so-humble opinion.
> > 
> 
> This thread has really gone OT, but to revisit the original issue for a
> bit, are you still unwilling to consider leaving the default HZ at 1000
> for 2.6.13?

Yes. I see absolutely no point to it until I actually hear people who have 
actually tried some real load that doesn't work. Dammit, I want a real 
user who says that he can noticeable see his DVD stuttering, not some 
theory.

I'm incredibly fed up with the theoretical whining. 

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt

2005-07-14 Thread Linus Torvalds


On Thu, 14 Jul 2005, Alan Cox wrote:
> 
> I suspect the problem for some of this is that people think of jiffies
> as incrementing by 1. If HZ is right then jiffies can be in nS, it just
> won't increment by 1.

No, jiffies _cannot_ be in nS, because of the fact that then it doesn't 
fit in a word any more. A lot of things want timeouts in the tens of 
minutes, and a jiffy clock that tries to ne in nS just screws that up 
entirely, and forces people to use u64.

Which is much more expensive to compare on 32-bit architectures due to 
nasty atomicity issues. 

So you want to keep the "normal" timeout 32-bit. In ten years we may not 
care any more. For the forseeable future we definitely do.

> Its also why jiffies() is better on some platforms
> because many machines can answer "what time is it" far more accurately
> than they can set interrupts.

That's not what "jiffies" are about. If you want accurate time, use
something else, like gettimeofday. The timeouts are _only_ relevant on the
scale of a timer interrupt, since by definition that's what we're waiting
for.

So accuracy is a total non-issue. The only kind of accuracy we care about 
is "how often can the timer ticks happen".

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [rfc patch 2/2] direct-io: remove address alignment check

2005-07-14 Thread Badari Pulavarty
How does your patch ensures that we meet the driver alignment
restrictions ? Like you said, you need atleast "even" byte alignment
for IDE etc..

And also, are there any restrictions on how much the "minimum" IO
size has to be ? I mean, can I read "1" byte ? I guess you are
not relaxing it (yet)..

Thanks,
Badari

On Wed, 2005-07-13 at 16:43 -0700, Daniel McNeil wrote:
> This patch relaxes the direct i/o alignment check so that user addresses
> do not have to be a multiple of the device block size.
> 
> I've done some preliminary testing and it mostly works on an ext3
> file system on a ide disk.  I have seen trouble when the user address
> is on an odd byte boundary.  Sometimes the data is read back incorrectly
> on read and sometimes I get these kernel error messages:
>   hda: dma_timer_expiry: dma status == 0x60
>   hda: DMA timeout retry
>   hda: timeout waiting for DMA
>   hda: status error: status=0x58 { DriveReady SeekComplete DataRequest }
>   ide: failed opcode was: unknown
>   hda: drive not ready for command
> 
> Doing direct-io with user addresses on even, non-512 boundaries appears
> to be working correctly.
> 
> Any additional testing and/or comments welcome.
> 
> Signed-off-by: Daniel McNeil <[EMAIL PROTECTED]>
> 
> --- linux-2.6.12.orig/fs/direct-io.c  2005-06-28 16:39:39.0 -0700
> +++ linux-2.6.12/fs/direct-io.c   2005-06-28 16:39:59.0 -0700
> @@ -1147,7 +1147,9 @@ __blockdev_direct_IO(int rw, struct kioc
>   goto out;
>   }
>  
> - /* Check the memory alignment.  Blocks cannot straddle pages */
> + /*
> +  * Check the i/o.  It must be a multiple of device block size.
> +  */
>   for (seg = 0; seg < nr_segs; seg++) {
>   addr = (unsigned long)iov[seg].iov_base;
>   size = iov[seg].iov_len;
> @@ -1156,7 +1158,7 @@ __blockdev_direct_IO(int rw, struct kioc
>   if (bdev)
>blkbits = bdev_blkbits;
>   blocksize_mask = (1 << blkbits) - 1;
> - if ((addr & blocksize_mask) || (size & blocksize_mask))
> + if (size & blocksize_mask)
>   goto out;
>   }
>   }
> 
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-aio' in
> the body to [EMAIL PROTECTED]  For more info on Linux AIO,
> see: http://www.kvack.org/aio/
> Don't email: mailto:"[EMAIL PROTECTED]">[EMAIL PROTECTED]
> 

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt

2005-07-14 Thread Lee Revell
On Thu, 2005-07-14 at 09:37 -0700, Linus Torvalds wrote:
> I have to say, this whole thread has been pretty damn worthless in
> general in my not-so-humble opinion.
> 

This thread has really gone OT, but to revisit the original issue for a
bit, are you still unwilling to consider leaving the default HZ at 1000
for 2.6.13?

Lee

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt

2005-07-14 Thread Linus Torvalds


On Thu, 14 Jul 2005, Alan Cox wrote:
>
> > just doesn't realize that the latter is a bit more complicated exactly 
> > because the latter is a hell of a lot more POWERFUL. Trying to get rid of 
> > jiffies for some religious reason is _stupid_.
> 
> Getting rid of jiffies in its current form is a huge win for very
> non-religious reasons. Jiffies is expensive in power management and
> virtualisation because you have to maintain it.

No, you're now confusing "interrupts" with "jiffies".

There is no conceptual 1:1 thing between those two.

It so happens that traditionally we've kept them 1:1, but there's nothing 
that says that we can't slow down the interrupt source and just increment 
jiffies by a factor of the slowdown when the interrupt _does_ happen.

But no, that does NOT mean that "jiffies" should just count nanoseconds, 
and the problem would be solved. The fact is, most users of jiffies only 
care about the low 32 bits on 32-bit architectures, and that's fine as 
long as jiffies are in the millisecond range, since it still leaves a 
useful timeout value for almost everything (and then only long-range stuff 
needs to use "u64" for their timeouts).

In other words, we want a clock that is _known_ to not be very accurate,
but that is easy to just read from a memory location, and that has some
relationship to a timer tick in the sense that it should be at least in
the order-of-magnitude range for what a timer tick can cause.

Anybody who asks for nanoseconds is confused. That just forces you to use 
a 64-bit value, where no such value is needed. Things like TCP 
retransmission timeouts would be totally _idiotic_ to be made in 
nanoseconds: it would just make the socket data structures larger, and it 
has zero relevance, since the actual timer tick doesn't have that kind of 
resolution _anyway_.

The current "jiffies" actually fits all of these problems _wonderfully_
well. Yes, it needs to be converted from "struct timeval" and friends, but
it needs to be converted exactly _because_ of the good properties it has,
namely that it fits in a word, and is _relevant_ to what a timer interrupt
ends up being.

Look at 99% of the use of jiffies: it uses _jiffies_. It doesn't use
"jiffies_64", even though that's actually what gets updated. And it does
that _exactly_ because almost _nobody_ cares to pay the price of 64 bit
issues (both structure memory usage, and atomicity costs on 32-bit
architectures).

And I claim that you _cannot_ do better.

But what you can do is to have HZ at some reasonably high value (ie in the 
kHz range), and then slow down the system clock to conserve energy, and 
increment jiffies by 16 or 32 when in "slow clock mode". And then, when 
there is a multimedia app or somethign that asks for high-precision 
timers, you speed the interrupts up again, and you increment jiffies by 1.

It's that simple. And it really _is_ simple. 

And guys, the fact is, jiffies works _today_. Making this change won't 
break anything, and won't introduce any new concepts, and won't break any 
existing drivers. In contrast, introducing _yet_ another timekeeping 
mechanism is not only going to be objectively _worse_ than jiffies from a 
technical standpoint, it's going to be a total disaster from a transition 
standpoint too. 

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Add security_task_post_setgid

2005-07-14 Thread Chris Wright
* Jan Engelhardt ([EMAIL PROTECTED]) wrote:
> the following patch adds a post_setgid() security hook, and necessary dummy 
> funcs.

why?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Problem with kernel 2.6.11

2005-07-14 Thread Bernd Schubert
Hello Francois,

> > > I have a problem with a program named Gaussian (http://www.gaussian.com)
> > > (versions g98 or g03) and FC 4.0 (default kernel 2.6.11): I am used to 
> > > take
> > > Gaussian binaries compiled on the RedHat 9.0 version, and used them on FC
> > > 2.0 or FC 3.0. If I try to do so, on FC 4.0. (with the default kernel)
> > > Gaussian stops (both g98 and g03 versions) with the following error
> > > message:

could you please tell me which compiler you used to compile Gaussian?
Its rather probably pgf77 (PGI), but the version is also important. If
it was 5.2, you just ran into bugs we already experienced some time ago.
I also posted a warning about that to the CCL list. On the CCL list I also saw
there were problems with PGI-6.0, but I never bothered to test this
myself, as our gaussian-binaries compiled with PGI-5.1 seem to work
fine. Also, binaries from the PGI compiler are to our experience rather
sensible to the glibc version. I'm not absolutely sure whats causing
that, but somehow I'm under the impression that the PGI-libraries, which all
binaries created with the PGI compiler are linked with, do some odd
optimizations.  So to make sure that its really a kernel issue you should use 
the 
libc of the compiler system (via LD_LIBRARY_PATH) or compile Gaussian
statically.

> stat64("/home/fyd/0QM_SCR/Gau-3174.inp", 0xbf9db114) = -1 ENOENT (No such file

I'm a bit tired now and maybe I'm interpreting it wrong, but I think you
should use strace -f ...

> rt_sigprocmask(SIG_SETMASK, [RTMIN], NULL, 8) = 0
> --- SIGCHLD (Child exited) @ 0 (0) ---

Same here.

Cheers,
Bernd

-- 
Bernd Schubert
PCI / Theoretische Chemie
Universität Heidelberg
INF 229
69120 Heidelberg
e-mail: [EMAIL PROTECTED]

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 2.6] remove PCI_BRIDGE_CTL_VGA handling from setup-bus.c

2005-07-14 Thread Ivan Kokshaysky
On Thu, Jul 14, 2005 at 06:39:43PM -0400, Jon Smirl wrote:
> I had the wrong define, this is the one I was thinking of 
> IORESOURCE_BUS_HAS_VGA

Oh, I definitely agree about that one. It's been unused for a couple of
years, at least. Let's kill it, please.

Ivan.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt

2005-07-14 Thread Chris Wedgwood
On Thu, Jul 14, 2005 at 01:41:44PM -0700, Christoph Lameter wrote:

> AFAIK John simply wants to change jiffies to count in nanoseconds
> since bootup and then call it "clock_monotonic".

Clocks and counter drift so calling it seconds would be
misleading.  It would really only be good for approximate timing.

I think call it something arbitrary and work towards have a separate
mechanism for time of day (which could end up being much more
expensive to use but less frrequently needed).

> One 64 bit value no splitting into seconds and nanoseconds anymore.

Using a 64-bit value is a pain on some (many?) 32-bit CPUs.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Thread_Id

2005-07-14 Thread J.A. Magallon

On 07.14, RVK wrote:
> Ian Campbell wrote:
> 
> >On Thu, 2005-07-14 at 16:32 +0530, RVK wrote:
> >  
> >
> >>Ian Campbell wrote:
> >>
> >>
> >>>What Arjan is saying is that pthread_t is a cookie -- this means that
> >>>you cannot interpret it in any way, it is just a "thing" which you can
> >>>pass back to the API, that pthread_t happens to be typedef'd to unsigned
> >>>long int is irrelevant.
> >>>  
> >>>
> >>Do you want to say for both 2.6.x and 2.4.x I should interpret that way ?
> >>
> >>
> >
> >As I understand it, yes, you should never try and assign any meaning to
> >the values. The fact that you may have been able to find some apparent
> >meaning under 2.4 is just a coincidence.
> >
> >  
> >
> Iam sorry I don't agree on this. This confusion have created only becoz 
> of the different behavior of pthread_self() on 2.4.18 and 2.6.x kernels. 
> And Iam looking for clarifying my doubt. I can't digest this at all.
> 

It is simple: none ever never told you that a pthread_t has nothing to do
with a pid. pthreads is a standard and portable implementation that
guarantees you can port _pthread_ code between posix systems. It uses
an internal opaque type to identify threads, but you should never relay on
it have nothing to do with pids. The fact that somewhere-in-time-in-some-os
the pthread_t equals the pid/tid/ etc is just pure chance. If you had
code relaying on this, it is just broken. Where is stated if pthread_t is
the tid, an index into a table internal to pthread library, a pointer
to an struct (mmm, bloken on 64 bits?) or what ?

Whatif:
- you swith kernels and thread library implementation ?
- you go solaris (it has user level threads ?)

I think one of the sources of the confussion is that:
- man pages about system calls talk about 'threads', but that should be
  read as 'sibling _processes_ created via clone(CLONE_THREAD) syscall'.
- man pages about phthreads library also talk about 'threads', but that
  should be read as 'posix threads created via pthread_create'.
And none guarantees that both 'threads' are the same.

If you just want to use gettid(), don't go further that clone().
If you use pthread_create(), forget about gettid().

(AFAIK ;) )

--
J.A. Magallon  \   Software is like sex:
werewolf!able!es \ It's better when it's free
Mandriva Linux release 2006.0 (Cooker) for i586
Linux 2.6.12-jam9 (gcc 4.0.1 (4.0.1-0.2mdk for Mandriva Linux release 2006.0))


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [announce] linux kernel performance project launch at sourceforge.net

2005-07-14 Thread Andi Kleen
"Chen, Kenneth W" <[EMAIL PROTECTED]> writes:

> I'm pleased to announce that we have established a linux kernel
> performance project, hosted at sourceforge.net:
> 
> http://kernel-perf.sourceforge.net

That's very cool. Thanks a lot.

Would it be possible to add 2.4.30 numbers and perhaps one or two 
distro kernels (let's say RHEL3/4, SLES8/9) to the graphs 
as data points for comparison? These are all very tuned
kernels and would show where mainline is worse than them.

Also how did you run netperf? Locally or to some other machine? 
Perhaps that should be documented.

Some oprofile listings from a few of the test runs would be also nice.

-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Buffer Over-runs, was Open source firewalls

2005-07-14 Thread Brian O'Mahoney
First there are endless ways of stopping DAMAGE from buffer
over-runs, from code that accepts user data, eg extend buffer, dont
use dangerous strxxx functions  so while you can move
stuff to proxies, and that has been done extensively e.g.
for sendmail it is a cop-out, far better fix the application;

Next, while all buffer over runs are very bad it is only those
that stamp on the stack, overwriting the return address stored
there and implanting viral code to be executed, that are truely
__EVIL__.

To do that you need to know a lot of things, the architecture
ie executing x86 code on a ppc will get you no-where, you must
know, and be able to debug your mal-ware against a stable
target, and this is why the _VERY_ slowly patched Windoze is
so vulnerable, and finally you really need to know the stack
base, top of stack, normally growing downward, and ... be able
to actually run code out of the stack space;

and if any one of these conditions are not true, eg I compiled
sendmail with a newer GCC, stack is not executable, ...

the exploit just fails or crashes an app and then you go after
why?

but your system is not compromised.

One final point, in practice, you get lots of unwanted packets
off the internet, and in general you do not want them on your
internal net, both for performance and security reasons, if you
drop them on your router or firewall then you dont need to
worry if the remote app is mal-ware.

-- 
mit freundlichen Grüßen, Brian.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Fix the recent C-state with FADT regression

2005-07-14 Thread Andrew Morton
Venkatesh Pallipadi <[EMAIL PROTECTED]> wrote:
>
> 
> 
> Attached patch fixes the recent C-state based on FADT regression reported by
> Kevin.
> 
> Please apply.
> 
> Thanks,
> Venki
> 
> 
> Fix the regression with c1_default_handler on some systems where C-states come
> from FADT.
> 
> Thanks to Kevin Radloff for identifying the issue and also root causing on 
> exact line of code that is causing the issue.
> 
> Signed-off-by: Venkatesh Pallipadi <[EMAIL PROTECTED]>
> 
> diff -purN  linux-2.6.13-rc1-mm1//drivers/acpi/processor_idle.c.org 
> linux-2.6.13-rc1-mm1//drivers/acpi/processor_idle.c
> --- linux-2.6.13-rc1-mm1//drivers/acpi/processor_idle.c.org   2005-07-14 
> 23:19:45.038854688 -0700
> +++ linux-2.6.13-rc1-mm1//drivers/acpi/processor_idle.c   2005-07-14 
> 23:21:47.292269344 -0700
> @@ -881,7 +881,7 @@ static int acpi_processor_get_power_info
>   result = acpi_processor_get_power_info_cst(pr);
>   if ((result) || (acpi_processor_power_verify(pr) < 2)) {
>   result = acpi_processor_get_power_info_fadt(pr);
> - if (result)
> + if ((result) || (acpi_processor_power_verify(pr) < 2))
>   result = acpi_processor_get_power_info_default_c1(pr);
>   }
>

It turns out I've had this in my tree since July 6 (from Jindrich
Makovicka), sent to Len on July 9.  Maybe he's merged it somewhere already.

I have seven acpi patches here, some of which have been in -mm for a very
long time.  I'll resend them all.  Could someone please promptly ack them for
2.6.13 or merge them somewhere or nack the things?

Thanks.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [announce] linux kernel performance project launch at sourceforge.net

2005-07-14 Thread Chen, Kenneth W
Alexey Dobriyan wrote on Thursday, July 14, 2005 3:34 PM
> On Friday 15 July 2005 00:21, Chen, Kenneth W wrote:
> > I'm pleased to announce that we have established a linux kernel
> > performance project, hosted at sourceforge.net:
> > 
> > http://kernel-perf.sourceforge.net
> 
> Perhaps, some cool-looking graphs instead of tables. Or you can write
> in red numbers where left deriative is smaller than zero. ;-)

I think sourceforge is being pounded hard by all the friendly kernel
hackers hitting the kernel performance page :-)  The site is just being
slow.  There are both tables and color charts in the result pages.

- Ken

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [announce] linux kernel performance project launch at sourceforge.net

2005-07-14 Thread Andi Kleen
> > Some oprofile listings from a few of the test runs would be also nice.
> 
> That is in the works.  We will upload profile data.  I'm having problem
> with oprofile on some versions of kernel and that is being investigated
> right now.

If you run statically compiled kernels you could as well use the
old style readprofile.  It just doesn't work with modules.

-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt

2005-07-14 Thread Alan Cox
On Iau, 2005-07-14 at 21:13, Linus Torvalds wrote:
> There is no way to avoid having some kind of counter to specify time.  
> NONE. The only choice is what you base your notion of time on, and how you
> represent it. Do you represent it as two separate counters and try to make
> it look like "fractions of seconds", or do you represent it as a single
> counter, and make it look like "ticks".

I suspect the problem for some of this is that people think of jiffies
as incrementing by 1. If HZ is right then jiffies can be in nS, it just
won't increment by 1. Its also why jiffies() is better on some platforms
because many machines can answer "what time is it" far more accurately
than they can set interrupts.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 2.6] remove PCI_BRIDGE_CTL_VGA handling from setup-bus.c

2005-07-14 Thread Jon Smirl
On 7/14/05, Ivan Kokshaysky <[EMAIL PROTECTED]> wrote:
> It shouldn't be a problem. These days we have a lot of arch hooks
> in the PCI layer. I'd probably start with the following:

You need to take this code into account, from arch/i386/pci/fixup.c

/*
 * Fixup to mark boot BIOS video selected by BIOS before it changes
 *
 * From information provided by "Jon Smirl" <[EMAIL PROTECTED]>
 *
 * The standard boot ROM sequence for an x86 machine uses the BIOS
 * to select an initial video card for boot display. This boot video
 * card will have it's BIOS copied to C in system RAM.
 * IORESOURCE_ROM_SHADOW is used to associate the boot video
 * card with this copy. On laptops this copy has to be used since
 * the main ROM may be compressed or combined with another image.
 * See pci_map_rom() for use of this flag. IORESOURCE_ROM_SHADOW
 * is marked here since the boot video device will be the only enabled
 * video device at this point.
 */

static void __devinit pci_fixup_video(struct pci_dev *pdev)
{
struct pci_dev *bridge;
struct pci_bus *bus;
u16 config;

if ((pdev->class >> 8) != PCI_CLASS_DISPLAY_VGA)
return;

/* Is VGA routed to us? */
bus = pdev->bus;
while (bus) {
bridge = bus->self;
if (bridge) {
pci_read_config_word(bridge, PCI_BRIDGE_CONTROL,
);
if (!(config & PCI_BRIDGE_CTL_VGA))
return;
}
bus = bus->parent;
}
pci_read_config_word(pdev, PCI_COMMAND, );
if (config & (PCI_COMMAND_IO | PCI_COMMAND_MEMORY)) {
pdev->resource[PCI_ROM_RESOURCE].flags |= IORESOURCE_ROM_SHADOW;
printk(KERN_DEBUG "Boot video device is %s\n", pci_name(pdev));
}
}
DECLARE_PCI_FIXUP_HEADER(PCI_ANY_ID, PCI_ANY_ID, pci_fixup_video);

-- 
Jon Smirl
[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 2.6] remove PCI_BRIDGE_CTL_VGA handling from setup-bus.c

2005-07-14 Thread Jon Smirl
On 7/14/05, Ivan Kokshaysky <[EMAIL PROTECTED]> wrote:
> On Thu, Jul 14, 2005 at 10:07:34AM -0400, Jon Smirl wrote:
> > I'm don't think it has ever been working in the 2.6 series. If you are
> > getting rid of it get rid of the #define PCI_BRIDGE_CTL_VGA in pci.h
> > too since this code was the only user.
> 
> No. The PCI_BRIDGE_CTL_VGA is not something artificial, it just describes
> some well defined hardware bit in the p2p bridge config header, so anyone
> working on VGA switching API will have to use it.

I had the wrong define, this is the one I was thinking of IORESOURCE_BUS_HAS_VGA

> > This code is part of VGA arbitration which BenH is addressing with a
> > more globally comprehensive patch. Ben's code will probably replace
> > it.
> 
> Yes, I've heard Ben is working on this, but I've yet to see the code. ;-)
> Any pointers?
> 
> Ivan.
> 


-- 
Jon Smirl
[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt

2005-07-14 Thread Alan Cox
> just doesn't realize that the latter is a bit more complicated exactly 
> because the latter is a hell of a lot more POWERFUL. Trying to get rid of 
> jiffies for some religious reason is _stupid_.

Getting rid of jiffies in its current form is a huge win for very
non-religious reasons. Jiffies is expensive in power management and
virtualisation because you have to maintain it.

Swap jiffies for jiffies() and the world gets a lot better. 

In actual fact you also want to fix users of

while(time_before(foo, jiffies)) { whack(mole); }

to become

init_timeout();
timeout.expires = jiffies + n
add_timeout();
while(!timeout_expired()) {}

Which is a trivial wrapper around timers as we have them now


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Add security_task_post_setgid

2005-07-14 Thread Christoph Hellwig
On Thu, Jul 14, 2005 at 11:42:46PM +0200, Jan Engelhardt wrote:
> Hi,
> 
> 
> the following patch adds a post_setgid() security hook, and necessary dummy 
> funcs.

... and why exactly would we want these?

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: 2.6.9: serial_core: uart_open

2005-07-14 Thread karl malbrain
> -Original Message-
> From: Russell King
> Sent: Thursday, July 14, 2005 11:57 AM
> To: karl malbrain
> Cc: [EMAIL PROTECTED] Kernel. Org
> Subject: Re: 2.6.9: serial_core: uart_open
>
>
> On Thu, Jul 14, 2005 at 10:16:23AM -0700, karl malbrain wrote:
> > I'd love to do a ps listing for you, but, except for the mouse,
> the system
> > is completely unresponsive after issuing the blocking open("/dev/ttyS1",
> > O_RDRW).
> >
> > Telnet is dead; the console will respond to the mouse, but the
> only thing I
> > can do is close the xterm window and thereby kill the process.
> I can launch
> > a new xterm window from the menu using the mouse, but the new
> window is dead
> > and has no cursor nor bash prompt.
> >
> > The clock on the display is being updated.  After several hours
> the system
> > reboots on its own.
> >
> > I recall from my DOS days that 8250/16550 programming requires that the
> > specific IIR source be responded to, or the chip will immediately
> > turn-around with another interrupt.  It doesn't look like 8250.c is
> > organized to respond directly to the modem-status-change value
> in IIR which
> > requires reading MSR to reset.
>
> Well, at this point interrupts are enabled, and _are_ handled.  The
> only thing we use the IIR for is to answer the question "did this
> device say it had an interrupt?"
>
> If it did, we unconditionally read the MSR without fail.
>
> So, I've no idea what so ever about what's going on here.  I don't
> understand why your system is behaving the way it is.  Therefore,
> I don't think we can progress this any further, sorry.

AT LAST I HAVE SOME DATA!!!

The problem is that ALL SYSTEM CALLS to open "/dev/tty" are blocking!! even
with O_NDELAY set and even from completely disjoint sessions.  I discovered
this via issuing "strace sh".  That's why the new xterm windows froze.

The original process doing the open("/dev/ttyS1", O_RDWR) is listed in the
ps aux listing as status S+.

Hope this helps karl m



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: pci_size() error condition

2005-07-14 Thread Ivan Kokshaysky
On Thu, Jul 14, 2005 at 11:04:00AM -0500, John Rose wrote:
> Okay, point taken :)  So for cases of base == maxbase, why would we ever
> want to return a nonzero value?  What is the intended purpose of the
> second part of that conditional?

Well, just two examples (both for PCI IO limited to 16 bits for simplicity,
but still from real life):
1. Consider some BAR that defines 16 bytes of IO space. It's
   perfectly valid for the PCI firmware to program this BAR to
   its max value, so after writing all 1s during the probe and proper
   masking we have base == maxbase == 0xfff0. But, since all high
   order bits are all 1s, (((base | size) & mask) != mask) is false,
   and we return correct value of 16.
2. Another BAR of some broken PCI device (typically, IDE controller)
   has *read-only* value of 0x1f0, for instance. After writing 0x
   we still read back the same 0x1f0, so base == maxbase == 0x1f0.
   But the second part of that "if" clause is now true, so we return 0,
   which means that the BAR is invalid and must be ignored.

Ivan.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH 0/4] new human-time soft-timer subsystem

2005-07-14 Thread Roman Zippel
Hi,

On Thu, 14 Jul 2005, Nishanth Aravamudan wrote:

> We no longer use jiffies (the variable) as the basis for determining
> what "time" a timer should expire or when it should be added. Instead,
> we use a new function, do_monotonic_clock(), which is simply a wrapper
> for getnstimeofday().

And suddenly a simple 32bit integer becomes a complex 64bit integer, which 
requires hardware access to read a timer and additional conversion into ns.
Why is suddenly everyone so obsessed with molesting something simple and 
cute as jiffies?

bye, Roman
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [announce] linux kernel performance project launch at sourceforge.net

2005-07-14 Thread Alexey Dobriyan
On Friday 15 July 2005 00:21, Chen, Kenneth W wrote:
> I'm pleased to announce that we have established a linux kernel
> performance project, hosted at sourceforge.net:
> 
> http://kernel-perf.sourceforge.net

Perhaps, some cool-looking graphs instead of tables. Or you can write in red
numbers where left deriative is smaller than zero. ;-)

> Comprehensive performance data from our tests will be published for easy
> access. 

Great! No, really. This means statistical errors.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [announce] linux kernel performance project launch at sourceforge.net

2005-07-14 Thread Chen, Kenneth W
[EMAIL PROTECTED] wrote on Thursday, July 14, 2005 3:18 PM
> "Chen, Kenneth W" <[EMAIL PROTECTED]> writes:
> > I'm pleased to announce that we have established a linux kernel
> > performance project, hosted at sourceforge.net:
> > 
> > http://kernel-perf.sourceforge.net
> 
> That's very cool. Thanks a lot.

Thank you.


> Would it be possible to add 2.4.30 numbers and perhaps one or two 
> distro kernels (let's say RHEL3/4, SLES8/9) to the graphs 
> as data points for comparison? These are all very tuned
> kernels and would show where mainline is worse than them.

We did have a distro kernel in the graph originally and later decided
to go with pure mainline kernels for consistency.  We will see what we
can do to add them in the future.


> Also how did you run netperf? Locally or to some other machine? 
> Perhaps that should be documented.

Yes, that was netperf, running locally.  Thanks for the suggestion, I
will document that appropriately.


> Some oprofile listings from a few of the test runs would be also nice.

That is in the works.  We will upload profile data.  I'm having problem
with oprofile on some versions of kernel and that is being investigated
right now.

- Ken

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[patch 1/2] block/cpqarray: Audit return code of create_proc_*

2005-07-14 Thread domen
From: Christophe Lucas <[EMAIL PROTECTED]>


Audit return of create_proc_* functions.

Signed-off-by: Christophe Lucas <[EMAIL PROTECTED]>
Signed-off-by: Domen Puncer <[EMAIL PROTECTED]>
---
 cpqarray.c |7 ++-
 1 files changed, 6 insertions(+), 1 deletion(-)

Index: quilt/drivers/block/cpqarray.c
===
--- quilt.orig/drivers/block/cpqarray.c
+++ quilt/drivers/block/cpqarray.c
@@ -212,13 +212,18 @@ static struct proc_dir_entry *proc_array
  */
 static void __init ida_procinit(int i)
 {
+   struct proc_dir_entry *ent;
+
if (proc_array == NULL) {
proc_array = proc_mkdir("cpqarray", proc_root_driver);
if (!proc_array) return;
}
 
-   create_proc_read_entry(hba[i]->devname, 0, proc_array,
+   ent = create_proc_read_entry(hba[i]->devname, 0, proc_array,
   ida_proc_get_info, hba[i]);
+   if (!ent)
+   printk(KERN_WARNING 
+   "cpqarray: Unable to create /proc entry.\n");
 }
 
 /*

--
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[patch 2/2] drivers/block/umem.c: Use DMA_{32,64}BIT_MASK and correct call to pci_set_dma_mask()

2005-07-14 Thread domen
From: Tobias Klauser <[EMAIL PROTECTED]>


Since nobody replied to Domen's request for clarification [1], here's a
patch to fix drivers/block/umem.c to correctly evaluate the return value
of pci_set_dma_mask() both times. The function returns non-null on
error, so this seems to be correct.

[1] http://lists.osdl.org/mailman/htdig/kernel-janitors/2005-May/004119.html

Use the DMA_{64,32}BIT_MASK constants from dma-mapping.h when calling
pci_set_dma_mask()
This patch includes dma-mapping.h explicitly because it caused errors
on some architectures otherwise.
See http://marc.theaimsgroup.com/?t=10800199301=1=2 for details

Signed-off-by: Tobias Klauser <[EMAIL PROTECTED]>
Signed-off-by: Domen Puncer <[EMAIL PROTECTED]>

---
 umem.c |4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

Index: quilt/drivers/block/umem.c
===
--- quilt.orig/drivers/block/umem.c
+++ quilt/drivers/block/umem.c
@@ -892,8 +892,8 @@ static int __devinit mm_pci_probe(struct
printk(KERN_INFO "Micro Memory(tm) controller #%d found at %02x:%02x 
(PCI Mem Module (Battery Backup))\n",
   card->card_number, dev->bus->number, dev->devfn);
 
-   if (pci_set_dma_mask(dev, 0xLL) &&
-   !pci_set_dma_mask(dev, 0xLL)) {
+   if (pci_set_dma_mask(dev, DMA_64BIT_MASK) &&
+   pci_set_dma_mask(dev, DMA_32BIT_MASK)) {
printk(KERN_WARNING "MM%d: NO suitable DMA found\n",num_cards);
return  -ENOMEM;
}

--
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[patch 1/1] Audit return code of create_proc_*

2005-07-14 Thread domen
From: Christophe Lucas <[EMAIL PROTECTED]>


Audit return of create_proc_* functions.

Signed-off-by: Christophe Lucas <[EMAIL PROTECTED]>
Signed-off-by: Domen Puncer <[EMAIL PROTECTED]>
---
 ecard.c |   11 +--
 1 files changed, 9 insertions(+), 2 deletions(-)

Index: quilt/arch/arm26/kernel/ecard.c
===
--- quilt.orig/arch/arm26/kernel/ecard.c
+++ quilt/arch/arm26/kernel/ecard.c
@@ -522,9 +522,16 @@ static struct proc_dir_entry *proc_bus_e
 
 static void ecard_proc_init(void)
 {
+   struct proc_dir_entry *proc_entry;
proc_bus_ecard_dir = proc_mkdir("ecard", proc_bus);
-   create_proc_info_entry("devices", 0, proc_bus_ecard_dir,
-   get_ecard_dev_info);
+   if (!proc_bus_ecard_dir)
+   printk(KERN_WARNING "Unable to create proc dir entry.\n");
+   else {
+   proc_entry = create_proc_info_entry("devices", 0,
+   proc_bus_ecard_dir, get_ecard_dev_info);
+   if (!proc_entry)
+   printk(KERN_WARNING "ecard: Unable to create proc 
entry\n");
+   }
 }
 
 #define ec_set_resource(ec,nr,st,sz,flg)   \

--
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


  1   2   3   4   5   6   7   >