date:20050825

Re: 2.6.13-rc6: halt instead of reboot

2005-08-25 Thread Eric W. Biederman

Meelis Roos <[EMAIL PROTECTED]> writes:

>> When skimming through the code I thought that reboot_thru_bios was the
>> default.
>
> My bad. I retested it and it's reboot=w was the one that works.
>
>> If you can't track this down we can at least dig up your board DMI ID
>> and put it in the list of systems that need to go through the BIOS to reboot.
>
> I have good news - it the ACPI merge commit
> 5028770a42e7bc4d15791a44c28f0ad539323807 that seems to break reboot. Also
> acpi=off works around it.
>
> So the "poweroff instead reboot" seems to be an ACPI regression :(

Does this small patch fix the problem for you?

In another bug fix we started calling acpi_sleep_prepare on the shutdown
path called by reboot/halt/poweroff, so the code would always be called with
interrupts enabled.  This just modifies it so we only call this code
if we know we are going to power off the system.

That change was added in the patch you indicate as where the
regression started.  

---

 drivers/acpi/sleep/poweroff.c |6 +-
 1 files changed, 5 insertions(+), 1 deletions(-)

7194b86c5e67aaf9ce8c25482441e87d700f057d
diff --git a/drivers/acpi/sleep/poweroff.c b/drivers/acpi/sleep/poweroff.c
--- a/drivers/acpi/sleep/poweroff.c
+++ b/drivers/acpi/sleep/poweroff.c
@@ -55,7 +55,11 @@ void acpi_power_off(void)
 
 static int acpi_shutdown(struct sys_device *x)
 {
-   return acpi_sleep_prepare(ACPI_STATE_S5);
+   if (system_state == SYSTEM_POWER_OFF) {
+   /* Prepare if we are going to power off the system */
+   return acpi_sleep_prepare(ACPI_STATE_S5);
+   }
+   return 0;
 }
 
 static struct sysdev_class acpi_sysclass = {
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.13-rc6: halt instead of reboot

2005-08-25 Thread Meelis Roos


When skimming through the code I thought that reboot_thru_bios was the
default.


My bad. I retested it and it's reboot=w was the one that works.


If you can't track this down we can at least dig up your board DMI ID
and put it in the list of systems that need to go through the BIOS to reboot.


I have good news - it the ACPI merge commit 
5028770a42e7bc4d15791a44c28f0ad539323807 that seems to break reboot. 
Also acpi=off works around it.


So the "poweroff instead reboot" seems to be an ACPI regression :(

--
Meelis Roos ([EMAIL PROTECTED])
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: PATCH: ide: ide-disk freeze support for hdaps

2005-08-25 Thread Yani Ioannou

Hi Bartlomiej,

Thank you for your feedback :), as this is my first dabble in
ide/block drivers I certainly need it!

On 8/25/05, Bartlomiej Zolnierkiewicz <[EMAIL PROTECTED]> wrote:
> +config IDEDISK_FREEZE
> 
> Is there any advantage of having it as a config option?

The main reasons I added the config option:
- the freeze feature is really only useful to an (increasing) niche of
mobile computers with an accelerometer.

- it might actually be detrimental to most other systems, you would
never want to freeze the queue on most machines - especially a
production system, and for that reason alone it seemed sensible to me
to be able to selectively remove it completely.

- to re-inforce the experimental nature of the patch, and disable it
by default (although this could be achieved just with EXPERIMENTAL I
suppose).

> Please make the interface accept number of seconds (as suggested by Jens)
> and remove this module parameter. This way interface will be more flexible
> and cleaner.  I really don't see any advantage in doing "echo 1 > ..." instead
> of "echo x > ..." (Pavel, please explain).

Either way is pretty easy enough to implement. Note though that I'd
expect the userspace app should thaw the device when danger is out of
the way (the timeout is mainly there to ensure that the queue isn't
frozen forever, and should probably be higher). Personally I don't
have too much of an opinion either way though... what's the consensus?
:).

I can understand Pavel's opinion in that a enable/disable attribute in
sysfs seems the norm, and is more intuitive. Also what should 'cat
/sys/block/hda/device/freeze' return for a 'echo x >
/sys/block/hda/device/freeze' sysfs attribute? The seconds remaining?
1/0 for frozen/thawed?

> +static void freeze_expire(unsigned long data);
> +static struct timer_list freeze_timer =
> +   TIMER_INITIALIZER(freeze_expire, 0, 0);
> 
> There needs to be a timer per device not a global one
> (it works for a current special case of T42 but sooner
>  or later we will hit this problem).

I was considering that, but I am confused as to whether each drive has
it's own queue or not? (I really am a newbie to this stuff...). If so
then yes there should be a per-device timer.

> queue handling should be done through block layer helpers
> (as described in Jens' email) - we will need them for libata too.

Good point, I'll try to move as much as I can up to the block layer,
it helps when it comes to implementing freeze for libata as you point
out too.

> At this time attribute can still be in use (because refcounting is done
> on drive->gendev), you need to add "disk" class to ide-disk driver
> (drivers/scsi/st.c looks like a good example how to do it).

I missed that completely, I'll look at changing it.

> IMO this should also be handled by block layer
> which has all needed information, Jens?
> 
> While at it: I think that sysfs support should be moved to block layer (queue
> attributes) and storage driver should only need to provide queue_freeze_fn
> and queue_thaw_fn functions (similarly to cache flush support).  This should
> be done now not later because this stuff is exposed to the user-space.

I was actually considering using a queue attribute originally, but in
my indecision I decided to go with Jen's suggestion. A queue attribute
does make sense in that the attribute primarily is there to freeze the
queue, but it would also be performing the head park. Would a queue
attribute be confusing because of that?

> 
> +   /*
>  * Sanity: don't accept a request that isn't a PM request
>  * if we are currently power managed. This is very important 
> as
>  * blk_stop_queue() doesn't prevent the elv_next_request()
> @@ -1661,6 +1671,9 @@ int ide_do_drive_cmd (ide_drive_t *drive
> where = ELEVATOR_INSERT_FRONT;
> rq->flags |= REQ_PREEMPT;
> }
> +   if (action == ide_next)
> +   where = ELEVATOR_INSERT_FRONT;
> +
> __elv_add_request(drive->queue, rq, where, 0);
> ide_do_request(hwgroup, IDE_NO_IRQ);
> spin_unlock_irqrestore(_lock, flags);
> 
> Why is this needed?

I think Jon discussed that in a previous thread, but basically
although ide_next is documented in the comment for ide_do_drive_cmd,
there isn't (as far as Jon or I could see) anything actually handling
it. This patch is carried over from Jon's work and adds the code to
handle ide_next by inserting the request at the front of the queue.

> Overall, very promising work!

Thanks :-), most of it is Jon's work, and Jen's suggestions though.

Yani

P.S. Sorry about the lack of [] around PATCH...lack of sleep. Its more
of a RFC anyway.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch] Real-Time Preemption, -RT-2.6.13-rc4-V0.7.52-01

2005-08-25 Thread Steven Rostedt

On Fri, 2005-08-12 at 14:58 +0200, Ingo Molnar wrote:
> FYI, in -53-05 i've added a bh->b_update_lock, which enabled me to get 
> rid of the bitlock ugliness in fs/buffer.c. Maybe it could be used to 
> have a better fix for the jbd bitlock thing too?

Well, I just spent several hours trying to use the b_update_lock in
implementing something to replace the bit spinlocks for RT.  It's
getting really ugly and I just hit a stone wall.

The problem is that I have two locks to work with. A
jbd_lock_bh_journal_head and a jbd_lock_bh_state. Unfortunately, I also
have a ranking order of:

jbd_lock_bh_state -> j_state_lock -> jbd_lock_bh_journal_head

If the ranking wasn't like this, I could probably make a little more
progress.

The jbd_lock_bh_journal_head is used to protect against creating a
journal_head and adding it to a buffer_head.  This was the obvious
choice to use your b_update_lock as a replacement, since I need to have
a lock before I acquired a journal descriptor.

The jbd_lock_bh_state was going to exist in the journal desciptor that
is stored in the buffer_head private data.  But this lead to a problem
when this is deleted.  The private data is freed while the lock is held.
So, keeping the lock in with the journal descriptor had the problem of
being freed before it was unlocked.

I started adding code to delay the freeing of the descriptor until after
the lock was held, but this added another problem.  There might be
another process waiting on this lock, and when it gets it, it tests if
the buffer_head even has a journal_descriptor for it. So, even if I
delayed the freeing, another process could be waiting on this so you
still may have a premature free.  Not to mention that this code was
becoming _very_ intrusive, since the freeing takes place deep inside
functions that acquire the lock.

So this lock has the same problem as the jbd_lock_bh_journal_head, where
as, you have a buffer_head and you want to take this lock before you
know that this buffer_head even has a journal descriptor attached to it.

So, the only other solutions that I can think of is:

a) add yet another (bloat) lock to the buffer head.

b) Still use your b_update_lock for the jbd_lock_bh_journal_head and
change the jbd_lock_bh_state to what I discussed earlier, and that being
the hash wait_on_bit code.

So do you have any ideas?

-- Steve

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: process creation time increases linearly with shmem

2005-08-25 Thread Linus Torvalds

On Fri, 26 Aug 2005, Nick Piggin wrote:
> 
> > Skipping MAP_SHARED in fork() sounds like a good idea to me...
> > 
> 
> Indeed. Linus, can you remember why we haven't done this before?

Hmm. Historical reasons. Also, if the child ends up needing it, it will 
now have to fault them in.

That said, I think it's a valid optimization. Especially as the child 
_probably_ doesn't need it (ie there's at least some likelihood of an 
execve() or similar).

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 2.6.13-rc7 2/2] New Syscall: set rlimits of any process (reworked)

2005-08-25 Thread Wieland Gmeiner

This is the second of two patches, it implements the setprlimit()
syscall.

Implementation: This patch provides a new syscall setprlimit() for
writing a given process resource limits for i386. Its implementation
follows closely the setrlimit syscall. It is given a pid as an
additional argument. If the given pid equals zero the current process
rlimits are written and the behaviour resembles the behaviour of
setrlimit. Otherwise some checking on the validity of the given pid is
done and if the given process is found access is granted if
- the calling process holds the CAP_SYS_PTRACE capability or
- the calling process uid equals the uid, euid, suid of the target
  process and the calling process gid equals the gid, egid, sgid of
  the target process.
(This resembles the behaviour of the ptrace system call.)

Simple programs for testing the syscalls can be found on
http://stud4.tuwien.ac.at/~e8607062/studies/soc/patches/


Signed-off-by: Wieland Gmeiner <[EMAIL PROTECTED]>



---

 arch/i386/kernel/syscall_table.S |1 
 include/asm-i386/unistd.h|3 -
 kernel/sys.c |  114 ---
 security/selinux/hooks.c |   14 +++-
 4 files changed, 85 insertions(+), 47 deletions(-)

diff -puN arch/i386/kernel/syscall_table.S~setprlimit 
arch/i386/kernel/syscall_table.S
--- linux-2.6.13-rc7/arch/i386/kernel/syscall_table.S~setprlimit
2005-08-26 05:09:13.0 +0200
+++ linux-2.6.13-rc7-wieland/arch/i386/kernel/syscall_table.S   2005-08-26 
05:09:34.0 +0200
@@ -295,3 +295,4 @@ ENTRY(sys_call_table)
.long sys_inotify_add_watch
.long sys_inotify_rm_watch
.long sys_getprlimit
+   .long sys_setprlimit/* 295 */
diff -puN include/asm-i386/unistd.h~setprlimit include/asm-i386/unistd.h
--- linux-2.6.13-rc7/include/asm-i386/unistd.h~setprlimit   2005-08-26 
05:09:13.0 +0200
+++ linux-2.6.13-rc7-wieland/include/asm-i386/unistd.h  2005-08-26 
05:09:34.0 +0200
@@ -300,8 +300,9 @@
 #define __NR_inotify_add_watch 292
 #define __NR_inotify_rm_watch  293
 #define __NR_getprlimit294
+#define __NR_setprlimit295
 
-#define NR_syscalls 295
+#define NR_syscalls 296
 
 /*
  * user-visible error numbers are in the range -1 - -128: see
diff -puN kernel/sys.c~setprlimit kernel/sys.c
--- linux-2.6.13-rc7/kernel/sys.c~setprlimit2005-08-26 05:09:13.0 
+0200
+++ linux-2.6.13-rc7-wieland/kernel/sys.c   2005-08-26 05:09:34.0 
+0200
@@ -1600,6 +1600,78 @@ asmlinkage long sys_getrlimit(unsigned i
return rlim_do_getprlimit(0, resource, rlim);
 }
 
+static inline long rlim_do_setprlimit(pid_t pid, unsigned int resource,
+ struct rlimit __user *rlim)
+{
+   struct rlimit new_rlim, *old_rlim;
+   int retval;
+   task_t *p;
+
+   if (resource >= RLIM_NLIMITS)
+   return -EINVAL;
+   if (pid < 0)
+   return -EINVAL;
+   if(copy_from_user(_rlim, rlim, sizeof(*rlim)))
+   return -EFAULT;
+   if (new_rlim.rlim_cur > new_rlim.rlim_max)
+   return -EINVAL;
+
+   retval = -ESRCH;
+   read_lock(_lock);
+   if (pid == 0) {
+   p = current;
+   } else {
+   p = find_task_by_pid(pid);
+   }
+   if (p) {
+   retval = -EPERM;
+   if (pid && !prlim_check_perm(p))
+   goto out;
+
+   old_rlim = p->signal->rlim + resource;
+   if ((new_rlim.rlim_max > old_rlim->rlim_max) &&
+   !capable(CAP_SYS_RESOURCE))
+   goto out;
+   if (resource == RLIMIT_NOFILE && new_rlim.rlim_max > NR_OPEN)
+   goto out;
+
+   retval = security_task_rlimit(p, resource, _rlim);
+   if (retval)
+   goto out;
+
+   task_lock(p->group_leader);
+   *old_rlim = new_rlim;
+   task_unlock(p->group_leader);
+
+   if (resource == RLIMIT_CPU &&
+   new_rlim.rlim_cur != RLIM_INFINITY &&
+   (cputime_eq(p->signal->it_prof_expires, cputime_zero) ||
+new_rlim.rlim_cur <= cputime_to_secs(
+p->signal->it_prof_expires))) {
+   cputime_t cputime = secs_to_cputime(new_rlim.rlim_cur);
+   spin_lock_irq(>sighand->siglock);
+   set_process_cpu_timer(p, CPUCLOCK_PROF,
+ , NULL);
+   spin_unlock_irq(>sighand->siglock);
+   }
+   }
+
+out:
+   read_unlock(_lock);
+   return retval;
+}
+
+asmlinkage long sys_setprlimit(pid_t pid, unsigned int resource,
+  struct rlimit __user *rlim)
+{
+   return rlim_do_setprlimit(pid, resource, rlim);
+}
+
+asmlinkage long

[PATCH 2.6.13-rc7 1/2] New Syscall: get rlimits of any process (reworked)

2005-08-25 Thread Wieland Gmeiner

Hi all!

First I would like to thank everyone who commented on my code.

I understand that this won't go into mainline but nevertheless I would
like to work on it further as it is a great learning experience to me.

I incorporated the changes suggested to me by this list (at least I hope
so), any comments highly appreciated.

Thanks,
Wieland


Rationale: Currently resource usage limits (rlimits) can only be 
set inside a process space, or inherited from the parent process.
It would be useful to allow adjusting resource limits for running 
processes, e.g. tuning the resource usage of daemon processes under 
changing workloads without restarting them.

Implementation: This patch provides a new syscall getprlimit() for
reading a given process resource limits for i386. Its implementation
follows closely the getrlimit syscall. It is given a pid as an
additional argument. If the given pid equals zero the current process
rlimits are read and the behaviour resembles the behaviour of
getrlimit. Otherwise some checking on the validity of the given pid is
done and if the given process is found access is granted if
- the calling process holds the CAP_SYS_PTRACE capability or
- the calling process uid equals the uid, euid, suid of the target
  process and the calling process gid equals the gid, egid, sgid of
  the target process.
(This resembles the behaviour of the ptrace system call.)


See the followup for the writing syscall.

Simple programs for testing the syscalls can be found on
http://stud4.tuwien.ac.at/~e8607062/studies/soc/patches/

Signed-off-by: Wieland Gmeiner <[EMAIL PROTECTED]>




---

 arch/i386/kernel/syscall_table.S |1 
 include/asm-i386/unistd.h|3 -
 include/linux/security.h |   25 +++-
 kernel/sys.c |   81 ++-
 security/dummy.c |5 +-
 security/selinux/hooks.c |   17 +---
 6 files changed, 105 insertions(+), 27 deletions(-)

diff -puN arch/i386/kernel/syscall_table.S~getprlimit 
arch/i386/kernel/syscall_table.S
--- linux-2.6.13-rc7/arch/i386/kernel/syscall_table.S~getprlimit
2005-08-26 05:01:17.0 +0200
+++ linux-2.6.13-rc7-wieland/arch/i386/kernel/syscall_table.S   2005-08-26 
05:01:46.0 +0200
@@ -294,3 +294,4 @@ ENTRY(sys_call_table)
.long sys_inotify_init
.long sys_inotify_add_watch
.long sys_inotify_rm_watch
+   .long sys_getprlimit
diff -puN include/asm-i386/unistd.h~getprlimit include/asm-i386/unistd.h
--- linux-2.6.13-rc7/include/asm-i386/unistd.h~getprlimit   2005-08-26 
05:01:17.0 +0200
+++ linux-2.6.13-rc7-wieland/include/asm-i386/unistd.h  2005-08-26 
05:01:46.0 +0200
@@ -299,8 +299,9 @@
 #define __NR_inotify_init  291
 #define __NR_inotify_add_watch 292
 #define __NR_inotify_rm_watch  293
+#define __NR_getprlimit294
 
-#define NR_syscalls 294
+#define NR_syscalls 295
 
 /*
  * user-visible error numbers are in the range -1 - -128: see
diff -puN include/linux/security.h~getprlimit include/linux/security.h
--- linux-2.6.13-rc7/include/linux/security.h~getprlimit2005-08-26 
05:01:17.0 +0200
+++ linux-2.6.13-rc7-wieland/include/linux/security.h   2005-08-26 
05:01:46.0 +0200
@@ -584,10 +584,12 @@ struct swap_info_struct;
  * @p contains the task_struct of process.
  * @nice contains the new nice value.
  * Return 0 if permission is granted.
- * @task_setrlimit:
- * Check permission before setting the resource limits of the current
- * process for @resource to @new_rlim.  The old resource limit values can
- * be examined by dereferencing (current->signal->rlim + resource).
+ * @task_rlimit:
+ * Check permission before reading the resource limits of the process @p
+ * for @resource or setting the limits to @new_rlim.  The old resource
+ * limit values can be examined by dereferencing
+ * (p->signal->rlim + resource).
+ * @p contains the task_struct for the process.
  * @resource contains the resource whose limit is being set.
  * @new_rlim contains the new limits for @resource.
  * Return 0 if permission is granted.
@@ -1156,7 +1158,8 @@ struct security_operations {
int (*task_getsid) (struct task_struct * p);
int (*task_setgroups) (struct group_info *group_info);
int (*task_setnice) (struct task_struct * p, int nice);
-   int (*task_setrlimit) (unsigned int resource, struct rlimit * new_rlim);
+   int (*task_rlimit) (struct task_struct * p, unsigned int resource,
+   struct rlimit * new_rlim);
int (*task_setscheduler) (struct task_struct * p, int policy,
  struct sched_param * lp);
int (*task_getscheduler) (struct task_struct * p);
@@ -1798,10 +1801,11 @@ static inline int security_task_setnice 
return security_ops->task_setnice (p, nice);
 }
 
-static inline int security_task_setrlimit

Re: 2.6.12 Performance problems

2005-08-25 Thread Danial Thom

--- Ben Greear <[EMAIL PROTECTED]> wrote:

> Danial Thom wrote:
> > 
> > --- Ben Greear <[EMAIL PROTECTED]>
> wrote:
> > 
> > 
> >>Danial Thom wrote:
> >>
> >>
> >>>I think the concensus is that 2.6 has made
> >>
> >>trade
> >>
> >>>offs that lower raw throughput, which is
> what
> >>
> >>a
> >>
> >>>networking device needs. So as a router or
> >>>network appliance, 2.6 seems less suitable.
> A
> >>
> >>raw
> >>
> >>>bridging test on a 2.0Ghz operton system:
> >>>
> >>>FreeBSD 4.9: Drops no packets at 900K pps
> >>>Linux 2.4.24: Starts dropping packets at
> 350K
> >>
> >>pps
> >>
> >>>Linux 2.6.12: Starts dropping packets at
> 100K
> >>
> >>pps
> >>
> >>I ran some quick tests using kernel 2.6.11,
> 1ms
> >>tick (HZ=1000), SMP kernel.
> >>Hardware is P-IV 3.0Ghz + HT on a new
> >>SuperMicro motherboard with 64/133Mhz
> >>PCI-X bus.  NIC is dual Intel pro/1000. 
> Kernel
> >>is close to stock 2.6.11.
> 
> > What GigE adapters did you use? Clearly every
> > driver is going to be different. My
> experience is
> > that a 3.4Ghz P4 is about the performance of
> a
> > 2.0Ghz Opteron. I have to try your tuning
> script
> > tomorrow.
> 
> Intel pro/1000, as I mentioned.  I haven't
> tried any other
> NIC that comes close in performance to the
> e1000.
> 
> > If your test is still set up, try compiling
> > something large while doing the test. The
> drops
> > go through the roof in my tests.
> 
> Installing RH9 on the box now to try some
> tests...
> 
> Disk access always robs networking, in my
> experience, so
> I am not supprised you see bad ntwk performance
> while
> compiling.
> 
> Ben

It would be useful if there were some way to find
out "what" is getting "robbed". If networking has
priority, then what is keeping it from getting
back to processing the rx interrupts? 

Ah, the e1000 has built-in interrupt moderation.
I can't get into my lab until tomorrow afternoon,
but if you get a chance try setting ITR in
e1000_main.c to something larger, like 20K. and
see if it makes a difference. At 200K pps that
would cause an interrupt every 10 packets, which
may allow the routine to grab back the cpu more
often.

Danial

__
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] [CIFS] Fix for oops in fs/locks.c in 2.6.13-rc running connectathon byte range lock test over cifs

2005-08-25 Thread Steve French

The recent change to locks_remove_flock code in fs/locks.c changes how 
byte range locks are removed from closing files, which shows up a bug in 
cifs.   The assumption in the cifs code was that the close call sent to 
the server would remove any pending locks on the server on this file, 
but that is no longer safe as the fs/locks.c code on the client wants 
unlock of 0 to PATH_MAX to remove all locks (at least from this client, 
it is not possible AFAIK to remove all locks from other clients made to 
the server copy of the file).   Note that cifs locks are different from 
posix locks - and it is not possible to map posix locks perfectly on the 
wire yet, due to restrictions of the cifs network protocol, even to 
Samba without adding a new request type to the network protocol (which 
we plan to do for Samba 3.0.21 within a few months), but the local 
client will have the correct, posix view, of the lock in most cases. 

The correct fix for cifs for this would involve a bigger change than I 
would like to do this late in the 2.6.13-rc cycle - and would involve 
cifs keeping track of all unmerged (uncoalesced) byte range locks for 
each remote inode and scanning that list to remove locks that intersect 
or fall wholly within the range - locks that intersect may have to be 
reaquired with the smaller, remaining range.


The immediate need though is for the following fix to get into 2.6.13 to 
at least avoid the oops in the vfs.

[CIFS] Fix oops in fs/locks.c on close of file with pending locks

Signed-off-by: Steve French <[EMAIL PROTECTED]>

diff -Naur old/fs/file.c new/fs/file.c
--- old/fs/cifs/file.c   2005-08-25 21:53:47.0 -0500
+++ new/fs/cifs/file.c   2005-08-25 21:54:56.0 -0500
@@ -643,7 +643,7 @@
netfid, length,
pfLock->fl_start, numUnlock, numLock, lockType,
wait_flag);
-   if (rc == 0 && (pfLock->fl_flags & FL_POSIX))
+   if (pfLock->fl_flags & FL_POSIX)
   posix_lock_file_wait(file, pfLock);
   FreeXid(xid);
   return rc;


The original problem report follows.  Thanks to Shaggy for the initial 
analysis.


Dave Kleikamp wrote:


Running the connectathon lock tests, I hit this BUG:

[ 3094.124950] [ cut here ]
[ 3094.124959] kernel BUG at fs/locks.c:1920!
[ 3094.124962] invalid operand:  [#1]
[ 3094.124964] PREEMPT 
[ 3094.124966] Modules linked in: cifs ipt_TCPMSS iptable_filter ip_tables blowfish sha256 dummy radeon irda crc_ccitt airo e1000 pcmcia yenta_socket rsrc_nonstatic pcmcia_core ntfs jfs

[ 3094.124981] CPU:0
[ 3094.124982] EIP:0060:[]Not tainted VLI
[ 3094.124984] EFLAGS: 00010246   (2.6.13-rc7) 
[ 3094.124993] EIP is at locks_remove_flock+0x7e/0x140

[ 3094.124997] eax: dc925b74   ebx: c66159f4   ecx: 0001   edx: 0001
[ 3094.125001] esi: c6615a8c   edi: c66159f4   ebp: c50ffec0   esp: d0c27e78
[ 3094.125004] ds: 007b   es: 007b   ss: 0068
[ 3094.125008] Process tlocklfs (pid: 12264, threadinfo=d0c27000 task=c6b03570)
[ 3094.125010] Stack: cb210ec0 d0c27e9c  10c27000 0001    
[ 3094.125017]8000 0023 cb210ec0 d0c27f1c  d69cc3c0 e1d96b1a e1d911ca 
[ 3094.125025]00fe08bf d69cc3c0 1f2f  8000   0001 
[ 3094.125032] Call Trace:

[ 3094.125038]  [] _FreeXid+0x1a/0x30 [cifs]
[ 3094.125058]  [] cifs_lock+0x17a/0x530 [cifs]
[ 3094.125074]  [] locks_remove_posix+0x131/0x140
[ 3094.125080]  [] inotify_dentry_parent_queue_event+0xa0/0xd0
[ 3094.125089]  [] __fput+0xa7/0x200
[ 3094.125098]  [] filp_close+0x4d/0x80
[ 3094.125103]  [] sys_close+0x6b/0xa0
[ 3094.125108]  [] syscall_call+0x7/0xb
[ 3094.125115] Code: 74 1b 89 c6 8b 06 85 c0 75 f3 e8 3e 39 33 00 81 c4 cc 00 00 00 5b 5e 5f 5d c3 8d 76 00 0f b6 50 28 f6 c2 02 75 22 f6 c2 20 75 0a <0f> 0b 80 07 2c f7 4d c0 eb cd 89 34 24 bf 02 00 00 00 89 7c 24 
[ 3094.125147]  


I believe it is caused by this patch (stale POSIX lock handling):
http://www.kernel.org/git/gitweb.cgi?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=c293621bbf678a3d85e3ed721c3921c8a670610d

The bit responsible is:

@@ -1888,12 +1908,7 @@ void locks_remove_flock(struct file *fil

   while ((fl = *before) != NULL) {
   if (fl->fl_file == filp) {
-   /*
-* We might have a POSIX lock that was created at the 
same time
-* the filp was closed for the last time. Just remove 
that too,
-* regardless of ownership, since nobody can own it.
-*/
-   if (IS_FLOCK(fl) || IS_POSIX(fl)) {
+   if (IS_FLOCK(fl)) {
   locks_delete_lock(before);
   continue;
   }


Leaving this:

   if (fl->fl_file == filp) {
   if (IS_FLOCK(fl)) {

[SOLVED] Re: Re: Problem with kernel image in a Prep Boot on PowerPC

2005-08-25 Thread Márcio Oliveira




John W. Linville wrote:


On Wed, Aug 24, 2005 at 02:52:44PM -0300, Márcio Oliveira wrote:

 

The command rdev can change the default root partition on x86 linux 
systems with pre-built kernels.
   



Of course...I meant I don't know of anything like that for PPC.

 

About the CONFIG_CMDLINE in the kernel configuration, I found it in lots 
of files in the kernel source tree and I'd like to know which file I 
need to change this value (/usr/src/linux/arch/ppc64/defconfig ?).
   



Probably just in your .config file:

cp arch/ppc64/defconfig .config
vi .config # Change CONFIG_CMDLINE here
make oldconfig

 

According to this doc: 
http://www-128.ibm.com/developerworks/eserver/library/es-SW_RAID_LINUX.html, 
ppc64 can use zImage-style boot wrapper, so I'm trying it.
   



Cool...I think you will like having that as an option.

John
 


John,

I made the changes in the kernel that you recomended and the server 
boots ok!


Thank's a lot.

Márcio Oliveira.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Memory-Mapping with LFS

2005-08-25 Thread Andreas Baer

Who is the memory mapping expert? :)

What are the current file size limits for memory mapping via glibc's
mmap() function on linux:

- for a native 32-Bit System not using LFS?
- for a native 32-Bit System using LFS?
- for a native 64-Bit System?

(linux-kernel >2.6, of course)

It would be nice if someone could tell me what I have to consider if I
want to use memory mapping for files. I'm currently a little bit
confused about it (information overflow :)). Personal opinions about
speed (maybe increase or decrease for large files) are also welcome.

--

The glibc documentation says:
"Since mmapped pages can be stored back to their file when physical
memory is low, it is possible to mmap files orders of magnitude larger
than both the physical memory and swap space. The only limit is address
space. The theoretical limit is 4GB on a 32-bit machine - however, the
actual limit will be smaller since some areas will be reserved for other
purposes. If the LFS interface is used the file size on 32-bit systems
is not limited to 2GB (offsets are signed which reduces the addressable
area of 4GB by half); the full 64-bit are available."

- I doubt that the full 64-Bit (something within Exabyte) are available
in practical use. Right or wrong?


I've also found an old kernel-list e-mail from 2004 that says:
"There is a limit per process in the kernel vm that prevent from
mmapping more than 512GB of data."

- Is this still true for the current kernel?


An example:

Let's presume the following case. I have an 8 GB file, 1 GB physical
memory and I want to use memory mapping for that file using LFS on a
32-Bit machine.

- Is it possible?

If yes, let's presume that I have 2 or more pointers, that are
frequently pointing to completely different places and switch the data
they are pointing to.

- How is it managed (by the kernel)? Through the pages, that are
mentioned in the glibc documentation above? Are these page operations
really faster than normal random file access (lseek etc)?

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] RT-patch update to remove the global pi_lock

2005-08-25 Thread Steven Rostedt

On Thu, 2005-08-25 at 16:09 -0400, Steven Rostedt wrote:

> A word of caution (aka. disclaimer). This is still new.  I still expect
> there are some cases in the code that was missed and can cause a dead
> lock or other bad side effect.  Hopefully, we can iron these all out.
> Also, I noticed that since the task takes it's own pi_lock for most of
> the code, if something locks up and a NMI goes off, the down_trylock in
> printk will also lock when it tries to take it's own pi_lock.

OK, found my first bug :-)

Just so everyone knows.  In rt.c, all pi_waiter access (reading or
writing) must be protected by the task's pi_lock, and all access to the
lock's wait_list must be protected by the lock's wait_lock.  The magic
is in the locking order :-).

-- Steve

Signed-off-by: Steven Rostedt <[EMAIL PROTECTED]>

Index: linux_realtime_goliath/kernel/rt.c
===
--- linux_realtime_goliath/kernel/rt.c  (revision 306)
+++ linux_realtime_goliath/kernel/rt.c  (working copy)
@@ -671,6 +671,7 @@
struct rt_mutex_waiter *w;
struct plist *curr1;
 
+   __raw_spin_lock(old_owner->task->pi_lock);
TRACE_WARN_ON_LOCKED(plist_empty(>pi_list));
TRACE_WARN_ON_LOCKED(lock_owner(lock));
 
@@ -681,6 +682,7 @@
}
TRACE_WARN_ON_LOCKED(1);
 ok:
+   __raw_spin_unlock(old_owner->task->pi_lock);
return;
 }
 
@@ -734,6 +736,8 @@
if (old_owner == new_owner)
return;
 
+   TRACE_BUG_ON_LOCKED(!spin_is_locked(_owner->task->pi_lock));
+   TRACE_BUG_ON_LOCKED(!spin_is_locked(_owner->task->pi_lock));
plist_for_each_safe(curr1, next1, _owner->task->pi_waiters) {
w = plist_entry(curr1, struct rt_mutex_waiter, pi_list);
if (w->lock == lock) {
@@ -932,6 +936,8 @@
/*
 * Add SCHED_NORMAL tasks to the end of the waitqueue (FIFO):
 */
+   TRACE_BUG_ON_LOCKED(!spin_is_locked(>pi_lock));
+   TRACE_BUG_ON_LOCKED(!spin_is_locked(>wait_lock));
 #ifndef ALL_TASKS_PI
if (!rt_task(task)) {
plist_add(>list, >wait_list);
@@ -939,6 +945,7 @@
return;
}
 #endif
+   __raw_spin_lock(_owner(lock)->task->pi_lock);
plist_add(>pi_list, _owner(lock)->task->pi_waiters);
/*
 * Add RT tasks to the head:
@@ -949,11 +956,9 @@
 * If the waiter has higher priority than the owner
 * then temporarily boost the owner:
 */
-   if (task->prio < lock_owner(lock)->task->prio) {
-   __raw_spin_lock(_owner(lock)->task->pi_lock);
+   if (task->prio < lock_owner(lock)->task->prio)
pi_setprio(lock, lock_owner(lock)->task, task->prio);
-   __raw_spin_unlock(_owner(lock)->task->pi_lock);
-   }
+   __raw_spin_unlock(_owner(lock)->task->pi_lock);
 }
 
 /*


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Redundant up operation in stop_machine.c ?(2.6.12)

2005-08-25 Thread Rusty Russell

On Thu, 2005-08-25 at 21:59 +0800, Yingchao Zhou wrote:
> In stop_machine function, there are codes:
>   if (ret < 0) {
>   stopmachine_set_state(STOPMACHINE_EXIT);
>   up(_mutex);
>   return ret;
>   }
> And in __stop_machine_run ,there are:
>   if (!IS_ERR(p)) {
>   kthread_bind(p, cpu);
>   wake_up_process(p);
>   wait_for_completion();
>   }
>   up(_mutex);
> 
> Is the first up op is really redundant?

Yes, it seems you have found a bug.  I tested it (inserting a spurious
failure), and indeed, it gets up'ed twice.

Good catch!
Rusty.

Name: Redundant up operation in stop_machine.c
Signed-off-by: Rusty Russell <[EMAIL PROTECTED]> (authored)

Yingchao Zhou <[EMAIL PROTECTED]> noticed that we up() in
stop_machine on failure, and also in the caller (unconditionally).

Index: linux-2.6.13-rc7-git1-Misc/kernel/stop_machine.c
===
--- linux-2.6.13-rc7-git1-Misc.orig/kernel/stop_machine.c   2005-08-26 
11:18:00.0 +1000
+++ linux-2.6.13-rc7-git1-Misc/kernel/stop_machine.c2005-08-26 
12:05:01.0 +1000
@@ -115,7 +115,6 @@
/* If some failed, kill them all. */
if (ret < 0) {
stopmachine_set_state(STOPMACHINE_EXIT);
-   up(_mutex);
return ret;
}
 

-- 
A bad analogy is like a leaky screwdriver -- Richard Braakman

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: process creation time increases linearly with shmem

2005-08-25 Thread Rik van Riel

On Fri, 26 Aug 2005, Nick Piggin wrote:

> > Skipping MAP_SHARED in fork() sounds like a good idea to me...
> 
> Indeed. Linus, can you remember why we haven't done this before?

Where "this" looks something like the patch below, shamelessly
merging Nick's and Andy's patches and adding the initialization
of retval.

I suspect this may be a measurable win on database servers with
a web frontend, where the connections to the database server are
set up basically for each individual query, and don't stick around
for a long time.

No, I haven't actually tested this patch - but feel free to go
wild while I sign off for the night.

Signed-off-by: Rik van Riel <[EMAIL PROTECTED]>

--- linux-2.6.12/kernel/fork.c.mapshared2005-08-25 18:40:44.0 
-0400
+++ linux-2.6.12/kernel/fork.c  2005-08-25 18:47:16.0 -0400
@@ -184,7 +184,7 @@
 {
struct vm_area_struct * mpnt, *tmp, **pprev;
struct rb_node **rb_link, *rb_parent;
-   int retval;
+   int retval = 0;
unsigned long charge;
struct mempolicy *pol;

@@ -265,7 +265,10 @@
rb_parent = >vm_rb;

mm->map_count++;
-   retval = copy_page_range(mm, current->mm, tmp);
+   /* Skip pte copying if page faults can take care of things. */
+   if (!file || !(tmp->vm_flags & VM_SHARED) ||
+   is_vm_hugetlb_page(vma))
+   retval = copy_page_range(mm, current->mm, tmp);
spin_unlock(>page_table_lock);

if (tmp->vm_ops && tmp->vm_ops->open)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Initramfs and TMPFS!

2005-08-25 Thread dwilson24



This is in reference to Chris Wedgwood's patch.

Wouldn't it be better to put overmount_rootfs in initramfs.c
and call it only if there's a initramfs?

printk(KERN_INFO "checking if image is initramfs...");
err = unpack_to_rootfs((char *)initrd_start,
initrd_end - initrd_start, 1);
if (!err) {
printk(" it is\n");
#ifdef CONFIG_EARLYUSERSPACE_ON_TMPFS
overmount_rootfs();
#endif /* CONFIG_EARLYUSERSPACE_ON_TMPFS */
unpack_to_rootfs((char *)initrd_start,
initrd_end - initrd_start, 0);
free_initrd_mem(initrd_start, initrd_end);
return;
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 2.6.13-rc7 2/2] completely disable cpu_exclusive sched domain

2005-08-25 Thread Nick Piggin


Paul Jackson wrote:

At the suggestion of Nick Piggin and Dinakar, totally disable
the facility to allow cpu_exclusive cpusets to define dynamic
sched domains in Linux 2.6.13, in order to avoid problems
first reported by John Hawkes (corrupt sched data structures
and kernel oops).

This has been built for ppc64, i386, ia64, x86_64, sparc, alpha.
It has been built, booted and tested for cpuset functionality
on an SN2 (ia64).

Dinakar or Nick - could you verify that it for sure does avoid
the problems Hawkes reported.  Hawkes is out of town, and I don't
have the recipe to reproduce what he found.



Thanks Paul, I was never able to reproduce the problem, but
I'm sure Dinakar should be able to test.

Acked-by: Nick Piggin <[EMAIL PROTECTED]>

--
SUSE Labs, Novell Inc.

Send instant messages to your online friends http://au.messenger.yahoo.com 


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: cache regresions with 2.6.1x ?

2005-08-25 Thread Andrew Morton

jerome lacoste <[EMAIL PROTECTED]> wrote:
>
> On 8/23/05, Andrew Morton <[EMAIL PROTECTED]> wrote:
>  > jerome lacoste <[EMAIL PROTECTED]> wrote:
>  > >
>  > > I am on a Dell Inspiron 8100 laptop with 512 M and 1G disk cache. I
>  > >  usually have at least 4 big applications running simultaneously: a
>  > >  Java IDE, firefox, firefox and X. All that under the Gnome desktop.
>  > >
>  > >  I've sometimes seen periods where my laptop goes kind of nuts. While
>  > >  the cpu is still at 0%, the workload goes to 100% (as shown in the
>  > >  gnome process monitor) (I haven't checked in other means, e.g. top or
>  > >  /proc info as my machine is unusable).
>  > >
>  > >  But with my latest upgrade to 2.6.12 from 2.6.10, the hanging happens
>  > >  much more often. It lasts for over 30 seconds.
>  > >
>  > >  Could this hanging be related to swapping?
>  > >  Are there any VM regression lately that would make a kernel less
>  > >  appropriate for desktop use?
>  > >  How can I investigate that further?
>  > 
>  > 10-20 lines of `vmstat 1' output while it's happening would help.
> 
>  Here it goes. Maybe just some bad swapping?

Maybe.  There's certainly a ton of swapping happening.

>  [EMAIL PROTECTED]> vmstat 1
>  procs ---memory-- ---swap-- -io --system-- 
> cpu
>   r  b   swpd   free   buff  cache   si   sobibo   incs us sy id 
> wa
>   1  7 588164   7424  18612 106908   1373444   1012  8  2 85  
> 5
>   2  4 587996   6152  18624 108092  404  664   540  2892 1201  2631 70  9  0 
> 21
>   0 12 588276   5160  18620 109188  664 1244   860  1244 1195   615 46  5  0 
> 50
>   0 13 588140   4912  18628 109188  2160   216 8 1156   245  0  0  0 
> 100
>   0 17 588536   4892  18628 109972  132  576   132   576 1172   353 32  4  0 
> 64
>   0 16 589096   5016  18628 1101920  608 4   628 1169   247  7  2  0 
> 91
>   0 16 589780   5636  18632 1101360  716 0   808 1181   261  1  0  0 
> 99

But maybe a memory leak.  Can you take a copy of /proc/meminfo and
/proc/slabinfo when this is happening?

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Question regardings inodes and anon_hash_chain in 2.4/2.6

2005-08-25 Thread Gerard Snitselaar

I know that anon_hash_chain has gone away in 2.6 because
the inodes for special filesystems like sockfs, pipefs,
etc are now associated with a superblock. Should these
inodes have i_hash linked into the inode hashtable then?
It appears in 2.4 now they are associated with superblocks
as well.

I have been working on a problem dealing with inodes on a 2.4 kernel, 
and was walking inode_in_use and saw inodes that were unhashed. They
all are associated with superblocks for special types. My question
is, is this expected behavior or should they be getting hashed?

Thanks



signature.asc
Description: This is a digitally signed message part

Re: process creation time increases linearly with shmem

2005-08-25 Thread Nick Piggin


Rik van Riel wrote:

On Thu, 25 Aug 2005, Nick Piggin wrote:



fork() can be changed so as not to set up page tables for
MAP_SHARED mappings. I think that has other tradeoffs like
initially causing several unavoidable faults reading
libraries and program text.



Actually, libraries and program text are usually mapped
MAP_PRIVATE, so those would still be copied.



Yep, that seems to be the case here.


Skipping MAP_SHARED in fork() sounds like a good idea to me...



Indeed. Linus, can you remember why we haven't done this before?

--
SUSE Labs, Novell Inc.

Send instant messages to your online friends http://au.messenger.yahoo.com 


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.13-rc7-rt1

2005-08-25 Thread Daniel Walker

On Fri, 2005-08-26 at 02:22 +0200, Thomas Gleixner wrote:
> On Thu, 2005-08-25 at 16:29 -0700, Daniel Walker wrote:
> > Devastating latency on a 3Ghz xeon .. Maybe the raw_spinlock in the
> > timer base is creating a unbounded latency?
> 
> The lock is only held for really short periods. The only possible long
> period would be migration of timers from a dead hotplug cpu to another.
> I guess thats not the case.
> 
> Do you have HIGH_RES_TIMERS enabled ?

No. The cascade has a very long worst case.

Daniel

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] removes filp_count_lock and changes nr_files type to atomic_t

2005-08-25 Thread Nick Piggin


Eric Dumazet wrote:

Furthermore, a lazy sync would mean to change sysctl proc_handler for 
"file-nr" to perform a synchronize before calling proc_dointvec, this 
would be really obscure.
 


I was only using your terminology (ie. the 'lazy' synch after the
atomic is updated).

Actually, a better idea would be to make a specific sysctl handler
like Christoph said.

Unless you can show some improvement, it would better not to introduce
the racy hack (even if it is mostly harmless).


Unless the fs people had a problem with that.

And you may as well get rid of the atomic_inc_return which can be more
expensive on some platforms and doesn't buy you much.
  atomic_inc;
  atomic_read;
Should be enough if you don't care about lost updates here, yeah?



You mean :

atomic_inc();
lazeyvalue = atomic_read();

instead of

lazeyvalue = atomic_inc_return();



Yes.

In fact I couldnt find one architecture where the later would be more 
expensive.




atomic_inc_return guarantees a memory barrier, while the former
statements do not. I'm fairly sure it will be more expensive on
a POWER5.

--
SUSE Labs, Novell Inc.

Send instant messages to your online friends http://au.messenger.yahoo.com 


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] IB: fix use-after-free in user verbs cleanup

2005-08-25 Thread Roland Dreier

Hi Andrew,

I'd like to get this into 2.6.13 if possible.  If it's too late, it's
not the end of the world -- we can wait for 2.6.13.1.  But it's a
tiny, obvious patch that fixes a crash that at least one person
actually hit running a normal application:
http://openib.org/pipermail/openib-general/2005-August/010248.html

Thanks,
  Roland


Fix a use-after-free bug in userspace verbs cleanup: we can't touch
mr->device after we free mr by calling ib_dereg_mr().

diff --git a/drivers/infiniband/core/uverbs_main.c 
b/drivers/infiniband/core/uverbs_main.c
--- a/drivers/infiniband/core/uverbs_main.c
+++ b/drivers/infiniband/core/uverbs_main.c
@@ -130,13 +130,14 @@ static int ib_dealloc_ucontext(struct ib
 
list_for_each_entry_safe(uobj, tmp, >mr_list, list) {
struct ib_mr *mr = idr_find(_uverbs_mr_idr, uobj->id);
+   struct ib_device *mrdev = mr->device;
struct ib_umem_object *memobj;
 
idr_remove(_uverbs_mr_idr, uobj->id);
ib_dereg_mr(mr);
 
memobj = container_of(uobj, struct ib_umem_object, uobject);
-   ib_umem_release_on_close(mr->device, >umem);
+   ib_umem_release_on_close(mrdev, >umem);
 
list_del(>list);
kfree(memobj);
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 2.6.13-rc7] dcdbas: add Dell Systems Management Base Driver with sysfs support

2005-08-25 Thread Doug Warzecha

This patch adds the Dell Systems Management Base Driver with sysfs support.

This driver has been tested with Dell OpenManage.

Signed-off-by: Doug Warzecha <[EMAIL PROTECTED]>
---

diff -uprN linux-2.6.13-rc7.orig/Documentation/dcdbas.txt 
linux-2.6.13-rc7/Documentation/dcdbas.txt
--- linux-2.6.13-rc7.orig/Documentation/dcdbas.txt  1969-12-31 
18:00:00.0 -0600
+++ linux-2.6.13-rc7/Documentation/dcdbas.txt   2005-08-25 10:58:40.0 
-0500
@@ -0,0 +1,91 @@
+Overview
+
+The Dell Systems Management Base Driver provides a sysfs interface for
+systems management software such as Dell OpenManage to perform system
+management interrupts and host control actions (system power cycle or
+power off after OS shutdown) on certain Dell systems.
+
+Dell OpenManage requires this driver on the following Dell PowerEdge systems:
+300, 1300, 1400, 400SC, 500SC, 1500SC, 1550, 600SC, 1600SC, 650, 1655MC,
+700, and 750.  Other Dell software such as the open source libsmbios project
+is expected to make use of this driver, and it may include the use of this
+driver on other Dell systems.
+
+The Dell libsmbios project aims towards providing access to as much BIOS
+information as possible.  See http://linux.dell.com/libsmbios/main/ for
+more information about the libsmbios project.
+
+
+System Management Interrupt
+
+On some Dell systems, systems management software must access certain
+management information via a system management interrupt (SMI).  The SMI data
+buffer must reside in 32-bit address space, and the physical address of the
+buffer is required for the SMI.  The driver maintains the memory required for
+the SMI and provides a way for the application to generate the SMI.
+The driver creates the following sysfs entries for systems management
+software to perform these system management interrupts:
+
+/sys/devices/platform/dcdbas/smi_data
+/sys/devices/platform/dcdbas/smi_data_buf_phys_addr
+/sys/devices/platform/dcdbas/smi_data_buf_size
+/sys/devices/platform/dcdbas/smi_request
+
+Systems management software must perform the following steps to execute
+a SMI using this driver:
+
+1) Lock smi_data.
+2) Write system management command to smi_data.
+3) Write "1" to smi_request to generate a calling interface SMI or
+   "2" to generate a raw SMI.
+4) Read system management command response from smi_data.
+5) Unlock smi_data.
+
+
+Host Control Action
+
+Dell OpenManage supports a host control feature that allows the administrator
+to perform a power cycle or power off of the system after the OS has finished
+shutting down.  On some Dell systems, this host control feature requires that
+a driver perform a SMI after the OS has finished shutting down.
+
+The driver creates the following sysfs entries for systems management software
+to schedule the driver to perform a power cycle or power off host control
+action after the system has finished shutting down:
+
+/sys/devices/platform/dcdbas/host_control_action
+/sys/devices/platform/dcdbas/host_control_smi_type
+/sys/devices/platform/dcdbas/host_control_on_shutdown
+
+Dell OpenManage performs the following steps to execute a power cycle or
+power off host control action using this driver:
+
+1) Write host control action to be performed to host_control_action.
+2) Write type of SMI that driver needs to perform to host_control_smi_type.
+3) Write "1" to host_control_on_shutdown to enable host control action.
+4) Initiate OS shutdown.
+   (Driver will perform host control SMI when it is notified that the OS
+   has finished shutting down.)
+
+
+Host Control SMI Type
+
+The following table shows the value to write to host_control_smi_type to
+perform a power cycle or power off host control action:
+
+PowerEdge SystemHost Control SMI Type
+-
+  300 HC_SMITYPE_TYPE1
+ 1300 HC_SMITYPE_TYPE1
+ 1400 HC_SMITYPE_TYPE2
+  500SC   HC_SMITYPE_TYPE2
+ 1500SC   HC_SMITYPE_TYPE2
+ 1550 HC_SMITYPE_TYPE2
+  600SC   HC_SMITYPE_TYPE2
+ 1600SC   HC_SMITYPE_TYPE2
+  650 HC_SMITYPE_TYPE2
+ 1655MC   HC_SMITYPE_TYPE2
+  700 HC_SMITYPE_TYPE3
+  750 HC_SMITYPE_TYPE3
+
+
diff -uprN linux-2.6.13-rc7.orig/drivers/firmware/dcdbas.c 
linux-2.6.13-rc7/drivers/firmware/dcdbas.c
--- linux-2.6.13-rc7.orig/drivers/firmware/dcdbas.c 1969-12-31 
18:00:00.0 -0600
+++ linux-2.6.13-rc7/drivers/firmware/dcdbas.c  2005-08-25 19:01:20.0 
-0500
@@ -0,0 +1,596 @@
+/*
+ *  dcdbas.c: Dell Systems Management Base Driver
+ *
+ *  The Dell Systems Management Base Driver provides a sysfs interface for
+ *  systems management software to perform System Management Interrupts (SMIs)
+ *  and Host Control Actions (power cycle or power off after OS shutdown) on
+ *  Dell systems.
+ *
+ *  See Documentation/dcdbas.txt for more information.
+ *
+ *  Copyright (C) 1995-2005 Dell Inc.
+ *
+

Re: 2.6.13-rc7-rt1

2005-08-25 Thread Thomas Gleixner

On Thu, 2005-08-25 at 16:29 -0700, Daniel Walker wrote:
> Devastating latency on a 3Ghz xeon .. Maybe the raw_spinlock in the
> timer base is creating a unbounded latency?

The lock is only held for really short periods. The only possible long
period would be migration of timers from a dead hotplug cpu to another.
I guess thats not the case.

Do you have HIGH_RES_TIMERS enabled ?

> ( softirq-timer/1-13   |#1): new 66088 us maximum-latency critical section.
>  => started at timestamp 1857957769: <__down_mutex+0x5f/0x295>
>  =>   ended at timestamp 1858023857: <_raw_spin_unlock_irq+0x16/0x39>

which mutex was taken and which raw spinlock released  ?

__down_mutex / _raw_spin_unlock_irq is an asymetric pair. Those sections
should be symetric.

I have the feeling that the call trace is not complete.

> {rt_secret_rebuild+0}

hint: rt_secret_rebuild() is a well known long running timer callback
function. 

tglx

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: libata-dev queue updated

2005-08-25 Thread Adrian Bunk

On Fri, Aug 26, 2005 at 09:04:37AM +0900, Tomita, Haruo wrote:
> On  Thursday, August 25, 2005 8:02 PM (JST), Adrian Bunk wrote:
> 
> > > 2.6.13- rc7-libata1.patch.bz2 was used. 
> > > A combined mode of ata_piix seems not to work. 
> > > Is the following patches correct?
> > > 
> > > diff -urN linux-2.6.13-rc7.orig/drivers/scsi/Kconfig 
> > linux-2.6.13-rc7/drivers/scsi/Kconfig
> > > --- linux-2.6.13-rc7.orig/drivers/scsi/Kconfig
> > 2005-08-25 13:44:33.0 +0900
> > > +++ linux-2.6.13-rc7/drivers/scsi/Kconfig 2005-08-25 
> > 14:33:38.0 +0900
> > > @@ -424,7 +424,7 @@
> > >  source "drivers/scsi/megaraid/Kconfig.megaraid"
> > >  
> > >  config SCSI_SATA
> > > - tristate "Serial ATA (SATA) support"
> > > + bool "Serial ATA (SATA) support"
> > >   depends on SCSI
> > >   help
> > > This driver family supports Serial ATA host controllers
> > 
> > No, this bug reintroduces a problem with SCSI=m.
> 
> Please explain this bug in detail. 

Assuming your patch is applied:

With SCSI=m and SCSI_SATA=y this allows the static enabling of the SATA 
drivers with unwanted effects, e.g.:
- SCSI=m, SCSI_SATA=y, SCSI_ATA_ADMA=y
  -> SCSI_ATA_ADMA is built statically but scsi/built-in.o is not linked 
 into the kernel
- SCSI=m, SCSI_SATA=y, SCSI_ATA_ADMA=y, SCSI_SATA_AHCI=m
  -> SCSI_ATA_ADMA and libata are built statically but 
 scsi/built-in.o is not linked into the kernel,
 SCSI_SATA_AHCI is built modular (unresolved symbols due to missing 
  libata)

> > Which problem do you face?
> > And how did this change alone fix it for you?
> 
> I am using Intel 82801EB SATA controller.
> 2.6.13-rc7-libata1.patch.bz2 worked as PATA when 82801EB was used in a 
> combined mode. 
> Does quirk_intel_ide_combined() work effectively?

I do still not see how your proposed patch would have _any_ influence on 
your problem.

> Thanks,
> Haruo

cu
Adrian

-- 

   "Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
   "Only a promise," Lao Er said.
   Pearl S. Buck - Dragon Seed

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: libata-dev queue updated

2005-08-25 Thread Tomita, Haruo

On  Thursday, August 25, 2005 8:02 PM (JST), Adrian Bunk wrote:

> > 2.6.13- rc7-libata1.patch.bz2 was used. 
> > A combined mode of ata_piix seems not to work. 
> > Is the following patches correct?
> > 
> > diff -urN linux-2.6.13-rc7.orig/drivers/scsi/Kconfig 
> linux-2.6.13-rc7/drivers/scsi/Kconfig
> > --- linux-2.6.13-rc7.orig/drivers/scsi/Kconfig  
> 2005-08-25 13:44:33.0 +0900
> > +++ linux-2.6.13-rc7/drivers/scsi/Kconfig   2005-08-25 
> 14:33:38.0 +0900
> > @@ -424,7 +424,7 @@
> >  source "drivers/scsi/megaraid/Kconfig.megaraid"
> >  
> >  config SCSI_SATA
> > -   tristate "Serial ATA (SATA) support"
> > +   bool "Serial ATA (SATA) support"
> > depends on SCSI
> > help
> >   This driver family supports Serial ATA host controllers
> 
> No, this bug reintroduces a problem with SCSI=m.

Please explain this bug in detail. 

> Which problem do you face?
> And how did this change alone fix it for you?

I am using Intel 82801EB SATA controller.
2.6.13-rc7-libata1.patch.bz2 worked as PATA when 82801EB was used in a combined 
mode. 
Does quirk_intel_ide_combined() work effectively?

Thanks,
Haruo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 2/2] pipe: do not return POLLERR for fifo_poll

2005-08-25 Thread Andrew Morton

Pekka Enberg <[EMAIL PROTECTED]> wrote:
>
> This patch changes fifo_poll not to return POLLERR to take care of a FIXME
> in fs/pipe.c stating that "Most unices do not set POLLERR for fifos." The
> comment has been there since 2.3.99-pre3 so either apply this patch or
> alternatively, I can send a new one removing the unnecessary abstraction.
> 
> ...
> --- 2.6-mm.orig/fs/pipe.c
> +++ 2.6-mm/fs/pipe.c
> @@ -399,8 +399,8 @@ pipe_ioctl(struct inode *pino, struct fi
>  }
>  
>  /* No kernel lock held - fine */
> -static unsigned int
> -pipe_poll(struct file *filp, poll_table *wait)
> +static inline unsigned int
> +__pipe_poll(struct file *filp, poll_table *wait, int can_err)
>  {
>   unsigned int mask;
>   struct inode *inode = filp->f_dentry->d_inode;
> @@ -420,15 +420,24 @@ pipe_poll(struct file *filp, poll_table 
>  
>   if (filp->f_mode & FMODE_WRITE) {
>   mask |= (nrbufs < PIPE_BUFFERS) ? POLLOUT | POLLWRNORM : 0;
> - if (!info->readers)
> + if (can_err && !info->readers)
>   mask |= POLLERR;
>   }
>  
>   return mask;
>  }
>  
> -/* FIXME: most Unices do not set POLLERR for fifos */
> -#define fifo_poll pipe_poll
> +static unsigned int
> +pipe_poll(struct file *filp, poll_table *wait)
> +{
> + return __pipe_poll(filp, wait, 1);
> +}
> +
> +static unsigned int
> +fifo_poll(struct file *filp, poll_table *wait)
> +{
> + return __pipe_poll(filp, wait, 0);
> +}
>  
>  static int
>  pipe_release(struct inode *inode, int decr, int decw)

A userspace-visible change, no?

So there's a risk in changing it.  What do we get in return?  Worried.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: A Great Idea (tm) about reimplementing NLS.

2005-08-25 Thread Daniel B.

Alan Cox wrote:
> 
> On Sul, 2005-06-19 at 18:55, Pavel Machek wrote:
> ...
> >
> > If we are serious about utf-8 support in ext3, we should return
> > -EINVAL if someone passes non-canonical utf-8 string.
> 
> That would ironically not be standards compliant

Which standards?

The standards I've read (mostly XML- and web-related specs)
do say that non-standard UTF-8 octet sequences should be rejected.


Daniel
-- 
Daniel Barclay
[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] drivers/hwmon/*: kfree() correct pointers

2005-08-25 Thread Jonathan Corbet

> Already fixed in Greg's i2c tree and -mm for quite some time now...

So it is.  The comment says, however, that "the existing code works
somewhat by accident."  In the case of the 9240 driver, however, the
existing code demonstrably does not work - it oopsed on me.  The patch
in Greg's tree looks fine (it's a straightforward fix, after all); I'd
recommend that it be merged before 2.6.13.

jon

Jonathan Corbet
Executive editor, LWN.net
[EMAIL PROTECTED]

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch 8/8] PCI Error Recovery: PPC64 core recovery routines

2005-08-25 Thread Benjamin Herrenschmidt

On Fri, 2005-08-26 at 09:18 +1000, Paul Mackerras wrote:
> Benjamin Herrenschmidt writes:
> 
> > Ok, so what is the problem then ? Why do we have to wait at all ? Why
> > not just unplug/replug right away ?
> 
> We'd have to be absolutely certain that the driver could not possibly
> take another interrupt or try to access the device on behalf of the
> old instance of the device by the time it returned from the remove
> function.  I'm not sure I'd trust most drivers that far...

Hrm... If a driver gets that wrong, then it will also blow up when
unloaded as a module. All drivers should be fully shut down by the time
they return from remove(). free_irq() is synchronous as is iounmap() and
both of those are usually called as part of remove(). I wouldn't be too
worried here.

Ben.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.13-rc7-rt1

2005-08-25 Thread Daniel Walker

Devastating latency on a 3Ghz xeon .. Maybe the raw_spinlock in the
timer base is creating a unbounded latency?

Daniel

( softirq-timer/1-13   |#1): new 66088 us maximum-latency critical section.
 => started at timestamp 1857957769: <__down_mutex+0x5f/0x295>
 =>   ended at timestamp 1858023857: <_raw_spin_unlock_irq+0x16/0x39>

Call Trace:{check_critical_timing+491} 
{rt_secret_rebuild+0}
   {trace_irqs_on+100} 
{_raw_spin_unlock_irq+22}
   {run_timer_softirq+1916} 
{ksoftirqd+241}
   {ksoftirqd+0} {kthread+218}
   {child_rip+8} {kthread+0}
   {child_rip+0}
---
| preempt count:  ]
| 0-level deep critical section nesting:


 =>   dump-end timestamp 1858101914



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Linux-2.6.13-rc7

2005-08-25 Thread Antonino A. Daplas

Sorry. Here's the start of the thread.

Tony

On Tue, 23 Aug 2005 22:08:13 -0700 (PDT)
Linus Torvalds <[EMAIL PROTECTED]> wrote:

> Antonino A. Daplas:
>   intelfb/fbdev: Save info->flags in a local variable
> Sylvain Meyer:
>   intelfb: Do not ioremap entire graphics aperture

One of these changes broke intelfb. The same .config from 2.6.13-rc6
does no longer work for -rc7. After booting the screen stays black, but
i can type blindly. I can also start X. dmesg does not show anything
unusual. any ideas?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Inotify problem [was Re: 2.6.13-rc6-mm1]

2005-08-25 Thread Johannes Berg

On Thu, 2005-08-25 at 16:10 -0700, George Anzinger wrote:

> That IS strange.  1024 is on a "level" boundry, but then next level is 
> 2**15, not 2**11.  I will take a look.

Remember that the level is never filled, so maybe the smallest level
just gets an offset or something? Well, you're the expert I suppose, so
apologies if this didn't make sense. Just crossed my mind :)

johannes

signature.asc
Description: This is a digitally signed message part

Re: [patch 8/8] PCI Error Recovery: PPC64 core recovery routines

2005-08-25 Thread Paul Mackerras

Benjamin Herrenschmidt writes:

> Ok, so what is the problem then ? Why do we have to wait at all ? Why
> not just unplug/replug right away ?

We'd have to be absolutely certain that the driver could not possibly
take another interrupt or try to access the device on behalf of the
old instance of the device by the time it returned from the remove
function.  I'm not sure I'd trust most drivers that far...

Paul.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Inotify problem [was Re: 2.6.13-rc6-mm1]

2005-08-25 Thread George Anzinger


John McCutchan wrote:

On Thu, 2005-08-25 at 11:54 -0700, George Anzinger wrote:


Robert Love wrote:


On Thu, 2005-08-25 at 09:33 -0400, John McCutchan wrote:



On Thu, 2005-08-25 at 22:07 +1200, Reuben Farrelly wrote:

~
I think the best thing is to take idr into user space and emulate the 
problem usage.  To this end, from the log it appears that you _might_ be 
moving between 0, 1 and 2 entries increasing the number each time.  It 
also appears that the failure happens here:

add 1023
add 1024
find 1024  or is it the remove that fails?  It also looks like 1024 got 
allocated twice.  Am I reading the log correctly?



You are reading the log correctly. There are two bugs. One is that if we
pass X to idr_get_new_above, it can return X again (doesn't ever seem to
return < X). The other problem is that the find fails on 1024 (and 2048
if we skip 1024).


That IS strange.  1024 is on a "level" boundry, but then next level is 
2**15, not 2**11.  I will take a look.





So, is it correct to assume that the tree is empty save these two at 
this time?  I am just trying to figure out what the test program needs 
to do.



Yes that is the exact scenario. Only 2 id's are used at any given time,
and once we hit 1024 things break. This doesn't happen when the tree is
not empty.

Thanks for looking at this!


--
George Anzinger   george@mvista.com
HRT (High-res-timers):  http://sourceforge.net/projects/high-res-timers/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Need better is_better_time_interpolator() algorithm

2005-08-25 Thread Alex Williamson

On Thu, 2005-08-25 at 17:40 -0400, [EMAIL PROTECTED] wrote:
> > (frequency) * (1/drift) * (1/latency) * (1/(jitter_factor * cpus))
> 
> (Note that 1/cpus, being a constant for all evaluations of this
> expression, has no effect on the final ranking.)

   I was sloppy expressing how the jitter factors in, but I've got code
below that should make it more clear.

> The usual way it's done is with some fiddle factors:
> 
> quality_a^a * quality_b^b * quality_c^c
> 
> Or, equivalently:
> 
> a * log(quality_a) + b * log(quality_b) + c * log(quality_c)
> 
> Then you use the a, b and c factors to weight the relative importance
> of them.  Your suggestion is equivalent to setting all the exponents to 1.
> 
> But you can also say that "a is twice as important as b" in a
> consistent manner.

   Right.  It's the weighting factors themselves that I'm clueless
about.  I'm hoping someone will chime in to help us prioritize and
weight clock attributes appropriately.  The code I've been poking at is
below (by no means ready for inclusion).  I think this brings the
components I'm aware of together, and makes an attempt to do something
meaningful with them.

   On my system, I have 2 NUMA nodes, each with 4 processors.  The cycle
counter runs at 1.5GHz with a 750ppm drift.  The latency of this timer
comes out to ~30ns.  The cycle counter is subject to drift, so I divide
by the square of the cpus and arrive at a "goodness" factor of around
950.

  I also have 2 HPETs on the system, one on each node.  These run at
250MHz with an assumed drift of 500ppm (although the current hpet code
makes the drift much higher).  I measure the access latency of these to
be around 200ns.  Because the HPETs have different access latency
depending on the node, I use the NUMA info to get a average access
latency of ~300ns.  These timers are not subject to jitter, eliminating
that factor.  This results in a "goodness" factor of ~1450.

   I don't really know if this makes sense, but it seems to do what I
think it should.  If I where to add another node to the system, I would
more strongly favor the HPETs time, if I removed a node I would revert
to the cycle counter.  Anyway, I think it might be a good starting point
for further experimentation.  Patch below.

Alex
-- 
Alex Williamson HP Linux & Open Source Lab

 arch/ia64/kernel/cyclone.c  |3 -
 arch/ia64/kernel/time.c |3 -
 arch/ia64/sn/kernel/sn2/timer.c |3 -
 arch/sparc64/kernel/time.c  |3 -
 drivers/char/hpet.c |   12 +++-
 include/linux/hpet.h|1 
 include/linux/timex.h   |2 
 kernel/timer.c  |  113 +++-
 8 files changed, 133 insertions(+), 7 deletions(-)

diff -r b40794c1ac45 arch/ia64/kernel/cyclone.c
--- a/arch/ia64/kernel/cyclone.cWed Aug 24 12:00:22 2005
+++ b/arch/ia64/kernel/cyclone.cThu Aug 25 16:29:22 2005
@@ -23,7 +23,8 @@
.shift =16,
.frequency =CYCLONE_TIMER_FREQ,
.drift =-100,
-   .mask = (1LL << 40) - 1
+   .mask = (1LL << 40) - 1,
+   .node = -1
 };

 int __init init_cyclone_clock(void)
diff -r b40794c1ac45 arch/ia64/kernel/time.c
--- a/arch/ia64/kernel/time.c   Wed Aug 24 12:00:22 2005
+++ b/arch/ia64/kernel/time.c   Thu Aug 25 16:29:22 2005
@@ -48,7 +48,8 @@
 static struct time_interpolator itc_interpolator = {
.shift = 16,
.mask = 0xLL,
-   .source = TIME_SOURCE_CPU
+   .source = TIME_SOURCE_CPU,
+   .node = -1
 };

 static irqreturn_t
diff -r b40794c1ac45 arch/ia64/sn/kernel/sn2/timer.c
--- a/arch/ia64/sn/kernel/sn2/timer.c   Wed Aug 24 12:00:22 2005
+++ b/arch/ia64/sn/kernel/sn2/timer.c   Thu Aug 25 16:29:22 2005
@@ -25,7 +25,8 @@
.drift = -1,
.shift = 10,
.mask = (1LL << 55) - 1,
-   .source = TIME_SOURCE_MMIO64
+   .source = TIME_SOURCE_MMIO64,
+   .node = -1
 };

 void __init sn_timer_init(void)
diff -r b40794c1ac45 arch/sparc64/kernel/time.c
--- a/arch/sparc64/kernel/time.cWed Aug 24 12:00:22 2005
+++ b/arch/sparc64/kernel/time.cThu Aug 25 16:29:22 2005
@@ -1048,7 +1048,8 @@
 static struct time_interpolator sparc64_cpu_interpolator = {
.source =   TIME_SOURCE_CPU,
.shift  =   16,
-   .mask   =   0xLL
+   .mask   =   0xLL,
+   .node   =   -1
 };

 /* The quotient formula is taken from the IA64 port. */
diff -r b40794c1ac45 drivers/char/hpet.c
--- a/drivers/char/hpet.c   Wed Aug 24 12:00:22 2005
+++ b/drivers/char/hpet.c   Thu Aug 25 16:29:22 2005
@@ -82,6 +82,7 @@
unsigned long hp_delta;
unsigned int hp_ntimer;
unsigned int hp_which;
+   acpi_handle handle;
struct hpet_dev hp_dev[1];
 };

@@ -702,6 +703,7 @@
 {
 #ifdef CONFIG_TIME_INTERPOLATION

Re: Linux-2.6.13-rc7

2005-08-25 Thread Al Viro

On Thu, Aug 25, 2005 at 03:16:49PM -0700, Richard Henderson wrote:
> On Thu, Aug 25, 2005 at 08:07:55PM +0100, Al Viro wrote:
> > IMO that's a question to rth: why do we really need to block always_inline
> > on alpha?
> 
> Because I use "extern inline" in the proper way.  That is, I have both
> inline and out-of-line versions of some routines.  These routines have
> their address taken to be put into the alpha_machine_vector structures,
> so we're guaranteed that they'll be out-of-line at least once.
> 
> But if you define inline to always_inline, the compiler complains when
> its forced to fall back to the out-of-line copy.  And rightly so -- the
> feature was INVENTED for using compiler intrinsics that would in fact
> not produce valid assembly unless certain parameters are constants.
> 
> I've complained about this before.  You always-inline savages have 
> obsconded with ALL THREE inline keywords -- "inline", "__inline" and
> "__inline__" -- so there is in fact no way to accomplish what I want.
> 
> So in a fit of pique I've locally undone not just one, but all of the
> always-inline crap.
> 
> All that said, something's wrong if we couldn't generate an out-of-line
> copy of kmalloc.  The entire block protected by __builtin_constant_p
> should have been eliminated.  File a gcc bugzilla report.  

It is eliminated.  As the result, the compile-time checks disappear.
In this case it's more or less harmless - we miss some bugs that could
be caught at compile time, but that's it.  In case of e.g. xchg() (same
technics of calling undefined function in the code that gets eliminated
if everything's right) it gave genuine bugs - gcc decided to create an
uninlined copy and to hell it went:

static inline unsigned long
__xchg(volatile void *ptr, unsigned long x, int size)
{
switch (size) {
case 1:
return __xchg_u8(ptr, x);
case 2:
return __xchg_u16(ptr, x);
case 4:
return __xchg_u32(ptr, x);
case 8:
return __xchg_u64(ptr, x);
}
__xchg_called_with_bad_pointer();
return x;
}
#define xchg(ptr,x)  \
  ({ \
 __typeof__(*(ptr)) _x_ = (x);   \
 (__typeof__(*(ptr))) __xchg((ptr), (unsigned long)_x_, sizeof(*(ptr))); \
  })

blows to hell, since we have no way to tell gcc that it should _never_
be done non-inlined.  Well, no way short of making __xchg a macro...

So what do you propose to use for that class of compile-time checks?
#define whenever they are used?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.13-rc7-rt1

2005-08-25 Thread Daniel Walker

Nevermind , the original patch looks fine.

Daniel

On Thu, 2005-08-25 at 23:54 +0200, Ingo Molnar wrote:
> * Daniel Walker <[EMAIL PROTECTED]> wrote:
> 
> > @@ -257,6 +257,7 @@ void check_preempt_wakeup(struct task_st
> >  * hangs and race conditions.
> >  */
> > if (!preempt_count() &&
> > +   !__raw_irqs_disabled() &&
> > p->prio < current->prio &&
> > rt_task(p) &&
> > (current->rcu_read_lock_nesting != 0 ||
> 
> did you get a false positive? If yes, in what code/driver?
> 
>   Ingo

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Linux-2.6.13-rc7

2005-08-25 Thread Richard Henderson

On Thu, Aug 25, 2005 at 08:07:55PM +0100, Al Viro wrote:
> IMO that's a question to rth: why do we really need to block always_inline
> on alpha?

Because I use "extern inline" in the proper way.  That is, I have both
inline and out-of-line versions of some routines.  These routines have
their address taken to be put into the alpha_machine_vector structures,
so we're guaranteed that they'll be out-of-line at least once.

But if you define inline to always_inline, the compiler complains when
its forced to fall back to the out-of-line copy.  And rightly so -- the
feature was INVENTED for using compiler intrinsics that would in fact
not produce valid assembly unless certain parameters are constants.

I've complained about this before.  You always-inline savages have 
obsconded with ALL THREE inline keywords -- "inline", "__inline" and
"__inline__" -- so there is in fact no way to accomplish what I want.

So in a fit of pique I've locally undone not just one, but all of the
always-inline crap.

All that said, something's wrong if we couldn't generate an out-of-line
copy of kmalloc.  The entire block protected by __builtin_constant_p
should have been eliminated.  File a gcc bugzilla report.  

r~
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] late spinlock initialization in ieee1394/ohci

2005-08-25 Thread Al Viro

spinlock used in irq handler should be initialized before registering
irq, even if we know that our device has interrupts disabled; handler
is registered shared and taking spinlock is done unconditionally.  As
it is, we can and do get oopsen on boot for some configuration, depending
on irq routing - I've got a reproducer.

Signed-off-by: Al Viro <[EMAIL PROTECTED]>

diff -urN RC13-rc7-base/drivers/ieee1394/ohci1394.c 
current/drivers/ieee1394/ohci1394.c
--- RC13-rc7-base/drivers/ieee1394/ohci1394.c   2005-08-24 01:56:37.0 
-0400
+++ current/drivers/ieee1394/ohci1394.c 2005-08-25 18:02:49.0 -0400
@@ -478,7 +478,6 @@
int num_ports, i;
 
spin_lock_init(>phy_reg_lock);
-   spin_lock_init(>event_lock);
 
/* Put some defaults to these undefined bus options */
buf = reg_read(ohci, OHCI1394_BusOptions);
@@ -3402,7 +3401,14 @@
/* We hopefully don't have to pre-allocate IT DMA like we did
 * for IR DMA above. Allocate it on-demand and mark inactive. */
ohci->it_legacy_context.ohci = NULL;
+   spin_lock_init(>event_lock);
 
+   /*
+* interrupts are disabled, all right, but... due to SA_SHIRQ we
+* might get called anyway.  We'll see no event, of course, but
+* we need to get to that "no event", so enough should be initialized
+* by that point.
+*/
if (request_irq(dev->irq, ohci_irq_handler, SA_SHIRQ,
 OHCI1394_DRIVER_NAME, ohci))
FAIL(-ENOMEM, "Failed to allocate shared interrupt %d", 
dev->irq);
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.13-rc7-rt1

2005-08-25 Thread Daniel Walker

On Thu, 2005-08-25 at 23:54 +0200, Ingo Molnar wrote:
> * Daniel Walker <[EMAIL PROTECTED]> wrote:
> 
> > @@ -257,6 +257,7 @@ void check_preempt_wakeup(struct task_st
> >  * hangs and race conditions.
> >  */
> > if (!preempt_count() &&
> > +   !__raw_irqs_disabled() &&
> > p->prio < current->prio &&
> > rt_task(p) &&
> > (current->rcu_read_lock_nesting != 0 ||
> 
> did you get a false positive? If yes, in what code/driver?


Yes, one was in trigger_softirqs() , the other in init_sd . Both traces
below.

BUG: swapper/1, possible wake_up race on softirq-scsi/3/31

Call Trace:{wake_up_process+17} 
{trigger_softirqs+76}
   {__scsi_done+101} 
{ata_scsi_rbuf_fill+114}
   {scsi_done+0} {ata_scsi_simulate+320}
   {scsi_done+0} {ata_scsi_queuecmd+228}
   {scsi_dispatch_cmd+488} 
{scsi_request_fn+1046}
   {blk_insert_request+156} 
{scsi_insert_special_req+51}
   {scsi_wait_req+339} 
{__scsi_mode_sense+228}
   {sd_revalidate_disk+3037} 
{add_preempt_count_ti+35}
   {atomic_dec_and_spin_lock+52} 
{rescan_partitions+133}
   {do_open+672} {blkdev_get+161}

BUG: swapper/1, possible wake_up race on softirq-scsi/1/15

Call Trace:{wake_up_p   
{driver_register+91} {init_sd+31}
   {init+503} {child_rip+8}
   {init+0} {child_rip+0}


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 5/7] spufs: Use a system call instead of ioctl

2005-08-25 Thread Arnd Bergmann

This patch makes it possible to use a system call instead of
an ioctl to run spu code on spufs.

This is only provided for reference, the current patch is
unlikely to be used in future versions.

We planning to move to a model where creation/destruction
of SPU threads as well as entering the execution is done
with new system calls instead of mkdir, rmdir and this
call.
This will make an SPU thread a property of a Linux user
space thread instead of the global object that can be
used by any process.

Signed-off-by: Arnd Bergmann <[EMAIL PROTECTED]>

--

 arch/ppc64/kernel/misc.S   |2 
 fs/spufs/Makefile  |2 
 fs/spufs/file.c|   12 +
 fs/spufs/spufs.h   |2 
 fs/spufs/spurun.c  |   96 +
 include/asm-ppc/unistd.h   |3 -
 include/asm-ppc64/unistd.h |3 -
 include/linux/syscalls.h   |3 +
 kernel/sys_ni.c|1 
 9 files changed, 121 insertions(+), 3 deletions(-)

--- linux-cg.orig/arch/ppc64/kernel/misc.S  2005-08-25 23:12:31.254917248 
-0400
+++ linux-cg/arch/ppc64/kernel/misc.S   2005-08-25 23:12:36.582906000 -0400
@@ -1132,6 +1132,7 @@ _GLOBAL(sys_call_table32)
.llong .sys_inotify_init/* 275 */
.llong .sys_inotify_add_watch
.llong .sys_inotify_rm_watch
+   .llong .sys_spu_run
 
.balign 8
 _GLOBAL(sys_call_table)
@@ -1413,3 +1414,4 @@ _GLOBAL(sys_call_table)
.llong .sys_inotify_init/* 275 */
.llong .sys_inotify_add_watch
.llong .sys_inotify_rm_watch
+   .llong .sys_spu_run
--- linux-cg.orig/fs/spufs/Makefile 2005-08-25 23:12:34.935890192 -0400
+++ linux-cg/fs/spufs/Makefile  2005-08-25 23:12:36.582906000 -0400
@@ -1,5 +1,7 @@
 obj-$(CONFIG_SPU_FS) += spufs.o
+syscall-$(CONFIG_SPU_FS) += spurun.o
 
+obj-y += $(syscall-y) $(syscall-m)
 spufs-y += inode.o file.o context.o switch.o
 
 # Rules to build switch.o with the help of SPU tool chain
--- linux-cg.orig/fs/spufs/file.c   2005-08-25 23:12:31.259916488 -0400
+++ linux-cg/fs/spufs/file.c2005-08-25 23:12:36.584905696 -0400
@@ -406,6 +406,16 @@ out:
return ret;
 }
 
+static int spufs_run_open(struct inode *inode, struct file *file)
+{
+   struct spufs_inode_info *i = SPUFS_I(inode);
+   file->private_data = i->i_ctx;
+
+   i->i_spu_run = spufs_run_spu;
+
+   return nonseekable_open(inode, file);
+}
+
 struct spufs_run_arg {
u32 npc;/* inout: Next Program Counter */
u32 status; /* out:   SPU status */
@@ -472,7 +482,7 @@ static long spufs_run_ioctl(struct file 
 }
 
 static struct file_operations spufs_run_fops = {
-   .open   = spufs_pipe_open,
+   .open   = spufs_run_open,
.unlocked_ioctl = spufs_run_ioctl,
.compat_ioctl   = spufs_run_ioctl,
.read   = spufs_run_read,
--- linux-cg.orig/fs/spufs/spufs.h  2005-08-25 23:12:31.261916184 -0400
+++ linux-cg/fs/spufs/spufs.h   2005-08-25 23:12:36.584905696 -0400
@@ -46,6 +46,8 @@ struct spu_context {
 
 struct spufs_inode_info {
struct spu_context *i_ctx;
+   long (*i_spu_run)(struct file *filp, struct spu_context *ctx,
+   u32 *npc, u32 *result);
struct inode vfs_inode;
 };
 #define SPUFS_I(inode) \
--- linux-cg.orig/fs/spufs/spurun.c 1969-12-31 19:00:00.0 -0500
+++ linux-cg/fs/spufs/spurun.c  2005-08-25 23:12:36.585905544 -0400
@@ -0,0 +1,96 @@
+/*
+ * SPU file system -- run system call
+ *
+ * (C) Copyright IBM Deutschland Entwicklung GmbH 2005
+ *
+ * Author: Arnd Bergmann <[EMAIL PROTECTED]>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2, or (at your option)
+ * any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+
+#include "spufs.h"
+
+/**
+ * sys_spu_run - run code loaded into an SPU
+ *
+ * @unpc:next program counter for the SPU
+ * @ustatus: status of the SPU
+ *
+ * This system call transfers the control of execution of a
+ * user space thread to an SPU. It will return when the
+ * SPU has finished executing or when it hits an error
+ * condition and it will be interrupted if a signal needs
+ * to be delivered to a handler in user space.
+ *
+ * The next program counter is set to the passed value
+ * before the SPU starts fetching code and the user space
+ * pointer gets updated with the new value when returning

Re: PowerOP Take 2 0/3 Intro

2005-08-25 Thread Todd Poynor


Jordan Crouse wrote:

Todd - do you have a ChangeLog from Take 1? :)


Right, here's what's changed in this version...

The generic structure of an operating point as an array of integers is 
dropped.  A struct powerop_point is now an entirely backend-defined 
struct of arbitrary fields.


There is no more PowerOP core layer; all data structures and functions 
for core functionality are provided by the machine-specific backend.


The diagnostic sysfs UI has been split out into a separate, optional 
patch.  A more full-featured UI allowing operating point creation and 
activation via sysfs has also been provided in that patch.  This UI 
primarily serves as an example for experimentation purposes, but is 
pretty close to what a basic userspace-based policy manager might need 
to switch operating points in response to infrequent changes in system 
state.


The UI also embodies the notion of a list of "named operating points" 
that could be registered by other means, such as loading a module with 
data structures that encode the desired operating points (as David 
Brownell has suggested).  The named operating points registered from 
such other interfaces can also be activated from the sysfs UI (that is, 
the hardware can be told to run at that operating point), as an example 
of how to tie in userspace policy managers with such a scheme.


The example platform backend this time is for an embedded system: the TI 
OMAP1 family of processors used for numerous mobile phones and PDAs.  It 
may better illustrate why managing multiple power parameters might be a 
useful capability.  I haven't put out an example of cpufreq integration 
this time, but the idea has changed little from before.


In case it's getting lost in all these details, the main point of all 
this is to pose the question: "are arbitrary power parameters arranged 
into a set with mutually consistent values (called here an operating 
point) a good low-level abstraction for system power management of a 
wide variety of platforms?"  Thanks,


--
Todd
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 6/7] spufs: allow O_ASYNC on mailbox files

2005-08-25 Thread Arnd Bergmann

This patch makes it possible to receive user-defined
signals when the spufs ibox and wbox files are accessed
from an SPE, so data can be read/written from/to
them again.

Unfortunately, this kind of messes with the layering
of the high- and low-level parts of the code, so I'm
currently thinking about dropping the functionality
again.

If anyone has strong opinions on how useful or harmful
this patch is, please tell me.

Signed-off-by: Arnd Bergmann <[EMAIL PROTECTED]>

--

 arch/ppc64/kernel/spu_base.c |7 +++
 fs/spufs/file.c  |   16 
 include/asm-ppc64/spu.h  |2 ++
 3 files changed, 25 insertions(+)

--- linux-cg.orig/arch/ppc64/kernel/spu_base.c  2005-08-25 23:12:30.395914664 
-0400
+++ linux-cg/arch/ppc64/kernel/spu_base.c   2005-08-25 23:12:46.857941720 
-0400
@@ -136,6 +136,7 @@ static int __spu_trap_data_map(struct sp
 static int __spu_trap_mailbox(struct spu *spu)
 {
wake_up_all(>ibox_wq);
+   kill_fasync(>ibox_fasync, SIGIO, POLLIN);
 
/* atomically disable SPU mailbox interrupts */
spin_lock(>register_lock);
@@ -171,6 +172,7 @@ static int __spu_trap_tag_group(struct s
 static int __spu_trap_spubox(struct spu *spu)
 {
wake_up_all(>wbox_wq);
+   kill_fasync(>wbox_fasync, SIGIO, POLLOUT);
 
/* atomically disable SPU mailbox interrupts */
spin_lock(>register_lock);
@@ -394,6 +396,8 @@ EXPORT_SYMBOL(spu_alloc);
 void spu_free(struct spu *spu)
 {
down(_mutex);
+   spu->ibox_fasync = NULL;
+   spu->wbox_fasync = NULL;
list_add_tail(>list, _list);
up(_mutex);
 }
@@ -676,6 +680,9 @@ static int __init create_spu(struct devi
init_waitqueue_head(>wbox_wq);
init_waitqueue_head(>ibox_wq);
 
+   spu->ibox_fasync = NULL;
+   spu->wbox_fasync = NULL;
+
down(_mutex);
spu->number = number++;
ret = spu_request_irqs(spu);
--- linux-cg.orig/fs/spufs/file.c   2005-08-25 23:12:36.584905696 -0400
+++ linux-cg/fs/spufs/file.c2005-08-25 23:12:46.858941568 -0400
@@ -198,6 +198,13 @@ size_t spu_ibox_read(struct spu *spu, u3
 }
 EXPORT_SYMBOL(spu_ibox_read);
 
+static int spufs_ibox_fasync(int fd, struct file *file, int on)
+{
+   struct spu_context *ctx;
+   ctx = file->private_data;
+   return fasync_helper(fd, file, on, >spu->ibox_fasync);
+}
+
 static ssize_t spufs_ibox_read(struct file *file, char __user *buf,
size_t len, loff_t *pos)
 {
@@ -253,6 +260,7 @@ static struct file_operations spufs_ibox
.open   = spufs_pipe_open,
.read   = spufs_ibox_read,
.poll   = spufs_ibox_poll,
+   .fasync = spufs_ibox_fasync,
 };
 
 static ssize_t spufs_ibox_stat_read(struct file *file, char __user *buf,
@@ -302,6 +310,13 @@ size_t spu_wbox_write(struct spu *spu, u
 }
 EXPORT_SYMBOL(spu_wbox_write);
 
+static int spufs_wbox_fasync(int fd, struct file *file, int on)
+{
+   struct spu_context *ctx;
+   ctx = file->private_data;
+   return fasync_helper(fd, file, on, >spu->wbox_fasync);
+}
+
 static ssize_t spufs_wbox_write(struct file *file, const char __user *buf,
size_t len, loff_t *pos)
 {
@@ -353,6 +368,7 @@ static struct file_operations spufs_wbox
.open   = spufs_pipe_open,
.write  = spufs_wbox_write,
.poll   = spufs_wbox_poll,
+   .fasync = spufs_wbox_fasync,
 };
 
 static ssize_t spufs_wbox_stat_read(struct file *file, char __user *buf,
--- linux-cg.orig/include/asm-ppc64/spu.h   2005-08-25 23:12:30.400913904 
-0400
+++ linux-cg/include/asm-ppc64/spu.h2005-08-25 23:12:46.860941264 -0400
@@ -127,6 +127,8 @@ struct spu {
wait_queue_head_t stop_wq;
wait_queue_head_t ibox_wq;
wait_queue_head_t wbox_wq;
+   struct fasync_struct *ibox_fasync;
+   struct fasync_struct *wbox_fasync;
 
char irq_c0[8];
char irq_c1[8];

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 4/7] spufs: spu-side context switch code

2005-08-25 Thread Arnd Bergmann

Add the source code that is used to generate spu_save_dump.h and
spu_restore_dump.h. Since a full spu tool chain is needed to
generate these files, the default remains to use the shipped
versions in order to keep the number of tools for building the
kernel down.

From: Mark Nutter: <[EMAIL PROTECTED]>
Signed-off-by: Arnd Bergmann <[EMAIL PROTECTED]>

--

 Makefile   |   46 +++
 spu_restore.c  |  336 +
 spu_restore_crt0.S |  116 ++
 spu_save.c |  203 
 spu_save_crt0.S|  102 
 spu_utils.h|  160 +
 6 files changed, 963 insertions(+)

--- linux-cg.orig/fs/spufs/Makefile 2005-08-16 20:11:10.396001984 -0400
+++ linux-cg/fs/spufs/Makefile  2005-08-16 20:11:43.730998928 -0400
@@ -2,4 +2,50 @@ obj-$(CONFIG_SPU_FS) += spufs.o
 
 spufs-y += inode.o file.o context.o switch.o
 
+# Rules to build switch.o with the help of SPU tool chain
+SPU_CROSS  := spu-
+SPU_CC := $(SPU_CROSS)gcc
+SPU_AS := $(SPU_CROSS)gcc
+SPU_LD := $(SPU_CROSS)ld
+SPU_READELF:= $(SPU_CROSS)readelf
+SPU_CFLAGS := -O2 -Wall -I$(srctree)/include -I$(objtree)/include2
+SPU_AFLAGS := -c -D__ASSEMBLY__ -I$(srctree)/include -I$(objtree)/include2
+SPU_LDFLAGS:= -N -Ttext=0x0
+
 $(obj)/switch.o: $(obj)/spu_save_dump.h $(obj)/spu_restore_dump.h
+
+# Compile SPU files
+  cmd_spu_cc = $(SPU_CC) $(SPU_CFLAGS) -c -o $@ $<
+quiet_cmd_spu_cc = SPU_CC  $@
+$(obj)/spu_%.o: $(src)/spu_%.c
+   $(call if_changed,spu_cc)
+
+# Assemble SPU files
+  cmd_spu_as = $(SPU_AS) $(SPU_AFLAGS) -o $@ $<
+quiet_cmd_spu_as = SPU_AS  $@
+$(obj)/spu_%.o: $(src)/spu_%.S
+   $(call if_changed,spu_as)
+
+# Link SPU Executables
+  cmd_spu_ld = $(SPU_LD) $(SPU_LDFLAGS) -o $@ $^
+quiet_cmd_spu_ld = SPU_LD  $@
+$(obj)/spu_%: $(obj)/spu_%.o $(obj)/spu_%_crt0.o
+   $(call if_changed,spu_ld)
+
+# create C code from ELF executable
+cmd_hexdump   = ( \
+   echo "/*" ; \
+   echo " * $@: Copyright (C) 2005 IBM." ; \
+   echo " * Hex-dump auto generated from $<.c." ; \
+   echo " * Do not edit!" ; \
+   echo " */" ; \
+   echo "static unsigned int $*_code[] __page_aligned = {" ; \
+   $(SPU_READELF) -x1 -x2 $< | \
+   grep -v "Hex dump of section" | \
+   grep -v "^$$" | \
+   $(AWK) -- '{ print "0x"$$2", 0x"$$3", 0x"$$4", 0x"$$5", " }' ; \
+   echo "};" ; \
+   ) > $@
+quiet_cmd_hexdump = HEXDUMP $@
+$(obj)/%_dump.h: $(obj)/%
+   $(call if_changed,hexdump)
--- linux-cg.orig/fs/spufs/spu_restore.c1969-12-31 19:00:00.0 
-0500
+++ linux-cg/fs/spufs/spu_restore.c 2005-08-16 20:11:43.737997864 -0400
@@ -0,0 +1,336 @@
+/*
+ * spu_restore.c
+ *
+ * (C) Copyright IBM Corp. 2005
+ *
+ * SPU-side context restore sequence outlined in
+ * Synergistic Processor Element Book IV
+ *
+ * Author: Mark Nutter <[EMAIL PROTECTED]>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2, or (at your option)
+ * any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
+ *
+ */
+
+
+#ifndef LS_SIZE
+#define LS_SIZE 0x4/* 256K (in bytes) */
+#endif
+
+typedef unsigned int u32;
+typedef unsigned long long u64;
+
+#include 
+#include 
+#include "spu_utils.h"
+
+#define BR_INSTR   0x327fff80  /* br -4 */
+#define NOP_INSTR  0x4020  /* nop   */
+#define HEQ_INSTR  0x7b00  /* heq $0, $0*/
+#define STOP_INSTR 0x  /* stop 0x0  */
+#define ILLEGAL_INSTR  0x0080  /* illegal instr */
+#define RESTORE_COMPLETE   0x3ffc  /* stop 0x3ffc   */
+
+static inline void fetch_regs_from_mem(addr64 lscsa_ea)
+{
+   unsigned int ls = (unsigned int)_spill[0];
+   unsigned int size = sizeof(regs_spill);
+   unsigned int tag_id = 0;
+   unsigned int cmd = 0x40;/* GET */
+
+   spu_writech(mfc_ls_addr, ls);
+   spu_writech(mfc_ea_hi, lscsa_ea.ui[0]);
+   spu_writech(mfc_ea_low, lscsa_ea.ui[1]);
+   spu_writech(mfc_dma_size, size);
+   spu_writech(mfc_tag_id, tag_id);
+   spu_writech(mfc_cmd_queue, cmd);
+}
+
+static inline void restore_upper_240kb(addr64 lscsa_ea)
+{
+   unsigned

[PATCH 7/7] spufs: Add a register file for the debugger

2005-08-25 Thread Arnd Bergmann

In order to debug spu threads, we need access to the registers
of the running SPU. Unfortunately, this is only possible when
the SPU context is saved to memory.

This patch adds operations that enable accessing an SPU
in either runnable or saved state. We use an RW semaphore
to protect the state of the SPU from changing underneath
us, while we are holding it readable. In order to change
the state, it is acquired writeable and a context save
or restore is executed before downgrading the semaphore
to read-only.

Future schedulers will likely be built on top of this.

From: Ulrich Weigand <[EMAIL PROTECTED]>
Signed-off-by: Arnd Bergmann <[EMAIL PROTECTED]>

--

 context.c   |   48 
 file.c  |  239 
 spufs.h |3 
 4 files changed, 243 insertions(+), 47 deletions(-)

--- linux-cg.orig/fs/spufs/context.c2005-08-25 23:12:20.725920136 -0400
+++ linux-cg/fs/spufs/context.c 2005-08-25 23:12:52.415895512 -0400
@@ -53,6 +53,8 @@ struct spu_context *alloc_spu_context(vo
init_rwsem(>backing_sema);
spin_lock_init(>mmio_lock);
kref_init(>kref);
+   init_rwsem(>state_sema);
+   ctx->state = SPU_STATE_SAVED;
goto out;
 out_free:
kfree(ctx);
@@ -82,4 +84,50 @@ void put_spu_context(struct spu_context 
kref_put(>kref, _spu_context);
 }
 
+void spu_acquire(struct spu_context *ctx)
+{
+   down_read(>state_sema);
+}
+
+void spu_release(struct spu_context *ctx)
+{
+   up_read(>state_sema);
+}
+
+void spu_acquire_runnable(struct spu_context *ctx)
+{
+   down_read(>state_sema);
+
+   if (ctx->state == SPU_STATE_RUNNABLE
+   || ctx->state == SPU_STATE_LOCKED)
+   return;
+
+   up_read(>state_sema);
+   down_write(>state_sema);
 
+   if (ctx->state == SPU_STATE_SAVED) {
+   spu_restore(>csa, ctx->spu);
+   ctx->state = SPU_STATE_RUNNABLE;
+   }
+
+   downgrade_write(>state_sema);
+}
+
+void spu_acquire_saved(struct spu_context *ctx)
+{
+   down_read(>state_sema);
+
+   if (ctx->state == SPU_STATE_SAVED
+   || ctx->state == SPU_STATE_LOCKED)
+   return;
+
+   up_read(>state_sema);
+   down_write(>state_sema);
+
+   if (ctx->state == SPU_STATE_RUNNABLE) {
+   spu_save(>csa, ctx->spu);
+   ctx->state = SPU_STATE_SAVED;
+   }
+
+   downgrade_write(>state_sema);
+}
--- linux-cg.orig/fs/spufs/file.c   2005-08-25 23:12:46.858941568 -0400
+++ linux-cg/fs/spufs/file.c2005-08-25 23:12:52.418895056 -0400
@@ -32,6 +32,7 @@
 
 #include "spufs.h"
 
+
 static int
 spufs_mem_open(struct inode *inode, struct file *file)
 {
@@ -44,23 +45,22 @@ static ssize_t
 spufs_mem_read(struct file *file, char __user *buffer,
size_t size, loff_t *pos)
 {
-   struct spu *spu;
-   struct spu_context *ctx;
+   struct spu_context *ctx = file->private_data;
+   char *local_store;
int ret;
 
-   ctx = file->private_data;
-   spu = ctx->spu;
-
+   spu_acquire(ctx);
down_read(>backing_sema);
-   if (spu->number & 0/*1*/) {
-   ret = generic_file_read(file, buffer, size, pos);
-   goto out;
-   }
 
-   ret = simple_read_from_buffer(buffer, size, pos,
-   spu->local_store, LS_SIZE);
-out:
+   if (ctx->state == SPU_STATE_SAVED)
+   local_store = ctx->csa.lscsa->ls;
+   else
+   local_store = ctx->spu->local_store;
+
+   ret = simple_read_from_buffer(buffer, size, pos, local_store, LS_SIZE);
+
up_read(>backing_sema);
+   spu_release(ctx);
return ret;
 }
 
@@ -69,17 +69,28 @@ spufs_mem_write(struct file *file, const
size_t size, loff_t *pos)
 {
struct spu_context *ctx = file->private_data;
-   struct spu *spu = ctx->spu;
-
-   if (spu->number & 0) //1)
-   return generic_file_write(file, buffer, size, pos);
+   char *local_store;
+   int ret;
 
size = min_t(ssize_t, LS_SIZE - *pos, size);
if (size <= 0)
return -EFBIG;
*pos += size;
-   return copy_from_user(spu->local_store + *pos - size,
-   buffer, size) ? -EFAULT : size;
+   
+   spu_acquire(ctx);
+   down_read(>backing_sema);
+
+   if (ctx->state == SPU_STATE_SAVED)
+   local_store = ctx->csa.lscsa->ls;
+   else
+   local_store = ctx->spu->local_store;
+
+   ret = copy_from_user(local_store + *pos - size,
+buffer, size) ? -EFAULT : size;
+
+   up_read(>backing_sema);
+   spu_release(ctx);
+   return ret;
 }
 
 static int
@@ -88,9 +99,9 @@ spufs_mem_mmap(struct file *file, struct
struct spu_context *ctx = file->private_data;
struct spu *spu = ctx->spu;
unsigned long pfn;

[PATCH 0/7] Cell SPU file system, snapshot 4

2005-08-25 Thread Arnd Bergmann

Thankfully, there is now documentation available to the world about
the Cell architecture (http://cell.scei.co.jp/e_download.html), so I
am now able to disclose more of our work on the SPU file system.

This is a rather big update compared to the previous version, as it
contains work from Mark Nutter and Ulrich Weigand to support context
save and restore of SPUs. This release should still be fully compatible
to the previous ones, but we intend to do incompatible changes for
in the future.

Arnd <><

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 2/7] spufs: switchable spu contexts

2005-08-25 Thread Arnd Bergmann

Add some infrastructure for saving and restoring the context of an
SPE. This patch creates a new structure that can hold the whole
state of a physical SPE in memory. It also contains code that
avoids races during the context switch and the binary code that
is loaded to the SPU in order to access its registers.

The actual PPE- and SPE-side context switch code are two separate
patches.

From: Mark Nutter <[EMAIL PROTECTED]>
Signed-off-by: Arnd Bergmann <[EMAIL PROTECTED]>

--

 arch/ppc64/kernel/spu_base.c|   27 ++
 fs/spufs/Makefile   |4 
 fs/spufs/context.c  |   18 +
 fs/spufs/spu_restore_dump.h_shipped |  342 
 fs/spufs/spu_save_dump.h_shipped|  290 ++
 fs/spufs/spufs.h|2 
 fs/spufs/switch.c   |  174 ++
 include/asm-ppc64/spu.h |   76 
 include/asm-ppc64/spu_csa.h |  256 ++
 9 files changed, 1185 insertions(+), 4 deletions(-)

diff -urN linux-2.6.13-rc6.fix/arch/ppc64/kernel/spu_base.c 
linux-2.6.13-rc6/arch/ppc64/kernel/spu_base.c
--- linux-2.6.13-rc6.fix/arch/ppc64/kernel/spu_base.c   2005-08-23 
00:25:53.201118036 +0200
+++ linux-2.6.13-rc6/arch/ppc64/kernel/spu_base.c   2005-08-23 
00:21:50.242294578 +0200
@@ -62,7 +62,9 @@
 static void spu_restart_dma(struct spu *spu)
 {
struct spu_priv2 __iomem *priv2 = spu->priv2;
-   out_be64(>mfc_control_RW, MFC_CNTL_RESTART_DMA_COMMAND);
+
+   if (!test_bit(SPU_CONTEXT_SWITCH_PENDING_nr, >flags))
+   out_be64(>mfc_control_RW, MFC_CNTL_RESTART_DMA_COMMAND);
 }
 
 static int __spu_trap_data_seg(struct spu *spu, unsigned long ea)
@@ -72,6 +74,11 @@
 
pr_debug("%s\n", __FUNCTION__);
 
+   if (test_bit(SPU_CONTEXT_SWITCH_ACTIVE_nr, >flags)) {
+   printk("%s: invalid access during switch!\n", __func__);
+   return 1;
+   }
+
if (REGION_ID(ea) != USER_REGION_ID) {
pr_debug("invalid region access at %016lx\n", ea);
return 1;
@@ -98,6 +105,7 @@
return 0;
 }  
 
+extern int hash_page(unsigned long ea, unsigned long access, unsigned long 
trap); //XXX
 static int __spu_trap_data_map(struct spu *spu, unsigned long ea)
 {
unsigned long dsisr;
@@ -107,8 +115,21 @@
priv1 = spu->priv1;
dsisr = in_be64(>mfc_dsisr_RW);
 
-   wake_up(>stop_wq);
+   /* Handle kernel space hash faults immediately.
+  User hash faults need to be deferred to process context. */
+   if ((dsisr & MFC_DSISR_PTE_NOT_FOUND)
+   && REGION_ID(ea) != USER_REGION_ID
+   && hash_page(ea, _PAGE_PRESENT, 0x300) == 0) {
+   spu_restart_dma(spu);
+   return 0;
+   }
+
+   if (test_bit(SPU_CONTEXT_SWITCH_ACTIVE_nr, >flags)) {
+   printk("%s: invalid access during switch!\n", __func__);
+   return 1;
+   }
 
+   wake_up(>stop_wq);
return 0;
 }
 
@@ -378,7 +399,6 @@
 }
 EXPORT_SYMBOL(spu_free);
 
-extern int hash_page(unsigned long ea, unsigned long access, unsigned long 
trap); //XXX
 static int spu_handle_mm_fault(struct spu *spu)
 {
struct spu_priv1 __iomem *priv1;
@@ -646,6 +666,7 @@
spu->slb_replace = 0;
spu->mm = NULL;
spu->class_0_pending = 0;
+   spu->flags = 0UL;
spin_lock_init(>register_lock);
 
out_be64(>priv1->mfc_sdr_RW, mfspr(SPRN_SDR1));
diff -urN linux-2.6.13-rc6.fix/fs/spufs/Makefile 
linux-2.6.13-rc6/fs/spufs/Makefile
--- linux-2.6.13-rc6.fix/fs/spufs/Makefile  2005-08-22 23:47:52.087135000 
+0200
+++ linux-2.6.13-rc6/fs/spufs/Makefile  2005-08-22 23:50:30.742186383 +0200
@@ -1,3 +1,5 @@
 obj-$(CONFIG_SPU_FS) += spufs.o
 
-spufs-y += inode.o file.o context.o
+spufs-y += inode.o file.o context.o switch.o
+
+$(obj)/switch.o: $(obj)/spu_save_dump.h $(obj)/spu_restore_dump.h
diff -urN linux-2.6.13-rc6.fix/fs/spufs/context.c 
linux-2.6.13-rc6/fs/spufs/context.c
--- linux-2.6.13-rc6.fix/fs/spufs/context.c 2005-08-22 23:47:52.088135000 
+0200
+++ linux-2.6.13-rc6/fs/spufs/context.c 2005-08-22 23:50:30.743186743 +0200
@@ -22,6 +22,7 @@
 
 #include 
 #include 
+#include 
 #include "spufs.h"
 
 struct spu_context *alloc_spu_context(void)
@@ -30,9 +31,25 @@
ctx = kmalloc(sizeof *ctx, GFP_KERNEL);
if (!ctx)
goto out;
+   /* Future enhancement: do not call spu_alloc()
+* here.  This step should be deferred until
+* spu_run()!!
+*
+* More work needs to be done to read(),
+* write(), mmap(), etc., so that operations
+* are performed on CSA when the context is
+* not currently being run.  In this way we
+* can support arbitrarily large number of
+* entries in /spu, allow state queries, etc.
+*/
ctx->spu = spu_alloc();
if (!ctx->spu)
goto

Re: [PATCH] drivers/hwmon/*: kfree() correct pointers

2005-08-25 Thread Jean Delvare

Hi Alexey,

> The adm9240 driver, in adm9240_detect(), allocates a structure.  The
> error path attempts to kfree() ->client field of it (second one),
> resulting in an oops (or slab corruption) if the hardware is not
> present.
> 
> ->client field in adm1026, adm1031, smsc47b397 and smsc47m1 is the
> first in ${HWMON}_data structure, but fix them too.

Already fixed in Greg's i2c tree and -mm for quite some time now...

http://www.kernel.org/pub/linux/kernel/people/gregkh/gregkh-2.6/gregkh-02-i2c/i2c-hwmon-class-01.patch

Thanks,
-- 
Jean Delvare
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] bogus function type in qdio

2005-08-25 Thread Al Viro

In qdio_get_micros() volatile in return type is plain noise (even with old
gccisms it would make no sense - noreturn function returning __u64 is a
bit odd ;-)

Signed-off-by: Al Viro <[EMAIL PROTECTED]>

diff -urN RC13-rc7-emac-iounmap/drivers/s390/cio/qdio.c 
RC13-rc7-attr-misc/drivers/s390/cio/qdio.c
--- RC13-rc7-emac-iounmap/drivers/s390/cio/qdio.c   2005-08-24 
01:58:29.0 -0400
+++ RC13-rc7-attr-misc/drivers/s390/cio/qdio.c  2005-08-25 00:54:22.0 
-0400
@@ -112,7 +112,7 @@
 
 /* SCRUBBER HELPER ROUTINES **/
 
-static inline volatile __u64 
+static inline __u64 
 qdio_get_micros(void)
 {
 return (get_clock() >> 10); /* time>>12 is microseconds */
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] bogus iounmap() in emac

2005-08-25 Thread Al Viro

Dumb typo: iounmap(_pointer_variable).

Signed-off-by: Al Viro <[EMAIL PROTECTED]>

diff -urN RC13-rc7-m68k-adb.patch/drivers/net/ibm_emac/ibm_emac_core.c 
RC13-rc7-emac-iounmap/drivers/net/ibm_emac/ibm_emac_core.c
--- RC13-rc7-m68k-adb.patch/drivers/net/ibm_emac/ibm_emac_core.c
2005-08-24 01:58:29.0 -0400
+++ RC13-rc7-emac-iounmap/drivers/net/ibm_emac/ibm_emac_core.c  2005-08-25 
00:54:21.0 -0400
@@ -1253,7 +1253,7 @@
 TAH_MR_CVR | TAH_MR_ST_768 | TAH_MR_TFS_10KB | TAH_MR_DTFP |
 TAH_MR_DIG);
 
-   iounmap();
+   iounmap(tahp);
 
return 0;
 }
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.13-rc7-rt1

2005-08-25 Thread Ingo Molnar


* Daniel Walker <[EMAIL PROTECTED]> wrote:

> @@ -257,6 +257,7 @@ void check_preempt_wakeup(struct task_st
>* hangs and race conditions.
>*/
>   if (!preempt_count() &&
> + !__raw_irqs_disabled() &&
>   p->prio < current->prio &&
>   rt_task(p) &&
>   (current->rcu_read_lock_nesting != 0 ||

did you get a false positive? If yes, in what code/driver?

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Asus a8v-e Deluxe lockups

2005-08-25 Thread Lawrence Walton


Hi!

 I just switched out motherboards and CPUs From a Asus K8v SE Deluxe
to a to a Asus A8V-E Deluxe, and a 754 pin 3200+ to a 934 pin 3200+.
I am now having some fairly serious instability issues. The system
locks up completely with no oops. After disabling CONFIGHIMEM and
CONFIG_PREEMPT_VOLUNTARY the is considerably more stable but not
completely.

These lockups are from what I guess PCI/SCSI/FS related. They are most
often triggered by of all things mutt. Compiling the kernel, bziping
files, general IO, seems fine, but reading my mail with mutt? it triggers
the lockup.

Things are out of the ordinary about this machine.

* PCI-E 

* SCSI card (LSI 53c1030 Fusion-MPT) 

* SCSI disk 

* X does not work at this time. (I've not had a stable machine 
to troubleshoot.)

I have switched out RAM, power supplies, and CPU fans, and network
cards.

I'm up for suggestions before I switch back to the old motherboard
and CPU.

.config, output from ver_linux, and lspci -vvv are included.

#
# Automatically generated make config: don't edit
# Linux kernel version: 2.6.13-rc7
# Thu Aug 25 12:39:58 2005
#
CONFIG_X86=y
CONFIG_MMU=y
CONFIG_UID16=y
CONFIG_GENERIC_ISA_DMA=y
CONFIG_GENERIC_IOMAP=y

#
# Code maturity level options
#
CONFIG_EXPERIMENTAL=y
# CONFIG_CLEAN_COMPILE is not set
CONFIG_BROKEN=y
CONFIG_BROKEN_ON_SMP=y
CONFIG_INIT_ENV_ARG_LIMIT=32

#
# General setup
#
CONFIG_LOCALVERSION=""
CONFIG_SWAP=y
CONFIG_SYSVIPC=y
CONFIG_POSIX_MQUEUE=y
CONFIG_BSD_PROCESS_ACCT=y
# CONFIG_BSD_PROCESS_ACCT_V3 is not set
CONFIG_SYSCTL=y
# CONFIG_AUDIT is not set
CONFIG_HOTPLUG=y
CONFIG_KOBJECT_UEVENT=y
# CONFIG_IKCONFIG is not set
# CONFIG_EMBEDDED is not set
CONFIG_KALLSYMS=y
CONFIG_KALLSYMS_ALL=y
# CONFIG_KALLSYMS_EXTRA_PASS is not set
CONFIG_PRINTK=y
CONFIG_BUG=y
CONFIG_BASE_FULL=y
CONFIG_FUTEX=y
CONFIG_EPOLL=y
CONFIG_SHMEM=y
CONFIG_CC_ALIGN_FUNCTIONS=0
CONFIG_CC_ALIGN_LABELS=0
CONFIG_CC_ALIGN_LOOPS=0
CONFIG_CC_ALIGN_JUMPS=0
# CONFIG_TINY_SHMEM is not set
CONFIG_BASE_SMALL=0

#
# Loadable module support
#
CONFIG_MODULES=y
CONFIG_MODULE_UNLOAD=y
# CONFIG_MODULE_FORCE_UNLOAD is not set
CONFIG_OBSOLETE_MODPARM=y
# CONFIG_MODVERSIONS is not set
# CONFIG_MODULE_SRCVERSION_ALL is not set
CONFIG_KMOD=y

#
# Processor type and features
#
CONFIG_X86_PC=y
# CONFIG_X86_ELAN is not set
# CONFIG_X86_VOYAGER is not set
# CONFIG_X86_NUMAQ is not set
# CONFIG_X86_SUMMIT is not set
# CONFIG_X86_BIGSMP is not set
# CONFIG_X86_VISWS is not set
# CONFIG_X86_GENERICARCH is not set
# CONFIG_X86_ES7000 is not set
# CONFIG_M386 is not set
# CONFIG_M486 is not set
# CONFIG_M586 is not set
# CONFIG_M586TSC is not set
# CONFIG_M586MMX is not set
# CONFIG_M686 is not set
# CONFIG_MPENTIUMII is not set
# CONFIG_MPENTIUMIII is not set
# CONFIG_MPENTIUMM is not set
# CONFIG_MPENTIUM4 is not set
# CONFIG_MK6 is not set
# CONFIG_MK7 is not set
CONFIG_MK8=y
# CONFIG_MCRUSOE is not set
# CONFIG_MEFFICEON is not set
# CONFIG_MWINCHIPC6 is not set
# CONFIG_MWINCHIP2 is not set
# CONFIG_MWINCHIP3D is not set
# CONFIG_MGEODEGX1 is not set
# CONFIG_MCYRIXIII is not set
# CONFIG_MVIAC3_2 is not set
# CONFIG_X86_GENERIC is not set
CONFIG_X86_CMPXCHG=y
CONFIG_X86_XADD=y
CONFIG_X86_L1_CACHE_SHIFT=6
CONFIG_RWSEM_XCHGADD_ALGORITHM=y
CONFIG_GENERIC_CALIBRATE_DELAY=y
CONFIG_X86_WP_WORKS_OK=y
CONFIG_X86_INVLPG=y
CONFIG_X86_BSWAP=y
CONFIG_X86_POPAD_OK=y
CONFIG_X86_GOOD_APIC=y
CONFIG_X86_INTEL_USERCOPY=y
CONFIG_X86_USE_PPRO_CHECKSUM=y
CONFIG_HPET_TIMER=y
CONFIG_HPET_EMULATE_RTC=y
# CONFIG_SMP is not set
CONFIG_PREEMPT_NONE=y
# CONFIG_PREEMPT_VOLUNTARY is not set
# CONFIG_PREEMPT is not set
CONFIG_X86_UP_APIC=y
CONFIG_X86_UP_IOAPIC=y
CONFIG_X86_LOCAL_APIC=y
CONFIG_X86_IO_APIC=y
CONFIG_X86_TSC=y
CONFIG_X86_MCE=y
CONFIG_X86_MCE_NONFATAL=y
# CONFIG_X86_MCE_P4THERMAL is not set
# CONFIG_TOSHIBA is not set
# CONFIG_I8K is not set
CONFIG_X86_REBOOTFIXUPS=y
# CONFIG_MICROCODE is not set
# CONFIG_X86_MSR is not set
# CONFIG_X86_CPUID is not set

#
# Firmware Drivers
#
# CONFIG_EDD is not set
CONFIG_NOHIGHMEM=y
# CONFIG_HIGHMEM4G is not set
# CONFIG_HIGHMEM64G is not set
CONFIG_SELECT_MEMORY_MODEL=y
CONFIG_FLATMEM_MANUAL=y
# CONFIG_DISCONTIGMEM_MANUAL is not set
# CONFIG_SPARSEMEM_MANUAL is not set
CONFIG_FLATMEM=y
CONFIG_FLAT_NODE_MEM_MAP=y
# CONFIG_MATH_EMULATION is not set
CONFIG_MTRR=y
# CONFIG_EFI is not set
CONFIG_REGPARM=y
CONFIG_SECCOMP=y
# CONFIG_HZ_100 is not set
CONFIG_HZ_250=y
# CONFIG_HZ_1000 is not set
CONFIG_HZ=250
CONFIG_PHYSICAL_START=0x10
# CONFIG_KEXEC is not set

#
# Power management options (ACPI, APM)
#
CONFIG_PM=y
# CONFIG_PM_DEBUG is not set
# CONFIG_SOFTWARE_SUSPEND is not set

#
# ACPI (Advanced Configuration and Power Interface) Support
#
CONFIG_ACPI=y
CONFIG_ACPI_BOOT=y
CONFIG_ACPI_INTERPRETER=y
CONFIG_ACPI_SLEEP=y
CONFIG_ACPI_SLEEP_PROC_FS=y
# CONFIG_ACPI_SLEEP_PROC_SLEEP is not set
CONFIG_ACPI_AC=m
CONFIG_ACPI_BATTERY=m
CONFIG_ACPI_BUTTON=m
CONFIG_ACPI_VIDEO=m
# CONFIG_ACPI_HOTKEY is not set
CONFIG_ACPI_FAN=m
CONFIG_ACPI_PROCESSOR=m
CONFIG_ACPI_THERMAL=m

Re: 2.6.13-rc7-rt1

2005-08-25 Thread Daniel Walker


Wakeup race checking shouldn't trigger when interrupts are off. Here's a
fix.

Daniel

Index: linux-2.6.12/kernel/rt.c
===
--- linux-2.6.12.orig/kernel/rt.c   2005-08-25 21:33:43.0 +
+++ linux-2.6.12/kernel/rt.c2005-08-25 21:44:20.0 +
@@ -257,6 +257,7 @@ void check_preempt_wakeup(struct task_st
 * hangs and race conditions.
 */
if (!preempt_count() &&
+   !__raw_irqs_disabled() &&
p->prio < current->prio &&
rt_task(p) &&
(current->rcu_read_lock_nesting != 0 ||


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.13-rc6: halt instead of reboot

2005-08-25 Thread Meelis Roos



I'm searching my way through changesests.

rc2 was OK, rc3 was broken.
60a762b6a6dec17cc4339b60154902fd04c2f9f2 was OK too - the commit before 
ACPI merge on 2005-07-12


Currently compiling 5028770a42e7bc4d15791a44c28f0ad539323807 - acpi 
merge commit. Will see tomorroy whether it works.


--
Meelis Roos ([EMAIL PROTECTED])
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch 8/8] PCI Error Recovery: PPC64 core recovery routines

2005-08-25 Thread Benjamin Herrenschmidt

On Thu, 2005-08-25 at 11:21 -0500, Linas Vepstas wrote:
> On Thu, Aug 25, 2005 at 10:49:03AM +1000, Benjamin Herrenschmidt was heard to 
> remark:
> > 
> > Of course, we'll possibly end up with a different ethX or whatever, but
> 
> Yep, but that's not an issue, since all the various device-naming
> schemes are supposed to be fixing this. Its a distinct problem;
> it needs to be solved even across cold-boots. 

Ok, so what is the problem then ? Why do we have to wait at all ? Why
not just unplug/replug right away ?

> (Didn't I ever tell you about the day I added a new disk controller to
> my system, and /dev/hda became /dev/hde and thus /home was mounted on
> /usr and /var as /etc and all hell broke loose? Owww, device naming
> is a serious issue for home users and even more so for enterprise-class 
> users).
> 
> --linas
> 

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Memory problem w/ recent kernels on 2x Opteron with 12 GB RAM

2005-08-25 Thread Rafael J. Wysocki

On Wednesday, 24 of August 2005 23:21, Andi Kleen wrote:
> On Wednesday 24 August 2005 23:08, Rafael J. Wysocki wrote:
> > Hi,
> >
> > I'm currently seeing a memory problem on a NUMA-enabled dual-Opteron 250
> > box with the 2.6.12.5 and 2.6.13-rc* (up to 7) kernels.  Namely, the box
> > has 12 GB of RAM, 8 GB of which is installed on the first node.  The whole
> > memory is detected but then only the first 8 GB of it is made available
> > (minus some hardware-related holes), as though the memory on the second
> > node were discarded for some reason.
> 
> 
> Boot log please?

Sorry for the delay.  The BIOS upgrade has fixed the problem.

Greetings,
Rafael


-- 
- Would you tell me, please, which way I ought to go from here?
- That depends a good deal on where you want to get to.
-- Lewis Carroll "Alice's Adventures in Wonderland"
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Need better is_better_time_interpolator() algorithm

2005-08-25 Thread linux

> (frequency) * (1/drift) * (1/latency) * (1/(jitter_factor * cpus))

(Note that 1/cpus, being a constant for all evaluations of this
expression, has no effect on the final ranking.)
The usual way it's done is with some fiddle factors:

quality_a^a * quality_b^b * quality_c^c

Or, equivalently:

a * log(quality_a) + b * log(quality_b) + c * log(quality_c)

Then you use the a, b and c factors to weight the relative importance
of them.  Your suggestion is equivalent to setting all the exponents to 1.

But you can also say that "a is twice as important as b" in a
consistent manner.

Note that computing a few bits of log_2 is not hard to do in integer
math if you're not too anxious about efficiency:

unsigned log2(unsigned x)
{
unsigned result = 31;
unsigned i;

assert(x);
while (!x & (1u<<31)) {
x <<= 1;
result--;
}
/* Think of x as a 1.31-bit fixed-point number, 1 <= x < 2 */
for (i = 0; i < NUM_FRACTION_BITS; i++) {
unsigned long long y = x;
/* Square x and compare to 2. */
y *= x;
result <<= 1;
if (y & (1ull<<63)) {
result++;
x = (unsigned)(y >> 32);
} else {
x = (unsigned)(y >> 31);
}
}
return result;
}

Setting NUM_FRACTION_BITS to 16 or so would give enough room for
reasonable-sized weights and not have the total overflow 32 bits.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] drivers/hwmon/*: kfree() correct pointers

2005-08-25 Thread Alexey Dobriyan

The adm9240 driver, in adm9240_detect(), allocates a structure.  The
error path attempts to kfree() ->client field of it (second one),
resulting in an oops (or slab corruption) if the hardware is not present.

->client field in adm1026, adm1031, smsc47b397 and smsc47m1 is the first in
${HWMON}_data structure, but fix them too.

Signed-off-by: Jonathan Corbet <[EMAIL PROTECTED]
Signed-off-by: Alexey Dobriyan <[EMAIL PROTECTED]>
---

 drivers/hwmon/adm1026.c|2 +-
 drivers/hwmon/adm1031.c|2 +-
 drivers/hwmon/adm9240.c|2 +-
 drivers/hwmon/smsc47b397.c |2 +-
 drivers/hwmon/smsc47m1.c   |2 +-
 5 files changed, 5 insertions(+), 5 deletions(-)

diff -uprN linux-vanilla/drivers/hwmon/adm1026.c 
linux-hwmon/drivers/hwmon/adm1026.c
--- linux-vanilla/drivers/hwmon/adm1026.c   2005-08-25 18:57:18.0 
+0400
+++ linux-hwmon/drivers/hwmon/adm1026.c 2005-08-26 01:16:07.0 +0400
@@ -1691,7 +1691,7 @@ int adm1026_detect(struct i2c_adapter *a
 
/* Error out and cleanup code */
 exitfree:
-   kfree(new_client);
+   kfree(data);
 exit:
return err;
 }
diff -uprN linux-vanilla/drivers/hwmon/adm1031.c 
linux-hwmon/drivers/hwmon/adm1031.c
--- linux-vanilla/drivers/hwmon/adm1031.c   2005-08-25 18:57:18.0 
+0400
+++ linux-hwmon/drivers/hwmon/adm1031.c 2005-08-26 01:16:26.0 +0400
@@ -834,7 +834,7 @@ static int adm1031_detect(struct i2c_ada
return 0;
 
 exit_free:
-   kfree(new_client);
+   kfree(data);
 exit:
return err;
 }
diff -uprN linux-vanilla/drivers/hwmon/adm9240.c 
linux-hwmon/drivers/hwmon/adm9240.c
--- linux-vanilla/drivers/hwmon/adm9240.c   2005-08-25 18:57:18.0 
+0400
+++ linux-hwmon/drivers/hwmon/adm9240.c 2005-08-26 01:16:40.0 +0400
@@ -616,7 +616,7 @@ static int adm9240_detect(struct i2c_ada
 
return 0;
 exit_free:
-   kfree(new_client);
+   kfree(data);
 exit:
return err;
 }
diff -uprN linux-vanilla/drivers/hwmon/smsc47b397.c 
linux-hwmon/drivers/hwmon/smsc47b397.c
--- linux-vanilla/drivers/hwmon/smsc47b397.c2005-08-25 18:57:18.0 
+0400
+++ linux-hwmon/drivers/hwmon/smsc47b397.c  2005-08-26 01:21:11.0 
+0400
@@ -298,7 +298,7 @@ static int smsc47b397_detect(struct i2c_
return 0;
 
 error_free:
-   kfree(new_client);
+   kfree(data);
 error_release:
release_region(addr, SMSC_EXTENT);
return err;
diff -uprN linux-vanilla/drivers/hwmon/smsc47m1.c 
linux-hwmon/drivers/hwmon/smsc47m1.c
--- linux-vanilla/drivers/hwmon/smsc47m1.c  2005-08-25 18:57:18.0 
+0400
+++ linux-hwmon/drivers/hwmon/smsc47m1.c2005-08-26 01:21:28.0 
+0400
@@ -495,7 +495,7 @@ static int smsc47m1_detect(struct i2c_ad
return 0;
 
 error_free:
-   kfree(new_client);
+   kfree(data);
 error_release:
release_region(address, SMSC_EXTENT);
return err;

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.13-rc7-rt1

2005-08-25 Thread Daniel Walker

On Thu, 2005-08-25 at 19:45 +0200, Ingo Molnar wrote:
> * Daniel Walker <[EMAIL PROTECTED]> wrote:
> 
> > Does anyone have x86_64 working in PREEMPT_RT ?
> 
> builds fine, but doesnt seem to boot at the moment. Havent investigated 
> yet.

I tested an em64t , and it hung during boot .. But this patched fixed
it, does it do anything for you?

Daniel

Index: linux-2.6.12/arch/x86_64/kernel/smpboot.c
===
--- linux-2.6.12.orig/arch/x86_64/kernel/smpboot.c  2005-08-25 
19:39:04.0 +
+++ linux-2.6.12/arch/x86_64/kernel/smpboot.c   2005-08-25 20:42:38.0 
+
@@ -750,7 +750,6 @@ static int __cpuinit do_boot_cpu(int cpu
 
 do_rest:
 
-   cpu_pda[cpu].pcurrent = c_idle.idle;
 
start_rip = setup_trampoline();
 
@@ -789,6 +788,8 @@ do_rest:
apic_read(APIC_ESR);
}
 
+   cpu_pda[cpu].pcurrent = c_idle.idle;
+
/*
 * Status is now clean
 */


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] RT-patch update to remove the global pi_lock

2005-08-25 Thread Daniel Walker

On Thu, 2005-08-25 at 16:09 -0400, Steven Rostedt wrote:

> A word of caution (aka. disclaimer). This is still new.  I still expect
> there are some cases in the code that was missed and can cause a dead
> lock or other bad side effect.  Hopefully, we can iron these all out.
> Also, I noticed that since the task takes it's own pi_lock for most of
> the code, if something locks up and a NMI goes off, the down_trylock in
> printk will also lock when it tries to take it's own pi_lock.

maybe it's time for ALL_TASKS_PI ?


Daniel

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [OT] volatile keyword

2005-08-25 Thread Christopher Friesen


Vadim Lobanov wrote:


I figured it was something along these lines. In that case, is the
following code (from kernel/posix-timers.c) really doing the right
thing?

do
expires = timr->it_timer.expires;
while ((volatile long) (timr->it_timer.expires) != expires);

Seems it's casting the value, not the pointer.


Someone else will have to give the definitive answer, but it looks 
suspicious to me...


Chris
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[-mm patch] relayfs: upgraded read() implementation

2005-08-25 Thread Tom Zanussi

Hi,

The current relayfs read implementation works fine, but was designed
to be used mainly for 'draining' the buffer after a tracing run.  It
turns out that people really want to be able to read from the buffer
during a live trace, for example the blktrace application submitted
recently:

http://marc.theaimsgroup.com/?l=linux-kernel=112480046405961=2

Here's an improved read implementation for relayfs which allows for
that.

This version has been tested pretty thoroughly, using both the
blktrace application and a new example I added to the relay-apps
tarball called 'readtest' which is basically a unit test for the read
functionality.  All the tests I've come up with have passed and it
looks pretty solid at this point.  Here's a link to the test code:

http://prdownloads.sourceforge.net/relayfs/relay-apps-0.8.tar.gz?download

Andrew, please apply.

Thanks,

Tom


Signed-off-by: Tom Zanussi <[EMAIL PROTECTED]>

diff -urpN -X dontdiff 
linux-2.6.13-rc6-mm2/Documentation/filesystems/relayfs.txt 
linux-2.6.13-rc6-mm2-cur/Documentation/filesystems/relayfs.txt
--- linux-2.6.13-rc6-mm2/Documentation/filesystems/relayfs.txt  2005-08-25 
19:28:59.0 -0500
+++ linux-2.6.13-rc6-mm2-cur/Documentation/filesystems/relayfs.txt  
2005-08-25 17:07:48.0 -0500
@@ -82,10 +82,15 @@ mmap()   results in channel buffer being 
 memory space. Note that you can't do a partial mmap - you must
 map the entire file, which is NRBUF * SUBBUFSIZE.
 
-read()  read the contents of a channel buffer.  If there are active
-channel writers, results may be unpredictable - users should
-make sure that all logging to the channel has ended before
-using read().
+read()  read the contents of a channel buffer.  The bytes read are
+'consumed' by the reader i.e. they won't be available again
+to subsequent reads.  If the channel is being used in
+no-overwrite mode (the default), it can be read at any time
+even if there's an active kernel writer.  If the channel is
+being used in overwrite mode and there are active channel
+writers, results may be unpredictable - users should make
+sure that all logging to the channel has ended before using
+read() with overwrite mode.
 
 poll()  POLLIN/POLLRDNORM/POLLERR supported.  User applications are
 notified when sub-buffer boundaries are crossed.
@@ -256,8 +261,8 @@ consulted.
 
 The default subbuf_start() implementation, used if the client doesn't
 define any callbacks, or doesn't define the subbuf_start() callback,
-implements the simplest possible 'overwrite' mode i.e. it does nothing
-but return 1.
+implements the simplest possible 'no-overwrite' mode i.e. it does
+nothing but return 0.
 
 Header information can be reserved at the beginning of each sub-buffer
 by calling the subbuf_start_reserve() helper function from within the
diff -urpN -X dontdiff linux-2.6.13-rc6-mm2/fs/relayfs/inode.c 
linux-2.6.13-rc6-mm2-cur/fs/relayfs/inode.c
--- linux-2.6.13-rc6-mm2/fs/relayfs/inode.c 2005-08-25 19:29:02.0 
-0500
+++ linux-2.6.13-rc6-mm2-cur/fs/relayfs/inode.c 2005-08-25 18:21:31.0 
-0500
@@ -295,101 +295,143 @@ static int relayfs_release(struct inode 
 }
 
 /**
- * relayfs_read_start - find the first available byte to read
- *
- * If the read_pos is in the middle of padding, return the
- * position of the first actually available byte, otherwise
- * return the original value.
+ * relayfs_read_consume - update the consumed count for the buffer
  */
-static inline size_t relayfs_read_start(size_t read_pos,
-   size_t avail,
-   size_t start_subbuf,
-   struct rchan_buf *buf)
+static void relayfs_read_consume(struct rchan_buf *buf,
+size_t read_pos,
+size_t bytes_consumed)
 {
-   size_t read_subbuf, adj_read_subbuf;
-   size_t padding, padding_start, padding_end;
size_t subbuf_size = buf->chan->subbuf_size;
size_t n_subbufs = buf->chan->n_subbufs;
+   size_t read_subbuf;
 
-   read_subbuf = read_pos / subbuf_size;
-   adj_read_subbuf = (read_subbuf + start_subbuf) % n_subbufs;
+   if (buf->bytes_consumed + bytes_consumed > subbuf_size) {
+   relay_subbufs_consumed(buf->chan, buf->cpu, 1);
+   buf->bytes_consumed = 0;
+   }
 
-   if ((read_subbuf + 1) * subbuf_size <= avail) {
-   padding = buf->padding[adj_read_subbuf];
-   padding_start = (read_subbuf + 1) * subbuf_size - padding;
-   padding_end = (read_subbuf + 1) * subbuf_size;
-   if (read_pos >= padding_start && read_pos < padding_end) {
-   read_subbuf = (read_subbuf + 1) % n_subbufs;
-   read_pos = read_subbuf * subbuf_size;
+   buf->bytes_consumed +=

Re: [OT] volatile keyword

2005-08-25 Thread Vadim Lobanov

On Thu, 25 Aug 2005, Christopher Friesen wrote:

> Vadim Lobanov wrote:
>
> > I'm positive I'm doing something wrong here. In fact, I bet it's the
> > volatile cast within the loop that's wrong; but I'm not sure how to do
> > it correctly. Any help / pointers / discussion would be appreciated.
>
> You need to cast is as dereferencing a volatile pointer.
>
> Chris
>

I figured it was something along these lines. In that case, is the
following code (from kernel/posix-timers.c) really doing the right
thing?

do
expires = timr->it_timer.expires;
while ((volatile long) (timr->it_timer.expires) != expires);

Seems it's casting the value, not the pointer.

-VadimL
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 5/5] Remove unnecesary capability hooks in rootplug.

2005-08-25 Thread Chris Wright

* [EMAIL PROTECTED] ([EMAIL PROTECTED]) wrote:
> @@ -1527,7 +1533,8 @@ static int selinux_vm_enough_memory(long
>   int rc, cap_sys_admin = 0;
>   struct task_security_struct *tsec = current->security;
>  
> - rc = secondary_ops->capable(current, CAP_SYS_ADMIN);
> + rc = secondary_ops->capable ?
> + secondary_ops->capable(current, CAP_SYS_ADMIN) : 0;

I don't think this really makes sense.  It says the default secondary
thinks you have the capablity.  Safe since SELinux double checks, but
not really accurate.

thanks,
-chris
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 5/5] Remove unnecesary capability hooks in rootplug.

2005-08-25 Thread Chris Wright

* Chris Wright ([EMAIL PROTECTED]) wrote:
> * Stephen Smalley ([EMAIL PROTECTED]) wrote:
> > e.g. if secondary_ops->capable is null, the SELinux tests aren't going
> > to show that, because they will still see that the SELinux permission
> > checks are working correctly.  They only test failure/success for the
> > SELinux permission checks, not for the capability checks, so if you
> > unhook capabilities, they won't notice.
> 
> Yes, I see.  I thought the tests you were referring to were 
> "if (secondary_ops->capable)" not LTP tests.  Capability is still a
> module that can be loaded (or built-in).  So the only issue is it's
> security_ops is now NULL where it was a trivial return 0 function.
> Aside from the oversight Serge fixed, I don't think there's any issue.

Bah, of course, that's inaccurate because you unconditionally set the
secondary to the default.  So, indeed, the default case (nothing actively
loaded as secondary) will get secondary_ops filled with NULL only.
Seems simplest to just fill the default with cap calls where applicable,
but I had hoped to eliminate that.
Thoughts?

thanks,
-chris
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: PowerOP Take 2 0/3 Intro

2005-08-25 Thread Jordan Crouse

Todd - do you have a ChangeLog from Take 1? :)

Jordan
-- 
Jordan Crouse
Senior Linux Engineer
AMD - Personal Connectivity Solutions Group


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: oops in 2.6.13-rc6-git12 in tcp/netfilter routines

2005-08-25 Thread Sven Schuster


Hi Harald,

On Thu, Aug 25, 2005 at 06:55:50PM +0200, Harald Welte told us:
> Is it true that PeerGuardian is a proprietary application?  I'm not
> going to debug this problem using a proprietary ip_queue program, sorry.

sorry to jump in here, but I took a quick look at PeerGuardian,
according to
http://methlabs.org/wiki/license_information
it's open source.  The source code is available at
http://methlabs.org/projects/peerguardian-linuxosx/

HTH

Sven

-- 
Linux zion.homelinux.com 2.6.13-rc6-mm2 #3 Thu Aug 25 14:53:55 CEST 2005 i686 
athlon i386 GNU/Linux
 22:56:18 up  7:40,  1 user,  load average: 0.46, 0.14, 0.04


pgp8ptImjJfSl.pgp
Description: PGP signature

Re: [OT] volatile keyword

2005-08-25 Thread Christopher Friesen


Vadim Lobanov wrote:


I'm positive I'm doing something wrong here. In fact, I bet it's the
volatile cast within the loop that's wrong; but I'm not sure how to do
it correctly. Any help / pointers / discussion would be appreciated.


You need to cast is as dereferencing a volatile pointer.

Chris
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] fix adm9240 oops

2005-08-25 Thread Jonathan Corbet

The adm9240 driver, in adm9240_detect(), allocates a structure.  The
error path attempts to kfree() a subfield of that structure, resulting
in an oops (or slab corruption) if the hardware is not present.  This
one seems worth fixing for 2.6.13.

jon

Signed-off-by: Jonathan Corbet <[EMAIL PROTECTED]>

--- 2.6.13-rc7/drivers/hwmon/adm9240.c.orig 2005-08-25 14:30:04.0 
-0600
+++ 2.6.13-rc7/drivers/hwmon/adm9240.c  2005-08-25 14:30:26.0 -0600
@@ -616,7 +616,7 @@ static int adm9240_detect(struct i2c_ada
 
return 0;
 exit_free:
-   kfree(new_client);
+   kfree(data);
 exit:
return err;
 }
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Petition for gas grices

2005-08-25 Thread Danial Thom



--- Lee Revell <[EMAIL PROTECTED]> wrote:

> On Thu, 2005-08-25 at 14:44 -0400, Lee Revell
> wrote:
> > On Thu, 2005-08-25 at 14:20 -0400, Michael
> Krufky wrote:
> > > Todd Bailey wrote:
> > > 
> > > > I'm all for this but I think there is
> little uncle George can do.
> > > 
> > > Was it necessary to cc this to everybody in
> the world?
> > 
> > God, I can't believe this epidemic of
> bitching about gas prices has
> > invaded LKML of all places.
> 
> Sorry did not mean to send that to the list.
> 
> Lee

If you weren't such a big dope you'd own oil
company stocks like the rest of us real
Americans. Participate in America rather than
bashing it :)




Start your day with Yahoo! - make it your home page 
http://www.yahoo.com/r/hs 
 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.12 Performance problems

2005-08-25 Thread Danial Thom



--- Ben Greear <[EMAIL PROTECTED]> wrote:

> Danial Thom wrote:
> 
> > The tests I reported where on UP systems.
> Perhaps
> > the default settings are better for this in
> 2.4,
> > since that is what I used, and you used your
> > hacks for both.
> 
> My modifications to the kernel are unlikely to
> speed anything
> up, and probably will slow things down ever so
> slightly.
> 
> I can try with a UP kernel, but my machine at
> least has a single
> processor.  I'm using the SMP kernel to take
> advantage of HT.
> 
> > Are you getting drops or overruns (or both)?
> I
> > would assume drops is a decision to drop
> rather
> > than an overrun which is a ring overrun.
> Overruns
> > would imply more about performance than
> tuning,
> > I'd think.
> 
> I was seeing lots of NIC errors...in fact, it
> was showing a great many
> more errors than packets sent to it, so I just
> ignored them.
> 
> I increased the TxDescriptors and RxDescriptors
> and that helped a little.
> 
> Increasing the transmit queue for the NIC to
> 2000 also helped a little.
> 
> > I wouldn't think that HT would be appropriate
> for
> > this sort of setup...?
> 
> 2.6.11 seems to be faster when running SMP
> kernel on this system.

HT and SMP are not the same animal, are they? My
understanding is that an HT aware scheduler is
likely to make things worse most of the time,
particularly for systems not running a lot of
threads..


> > 
> > You're using a dual PCI-X NIC rather than the
> > onboard ports? Supermicro runs their onboard
> 
> Of course.  Never found a motherboard yet with
> decent built-in
> NICs.  The built-ins on this board are tg3 and
> they must be on
> a slow bus, because they cannot go faster than
> about 700Mbps
> (using big pkts).

If its the P8SCI or the same design they are on a
1X PCIE thats shared with the PCI-X. Pretty hokey
stuff. Its also a low-end controller amongst the
broadcom parts.

Danial




Start your day with Yahoo! - make it your home page 
http://www.yahoo.com/r/hs 
 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[OT] volatile keyword

2005-08-25 Thread Vadim Lobanov

Hi,

The recent discussion on the list concerning memory barriers and write
ordering took a side-trip to the volatile keyword, especially its
correct / incorrect usage. Someone posted a link to the LKML archives,
in which the argument is made that it is best to keep 'volatile' _out_
of variable and structure definitions, and _into_ the code that uses
them. I was curious, so I decided to try this out (looking at
kernel/posix-timers.c for inspiration)...

Here's sample userland program number one, written one way:
===
#include 
#include 
#include 

struct sync {
volatile unsigned long lock;
volatile unsigned long value;
};

struct sync data;

void * thread (void * arg) {
sleep(5);
data.value = 0;
data.lock = 0;

return NULL;
}

int main (void) {
pthread_t other;

data.lock = 1;
data.value = 1;
pthread_create(, NULL, thread, NULL);
while (data.lock);
printf("Value is %lu.\n", data.value);
pthread_join(other, NULL);

return 0;
}
===

And here's what should be the same program, written the "suggested" way:
===
#include 
#include 
#include 

struct sync {
unsigned long lock;
unsigned long value;
};

struct sync data;

void * thread (void * arg) {
sleep(5);
data.value = 0;
data.lock = 0;

return NULL;
}

int main (void) {
pthread_t other;

data.lock = 1;
data.value = 1;
pthread_create(, NULL, thread, NULL);
while ((volatile unsigned long)(data.lock));
printf("Value is %lu.\n", data.value);
pthread_join(other, NULL);

return 0;
}
===

The first program works exactly as expected. The second program,
however, never syncs with the sleeping thread. In fact, for the second
program, gcc compiles the while loop down to an infinite loop.

I'm positive I'm doing something wrong here. In fact, I bet it's the
volatile cast within the loop that's wrong; but I'm not sure how to do
it correctly. Any help / pointers / discussion would be appreciated.

Thanks. :-)
-VadimL
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] Ext3 online resizing locking issue

2005-08-25 Thread Glauber de Oliveira Costa

 
> NAK, this is wrong:
> 
> > +   lock_super(sb);
> > err = ext3_group_extend(sb, EXT3_SB(sb)->s_es, n_blocks_count);
> > +   unlock_super(sb);
> 
> This basically reverses the order of locking between lock_super() and
> journal_start() (the latter acts like a lock because it can block on a
> resource if the journal is too full for the new transaction.)  That's
> the opposite order to normal, and will result in a potential deadlock.
> 
Ooops! Missed that. But I agree with the point. 

 
> But the _right_ fix, if you really want to keep that code, is probably
> to move all the resize locking to a separate lock that ranks outside the
> journal_start.  The easy workaround is to drop the superblock lock and
> reaquire it around the journal_start(); it would be pretty easy to make
> that work robustly as far as ext3 is concerned, but I suspect there may
> be VFS-layer problems if we start dropping the superblock lock in the
> middle of the s_ops->remount() call --- Al?
> 

Just a question here. With s_lock held by the remount code, we're
altering the struct super_block, and believing we're safe. We try to
acquire it inside the resize functions, because we're trying to modify 
this same data. Thus, if we rely on another lock, aren't we probably 
messing  up something ? (for example, both group_extend and remount code 
potentially modify s_flags field. If we ioctl and remount at the same time, 
each one with a different lock, something could go wrong). Am I missing
something here ? 

glauber
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Conntrack problem, machines freeze

2005-08-25 Thread Lukasz Spaleniak

Hello,

I have simple linux router with three fastethernet cards (intel , e100
driver). About two months ago it started hanging. It's completly
freezing machine (no ooops. First of all when it's booting few
messages like this appears on screen:

NF_IP_ASSERT: ip_conntrack_core.c:1128(ip_conntrack_alter_reply)

I suppose it's showing before firewall script load rules (simple nat).
After that somtimes it's working very long, sometimes it's freezing
after few seconds. One time I've logged this message before it freezes:

kernel: LIST_DELETE: ip_conntrack_core.c:302 `>tuplehash
[IP_CT_DIR_REPLY]'(decb6084) not in _conntrack_hash[hr].

Components that has been already replaced:
- computer hardware (twice to a new one)
- fast ethernet cards (tried with intel, realtek and 3com)
- fresh system (debian sarge)
- switches

Router and switches are connected to UPS (dedicated, also replaced).

This is a vanilla kernel 2.4.31, problem also exist with kernels:
2.4.30, 2.4.29. I tried also with grsecuriry(hoping it could help)
patch, but it wasn't.

If you have any idea what I can try to fix please let me know.

Thank you for your time.


Best regads,
Lukasz Spaleniak


-- 
spalek on zigzag dot pl
GCM dpu s: a--- C++ UL P+ L+++ E--- W+ N+ K- w O- M V-
PGP t--- 5 X+ R- tv-- b DI- D- G e-- h! r y+
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Petition for gas grices

2005-08-25 Thread Lee Revell

On Thu, 2005-08-25 at 14:44 -0400, Lee Revell wrote:
> On Thu, 2005-08-25 at 14:20 -0400, Michael Krufky wrote:
> > Todd Bailey wrote:
> > 
> > > I'm all for this but I think there is little uncle George can do.
> > 
> > Was it necessary to cc this to everybody in the world?
> 
> God, I can't believe this epidemic of bitching about gas prices has
> invaded LKML of all places.

Sorry did not mean to send that to the list.

Lee

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] RT-patch update to remove the global pi_lock

2005-08-25 Thread Steven Rostedt

On Thu, 2005-08-25 at 19:47 +0200, Ingo Molnar wrote:

> your patch works great here, on 3 separate systems: a 1-way, a 2/4-way 
> and an 8-way.
> 
> the 1-way system performed so well running the SMP kernel that i first 
> thought i booted the UP kernel by accident :-)
> 
> on the 8-way box, "hackbench 10" got _3.7_ times faster (!!!).
> 
> i have booted the 8-way box without your patch once more because i didnt 
> believe the results initially and thought they were some benchmarking 
> fluke. But no, it wasnt a fluke. The kernel profiles are nicely flat 
> now.
> 
> i've released 2.6.13-rc7-rt2 with your patch included. This is certainly 
> a major milestone for PREEMPT_RT, it is now a first-class scalability 
> citizen on SMP too. Great work Steven!
> 
>   Ingo

A word of caution (aka. disclaimer). This is still new.  I still expect
there are some cases in the code that was missed and can cause a dead
lock or other bad side effect.  Hopefully, we can iron these all out.
Also, I noticed that since the task takes it's own pi_lock for most of
the code, if something locks up and a NMI goes off, the down_trylock in
printk will also lock when it tries to take it's own pi_lock.

I stated earlier, that I converted printk to use early_printk (serial)
on a oops_in_progress, so I wouldn't need to worry about the unlocking
circus that needs to be done.  So, I'm sure if something goes wrong now,
you won't see anything, even with an NMI.

If someone else would like to fix that, I'm sure Ingo would be willing
to accept patches :-)

-- Steve

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: process creation time increases linearly with shmem

2005-08-25 Thread Rik van Riel

On Thu, 25 Aug 2005, Nick Piggin wrote:

> fork() can be changed so as not to set up page tables for
> MAP_SHARED mappings. I think that has other tradeoffs like
> initially causing several unavoidable faults reading
> libraries and program text.

Actually, libraries and program text are usually mapped
MAP_PRIVATE, so those would still be copied.

Skipping MAP_SHARED in fork() sounds like a good idea to me...

-- 
All Rights Reversed
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 2.6.13-rc7 1/2] undo partial cpu_exclusive sched domain disabling

2005-08-25 Thread Paul Jackson

The partial disabling of Dinakar's new facility to allow
cpu_exclusive cpusets to define dynamic sched domains
doesn't go far enough.  At the suggestion of Nick Piggin
and Dinakar, let us instead totally disable this facility
for 2.6.13, in order to avoid problems first reported
by John Hawkes (corrupt sched data structures and kernel oops).

This patch removes the partial disabling code in 2.6.13-rc7,
in anticipation of the next patch, which will totally disable
it instead.

Signed-off-by: Paul Jackson <[EMAIL PROTECTED]>

Index: linux-2.6.13-rc7/kernel/cpuset.c
===
--- linux-2.6.13-rc7.orig/kernel/cpuset.c
+++ linux-2.6.13-rc7/kernel/cpuset.c
@@ -636,25 +636,6 @@ static void update_cpu_domains(struct cp
return;
 
/*
-* Hack to avoid 2.6.13 partial node dynamic sched domain bug.
-* Require the 'cpu_exclusive' cpuset to include all (or none)
-* of the CPUs on each node, or return w/o changing sched domains.
-* Remove this hack when dynamic sched domains fixed.
-*/
-   {
-   int i, j;
-
-   for_each_cpu_mask(i, cur->cpus_allowed) {
-   cpumask_t mask = node_to_cpumask(cpu_to_node(i));
-
-   for_each_cpu_mask(j, mask) {
-   if (!cpu_isset(j, cur->cpus_allowed))
-   return;
-   }
-   }
-   }
-
-   /*
 * Get all cpus from parent's cpus_allowed not part of exclusive
 * children
 */

-- 
  I won't rest till it's the best ...
  Programmer, Linux Scalability
  Paul Jackson <[EMAIL PROTECTED]> 1.650.933.1373
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 2.6.13-rc7 2/2] completely disable cpu_exclusive sched domain

2005-08-25 Thread Paul Jackson

At the suggestion of Nick Piggin and Dinakar, totally disable
the facility to allow cpu_exclusive cpusets to define dynamic
sched domains in Linux 2.6.13, in order to avoid problems
first reported by John Hawkes (corrupt sched data structures
and kernel oops).

This has been built for ppc64, i386, ia64, x86_64, sparc, alpha.
It has been built, booted and tested for cpuset functionality
on an SN2 (ia64).

Dinakar or Nick - could you verify that it for sure does avoid
the problems Hawkes reported.  Hawkes is out of town, and I don't
have the recipe to reproduce what he found.

Signed-off-by: Paul Jackson <[EMAIL PROTECTED]> 
 

Index: linux-2.6.13-rc7/kernel/cpuset.c
===
--- linux-2.6.13-rc7.orig/kernel/cpuset.c
+++ linux-2.6.13-rc7/kernel/cpuset.c
@@ -627,6 +627,14 @@ static int validate_change(const struct 
  * Call with cpuset_sem held.  May nest a call to the
  * lock_cpu_hotplug()/unlock_cpu_hotplug() pair.
  */
+
+/*
+ * Hack to avoid 2.6.13 partial node dynamic sched domain bug.
+ * Disable letting 'cpu_exclusive' cpusets define dynamic sched
+ * domains, until the sched domain can handle partial nodes.
+ * Remove this #if hackery when sched domains fixed.
+ */
+#if 0
 static void update_cpu_domains(struct cpuset *cur)
 {
struct cpuset *c, *par = cur->parent;
@@ -667,6 +675,11 @@ static void update_cpu_domains(struct cp
partition_sched_domains(, );
unlock_cpu_hotplug();
 }
+#else
+static void update_cpu_domains(struct cpuset *cur)
+{
+}
+#endif
 
 static int update_cpumask(struct cpuset *cs, char *buf)
 {

-- 
  I won't rest till it's the best ...
  Programmer, Linux Scalability
  Paul Jackson <[EMAIL PROTECTED]> 1.650.933.1373
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] RT-patch update to remove the global pi_lock

2005-08-25 Thread Steven Rostedt

On Thu, 2005-08-25 at 21:34 +0200, Ingo Molnar wrote:
> * Steven Rostedt <[EMAIL PROTECTED]> wrote:
> 
> > > does the system truly lock up, or is this some transitional condition?  
> > > In any case, i agree that this should be debugged independently of the 
> > > pi_lock patch.
> > 
> > Hmm, I forgot that you took out the bit_spin_lock fixes.  I think this 
> > may be caused by them.  I haven't look further into it yet.
> 
> yeah, i took them out because they clashed with upstream changes. Note 
> that i meanwhile also introduced a per-bh lock, which might make it 
> easier to fix the deadlock:
> 
>  --- linux.orig/fs/buffer.c
>  +++ linux/fs/buffer.c
>  @@ -537,8 +537,7 @@ static void end_buffer_async_read(struct
>   * decide that the page is now completely done.
>   */
>  first = page_buffers(page);
>  -   local_irq_save(flags);
>  -   bit_spin_lock(BH_Uptodate_Lock, >b_state);
>  +   spin_lock_irqsave(>b_uptodate_lock, flags);
>  clear_buffer_async_read(bh);
>  unlock_buffer(bh);
>  tmp = bh;
> 
> could jbd reuse this lock - or would it need another lock?

I think it can.  I'm looking into right now, but first I'm updating my
logdev to the latest release.  I stripped it all out after submitting
that pi_lock patch and now I have to put it back in!   I didn't save the
updates that I added earlier, so I'm reworking things now.  The logging
definitely helps me, since that was a major factor in getting that
pi_lock patch done so quick.

-- Steve


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH][-mm] Generic VFS fallback for security xattrs

2005-08-25 Thread Stephen Smalley

On Thu, 2005-08-25 at 13:43 -0400, Stephen Smalley wrote:
> This patch modifies the VFS setxattr, getxattr, and listxattr code to
> fall back to the security module for security xattrs if the filesystem
> does not support xattrs natively.  This allows security modules to
> export the incore inode security label information to userspace even
> if the filesystem does not provide xattr storage, and eliminates the
> need to individually patch various pseudo filesystem types to provide
> such access.  The patch removes the existing xattr code from devpts
> and tmpfs as it is then no longer needed.
> 
> The patch restructures the code flow slightly to reduce duplication
> between the normal path and the fallback path, but this should only
> have one user-visible side effect - a program may get -EACCES rather
> than -EOPNOTSUPP if policy denied access but the filesystem didn't
> support the operation anyway.  Note that the post_setxattr hook call
> is not needed in the fallback case, as the inode_setsecurity hook call
> handles the incore inode security state update directly.  In contrast,
> we do call fsnotify in both cases.
> 
> Please include in -mm for wider testing prior to merging in 2.6.14.
> 
> ---
> 
>  fs/Kconfig |   43 --
>  fs/devpts/Makefile |1 
>  fs/devpts/inode.c  |   21 ---
>  fs/devpts/xattr_security.c |   47 
>  fs/xattr.c |   80 +-
>  mm/shmem.c |   85 
> -
>  6 files changed, 49 insertions(+), 228 deletions(-)

Sorry, forgot to explicitly sign off on the patch:

Signed-off-by:  Stephen Smalley <[EMAIL PROTECTED]>

-- 
Stephen Smalley
National Security Agency

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] RT-patch update to remove the global pi_lock

2005-08-25 Thread Ingo Molnar


* Steven Rostedt <[EMAIL PROTECTED]> wrote:

> > does the system truly lock up, or is this some transitional condition?  
> > In any case, i agree that this should be debugged independently of the 
> > pi_lock patch.
> 
> Hmm, I forgot that you took out the bit_spin_lock fixes.  I think this 
> may be caused by them.  I haven't look further into it yet.

yeah, i took them out because they clashed with upstream changes. Note 
that i meanwhile also introduced a per-bh lock, which might make it 
easier to fix the deadlock:

 --- linux.orig/fs/buffer.c
 +++ linux/fs/buffer.c
 @@ -537,8 +537,7 @@ static void end_buffer_async_read(struct
  * decide that the page is now completely done.
  */
 first = page_buffers(page);
 -   local_irq_save(flags);
 -   bit_spin_lock(BH_Uptodate_Lock, >b_state);
 +   spin_lock_irqsave(>b_uptodate_lock, flags);
 clear_buffer_async_read(bh);
 unlock_buffer(bh);
 tmp = bh;

could jbd reuse this lock - or would it need another lock?

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Initramfs and TMPFS!

2005-08-25 Thread dwilson24


   >I'm not subscribed to the list and I use lynx and a small mda
   >called msmtp, so I know it's awkward (perhaps mostly for me).
   >>People seem to  be  CCing  you, can't you reply to the message you
   >>receive that way? That's how everyone else who doesn't subscribe
   >>gets along...
   >>Anyway, if you insist on sending things manually, you could add the
   >>correct References and/or In-Reply-To headers by had as well.

That free email-address <[EMAIL PROTECTED]> doesn't seem to be working,
so I haven't gotten any mail there in awhile.

I have another email address that does work, but I have no interface to it.

I can download mail using fetchmail and read/reply using mutt, perhaps
that would do.

This is the good email-address: <[EMAIL PROTECTED]>
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Question about usb-storage: Sometimes partitions are not recognized.

2005-08-25 Thread Pete Zaitcev

On Thu, 25 Aug 2005 15:26:27 +0200, Manuel Schneider <[EMAIL PROTECTED]> wrote:

> When I plug them in, they will be recognized by hotplug (I'm using udev), the 
> module usb-storage will be loaded and the device nodes are created.
> 
> BUT: There is normally just ONE device node for the disc block device.
> Partitions are not available.
> I can "solve" this by just starting fdisk (and shutting it down again without 
> changing anything) on the given block device - after that, all the partitions 
> are available. [...]

We need more data. First, "Kernel 2.6.x" is not good enough.
Give us a precise version on which you observe this.
Second, running with CONFIG_USB_STORAGE_DEBUG may yield a useful trace.
I am not quite sure about that though, as this seems to be some
misunderstanding between the block level and SCSI.

Problems with block device open() not working right fall squarely
into purview of Al Viro, but he's quite busy right now. Someone
has to identify the exact scenario. I suppose adding a few printks
around fs/block_dev.c may be more beneficial than any USB debugging.

-- Pete

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH][-mm] Generic VFS fallback for security xattrs

2005-08-25 Thread James Morris

On Thu, 25 Aug 2005, Stephen Smalley wrote:

> Please include in -mm for wider testing prior to merging in 2.6.14.

Acked-by: James Morris <[EMAIL PROTECTED]>


-- 
James Morris
<[EMAIL PROTECTED]>
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 0/5] LSM hook updates

2005-08-25 Thread Chris Wright

* Chris Wright ([EMAIL PROTECTED]) wrote:
> I'll have some numbers tomorrow.  If you'd like to run SELinux that'd
> be quite useful.

These are just lmbench and kernel build numbers (certainly not the best
for real benchmark numbers, but easy to get a quick view run).  This is
just baseline (i.e. default, nothing loaded).

This is x86_64 (1 HT core) 2GB.

Kernel build:

old hooks   new hooks
-   -
real7m2.313sreal7m1.542s
user6m25.012s   user6m25.484s
sys 0m56.580s   sys 0m56.008s

real7m3.376sreal7m0.593s
user6m25.412s   user6m24.184s
sys 0m57.140s   sys 0m56.936s

real7m2.643sreal7m1.280s
user6m23.840s   user6m25.408s
sys 0m57.668s   sys 0m55.935s

real7m0.015sreal7m0.712s
user6m23.964s   user6m24.820s
sys 0m57.940s   sys 0m56.520s

real7m3.204sreal7m0.592s
user6m23.868s   user6m24.652s
sys 0m57.712s   sys 0m56.460s

real7m1.961sreal7m1.328s
user6m24.416s   user6m25.284s
sys 0m57.252s   sys 0m56.184s


Basic system parameters

Host OS Description  Mhz

- - --- 
vert.sous Linux 2.6.13- x86_64-linux-gnu-oldhoo 2997
vert.sous Linux 2.6.13- x86_64-linux-gnu-oldhoo 2997
vert.sous Linux 2.6.13- x86_64-linux-gnu-oldhoo 2997
vert.sous Linux 2.6.13- x86_64-linux-gnu-oldhoo 2997

vert.sous Linux 2.6.13- x86_64-linux-gnu-newhoo 2997
vert.sous Linux 2.6.13- x86_64-linux-gnu-newhoo 2997
vert.sous Linux 2.6.13- x86_64-linux-gnu-newhoo 2997
vert.sous Linux 2.6.13- x86_64-linux-gnu-newhoo 2997

Processor, Processes - times in microseconds - smaller is better

Host OS  Mhz null null  open selct sig  sig  fork exec sh  
 call  I/O stat clos TCP   inst hndl proc proc proc
- -      -     
vert.sous Linux 2.6.13- 2997 0.22 0.39 14.1 16.4  14.9 0.36 4.77 199. 684. 2524
vert.sous Linux 2.6.13- 2997 0.22 0.39 14.1 16.4  15.0 0.36 4.68 198. 689. 2530
vert.sous Linux 2.6.13- 2997 0.23 0.39 14.1 16.4  14.2 0.36 4.74 198. 690. 2528
vert.sous Linux 2.6.13- 2997 0.22 0.39 14.1 16.4  14.9 0.37 4.71 199. 684. 2532

vert.sous Linux 2.6.13- 2997 0.22 0.39 14.1 16.3  14.2 0.37 4.66 195. 679. 2497
vert.sous Linux 2.6.13- 2997 0.22 0.39 14.1 16.3  14.8 0.37 4.67 198. 681. 2511
vert.sous Linux 2.6.13- 2997 0.23 0.40 14.1 16.3  15.0 0.37 4.67 197. 678. 2512
vert.sous Linux 2.6.13- 2997 0.23 0.39 14.1 16.3  15.6 0.37 4.70 197. 681. 2508

Context switching - times in microseconds - smaller is better
-
Host OS 2p/0K 2p/16K 2p/64K 8p/16K 8p/64K 16p/16K 16p/64K
ctxsw  ctxsw  ctxsw ctxsw  ctxsw   ctxsw   ctxsw
- - - -- -- -- -- --- ---
vert.sous Linux 2.6.13- 6.120 7.1500 9.6900 7.1600   11.8 7.7800018.0
vert.sous Linux 2.6.13- 6.140 7.1000 9.6700 7.1600   11.7 7.9300018.1
vert.sous Linux 2.6.13- 6.080 7.1100 9.6900 7.2100   11.9 8.1400018.0
vert.sous Linux 2.6.13- 6.070 7.1000 9.7100 7.3000   12.9 7.8500018.1

vert.sous Linux 2.6.13- 5.820 6.8900 9.4200 7.0600   12.2 7.7700018.0
vert.sous Linux 2.6.13- 5.830 6.9700 9.5400 7.   13.6 7.9900017.9
vert.sous Linux 2.6.13- 5.870 6.8200 9.5000 7.3000   12.1 8.1500017.8
vert.sous Linux 2.6.13- 5.870 6.9200 9.5400 7.1200   11.4 7.9100018.3

*Local* Communication latencies in microseconds - smaller is better
---
Host OS 2p/0K  Pipe AF UDP  RPC/   TCP  RPC/ TCP
ctxsw   UNIX UDP TCP conn
- - - -  - - - - 
vert.sous Linux 2.6.13- 6.180  15.2 33.9  29.9  42.3  55.9  72.2 106.
vert.sous Linux 2.6.13- 6.140  15.2 33.8  30.1  42.5  55.8  72.5 107.
vert.sous Linux 2.6.13- 6.080  15.1 34.0  30.0  42.5  55.9  72.6 107.
vert.sous Linux 2.6.13- 6.070  14.7 34.1  30.2  42.4  55.7  72.5 107.

vert.sous Linux 2.6.13- 5.820  14.1 33.8  30.0  42.0  54.9  71.0 106.
vert.sous Linux 2.6.13- 5.830  14.4 33.9  30.2  42.1  54.9  71.0 106.
vert.sous Linux 2.6.13- 5.870  14.6 34.1  29.9  42.0  54.9  71.2 106.
vert.sous Linux 2.6.13- 5.870  14.6 34.3  29.8  42.2  54.8  71.0 106.

File & VM system latencies in microseconds - smaller is better
--
Host OS   0K File  10K File  MmapProtPage

Re: [PATCH] Ext3 online resizing locking issue

2005-08-25 Thread Stephen C. Tweedie

Hi,

On Wed, 2005-08-24 at 22:03, Glauber de Oliveira Costa wrote:

> This simple patch provides a fix for a locking issue found in the online
> resizing code. The problem actually happened while trying to resize the
> filesystem trough the resize=xxx option in a remount. 

NAK, this is wrong:

> + lock_super(sb);
>   err = ext3_group_extend(sb, EXT3_SB(sb)->s_es, n_blocks_count);
> + unlock_super(sb);

This basically reverses the order of locking between lock_super() and
journal_start() (the latter acts like a lock because it can block on a
resource if the journal is too full for the new transaction.)  That's
the opposite order to normal, and will result in a potential deadlock.

> + {Opt_resize, "resize=%u"},
>   {Opt_err, NULL},
> - {Opt_resize, "resize"},

Right, that's disabled for now.  I guess the easy fix here is just to
remove the code entirely, given that we have locking problems with
trying to fix it!

But the _right_ fix, if you really want to keep that code, is probably
to move all the resize locking to a separate lock that ranks outside the
journal_start.  The easy workaround is to drop the superblock lock and
reaquire it around the journal_start(); it would be pretty easy to make
that work robustly as far as ext3 is concerned, but I suspect there may
be VFS-layer problems if we start dropping the superblock lock in the
middle of the s_ops->remount() call --- Al?

--Stephen

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Need better is_better_time_interpolator() algorithm

2005-08-25 Thread john stultz

On Thu, 2005-08-25 at 12:43 -0600, Alex Williamson wrote:
> On Thu, 2005-08-25 at 10:36 -0700, john stultz wrote:
> > On Thu, 2005-08-25 at 10:44 -0600, Alex Williamson wrote:
> > > How can we munge these all together to come up with a single goodness
> > > factor for comparison?  There's probably a thesis covering algorithms to
> > > handle this.  Anyone know of one or have some good ideas?  Thanks,
> > 
> > With my timeofday rework code, the timesource structure (which was
> > influenced by the time interpolators) just uses a fixed "priority" vale.
> ...
> > Realistically I don't think too many systems will have multiple out of
> > tree timesources, so assigning the correct priority value shouldn't be
> > too difficult.
> > 
> > This just seemed a bit more straight forward then sorting out some
> > weighting algorithm for their properties to select the best timesource. 
> 
>I don't know that it's that uncommon.  Simply having one non-arch
> specific timer is enough to need to decided whether it's better than a
> generic timer. I assume pretty much every arch has a cycle timer.  For
> smaller boxes, this might be the preferred timer given it's latency even
> if something like an hpet exists (mmio access are expensive).  How do
> you hard code a value that can account for that?  

Well, in my patches I set the default priority of cycle timer as being
very high, say 350, and the slower MMIO device as being a decent 250.
Then if we boot up on say a NUMA system, or if we find the cycle
counters to be out of sync, the cycle counter init code drops the
timesource priority of the cycle counter to something like 50.


> I agree, we could
> easily go too far and produce some bloated algorithm, but maybe it's
> simply a weighted product of a few variables.
> 
> To start with, what would this do:
> 
> (frequency) * (1/drift) * (1/latency) * (1/(jitter_factor * cpus))

It just seems that something like this could be for the most part
precomputed when you're writing the time_interpolator code into a single
priority value that the init code can tweak as needed.

> Something this simple at least starts to dynamically bring the factors
> together.  All else being equal (and with no weighting), this would give
> the 1.5GHz/750ppm timer a higher priority than the 250MHz/500ppm timer.
> Is that good?  I like your idea to make this user tunable after boot,
> but I still think there has to be a way to make a smarter decision up
> front.  Thanks,

Shrug. I agree it needs improvement, so don't let me stop you from doing
something like what you have above. I just think its more complex then
necessary and might result in less predictable interpolator selections.

thanks
-john


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

To make molded parts for you to save your cost.Small Q'ty is OK(inj-com-wlw)

2005-08-25 Thread Please . fax . instead . of . email

Dear Sir/Madam,

We learnt your e-mail add.from internet.
 
FIRST OF ALL,PLEASE KINDLY NOTE THIS E-MAIL IS SENT BY
OUR "ADVERTISING COMPANY" AND THE E-MAIL ADDRESS IS
NOT "REAL"(VIRTUAL),THEREFORE,PLEASE CONTACT US
VIA "FAX"  OR "POST".DON'T DIRECTLY RESPONSE VIA " E-MAIL"
BECAUSE WE CAN'T RECEIVE YOUR E-MAIL.
IF YOU WANT TO BE REMOVED FROM THE LIST,PLEASE ADVISE
YOUR E-MAIL ADDRESS & THIS E-MAIL CONTENT OR SUBJECT VIA "FAX" OR "POST".

We are the professional mold & die maker and molded parts(moldings) supplier
for the following parts:

* Castings(sand castings) for iron & aluminium
* (Pressure) Die Casting for Zinc or aluminium
* Plastic Injection moldings.
* Sheet Metal Stampings.
* Oil Seals & other Rubber Moldings(both for industrial or general uses).
* Various Magnets.
* Machinings(Machined parts)
* Assembled unit(components assembled)

SMALL ORDER IS OK,PLEASE CONTACT US TO SAVE YOUR COST!

Thank you

Best Regards
J.S.Fu/President & CEO
No.107,Kuan-Fu Rd.,Bei-Dou(521),Chan ghwa Hsien,Taiwan.
Fax:886-4-8876126 (886 is the country code)



Developing Japan markets.doc
Description: Binary data

Re: Initramfs and TMPFS!

2005-08-25 Thread Alan Jenkins

> On Thu, Aug 25, 2005 at 12:32:50AM -0400, [EMAIL PROTECTED] wrote: 
> > Right, but it would be nice to have that option if initramfs 
> > using tmpfs becomes part of the kernel. 
> 
> But it's not needed so why add bloat? 

I'm not subscribed, so sorry if this doesn't fall into the original
thread.  I'm curious as to why the kernel has to include the decoder -
why you can't just run a self-extracting executable in an empty
initramfs (with a preset capacity if needs be).
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Inotify problem [was Re: 2.6.13-rc6-mm1]

2005-08-25 Thread Johannes Berg

On Thu, 2005-08-25 at 11:54 -0700, George Anzinger wrote:

> I think the best thing is to take idr into user space and emulate the 
> problem usage.  

Good plan, I guess. Do you think that's easy?

> To this end, from the log it appears that you _might_ be 
> moving between 0, 1 and 2 entries increasing the number each time.  It 
> also appears that the failure happens here:
> add 1023
> add 1024
> find 1024  or is it the remove that fails?  It also looks like 1024 got 
> allocated twice.  Am I reading the log correctly?

Remove 1024 fails, but add(please make it >1024) seems to return 1024,
and find(1024) also seems to fail. Well, remove() probably has to
find(), but I'm not really sure what inotify does (maybe find first, to
see if it's valid).

> So, is it correct to assume that the tree is empty save these two at 
> this time?  I am just trying to figure out what the test program needs 
> to do.

Yes, but all the smaller ones have been in it at some point in time.

johannes

signature.asc
Description: This is a digitally signed message part

Re: Inotify problem [was Re: 2.6.13-rc6-mm1]

2005-08-25 Thread John McCutchan

On Thu, 2005-08-25 at 21:03 +0200, Johannes Berg wrote:
> On Thu, 2005-08-25 at 11:54 -0700, George Anzinger wrote:
> 
> > I think the best thing is to take idr into user space and emulate the 
> > problem usage.  
> 
> Good plan, I guess. Do you think that's easy?
> 
> > To this end, from the log it appears that you _might_ be 
> > moving between 0, 1 and 2 entries increasing the number each time.  It 
> > also appears that the failure happens here:
> > add 1023
> > add 1024
> > find 1024  or is it the remove that fails?  It also looks like 1024 got 
> > allocated twice.  Am I reading the log correctly?
> 
> Remove 1024 fails, but add(please make it >1024) seems to return 1024,
> and find(1024) also seems to fail. Well, remove() probably has to
> find(), but I'm not really sure what inotify does (maybe find first, to
> see if it's valid).

Just to clarify, the remove() he is talking about isn't idr_remove, it
is inotify's remove. idr_find() is failing at 1024 which causes
inotify's remove to fail.

-- 
John McCutchan <[EMAIL PROTECTED]>
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Inotify problem [was Re: 2.6.13-rc6-mm1]

2005-08-25 Thread John McCutchan

On Thu, 2005-08-25 at 11:54 -0700, George Anzinger wrote:
> Robert Love wrote:
> > On Thu, 2005-08-25 at 09:33 -0400, John McCutchan wrote:
> > 
> >>On Thu, 2005-08-25 at 22:07 +1200, Reuben Farrelly wrote:
> >>
> ~
> >>>dovecot: Aug 25 19:31:26 Warning: IMAP(gilly): removing wd 1022 from 
> >>>inotify fd 4
> >>>dovecot: Aug 25 19:31:27 Warning: IMAP(gilly): inotify_add_watch returned 
> >>>1023
> >>>dovecot: Aug 25 19:31:27 Warning: IMAP(gilly): inotify_add_watch returned 
> >>>1024
> >>>dovecot: Aug 25 19:31:27 Warning: IMAP(gilly): removing wd 1024 from 
> >>>inotify fd 4
> >>>dovecot: Aug 25 19:31:27 Error: IMAP(gilly): inotify_rm_watch() failed: 
> >>>Invalid argument
> >>>dovecot: Aug 25 19:31:27 Warning: IMAP(gilly): removing wd 1023 from 
> >>>inotify fd 4
> >>>dovecot: Aug 25 19:31:28 Warning: IMAP(gilly): inotify_add_watch returned 
> >>>1024
> >>>dovecot: Aug 25 19:31:28 Warning: IMAP(gilly): inotify_add_watch returned 
> >>>1024
> >>>
> >>>Note the incrementing wd value even though we are removing them as we go..
> >>>
> >>
> >>What kernel are you running? The wd's should ALWAYS be incrementing, you
> >>should never get the same wd as you did before. From your log, you are
> >>getting the same wd (after you inotify_rm_watch it). I can reproduce
> >>this bug on 2.6.13-rc7.
> >>
> >>idr_get_new_above 
> >>
> >>isn't returning something above.
> >>
> >>Also, the idr layer seems to be breaking when we pass in 1024. I can
> >>reproduce that on my 2.6.13-rc7 system as well.
> >>
> >>
> >>>This is using latest CVS of dovecot code and with 2.6.12-rc6-mm(1|2) 
> >>>kernel.
> >>>
> >>>Robert, John, what do you think?   Is this possibly related to the oops 
> >>>seen 
> >>>in the log that I reported earlier?  (Which is still showing up 2-3 times 
> >>>per 
> >>>day, btw)
> >>
> >>There is definitely something broken here.
> > 
> > 
> > Jim, George-
> > 
> > We are seeing a problem in the idr layer.  If we do idr_find(1024) when,
> > say, a low valued idr, like, zero, is unallocated, NULL is returned.
> 
> I think the best thing is to take idr into user space and emulate the 
> problem usage.  To this end, from the log it appears that you _might_ be 
> moving between 0, 1 and 2 entries increasing the number each time.  It 
> also appears that the failure happens here:
> add 1023
> add 1024
> find 1024  or is it the remove that fails?  It also looks like 1024 got 
> allocated twice.  Am I reading the log correctly?

You are reading the log correctly. There are two bugs. One is that if we
pass X to idr_get_new_above, it can return X again (doesn't ever seem to
return < X). The other problem is that the find fails on 1024 (and 2048
if we skip 1024).

> 
> So, is it correct to assume that the tree is empty save these two at 
> this time?  I am just trying to figure out what the test program needs 
> to do.

Yes that is the exact scenario. Only 2 id's are used at any given time,
and once we hit 1024 things break. This doesn't happen when the tree is
not empty.

Thanks for looking at this!
-- 
John McCutchan <[EMAIL PROTECTED]>
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Initramfs and TMPFS!

2005-08-25 Thread Ian Campbell

On Thu, 2005-08-25 at 14:15 -0400, [EMAIL PROTECTED] wrote:
>>Could you please please pretty please get an RFC compliant mailer that
>>generates  "In-Reply-To" and preferable even "References" headers?
>>Right
>>now every mail you write starts a new thread instead of referencing to
>>the previous one. See http://lkml.org/lkml/2005/8/25/180/ to see what
>>I mean.
> 
> I'm not subscribed to the list and I use lynx and a small mda
> called msmtp, so I know it's awkward (perhaps mostly for me).

People seem to be CCing you, can't you reply to the message you receive
that way? That's how everyone else who doesn't subscribe gets along...

Anyway, if you insist on sending things manually, you could add the
correct References and/or In-Reply-To headers by had as well.

Ian.
-- 
Ian Campbell

Experience is the worst teacher.  It always gives the test first and
the instruction afterward.

signature.asc
Description: This is a digitally signed message part

Re: 2.6: how do I this in sysfs?

2005-08-25 Thread Christoph Hellwig

> > typedef struct _CSMI_SAS_IDENTIFY {
> >__u8  bDeviceType;
> >__u8  bRestricted;
> >__u8  bInitiatorPortProtocol;
> >__u8  bTargetPortProtocol;
> >__u8  bRestricted2[8];
> >__u8  bSASAddress[8];
> >__u8  bPhyIdentifier;
> >__u8  bSignalClass;
> >__u8  bReserved[6];
> > } CSMI_SAS_IDENTIFY,
> >   *PCSMI_SAS_IDENTIFY;

please compare this with struct sas_identify in
include/linux/scsi_transport_sas.h and look at
drivers/scsi/scsi_transport_sas.c om how it's exposed.

> > typedef struct _CSMI_SAS_PHY_ENTITY {
> >CSMI_SAS_IDENTIFY Identify;
> >__u8  bPortIdentifier;
> >__u8  bNegotiatedLinkRate;
> >__u8  bMinimumLinkRate;
> >__u8  bMaximumLinkRate;
> >__u8  bPhyChangeCount;
> >__u8  bAutoDiscover;
> >__u8  bReserved[2];
> >CSMI_SAS_IDENTIFY Attached;
> > } CSMI_SAS_PHY_ENTITY,
> >   *PCSMI_SAS_PHY_ENTITY;

and this one to struct sas_port_attrs.

This is after my minimal sas transport class, please also read the
thread about it on linux-scsi

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Linux-2.6.13-rc7

2005-08-25 Thread Al Viro

On Thu, Aug 25, 2005 at 11:27:32AM +0400, Alexey Dobriyan wrote:
> Mine is alpha-unknown-linux-gnu-gcc (GCC) 3.4.4 (Gentoo 3.4.4)
> 
> > Which place triggers it in your build?
> 
> net/ipv4/route.c:3152, call to rt_hash_lock_init().
> 
> >From preprocessed source (reformatted):
> ---
> typedef struct {
>   volatile unsigned int lock;
> 
>   int on_cpu;
>   int line_no;
>   void *previous;
>   struct task_struct * task;
>   const char *base_file;
> } spinlock_t;
> 
> static inline void *kmalloc(size_t size, unsigned int flags)

Oh, lovely...

a) gcc4 on alpha refuses to make that inline
b) bug is real, indeed - spinlock debugging + >32 CPU => panic in ip_rt_init()

IMO that's a question to rth: why do we really need to block always_inline
on alpha?

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Petition for gas grices

2005-08-25 Thread Patrick McFarland

On Thursday 25 August 2005 02:44 pm, Lee Revell wrote:
> Take the fucking bus, ride a bike, or just fucking move closer to work.
> What ever gave all you people the idea that driving 50 miles each way to
> work was sustainable in the first place?  I can't believe how many
> otherwise rational people have a gigantic blind spot for this.

Or get a hybird vehicle. IMHO its seriously worth it. Doing 50mpg+ would be 
cheap even if gas is $5 per gallon (predicted price for 2006). Better than 
the 3mpg SUV mechanical whales everyone drives around. Costs like $90 to fill 
the tank, and you have to fill the tank just going up the road to get a loaf 
of bread.

-- 
Patrick "Diablo-D3" McFarland || [EMAIL PROTECTED]
"Computer games don't affect kids; I mean if Pac-Man affected us as kids, we'd 
all be running around in darkened rooms, munching magic pills and listening to
repetitive electronic music." -- Kristian Wilson, Nintendo, Inc, 1989

pgpsfrcLrittj.pgp
Description: PGP signature

Re: Building the kernel with Cygwin

2005-08-25 Thread linux-os \(Dick Johnson\)

On Thu, 25 Aug 2005, Christopher Faylor wrote:

> On Thu, Aug 25, 2005 at 01:05:24PM -0400, linux-os (Dick Johnson) wrote:
>> On Thu, 25 Aug 2005, Chris du Quesnay wrote:
>>> The scripts/basic directory contains a fixdep.exe after the make is
>>> run.  There is no fixdep file.  I tried renaming the fixdep.exe to
>>> fixdep, but that also resulted in the same make error.
>>
>> Ah yes! The Makefile will not execute 'fixdep.exe` it executes 'fixdep'
>> --hard coded.  I don't know how well cygwin emulates a Unix
>> environment, but maybe you can use an alias???  ..  Like...  alias
>> fixdep='fixdep.exe'
>
> How about a symlink?
>
> ln -s fixdep.exe fixdep
>

Maybe I don't know.

I have Cygwin on my laptop, but never put the kernel on it so
I haven't tried.

> cgf
> --
> Christopher Faylorspammer? -> [EMAIL PROTECTED]
> Cygwin Co-Project Leader  [EMAIL PROTECTED]
> TimeSys, Inc.
>

Cheers,
Dick Johnson
Penguin : Linux version 2.6.12.5 on an i686 machine (5537.79 BogoMips).
Warning : 98.36% of all statistics are fiction.
.
I apologize for the following. I tried to kill it with the above dot :

The information transmitted in this message is confidential and may be 
privileged.  Any review, retransmission, dissemination, or other use of this 
information by persons or entities other than the intended recipient is 
prohibited.  If you are not the intended recipient, please notify Analogic 
Corporation immediately - by replying to this message or by sending an email to 
[EMAIL PROTECTED] - and destroy all copies of this information, including any 
attachments, without reading or disclosing them.

Thank you.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6: how do I this in sysfs?

2005-08-25 Thread Andrew Patterson

On Thu, 2005-08-25 at 13:52 -0500, Miller, Mike (OS Dev) wrote:
> I've been asked to pass this on for some kind of clarification. 
> We have management apps requiring specific information from the Smart
> Array controller. We're trying to use sysfs to accomplish the task. An
> example of what we need to do is below. I'm sure some of you will
> recognize this as CSMI.
> The basic question is this: how do you pass complex data structures back
> and forth between user/kernelspace and still abide by the rules around
> sysfs like: one attribute per file, text files only, etc?

Wouldn't you be able to post these items in sysfs attributes in the SAS
transport layer, assuming the cciss driver used the SAS transport layer.
Then the LLDD would be responsible for retrieving/setting the attribute
using whatever method is appropriate (dma, request queues, etc). 

I was hoping to start work on adding SDI operations to Christoph
Hellwig's SAS transport layer sometime next week.  

Andrew

> 
> Thanks,
> mikem
> > 
> > We have a storage controller which has some features which 
> > Work more or less as follows, but are not really "regular i/o"
> > In the sense that they are used for configuration or 
> > management Of devices rather than being the primary purpose 
> > of the devices.
> > 
> > The host constructs a somewhat complex data buffer according 
> > to a predefined convention, And fills out certain parts of 
> > the buffer to formulate what could be a query, or perhaps 
> > configuration data.
> > It then constructs a command which includes scatter gather 
> > elements Which reference this data buffer, and writes the bus 
> > address of the Command to a register on the controller.
> > 
> > The controller reads the command and data buffer from host 
> > memory, And DMAs the results of the query into the same data 
> > buffer, and issues An interrupt to the host.  So there's a 
> > bidirectional transfer Of data to/from the data buffer.
> > 
> > For example, one the data buffers the controller understands 
> > looks like what's below:
> > 
> > User applications need to be able to use this interface to 
> > talk To the controller.  What's the recommended way to 
> > implement such An interface?
> > 
> > // CC_CSMI_SAS_GET_PHY_INFO
> > typedef struct _COMMAND_HEADER {
> >__u32 IOControllerNumber;
> > __u32 Length;
> > __u32 ReturnCode;
> > __u32 Timeout;
> > __u16 Direction;
> > } COMMAND_HEADER, *PCOMMAND_HEADER;
> > 
> > typedef struct _CSMI_SAS_IDENTIFY {
> >__u8  bDeviceType;
> >__u8  bRestricted;
> >__u8  bInitiatorPortProtocol;
> >__u8  bTargetPortProtocol;
> >__u8  bRestricted2[8];
> >__u8  bSASAddress[8];
> >__u8  bPhyIdentifier;
> >__u8  bSignalClass;
> >__u8  bReserved[6];
> > } CSMI_SAS_IDENTIFY,
> >   *PCSMI_SAS_IDENTIFY;
> > 
> > typedef struct _CSMI_SAS_PHY_ENTITY {
> >CSMI_SAS_IDENTIFY Identify;
> >__u8  bPortIdentifier;
> >__u8  bNegotiatedLinkRate;
> >__u8  bMinimumLinkRate;
> >__u8  bMaximumLinkRate;
> >__u8  bPhyChangeCount;
> >__u8  bAutoDiscover;
> >__u8  bReserved[2];
> >CSMI_SAS_IDENTIFY Attached;
> > } CSMI_SAS_PHY_ENTITY,
> >   *PCSMI_SAS_PHY_ENTITY;
> > 
> > typedef struct _CSMI_SAS_PHY_INFO {
> >__u8  bNumberOfPhys;
> >__u8  bReserved[3];
> >CSMI_SAS_PHY_ENTITY Phy[32];
> > } CSMI_SAS_PHY_INFO,
> >   *PCSMI_SAS_PHY_INFO;
> > 
> > typedef struct _CSMI_SAS_PHY_INFO_BUFFER {
> >COMMAND_HEADER IoctlHeader;
> >CSMI_SAS_PHY_INFO Information;
> > } CSMI_SAS_PHY_INFO_BUFFER,
> >   *PCSMI_SAS_PHY_INFO_BUFFER;
> > 
> 


signature.asc
Description: This is a digitally signed message part

1 2 3 4 5 6 >

1 - 100 of 570 matches

Mail list logo