Re: [RFC] [PATCH] Multiple kprobes at an address

2005-04-20 Thread Suparna Bhattacharya
On Tue, Apr 19, 2005 at 09:22:34AM -0400, Ananth N Mavinakayanahalli wrote:
 Hi,
 
 Some instrumentation tools on Linux, like Itrace and systemtap
 (http://sourceware.org/systemtap) now use the kprobe infrastructure to
 gather information. One of the requirements of projects like systemtap 
 is the ability to define multiple kprobes at a given address.
 
 To this end, here is a patch that provides the feature. Patch is
 against linux-2.6.12-rc2.
 
 This patch provides the facility to register multiple kprobes at the
 same address using the existing interfaces. The house-keeping in
 case of multiple kprobes is taken care of within the base kprobes
 infrastructure.
 
 Another approach considered was to have a layer above the existing
 kprobes base (no modification to current kprobes base at all). A patch
 to this end is available at:
 
 http://sourceware.org/ml/systemtap/2005-q2/msg00089.html
 
 This approach would also allow for two sets of interfaces for
 un/registering kprobes. There has been quite a few discussions on the
 systemtap lists whether two interfaces are necessary or not. But, with
 the current kprobes locking model, the layered approach leaves room
 for a few kprobe registration races.

Maybe a low-level __register_kprobe() that expects to be called
under the kprobes_lock would do the trick ? That could also help
smoothen the remaining rough edges in the register/unregister paths.

 
 Both approaches are architecture agnostic.
 
 Other kprobe enhancements in the pipeline, such as, improving kprobe
 locking, support for return address probes, etc. Given that, the main
 questions to be answered now for the multi-kprobe feature are:
 
 1. Is the approach taken by the patch attached good?
 2. Do we take the layered approach?
 3. If we take the layered approach, do we need multiple interfaces?

I think the criteria would depend on whether you can avoid policy
creep down the line, e.g. ordering/priorities amongst multiple kprobe
handlers, conditionally bypassing handlers in the chain, etc. I'm
not sure if the requirements for these tools are fully clear as yet,
some of the users intended for multi-kprobes are probably still on
the drawing board ? So I guess you can cross that bridge when you
come to it.

Regards
Suparna

 4. Would existing users be impacted by the change?
 
 Comments?
 
 Thanks,
 Ananth
 

 diff -Naurp temp/linux-2.6.12-rc2/include/linux/kprobes.h 
 linux-2.6.12-rc2/include/linux/kprobes.h
 --- temp/linux-2.6.12-rc2/include/linux/kprobes.h 2005-04-18 
 11:44:57.0 -0400
 +++ linux-2.6.12-rc2/include/linux/kprobes.h  2005-04-18 13:36:23.0 
 -0400
 @@ -43,6 +43,9 @@ typedef int (*kprobe_fault_handler_t) (s
  struct kprobe {
   struct hlist_node hlist;
  
 + /* list of kprobes for multi-handler support */
 + struct list_head list;
 +
   /* location of the probe point */
   kprobe_opcode_t *addr;
  
 diff -Naurp temp/linux-2.6.12-rc2/kernel/kprobes.c 
 linux-2.6.12-rc2/kernel/kprobes.c
 --- temp/linux-2.6.12-rc2/kernel/kprobes.c2005-04-18 11:44:57.0 
 -0400
 +++ linux-2.6.12-rc2/kernel/kprobes.c 2005-04-18 13:41:38.0 -0400
 @@ -44,6 +44,7 @@ static struct hlist_head kprobe_table[KP
  
  unsigned int kprobe_cpu = NR_CPUS;
  static DEFINE_SPINLOCK(kprobe_lock);
 +static struct kprobe *curr_kprobe;
  
  /* Locks kprobe: irqs must be disabled */
  void lock_kprobes(void)
 @@ -73,22 +74,142 @@ struct kprobe *get_kprobe(void *addr)
   return NULL;
  }
  
 +/*
 + * Aggregate handlers for multiple kprobes support - these handlers 
 + * take care of invoking the individual kprobe handlers on p-list
 + */ 
 +int aggr_pre_handler(struct kprobe *p, struct pt_regs *regs)
 +{
 + struct kprobe *kp;
 +
 + list_for_each_entry(kp, p-list, list) {
 + if (kp-pre_handler) {
 + curr_kprobe = kp;
 + kp-pre_handler(kp, regs);
 + curr_kprobe = NULL;
 + }
 + }
 + return 0;
 +}
 +
 +void aggr_post_handler(struct kprobe *p, struct pt_regs *regs, 
 + unsigned long flags)
 +{
 + struct kprobe *kp;
 +
 + list_for_each_entry(kp, p-list, list) {
 + if (kp-post_handler) {
 + curr_kprobe = kp;
 + kp-post_handler(kp, regs, flags);
 + curr_kprobe = NULL;
 + }
 + }
 + return;
 +}
 +
 +int aggr_fault_handler(struct kprobe *p, struct pt_regs *regs, int trapnr)
 +{
 + /* 
 +  * if we faulted during the execution of a user specified
 +  * probe handler, invoke just that probe's fault handler
 +  */ 
 + if (curr_kprobe  curr_kprobe-fault_handler) {
 + if (curr_kprobe-fault_handler(curr_kprobe, regs, trapnr))
 + return 1;
 + }
 + return 0;
 +}
 +
 +/* 
 + * Fill in the required fields of the manager kprobe. Replace the 
 + * earlier kprobe in the hlist with the manager kprobe
 + */ 
 +static inline 

i830 lockup

2005-04-20 Thread Harish K Harshan
Hello,

   I am developing a device driver for the AxiomTek AX5621H data
acquisition card, and I am encountering some problems on a particular
machine. This driver works pretty fine on normal machines, but crashes
on an Industrial PC with intel 830 2-piece board (with the main board
going into a PCI and an ISA slot on the expansion board(which houses
all the PCI, ISA and AGP slots) at the same time. The error I get is
given below :

CPU 0 : Machine Check Exception : 0004
Bank 0 : a20084010400
Kernel panic : CPU context corrupt
In interrupt handler - not syncing

The DMA channel I use is DMA1, and i check for free interrupts and then
allocate them accordingly. So there is no conflicts, I assume. Also there
are no conflicts on the address range I have allocated (It is for now
0x300). The card supports only DMA channels 1 and 3. I have tried both, to
the same result. If anyone among you have had experience with such a
problem, any help in fixing this matter would be of great help.


Thank You.
Harish Harshan.


-
This email was sent using Amrita Mail.
   Amrita Vishwa Vidyapeetham [Deemed University] - Amritapuri Campus
http://amritapuri.amrita.edu

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH scsi-misc-2.6 01/05] scsi: make blk layer set REQ_SOFTBARRIER when a request is dispatched

2005-04-20 Thread Jens Axboe
On Wed, Apr 20 2005, Tejun Heo wrote:
 01_scsi_blk_make_started_requests_ordered.patch
 
   Reordering already started requests is without any real
   benefit and causes problems if the request has its
   driver-specific resources allocated (as in SCSI).  This patch
   makes elv_next_request() set REQ_SOFTBARRIER automatically
   when a request is dispatched.
 
   As both as and cfq schedulers don't allow passing requeued
   requests, the only behavior change is that requests deferred
   by prep_fn won't be passed by other requests.  This change
   shouldn't cause any problem.  The only affected driver other
   than SCSI is i2o_block.
 
 Signed-off-by: Tejun Heo [EMAIL PROTECTED]
 
  elevator.c |8 
  1 files changed, 4 insertions(+), 4 deletions(-)
 
 Index: scsi-reqfn-export/drivers/block/elevator.c
 ===
 --- scsi-reqfn-export.orig/drivers/block/elevator.c   2005-04-20 
 08:13:01.0 +0900
 +++ scsi-reqfn-export/drivers/block/elevator.c2005-04-20 
 08:13:33.0 +0900
 @@ -370,11 +370,11 @@ struct request *elv_next_request(request
  
   while ((rq = __elv_next_request(q)) != NULL) {
   /*
 -  * just mark as started even if we don't start it, a request
 -  * that has been delayed should not be passed by new incoming
 -  * requests
 +  * just mark as started even if we don't start it.
 +  * also, as a request that has been delayed should not
 +  * be passed by new incoming requests, set softbarrier.
*/
 - rq-flags |= REQ_STARTED;
 + rq-flags |= REQ_STARTED | REQ_SOFTBARRIER;
  
   if (rq == q-last_merge)
   q-last_merge = NULL;

Do it on requeue, please - not on the initial spotting of the request.

-- 
Jens Axboe

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH scsi-misc-2.6 01/05] scsi: make blk layer set REQ_SOFTBARRIER when a request is dispatched

2005-04-20 Thread Tejun Heo
Jens Axboe wrote:
On Wed, Apr 20 2005, Tejun Heo wrote:
01_scsi_blk_make_started_requests_ordered.patch
Reordering already started requests is without any real
benefit and causes problems if the request has its
driver-specific resources allocated (as in SCSI).  This patch
makes elv_next_request() set REQ_SOFTBARRIER automatically
when a request is dispatched.
As both as and cfq schedulers don't allow passing requeued
requests, the only behavior change is that requests deferred
by prep_fn won't be passed by other requests.  This change
shouldn't cause any problem.  The only affected driver other
than SCSI is i2o_block.
Signed-off-by: Tejun Heo [EMAIL PROTECTED]
elevator.c |8 
1 files changed, 4 insertions(+), 4 deletions(-)
Index: scsi-reqfn-export/drivers/block/elevator.c
===
--- scsi-reqfn-export.orig/drivers/block/elevator.c 2005-04-20 
08:13:01.0 +0900
+++ scsi-reqfn-export/drivers/block/elevator.c  2005-04-20 08:13:33.0 
+0900
@@ -370,11 +370,11 @@ struct request *elv_next_request(request
while ((rq = __elv_next_request(q)) != NULL) {
/*
-* just mark as started even if we don't start it, a request
-* that has been delayed should not be passed by new incoming
-* requests
+* just mark as started even if we don't start it.
+* also, as a request that has been delayed should not
+* be passed by new incoming requests, set softbarrier.
 */
-   rq-flags |= REQ_STARTED;
+   rq-flags |= REQ_STARTED | REQ_SOFTBARRIER;
if (rq == q-last_merge)
q-last_merge = NULL;

Do it on requeue, please - not on the initial spotting of the request.
 The thing is that we also need to set REQ_SOFTBARRIER on 
BLKPREP_DEFER.  So, it will be two places - in elv_next_request and 
blk_requeue_request.  The end result will be the same.  Do you think 
doing on requeue paths is better?

--
tejun
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] remove some usesless casts

2005-04-20 Thread Jörn Engel
Squashfs is extremely cast-happy.  This patch makes it less so.

Jörn

-- 
If you're willing to restrict the flexibility of your approach,
you can almost always do something better.
-- John Carmack


Signed-off-by: Jörn Engel [EMAIL PROTECTED]
---

 fs/squashfs/inode.c |   63 ++--
 1 files changed, 27 insertions(+), 36 deletions(-)

--- linux-2.6.12-rc2cow/fs/squashfs/inode.c~squashfs_cu12005-04-20 
07:52:46.0 +0200
+++ linux-2.6.12-rc2cow/fs/squashfs/inode.c 2005-04-20 08:11:10.254367656 
+0200
@@ -111,7 +111,7 @@ struct inode_operations squashfs_dir_ino
 static struct buffer_head *get_block_length(struct super_block *s,
int *cur_index, int *offset, int *c_byte)
 {
-   squashfs_sb_info *msBlk = (squashfs_sb_info *)s-s_fs_info;
+   squashfs_sb_info *msBlk = s-s_fs_info;
unsigned short temp;
struct buffer_head *bh;
 
@@ -176,7 +176,7 @@ unsigned int squashfs_read_data(struct s
unsigned int index, unsigned int length,
unsigned int *next_index)
 {
-   squashfs_sb_info *msBlk = (squashfs_sb_info *)s-s_fs_info;
+   squashfs_sb_info *msBlk = s-s_fs_info;
struct buffer_head *bh[((SQUASHFS_FILE_MAX_SIZE - 1) 
msBlk-devblksize_log2) + 2];
unsigned int offset = index  ((1  msBlk-devblksize_log2) - 1);
@@ -285,7 +285,7 @@ int squashfs_get_cached_block(struct sup
int length, unsigned int *next_block,
unsigned int *next_offset)
 {
-   squashfs_sb_info *msBlk = (squashfs_sb_info *)s-s_fs_info;
+   squashfs_sb_info *msBlk = s-s_fs_info;
int n, i, bytes, return_length = length;
unsigned int next_index;
 
@@ -390,7 +390,7 @@ static int get_fragment_location(struct 
unsigned int *fragment_start_block,
unsigned int *fragment_size)
 {
-   squashfs_sb_info *msBlk = (squashfs_sb_info *)s-s_fs_info;
+   squashfs_sb_info *msBlk = s-s_fs_info;
unsigned int start_block =
msBlk-fragment_index[SQUASHFS_FRAGMENT_INDEX(fragment)];
int offset = SQUASHFS_FRAGMENT_INDEX_OFFSET(fragment);
@@ -434,7 +434,7 @@ static struct squashfs_fragment_cache *g
int length)
 {
int i, n;
-   squashfs_sb_info *msBlk = (squashfs_sb_info *)s-s_fs_info;
+   squashfs_sb_info *msBlk = s-s_fs_info;
 
for (;;) {
down(msBlk-fragment_mutex);
@@ -466,8 +466,7 @@ static struct squashfs_fragment_cache *g
SQUASHFS_CACHED_FRAGMENTS;

if (msBlk-fragment[i].data == NULL)
-   if (!(msBlk-fragment[i].data =
-   (unsigned char *) kmalloc
+   if (!(msBlk-fragment[i].data = kmalloc
(SQUASHFS_FILE_MAX_SIZE,
 GFP_KERNEL))) {
ERROR(Failed to allocate fragment 
@@ -509,7 +508,7 @@ static struct squashfs_fragment_cache *g
 static struct inode *squashfs_new_inode(struct super_block *s,
squashfs_base_inode_header *inodeb, unsigned int ino)
 {
-   squashfs_sb_info *msBlk = (squashfs_sb_info *)s-s_fs_info;
+   squashfs_sb_info *msBlk = s-s_fs_info;
squashfs_super_block *sBlk = msBlk-sBlk;
struct inode *i = new_inode(s);
 
@@ -535,7 +534,7 @@ static struct inode *squashfs_new_inode(
 static struct inode *squashfs_iget(struct super_block *s, squashfs_inode inode)
 {
struct inode *i;
-   squashfs_sb_info *msBlk = (squashfs_sb_info *)s-s_fs_info;
+   squashfs_sb_info *msBlk = s-s_fs_info;
squashfs_super_block *sBlk = msBlk-sBlk;
unsigned int block = SQUASHFS_INODE_BLK(inode) +
sBlk-inode_table_start;
@@ -837,13 +836,12 @@ static int squashfs_fill_super(struct su
 
TRACE(Entered squashfs_read_superblock\n);
 
-   if (!(s-s_fs_info = (void *) kmalloc(sizeof(squashfs_sb_info),
-   GFP_KERNEL))) {
+   if (!(s-s_fs_info = kmalloc(sizeof(squashfs_sb_info), GFP_KERNEL))) {
ERROR(Failed to allocate superblock\n);
return -ENOMEM;
}
memset(s-s_fs_info, 0, sizeof(squashfs_sb_info));
-   msBlk = (squashfs_sb_info *) s-s_fs_info;
+   msBlk = s-s_fs_info;
sBlk = msBlk-sBlk;

msBlk-devblksize = sb_min_blocksize(s, BLOCK_SIZE);
@@ -914,8 +912,7 @@ static int squashfs_fill_super(struct su
s-s_op = squashfs_ops;
 
/* Init inode_table block pointer array */
-   if (!(msBlk-block_cache = (squashfs_cache *)
-   

Re: i830 lockup

2005-04-20 Thread Chris Wedgwood
On Wed, Apr 20, 2005 at 12:36:45PM +0530, Harish K Harshan wrote:

 CPU 0 : Machine Check Exception : 0004
 Bank 0 : a20084010400
 Kernel panic : CPU context corrupt
 In interrupt handler - not syncing

CPU got messed up...  could be a bad CPU/cache/chipset or simply it's
over heating or has a bad powersupply.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Lse-tech] Re: [RFC PATCH] Dynamic sched domains aka Isolated cpusets

2005-04-20 Thread Dinakar Guniguntala
On Tue, Apr 19, 2005 at 10:23:48AM -0700, Paul Jackson wrote:
 
 How does this play out in your interface?  Are you convinced that
 your invariants are preserved at all times, to all users?  Can
 you present a convincing argument to others that this is so?


Let me give an example of how the current version of isolated cpusets can
be used and hopefully clarify my approach.


Consider a system with 8 cpus that needs to run a mix of workloads.
One set of applications have low latency requirements and another
set have a mixed workload. The administrator decides to allot
2 cpus to the low latency application and the rest to other apps.
To do this, he creates two cpusets
(All cpusets are considered to be exclusive for this discussion)

   cpuset   cpus   isolated   cpus_allowed   isolated_map
 top 0-7   1  0-7 0
 top/lowlat 0-10  0-1 0
 top/others 2-70  2-7 0

He now wants to partition the system along these lines as he wants
to isolate lowlat from the rest of the system to ensure that
a. No tasks from the parent cpuset (top_cpuset in this case)
   use these cpus
b. load balance does not run across all cpus 0-7

He does this by

cd /mount-point/lowlat
/bin/echo 1  cpu_isolated

Internally it takes the cpuset_sem, does some sanity checks and ensures
that these cpus are not visible to any other cpuset including its parent
(by removing these cpus from its parent's cpus_allowed mask and adding
them to its parent's isolated_map) and then calls sched code to partition
the system as

[0-1] [2-7]

   The internal state of data structures are as follows

   cpuset   cpus   isolated   cpus_allowed   isolated_map
 top 0-7   1  2-70-1
 top/lowlat 0-11  0-1 0
 top/others 2-70  2-7 0

---


The administrator now wants to further partition the others cpuset into
a cpu intensive application and a batch one

   cpuset   cpus   isolated   cpus_allowed   isolated_map
 top 0-7   1  2-70-1
 top/lowlat 0-11  0-1 0
 top/others 2-70  2-7 0
 top/others/cint   2-3 0  2-3 0
 top/others/batch  4-7 0  4-7 0


If now the administrator wants to isolate the cint cpuset...

cd /mount-point/others
/bin/echo 1  cpu_isolated

(At this point no new sched domains are built
 as there exists a sched domain which exactly
 matches the cpus in the others cpuset.)

cd /mount-point/others/cint
/bin/echo 1  cpu_isolated

At this point cpus from the others cpuset are also taken away from its
parent cpus_allowed mask and put into the parent's isolated_map. This means
that the parent cpus_allowed mask is empty.  This would now result in
partitioning the others cpuset and builds two new sched domains as follows

[2-3] [4-7]

Notice that the cpus 0-1 having already been isolated are not affected
in this operation

   cpuset   cpus   isolated   cpus_allowed   isolated_map
 top 0-7   1   0 0-7
 top/lowlat 0-11  0-1 0
 top/others 2-71  4-72-3
 top/others/cint   2-3 1  2-3 0
 top/others/batch  4-7 0  4-7 0

---

The admin now wants to run more applications in the cint cpuset
and decides to borrow a couple of cpus from the batch cpuset
He removes cpus 4-5 from batch and adds them to cint

   cpuset   cpus   isolated   cpus_allowed   isolated_map
 top 0-7   1   0 0-7
 top/lowlat 0-11  0-1 0
 top/others 2-71  6-72-5
 top/others/cint   2-5 1  2-5 0
 top/others/batch  6-7 0  6-7 0

As cint is already isolated, adding cpus causes it to rebuild all cpus
covered by its cpus_allowed and its parent's cpus_allowed, so the new
sched domains will look as follows

[2-5] [6-7]

cpus 0-1 are ofcourse still not affected

Similarly the admin can remove cpus from cint, which will
result in the domains being rebuilt to what was before

[2-3] [4-7]

---


Hope this clears up my approach. Also note that we still need 

Re: [RFC] FUSE permission modell (Was: fuse review bits)

2005-04-20 Thread Miklos Szeredi
 Likely because its a chroot vulnerability.
 
 It allows a process to obtain a reference to the root vfsmount that
 doesn't have chroot checks performed on it.
 
 Consider the following pseudo example:
 
[...]

 if main is run within a chroot where it's / is on the same vfsmount as
  it's .., then the application can step out of the chroot using clone(2).
 
 Note: using chdir in a vfsmount outside of your namespace works, however
 you won't be able to walk off that vfsmount (to its parent or children).

How about fixing fchdir, so it checks whether you gone outside the
tree under current-fs-rootmnt?  Should be fairly easy to do.

Miklos
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: E1000 - page allocation failure - saga continues :(

2005-04-20 Thread Yann Dupont
Lukas Hejtmanek a crit :

On Tue, Apr 19, 2005 at 09:23:46AM +0200, Yann Dupont wrote:
  

Do you have turned NAPI on ??? I tried without it off on e1000 and ...
surprise !
Don't have any messages since 12H now (usually I got those in less than 1H)



I have NAPI on. I tried to turn it off but my test failed, I can see allocation
failure again.

  

Well. forgives me :)
I have re turned NAPI On and my box is still happy 19H later...

So it's obviously not napi.

The problem is beetween the 2 incarnations of kernel (2.6.11.7 with
kswapd meesages on thoses who works well), I've changed some more options
Not exactly the best way to track bugs :(

Anyway i'll try to catch THE option that make the kernel not so happy
under heavy stress. Stay tuned,

-- 
Yann Dupont, Cri de l'universit de Nantes
Tel: 02.51.12.53.91 - Fax: 02.51.12.58.60 - [EMAIL PROTECTED]

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [2.6 patch] drivers/ieee1394/: remove unneeded EXPORT_SYMBOL's

2005-04-20 Thread Arjan van de Ven
On Wed, 2005-04-20 at 00:00 +0200, Stefan Richter wrote:
 Arjan van de Ven wrote:
  On Tue, 2005-04-19 at 15:13 -0400, Jody McIntyre wrote:
  On Sun, Apr 17, 2005 at 09:57:07PM +0200, Adrian Bunk wrote:
   This patch removes unneeded EXPORT_SYMBOL's.
 ...
  Given the objections to your December patch, why should we accept this
  one now?
  
  since there still isn't a user ??
 
 There are users (though not in the kernel at the moment)

nor for the last 5 months... how long will it be ?

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH] Dynamic sched domains aka Isolated cpusets

2005-04-20 Thread Dinakar Guniguntala
On Tue, Apr 19, 2005 at 08:26:39AM -0700, Paul Jackson wrote:
  * Your understanding of cpu_exclusive is not the same as mine.

Sorry for creating confusion by what I said earlier, I do understand
exactly what cpu_exclusive means. Its just that when I started
working on this (a long time ago) I had a different notion and that is
what I was referring to, I probably should never have brought that up

 
  Since isolated cpusets are trying to partition the system, this
  can be restricted to only the first level of cpusets.
 
 I do not think such a restriction is a good idea.  For example, lets say
 our 8 CPU system has the following cpusets:
 

And my current implementation has no such restriction, I was only
suggesting that to simplify the code.

 
  Also I think we can add further restrictions in terms not being able
  to change (add/remove) cpus within a isolated cpuset.
 
 My approach agrees on this restriction.  Earlier I wrote:
  Also note that adding or removing a cpu from a cpuset that has
  its domain_cpu_current flag set true must fail, and similarly
  for domain_mem_current.
 
 This restriction is required in my approach because the CPUs in the
 domain_cpu_current cpusets (the isolated CPUs, in your terms) form a
 partition (disjoint cover) of the CPUs in the system, which property
 would be violated immediately if any CPU were added or removed from any
 cpuset defining the partition.

See my other note explaining how things work currently. I do feel that
this restriction is not good

-Dinakar
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: GPL violation by CorAccess?

2005-04-20 Thread Bernd Petrovitsch
On Tue, 2005-04-19 at 17:37 -0600, Chris Friesen wrote:
 Richard B. Johnson wrote:
 
  No. Accompany it with a written offer to __provide__ the source
  code for any GPL stuff they used (like the kernel or drivers).
  Anything at the application-level is NOT covered by the GPL.

That depends on the software used there.

  They do not have to give away their trade-secrets.

Unless they coded them into GPL software ...

 GPL'd applications would still be covered by the GPL, no?

Good question: Strictly speaking if you omit the GPL in the delivered
ssoftware/product/whatever, you violated the GPL yourself and - thus -
loose all rights which are given to you through the GPL.

 If I buy their product, I should be able to ask them for the source to 
 all GPL'd entities that are present in the system, including the kernel, 
 drivers, and all GPL'd userspace apps.

ACK.

 Any *new* apps that they wrote they would of course be free to keep private.

As long as they do not statically link against LGPL (or GPL) code and as
long as they do not link dynamically agaist GPL code. And there are
probably more rules .

Bernd
-- 
Firmix Software GmbH   http://www.firmix.at/
mobil: +43 664 4416156 fax: +43 1 7890849-55
  Embedded Linux Development and Services



-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH x86_64] Live Patching Function on 2.6.11.7

2005-04-20 Thread Takashi Ikebe
Hi,
I think basic assumption between us and you is not match..
Our assumption, the live patching is not for debug, but for the real 
operation method to fix very very important process which can not stop.
Live patchin fix the important process's bug without disrupting process.

Chris Wedgwood wrote:
On Wed, Apr 20, 2005 at 01:18:23PM +0900, Takashi Ikebe wrote:

Well, Live patching is just a patch, so I think the developer of
patch should know the original source code well.
In which case they could fix the application.
Yes, so they provide us the patch module, and we want to apply the patch 
as live patching.

Well, as you said some application can do that, but some application
can not continue service with your suggestion.
Such as?
please think about the process which use connection type
communication such as TCP(it's only example) between users and
server. During status copy, all the session between users and server
are disconnected...

They don't have to be. 
???
To takeover the application status, connection type 
communications(SOCK_STREAM) are need to be disconnected by close().
Same network port is not allowed to bind by multiple processes

How can you do that??
Users don't want to disconnect,(and also we don't want to disconnect) 
but server process need to it to takeover the status.

can not save the exiting service at all.
 
Yes they can.

It's one example, but similar problems may occurs whenever processed
use the resources which are mainly controlled by kernel.
What resources?  We can migrate memory and file descriptors?  What is
missing?
For example,
current process's resouces of rlimit.
you nerver set current rusage to new process.
especialy, ru_utime and ru_stime is very important to critical applications.
I don't know much about resources, but there may be more(I hope not..)
--
Takashi Ikebe
NTT Network Service Systems Laboratories
9-11, Midori-Cho 3-Chome Musashino-Shi,
Tokyo 180-8585 Japan
Tel : +81 422 59 4246, Fax : +81 422 60 4012
e-mail : [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH scsi-misc-2.6 01/05] scsi: make blk layer set REQ_SOFTBARRIER when a request is dispatched

2005-04-20 Thread Tejun Heo
 Hello, Jens.

On Wed, Apr 20, 2005 at 08:30:10AM +0200, Jens Axboe wrote:
 Do it on requeue, please - not on the initial spotting of the request.

 This is the reworked version of the patch.  It sets REQ_SOFTBARRIER
in two places - in elv_next_request() on BLKPREP_DEFER and in
blk_requeue_request().

 Other patches apply cleanly with this patch or the original one and
the end result is the same, so take your pick.  :-)


 Signed-off-by: Tejun Heo [EMAIL PROTECTED]


Index: scsi-reqfn-export/drivers/block/elevator.c
===
--- scsi-reqfn-export.orig/drivers/block/elevator.c 2005-04-20 
16:24:26.0 +0900
+++ scsi-reqfn-export/drivers/block/elevator.c  2005-04-20 16:31:36.0 
+0900
@@ -291,6 +291,13 @@ void elv_requeue_request(request_queue_t
}
 
/*
+* the request is prepped and may have some resources allocated.
+* allowing unprepped requests to pass this one may cause resource
+* deadlock.  turn on softbarrier.
+*/
+   rq-flags |= REQ_SOFTBARRIER;
+
+   /*
 * if iosched has an explicit requeue hook, then use that. otherwise
 * just put the request at the front of the queue
 */
@@ -386,6 +393,12 @@ struct request *elv_next_request(request
if (ret == BLKPREP_OK) {
break;
} else if (ret == BLKPREP_DEFER) {
+   /*
+* the request may have been (partially) prepped.
+* we need to keep this request in the front to
+* avoid resource deadlock.  turn on softbarrier.
+*/
+   rq-flags |= REQ_SOFTBARRIER;
rq = NULL;
break;
} else if (ret == BLKPREP_KILL) {
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH x86_64] Live Patching Function on 2.6.11.7

2005-04-20 Thread Chris Wedgwood
On Wed, Apr 20, 2005 at 04:35:07PM +0900, Takashi Ikebe wrote:

 I think basic assumption between us and you is not match...

No, I think at a high-level they do.

 Our assumption, the live patching is not for debug, but for the real
 operation method to fix very very important process which can not
 stop.

I understand that.

It might be though you could probably do what you want with some kind
of enhanced ptrace or debugging interface that would also be of value
to other people and probably simple than your proposed patch.

 Live patchin fix the important process's bug without disrupting
 process.

I understand that.

 To takeover the application status, connection type
 communications(SOCK_STREAM) are need to be disconnected by close().
 Same network port is not allowed to bind by multiple processes

AF_UNIX socket with SCM_RIGHTS

 especialy, ru_utime and ru_stime is very important to critical
 applications.

how so?  what is magical about these that can't be dealt with in
userspace should it span 2+ processes?
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH scsi-misc-2.6 01/05] scsi: make blk layer set REQ_SOFTBARRIER when a request is dispatched

2005-04-20 Thread Nick Piggin
On Wed, 2005-04-20 at 16:40 +0900, Tejun Heo wrote:
  Hello, Jens.
 
 On Wed, Apr 20, 2005 at 08:30:10AM +0200, Jens Axboe wrote:
  Do it on requeue, please - not on the initial spotting of the request.
 
  This is the reworked version of the patch.  It sets REQ_SOFTBARRIER
 in two places - in elv_next_request() on BLKPREP_DEFER and in
 blk_requeue_request().
 
  Other patches apply cleanly with this patch or the original one and
 the end result is the same, so take your pick.  :-)
 

I'm not sure that you need *either* one.

As far as I'm aware, REQ_SOFTBARRIER is used when feeding requests
into the top of the block layer, and is used to guarantee the device
driver gets the requests in a specific ordering.

When dealing with the requests at the other end (ie.
elevator_next_req_fn, blk_requeue_request), then ordering does not
change.

That is - if you call elevator_next_req_fn and don't dequeue the
request, then that's the same request you'll get next time.

And blk_requeue_request will push the request back onto the end of
the queue in a LIFO manner.

So I think adding barriers, apart from not doing anything, confuses
the issue because it suggests there *could* be reordering without
them.

Or am I completely wrong? It's been a while since I last got into
the code.


-- 
SUSE Labs, Novell Inc.



-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


AMD Mobile Sempron and kernel config

2005-04-20 Thread Lars Täuber
Hallo there,

hopefully my message is not to disturbing.
We own a notebook with a mobile sempron processor.
Which CPU choice is the one we should take?

CONFIG_MK7 or CONFIG_MK8 ?

Of course it's for a 32bit kernel.

Here is the output of /proc/cpuinfo:

processor   : 0
vendor_id   : AuthenticAMD
cpu family  : 15
model   : 28
model name  : Mobile AMD Sempron(tm) Processor 2800+
stepping: 0
cpu MHz : 800.190
cache size  : 256 KB
fdiv_bug: no
hlt_bug : no
f00f_bug: no
coma_bug: no
fpu : yes
fpu_exception   : yes
cpuid level : 1
wp  : yes
flags   : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov 
pat pse36 clflush mmx fxsr sse sse2 pni syscall nx mmxext fxsr_opt 3dnowext 
3dnow lahf_lm
bogomips: 1589.24


Thanks alot
Lars
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH x86_64] Live Patching Function on 2.6.11.7

2005-04-20 Thread Takashi Ikebe
Chris Wedgwood wrote:
On Wed, Apr 20, 2005 at 04:35:07PM +0900, Takashi Ikebe wrote: 

To takeover the application status, connection type
communications(SOCK_STREAM) are need to be disconnected by close().
Same network port is not allowed to bind by multiple processes

AF_UNIX socket with SCM_RIGHTS
hmm.. most internet base services will use TCPv4 TCPv6 SCTP...
AF_UNIX can not use as inter-nodes communication.
--
Takashi Ikebe
NTT Network Service Systems Laboratories
9-11, Midori-Cho 3-Chome Musashino-Shi,
Tokyo 180-8585 Japan
Tel : +81 422 59 4246, Fax : +81 422 60 4012
e-mail : [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH x86_64] Live Patching Function on 2.6.11.7

2005-04-20 Thread Chris Wedgwood
On Wed, Apr 20, 2005 at 04:57:31PM +0900, Takashi Ikebe wrote:

 hmm.. most internet base services will use TCPv4 TCPv6 SCTP...
 AF_UNIX can not use as inter-nodes communication.

You can send file descriptors (the actually file descriptors
themselves, not their contents) to another process over a socket.

A nearly ten-year old example is attached (ie. this isn't new or
magical or specific to Linux).

/* sendfd.c - v. crude example of passing fds */

/*
 * I tested this on HPUX10 using gcc -D_XOPEN_SOURCE_EXTENDED sendfd.c -lxnet
 *
 * Expected output is hello world
 */

#include stdio.h
#include sys/types.h
#include sys/wait.h
#include unistd.h
#include stdlib.h
#include sys/socket.h
#include sys/un.h

/* Note: msg_control may be msg_accrights, etc. */

int sendfd(int conn, int fd)
{
size_t len = sizeof(struct cmsghdr) + sizeof(int);
struct cmsghdr *hdr = (struct cmsghdr *)malloc(len);
struct msghdr msg;
int rc;

hdr-cmsg_len = len;
hdr-cmsg_level = SOL_SOCKET;
hdr-cmsg_type = SCM_RIGHTS;
*(int *)CMSG_DATA(hdr) = fd;

msg.msg_name = NULL;
msg.msg_namelen = 0;
msg.msg_iov = NULL;
msg.msg_iovlen = 0;
msg.msg_control = hdr;
msg.msg_controllen = len;

rc = sendmsg(conn, msg, 0);
free(hdr);
return rc;
}

int recvfd(int conn)
{
size_t len = sizeof(struct cmsghdr) + sizeof(int);
struct cmsghdr *hdr = (struct cmsghdr *)malloc(len);
struct msghdr msg;
int rc;

msg.msg_iov = NULL;
msg.msg_iovlen = 0;
msg.msg_control = hdr;
msg.msg_controllen = len;

rc = recvmsg(conn, msg, 0);

if (rc = 0 
	 hdr-cmsg_len == len 
	 hdr-cmsg_level == SOL_SOCKET 
	 hdr-cmsg_type == SCM_RIGHTS)
{
	int fd = *(int *)CMSG_DATA(hdr);
	free(hdr);
	return fd;
}

free(hdr);
return -1;
}

int main()
{
int fds[2];
int fd = -1;
int rc = socketpair(AF_UNIX, SOCK_STREAM, 0, fds);
pid_t pid;

if (rc)
{
	perror(socketpair);
	return 1;
}

/* punt */
system(echo hello world testfile);

switch (pid = fork())
{
case 0:
	// open this in the child proc
	fd = open(testfile,O_RDONLY);
	if (fd  0)
	perror(open(testfile));
	rc = sendfd(fds[1], fd);
	if (rc)
	perror(sendfd);
	_exit(0);
case -1:
	perror(fork);
	return 1;
default:
	waitpid(pid,NULL,0);
}

fd = recvfd(fds[0]);

if (fd  0)
{
	perror(recvfd);
	return 1;
}

close(fds[0]);
close(fds[1]);

dup2(fd,0);
close(fd);
execl(/bin/cat,cat,NULL);
}


Re: [PATCH x86_64] Live Patching Function on 2.6.11.7

2005-04-20 Thread Miquel van Smoorenburg
In article [EMAIL PROTECTED],
Takashi Ikebe  [EMAIL PROTECTED] wrote:
Chris Wedgwood wrote:
 On Wed, Apr 20, 2005 at 04:35:07PM +0900, Takashi Ikebe wrote: 
 
To takeover the application status, connection type
communications(SOCK_STREAM) are need to be disconnected by close().
Same network port is not allowed to bind by multiple processes
 
 
 AF_UNIX socket with SCM_RIGHTS
 
hmm.. most internet base services will use TCPv4 TCPv6 SCTP...
AF_UNIX can not use as inter-nodes communication.

No, Chris means filedescriptor passing.

You can pass any existing open filedescriptor to another process
using an AF_UNIX socket.

For example, the existing running process creates a UNIX socket in
/var/run/mysocket that the new process can connect() to. The
processes can then not only exchange all kinds of information,
the old process can even send open filedescriptors over to
the new process without closing/re-opening.

See man 7 unix, ANCILLARY MESSAGES - SCM_RIGHTS

ANCILLARY MESSAGES
   Ancillary data is sent and received using  sendmsg(2)  and  recvmsg(2).
   For  historical  reasons  the  ancillary message types listed below are
   specified with a SOL_SOCKET type even though they are PF_UNIX specific.
   To  send  them  set  the  cmsg_level  field  of  the  struct cmsghdr to
   SOL_SOCKET and the cmsg_type field to the type.  For  more  information
   see cmsg(3).


   SCM_RIGHTS
  Send or receive a set of open file descriptors from another pro-
  cess.  The data portion contains an integer array  of  the  file
  descriptors.   The passed file descriptors behave as though they
  have been created with dup(2).

Mike.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH x86_64] Live Patching Function on 2.6.11.7

2005-04-20 Thread Takashi Ikebe
Chris Wedgwood wrote:
(B On Wed, Apr 20, 2005 at 04:57:31PM +0900, Takashi Ikebe wrote:
(B 
(B 
(Bhmm.. most internet base services will use TCPv4 TCPv6 SCTP...
(BAF_UNIX can not use as inter-nodes communication.
(B 
(B 
(B You can send file descriptors (the actually file descriptors
(B themselves, not their contents) to another process over a socket.
(B 
(B A nearly ten-year old example is attached (ie. this isn't new or
(B magical or specific to Linux).
(B 
(B 
(B 
(B 
(B int main()
(B {
(B int fds[2];
(B int fd = -1;
(B int rc = socketpair(AF_UNIX, SOCK_STREAM, 0, fds);
(BHmm interest enough,
(BBut please see man.
(B
(BNOTES
(B   On Linux, the only supported domain for this call is AF_UNIX (or
(B syn-
(B   onymously,  AF_LOCAL).   (Most  implementations have the same
(Brestric-
(B   tion.)
(B
(BOnly for AF_UNIX..
(B
(B
(BWell, as many said Live patching is very historical  authoritative
(Bfunction on especially carrier, telecom vendor.
(BIf linux want to be adopted on mission critical world, this function is
(Besseintial.
(B
(B-- 
(BTakashi Ikebe
(BNTT Network Service Systems Laboratories
(B9-11, Midori-Cho 3-Chome Musashino-Shi,
(BTokyo 180-8585 Japan
(BTel : +81 422 59 4246, Fax : +81 422 60 4012
(Be-mail : [EMAIL PROTECTED]
(B-
(BTo unsubscribe from this list: send the line "unsubscribe linux-kernel" in
(Bthe body of a message to [EMAIL PROTECTED]
(BMore majordomo info at  http://vger.kernel.org/majordomo-info.html
(BPlease read the FAQ at  http://www.tux.org/lkml/

Re: [2.6 patch] drivers/net/sk98lin/: possible cleanups

2005-04-20 Thread Christoph Hellwig
On Wed, Apr 20, 2005 at 04:15:26AM +0200, Adrian Bunk wrote:
 This patch contains the following possible cleanups:
 - make needlessly global functions static
 - remove unused code

Not sure it's worth doing much on this, as the driver is beeing
obsoleted by the skge driver.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: enforcing DB immutability

2005-04-20 Thread linux
[A discussion on the git list about how to provide a hardlinked file
that *cannot* me modified by an editor, but must be replaced by
a new copy.]

[EMAIL PROTECTED] wrote all of:
 perhaps having a new 'immutable hardlink' feature in the Linux VFS 
 would help? I.e. a hardlink that can only be readonly followed, and 
 can be removed, but cannot be chmod-ed to a writeable hardlink. That i 
 think would be a large enough barrier for editors/build-tools not to 
 play the tricks they already do that makes 'readonly' files virtually 
 meaningless.
 
 immutable hardlinks have the following advantage: a hardlink by design 
 hides the information where the link comes from. So even if an editor 
 wanted to play stupid games and override the immutability - it doesnt 
 know where the DB object is. (sure, it could find it if it wants to, 
 but that needs real messing around - editors wont do _that_)

 so the only sensible thing the editor/tool can do when it wants to 
 change the file is precisely what we want: it will copy the hardlinked 
 files's contents to a new file, and will replace the old file with the 
 new file - a copy on write. No accidental corruption of the DB's 
 contents.

This is not a horrible idea, but it touches on another sore point I've
worried about for a while.

The obvious way to do the above *without* changing anything is just to
remove all write permission to the file.  But because I'm the owner, some
piece of software running with my permissions can just deicde to change
the permissions back and modify the file anyway.  Good old 7th edition
let you give files away, which could have addressed that (chmod a-w; chown
phantom_user), but BSD took that ability away to make accounting work.

The upshot is that, while separate users keeps malware from harming the
*system*, if I run a piece of malware, it can blow away every file I
own and make me unhappy.  When (notice I'm not saying if) commercial
spyware for Linux becomes common, it can also read every file I own.

Unless I have root access, Linux is no safer *for me* than Redmondware!

Since I *do* have root access, I often set up sandbox users and try
commercial binaries in that environment, but it's a pain and laziness
often wins.  I want a feature that I can wrap in a script, so that I
can run a commercial binary in a nicely restricted enviromment.

Or maybe I even want to set up a personal root level, and run
my normal interactive shells in a slightly restricted enviroment
(within which I could make a more-restricted world to run untrusted
binaries).  Then I could solve the immutable DB issue by having a
setuid binary that would make checked-in files unwriteable at my
normal permission level.

Obviously, a fundamental change to the Unix permissions model won't
be available to solve short-term problems, but I thought I'd raise
the issue to get people thinking about longer-term solutions.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH x86_64] Live Patching Function on 2.6.11.7

2005-04-20 Thread Chris Wedgwood
On Wed, Apr 20, 2005 at 05:45:00PM +0900, Takashi Ikebe wrote:

 Only for AF_UNIX..

I'm sure that means AF_UNIX is restricted for the socket you use to
pass the file descriptors, not a restriction on the file descriptors
themselves.  I don't see why the kernel would care what the
descriptors are.

 Well, as many said Live patching is very historical  authoritative
 function on especially carrier, telecom vendor.

Linux doesn't have it now, do it's not historical in the Linux space.

 If linux want to be adopted on mission critical world, this function
 is esseintial.

But Linux is used in mission critical places and we don't have that
feature.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Module that loads new Interrupt Descriptor Table

2005-04-20 Thread Zvi Rackover
Hello all,
 
  I would like to write a program that monitors various system
parameters in real time. One of these is counting the number of
interrupts. I would like to implement my own interrupt handler so that
each handler counts the number of interrupt of its respective type.
 I have read various guides and tutorials but none of them have
discussed this matter.
 The module is intended to be installed on an Intel architecture machine.
 Any tips or source code would be graciously accepted.
 
 Regards,
  Zvi
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Alsa-devel] [PATCH] sound: trivial warning fix for emu10k1

2005-04-20 Thread Takashi Iwai
At Tue, 19 Apr 2005 23:45:16 +0200 (CEST),
Jesper Juhl wrote:
 
 When building with gcc -W sound/pci/emu10k1/emupcm.c produces this little 
 warning in 2.6.12-rc2-mm3 : 
   sound/pci/emu10k1/emupcm.c:265: warning: `inline' is not at beginning of 
 declaration
 No big deal, but trivial to fix.
 
 
 Signed-off-by: Jesper Juhl [EMAIL PROTECTED]

Thanks, applied to ALSA tree now.


Takashi
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH scsi-misc-2.6 01/05] scsi: make blk layer set REQ_SOFTBARRIER when a request is dispatched

2005-04-20 Thread Tejun Heo
Nick Piggin wrote:
 On Wed, 2005-04-20 at 16:40 +0900, Tejun Heo wrote:
 
 Hello, Jens.

On Wed, Apr 20, 2005 at 08:30:10AM +0200, Jens Axboe wrote:

Do it on requeue, please - not on the initial spotting of the request.

 This is the reworked version of the patch.  It sets REQ_SOFTBARRIER
in two places - in elv_next_request() on BLKPREP_DEFER and in
blk_requeue_request().

 Other patches apply cleanly with this patch or the original one and
the end result is the same, so take your pick.  :-)

 
 
 I'm not sure that you need *either* one.
 
 As far as I'm aware, REQ_SOFTBARRIER is used when feeding requests
 into the top of the block layer, and is used to guarantee the device
 driver gets the requests in a specific ordering.
 
 When dealing with the requests at the other end (ie.
 elevator_next_req_fn, blk_requeue_request), then ordering does not
 change.
 
 That is - if you call elevator_next_req_fn and don't dequeue the
 request, then that's the same request you'll get next time.
 
 And blk_requeue_request will push the request back onto the end of
 the queue in a LIFO manner.
 
 So I think adding barriers, apart from not doing anything, confuses
 the issue because it suggests there *could* be reordering without
 them.
 
 Or am I completely wrong? It's been a while since I last got into
 the code.

 Well, yeah, all schedulers have dispatch queue (noop has only the
dispatch queue) and use them to defer/requeue, so no reordering will
happen, but I'm not sure they are required to be like this or just
happen to be implemented so.

 Hmm, well, it seems that setting REQ_SOFTBARRIER on requeue path isn't
necessary as we have INSERT_FRONT policy on requeue, and if
elv_next_req_fn() is required to return the same request when the
request isn't dequeued, you're right and we don't need this patch at
all.  We are guaranteed that all requeued requests are served in LIFO
manner.

 BTW, the same un-dequeued request rule is sort of already broken as
INSERT_FRONT request passes a returned but un-dequeued request, but,
then again, we need this behavior as we have to favor fully-prepped
requests over partially-prepped one.

-- 
tejun

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH scsi-misc-2.6 01/05] scsi: make blk layer set REQ_SOFTBARRIER when a request is dispatched

2005-04-20 Thread Jens Axboe
On Wed, Apr 20 2005, Tejun Heo wrote:
 Nick Piggin wrote:
  On Wed, 2005-04-20 at 16:40 +0900, Tejun Heo wrote:
  
  Hello, Jens.
 
 On Wed, Apr 20, 2005 at 08:30:10AM +0200, Jens Axboe wrote:
 
 Do it on requeue, please - not on the initial spotting of the request.
 
  This is the reworked version of the patch.  It sets REQ_SOFTBARRIER
 in two places - in elv_next_request() on BLKPREP_DEFER and in
 blk_requeue_request().
 
  Other patches apply cleanly with this patch or the original one and
 the end result is the same, so take your pick.  :-)
 
  
  
  I'm not sure that you need *either* one.
  
  As far as I'm aware, REQ_SOFTBARRIER is used when feeding requests
  into the top of the block layer, and is used to guarantee the device
  driver gets the requests in a specific ordering.
  
  When dealing with the requests at the other end (ie.
  elevator_next_req_fn, blk_requeue_request), then ordering does not
  change.
  
  That is - if you call elevator_next_req_fn and don't dequeue the
  request, then that's the same request you'll get next time.
  
  And blk_requeue_request will push the request back onto the end of
  the queue in a LIFO manner.
  
  So I think adding barriers, apart from not doing anything, confuses
  the issue because it suggests there *could* be reordering without
  them.
  
  Or am I completely wrong? It's been a while since I last got into
  the code.
 
  Well, yeah, all schedulers have dispatch queue (noop has only the
 dispatch queue) and use them to defer/requeue, so no reordering will
 happen, but I'm not sure they are required to be like this or just
 happen to be implemented so.

Precisely, I feel much better making sure SOFTBARRIER is set so that we
_know_ that a scheduler following the outlined rules will do the right
thing.

  Hmm, well, it seems that setting REQ_SOFTBARRIER on requeue path isn't
 necessary as we have INSERT_FRONT policy on requeue, and if
 elv_next_req_fn() is required to return the same request when the
 request isn't dequeued, you're right and we don't need this patch at
 all.  We are guaranteed that all requeued requests are served in LIFO
 manner.

After a requeue, it is not required to return the same request again.

  BTW, the same un-dequeued request rule is sort of already broken as
 INSERT_FRONT request passes a returned but un-dequeued request, but,
 then again, we need this behavior as we have to favor fully-prepped
 requests over partially-prepped one.

INSERT_FRONT really should skip requests with REQ_STARTED on the
dispatch list to be fully safe.

-- 
Jens Axboe

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH scsi-misc-2.6 01/05] scsi: make blk layer set REQ_SOFTBARRIER when a request is dispatched

2005-04-20 Thread Nick Piggin
Jens Axboe wrote:
On Wed, Apr 20 2005, Tejun Heo wrote:

Well, yeah, all schedulers have dispatch queue (noop has only the
dispatch queue) and use them to defer/requeue, so no reordering will
happen, but I'm not sure they are required to be like this or just
happen to be implemented so.

Precisely, I feel much better making sure SOFTBARRIER is set so that we
_know_ that a scheduler following the outlined rules will do the right
thing.
Well yeah, at the moment I am just following implementations as
defining the standard.

Hmm, well, it seems that setting REQ_SOFTBARRIER on requeue path isn't
necessary as we have INSERT_FRONT policy on requeue, and if
elv_next_req_fn() is required to return the same request when the
request isn't dequeued, you're right and we don't need this patch at
all.  We are guaranteed that all requeued requests are served in LIFO
manner.

After a requeue, it is not required to return the same request again.
Well I guess not.
Would there be any benefit to reordering after a requeue?

BTW, the same un-dequeued request rule is sort of already broken as
INSERT_FRONT request passes a returned but un-dequeued request, but,
then again, we need this behavior as we have to favor fully-prepped
requests over partially-prepped one.

INSERT_FRONT really should skip requests with REQ_STARTED on the
dispatch list to be fully safe.
I guess this could be one use of 'reordering' after a requeue.
I'm not sure this would need a REQ_SOFTBARRIER either though, really.
Your basic io scheduler framework - ie. a FIFO dispatch list which
can have requests requeued on the front models pretty well what the
block layer needs of the elevator.
Considering all requeues and all elv_next_request but not dequeued
requests would have this REQ_SOFTBARRIER bit set, any other model
that theoretically would allow reordering would degenerate to this
dispatch list behaviour, right?
In which case, the dispatch list is effectively basically the most
efficient way to do it? In which case should we just explicitly
document that in the API?
--
SUSE Labs, Novell Inc.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Module that loads new Interrupt Descriptor Table

2005-04-20 Thread Arjan van de Ven
On Wed, 2005-04-20 at 11:58 +0300, Zvi Rackover wrote:
 Hello all,
  
   I would like to write a program that monitors various system
 parameters in real time. One of these is counting the number of
 interrupts. I would like to implement my own interrupt handler so that
 each handler counts the number of interrupt of its respective type.

ehm
the kernel already keeps this kind of data, see /proc/interrupts

why would you want to collect it *again* ?
(or do you want to generally hook interrupts like some other people want
to hook syscalls?)


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: VST and Sched Load Balance

2005-04-20 Thread Srivatsa Vaddagiri
On Tue, Apr 19, 2005 at 09:07:49AM -0700, Nish Aravamudan wrote:
  +   if (jiffies - sd1-last_balance = interval) {


 Sorry for the late reply, but shouldn't this jiffies comparison be
 done with time_after() or time_before()?

I think it is not needed. The check should be able to handle overflow case also.

This probably assumes that you don't sleep longer than (2e32 - 1) jiffies
(which is ~1193 hrs). Current VST implementation let us sleep way less than that
limit (~896 ms) since it uses 32-bit number for sampling TSC. When it is
upgraded to use 64-bit number, we may have to ensure that this limit (1193 hrs)
is not exceeded.

-- 


Thanks and Regards,
Srivatsa Vaddagiri,
Linux Technology Center,
IBM Software Labs,
Bangalore, INDIA - 560017
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Module that loads new Interrupt Descriptor Table

2005-04-20 Thread Zvi Rackover
On 4/20/05, Arjan van de Ven [EMAIL PROTECTED] wrote:
 On Wed, 2005-04-20 at 11:58 +0300, Zvi Rackover wrote:
  Hello all,
 
I would like to write a program that monitors various system
  parameters in real time. One of these is counting the number of
  interrupts. I would like to implement my own interrupt handler so that
  each handler counts the number of interrupt of its respective type.
 
 ehm
 the kernel already keeps this kind of data, see /proc/interrupts
 
 why would you want to collect it *again* ?
 (or do you want to generally hook interrupts like some other people want
 to hook syscalls?)
 
 
You have a good point point - I should have given a better decription
to my problem.
This is an educational project I'm working on and I need to actually
see the contents of the Interrupt Descritor Table so I could load it
into user space and have it displayed in a nice GUI.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH scsi-misc-2.6 01/05] scsi: make blk layer set REQ_SOFTBARRIER when a request is dispatched

2005-04-20 Thread Jens Axboe
On Wed, Apr 20 2005, Nick Piggin wrote:
 Hmm, well, it seems that setting REQ_SOFTBARRIER on requeue path isn't
 necessary as we have INSERT_FRONT policy on requeue, and if
 elv_next_req_fn() is required to return the same request when the
 request isn't dequeued, you're right and we don't need this patch at
 all.  We are guaranteed that all requeued requests are served in LIFO
 manner.
 
 
 After a requeue, it is not required to return the same request again.
 
 
 Well I guess not.
 
 Would there be any benefit to reordering after a requeue?

Logic dictates that requeues should maintain ordering, since we don't
want to reorder around the original io scheduler decisions. But you
could be requeuing more than one request.

 BTW, the same un-dequeued request rule is sort of already broken as
 INSERT_FRONT request passes a returned but un-dequeued request, but,
 then again, we need this behavior as we have to favor fully-prepped
 requests over partially-prepped one.
 
 
 INSERT_FRONT really should skip requests with REQ_STARTED on the
 dispatch list to be fully safe.
 
 
 I guess this could be one use of 'reordering' after a requeue.

Yeah, or perhaps the io scheduler might determine that a request has
higher prio than a requeued one.  I'm not sure what semantics to place
on soft-barrier, I've always taken it to mean 'maintain ordering if
convenient' where the hard-barrier must be followed.

 I'm not sure this would need a REQ_SOFTBARRIER either though, really.
 
 Your basic io scheduler framework - ie. a FIFO dispatch list which
 can have requests requeued on the front models pretty well what the
 block layer needs of the elevator.
 
 Considering all requeues and all elv_next_request but not dequeued
 requests would have this REQ_SOFTBARRIER bit set, any other model
 that theoretically would allow reordering would degenerate to this
 dispatch list behaviour, right?

Not sure I follow this - I don't want REQ_SOFTBARRIER set automatically
on elv_next_request() return, it should only happen on requeues.
REQ_STARTED implies that you should not pass this request, since the io
scheduler is required to return this request again until dequeue is
called. But the result is the same, correct.

 In which case, the dispatch list is effectively basically the most
 efficient way to do it? In which case should we just explicitly
 document that in the API?

You lost me, please detail exactly what behaviour you want documented :)

-- 
Jens Axboe

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH scsi-misc-2.6 01/05] scsi: make blk layer set REQ_SOFTBARRIER when a request is dispatched

2005-04-20 Thread Nick Piggin
Jens Axboe wrote:
On Wed, Apr 20 2005, Nick Piggin wrote:

I guess this could be one use of 'reordering' after a requeue.

Yeah, or perhaps the io scheduler might determine that a request has
higher prio than a requeued one.  I'm not sure what semantics to place
I guess this is possible. It is often only a single request
that is on the dispatch list though, so I don't know if it
would make sense to reorder it by priority again.
on soft-barrier, I've always taken it to mean 'maintain ordering if
convenient' where the hard-barrier must be followed.
I've thought it was SOFTBARRIER ensures the device driver (and
hardware?) sees the request in this order, and HARDBARRIER ensures
it reaches stable storage in this order.
Not exactly sure why you would want a softbarrier and not a
hardbarrier. Maybe for special commands.

I'm not sure this would need a REQ_SOFTBARRIER either though, really.
Your basic io scheduler framework - ie. a FIFO dispatch list which
can have requests requeued on the front models pretty well what the
block layer needs of the elevator.
Considering all requeues and all elv_next_request but not dequeued
requests would have this REQ_SOFTBARRIER bit set, any other model
that theoretically would allow reordering would degenerate to this
dispatch list behaviour, right?

Not sure I follow this - I don't want REQ_SOFTBARRIER set automatically
on elv_next_request() return, it should only happen on requeues.
REQ_STARTED implies that you should not pass this request, since the io
scheduler is required to return this request again until dequeue is
called. But the result is the same, correct.
OK - but I'm wondering would it ever make sense to do it any
other way? I would have thought no, in which case we can document
that requests seen by 'elv_next_request', and those requeued back
into the device will not be reordered, and so Tejun does not need
to set REQ_SOFTBARRIER.
But I'm not so sure now... it isn't really that big a deal ;)
So whatever you're happy with is fine. Sorry for the nose.
--
SUSE Labs, Novell Inc.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH scsi-misc-2.6 01/05] scsi: make blk layer set REQ_SOFTBARRIER when a request is dispatched

2005-04-20 Thread Jens Axboe
On Wed, Apr 20 2005, Nick Piggin wrote:
 Jens Axboe wrote:
 On Wed, Apr 20 2005, Nick Piggin wrote:
 
 
 I guess this could be one use of 'reordering' after a requeue.
 
 
 Yeah, or perhaps the io scheduler might determine that a request has
 higher prio than a requeued one.  I'm not sure what semantics to place
 
 I guess this is possible. It is often only a single request
 that is on the dispatch list though, so I don't know if it
 would make sense to reorder it by priority again.

Depends entirely on the io scheduler. CFQ may put several on the
dispatch list.

 on soft-barrier, I've always taken it to mean 'maintain ordering if
 convenient' where the hard-barrier must be followed.
 
 
 I've thought it was SOFTBARRIER ensures the device driver (and
 hardware?) sees the request in this order, and HARDBARRIER ensures
 it reaches stable storage in this order.
 
 Not exactly sure why you would want a softbarrier and not a
 hardbarrier. Maybe for special commands.

It is the cleaner interpretation. CFQ marks requests as requeued
internally and gives preference to them for reissue, but it may return
another first (actually, I think it even checks for -requeued on
dispatch sort, so it wont right now).

 I'm not sure this would need a REQ_SOFTBARRIER either though, really.
 
 Your basic io scheduler framework - ie. a FIFO dispatch list which
 can have requests requeued on the front models pretty well what the
 block layer needs of the elevator.
 
 Considering all requeues and all elv_next_request but not dequeued
 requests would have this REQ_SOFTBARRIER bit set, any other model
 that theoretically would allow reordering would degenerate to this
 dispatch list behaviour, right?
 
 
 Not sure I follow this - I don't want REQ_SOFTBARRIER set automatically
 on elv_next_request() return, it should only happen on requeues.
 REQ_STARTED implies that you should not pass this request, since the io
 scheduler is required to return this request again until dequeue is
 called. But the result is the same, correct.
 
 
 OK - but I'm wondering would it ever make sense to do it any
 other way? I would have thought no, in which case we can document
 that requests seen by 'elv_next_request', and those requeued back
 into the device will not be reordered, and so Tejun does not need
 to set REQ_SOFTBARRIER.
 
 But I'm not so sure now... it isn't really that big a deal ;)
 So whatever you're happy with is fine. Sorry for the nose.

It's not noise, it would be nice to have this entirely documented so
that there isn't any confusion on what is guaranteed vs what currently
happens in most places.

But I don't want to document that they are never reordered. For requeues
it make sense to maintain ordering in most cases, but it also may make
sense to reorder for higher priority io. If the driver does _not_ want a
particular request reordered for data integrity reasons, then that needs
to be explicitly specified.

-- 
Jens Axboe

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Mercurial v0.1 - a minimal scalable distributed SCM

2005-04-20 Thread Matt Mackall
http://selenic.com/mercurial/

April 19, 2005

I've spent the past couple weeks working on a completely new
proof-of-concept SCM. The goals:

 - to initially be as simple (and thereby hackable) as possible
 - to be as scalable as possible
 - to be memory, disk, and bandwidth efficient
 - to be able to do clone/branch and pull/sync style development

It's still very early on, but I think I'm doing surprisingly well. Now
that I've got something that actually does some interesting things if
you poke at it right, I figure it's time to throw it out there.
Here's what I've got so far:

 - O(1) file revision storage and retrieval with efficient delta compression
 - efficient append-only layout for rsync and http range protocols
 - bare bones commit, checkout, stat, history
 - working clone/branch and pull/sync functionality
 - functional enough to be self-hosting[1]
 - all in less than 600 lines of Python

When I say pull/sync works, that means I can do:

 hg merge other-repo

and it will pull all changesets/deltas that are in other-dir that I
don't have, merge them into the changeset history graph, and do the
same for all files changed for those deltas. It will call out to
a user-specified merge tool like tkdiff for a proper 3-way merge with
the nearest common ancestor in the case of conflicts.

Right now, cloning/branching is simply a matter of cp -al or
rsync (mercurial knows how to break hardlinks if needed).

Some benchmarks from my laptop:

 - prepare for commit of Linux 2.6.10: ~1s
 - commit Linux 2.6.10: 27s
 - checkout Linux 2.6.10: 45s
 - full tree stat for added/changed/deleted files: 1s
 - local hardlink clone: 1.5s
 - empty merge between full trees: .1s
 - trivial 3-way merge with changes to Makefile: ~1s

Much thought has gone into what the best asymptotic performance can be
for the various things an SCM has to do and the core algorithms and
data structures here should scale relatively painlessly to arbitrary
numbers of changesets, files, and file revisions.

What remains to be done:

 - a halfway-usable command line tool
 - remote (network) repository support
 - diff generation
 - changelog entry editing
 - various manual interventions for merge
 - handle rename
 - handle rollback
 - all sorts of other error handling
 - all sorts of cleanup, packaging, documentation, testing...

Sample usage:

 export HGMERGE=tkmerge   # set a 3-way merge tool
 mkdir foo
 hg create# create a repository in .hg/
 echo foo  Makefile
 hg add Makefile  # add a file to the current changeset
 hg commit# commit files currently marked for add or delete
 hg history   # show all changesets
 hg index Makefile# show change
 touch Makefile
 hg stat  # find changed files
 cd ..; cp -al foo branch  # make a branch
 hg merge ../branch-foo# sync changesets from a branch

[1] though the repository format is still in flux

-- 
Mathematics is the supreme nostalgia of our time.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH][2.4.30] __attribute__ placement

2005-04-20 Thread Vinay K Nallamothu
Hi,

Realised few things are not ok with my earlier patch. The __attribute__
((packed)) problem exists only with typedefed structs. The correct
syntax is

typedef struct ... {  } __attribute__((packed)) newtype;

Please find the updated patch. Hope I got it right this time.

Thanks
Vinay

 arch/s390x/kernel/ptrace.c  |6 +++---
 drivers/net/gt96100eth.h|4 ++--
 drivers/s390/net/qeth.h |4 ++--
 drivers/s390/net/qeth_mpc.h |6 +++---
 4 files changed, 10 insertions(+), 10 deletions(-)

==
diff -urN linux-2.4.30/arch/s390x/kernel/ptrace.c 
linux-2.4.30-nvk/arch/s390x/kernel/ptrace.c
--- linux-2.4.30/arch/s390x/kernel/ptrace.c 2003-06-14 10:05:28.0 
+0530
+++ linux-2.4.30-nvk/arch/s390x/kernel/ptrace.c 2005-04-20 15:29:35.0 
+0530
@@ -146,14 +146,14 @@
 typedef struct
 {
__u32 cr[3];
-} per_cr_words32  __attribute__((packed));
+} __attribute__((packed)) per_cr_words32;
 
 typedef struct
 {
__u16  perc_atmid;  /* 0x096 */
__u32  address; /* 0x098 */
__u8   access_id;   /* 0x0a1 */
-} per_lowcore_words32  __attribute__((packed));
+} __attribute__((packed)) per_lowcore_words32;
 
 typedef struct
 {
@@ -177,7 +177,7 @@
union {
per_lowcore_words32 words;
} lowcore; 
-} per_struct32 __attribute__((packed));
+} __attribute__((packed)) per_struct32;
 
 struct user_regs_struct32
 {
diff -urN linux-2.4.30/drivers/net/gt96100eth.h 
linux-2.4.30-nvk/drivers/net/gt96100eth.h
--- linux-2.4.30/drivers/net/gt96100eth.h   2003-06-14 10:03:16.0 
+0530
+++ linux-2.4.30-nvk/drivers/net/gt96100eth.h   2005-04-20 15:34:14.0 
+0530
@@ -214,7 +214,7 @@
u32 cmdstat;
u32 next;
u32 buff_ptr;
-} gt96100_td_t __attribute__ ((packed));
+} __attribute__ ((packed)) gt96100_td_t;
 
 typedef struct {
 #ifdef DESC_BE
@@ -227,7 +227,7 @@
u32 cmdstat;
u32 next;
u32 buff_ptr;
-} gt96100_rd_t __attribute__ ((packed));
+} __attribute__ ((packed)) gt96100_rd_t;
 
 
 /* Values for the Tx command-status descriptor entry. */
diff -urN linux-2.4.30/drivers/s390/net/qeth.h 
linux-2.4.30-nvk/drivers/s390/net/qeth.h
--- linux-2.4.30/drivers/s390/net/qeth.h2004-12-04 17:45:26.0 
+0530
+++ linux-2.4.30-nvk/drivers/s390/net/qeth.h2005-04-20 15:31:52.0 
+0530
@@ -760,14 +760,14 @@
 typedef struct qeth_ringbuffer_t {
qdio_buffer_t buffer[QDIO_MAX_BUFFERS_PER_Q];
qeth_ringbuffer_element_t ringbuf_element[QDIO_MAX_BUFFERS_PER_Q];
-} qeth_ringbuffer_t __attribute__ ((packed,aligned(PAGE_SIZE)));
+} __attribute__((packed)) qeth_ringbuffer_t __attribute__ 
((aligned(PAGE_SIZE)));
 
 typedef struct qeth_dma_stuff_t {
unsigned char *sendbuf;
unsigned char *recbuf;
ccw1_t read_ccw;
ccw1_t write_ccw;
-} qeth_dma_stuff_t __attribute__ ((packed,aligned(PAGE_SIZE)));
+} __attribute__((packed)) qeth_dma_stuff_t __attribute__ 
((aligned(PAGE_SIZE)));
 
 typedef struct qeth_perf_stats_t {
unsigned int skbs_rec;
diff -urN linux-2.4.30/drivers/s390/net/qeth_mpc.h 
linux-2.4.30-nvk/drivers/s390/net/qeth_mpc.h
--- linux-2.4.30/drivers/s390/net/qeth_mpc.h2004-12-04 17:45:26.0 
+0530
+++ linux-2.4.30-nvk/drivers/s390/net/qeth_mpc.h2005-04-20 
15:33:16.0 +0530
@@ -460,7 +460,7 @@
__u8 unique_id[8];
} create_destroy_addr;
} data;
-} ipa_cmd_t __attribute__ ((packed));
+} __attribute__ ((packed)) ipa_cmd_t;
 
 #define QETH_IOC_MAGIC 0x22
 #define QETH_IOCPROC_REGISTER _IOWR(QETH_IOC_MAGIC, 1, int)
@@ -506,7 +506,7 @@
__u8 snmp_data[ARP_DATA_SIZE];
} snmp_subcommand;
} data;
-} snmp_ipa_setadp_cmd_t __attribute__ ((packed));
+} __attribute__ ((packed)) snmp_ipa_setadp_cmd_t;
 
 typedef struct arp_cmd_t {
__u8 command;
@@ -539,7 +539,7 @@
} setassparms;
 snmp_ipa_setadp_cmd_t setadapterparms; 
} data;
-} arp_cmd_t __attribute__ ((packed));
+} __attribute__ ((packed)) arp_cmd_t;
 
 
 

==

On Mon, 2005-04-18 at 20:37 +0530, Vinay K Nallamothu wrote:
 Hi,
 
 The variable attributes packed and align when used with structure
 should have the following order:
 
 struct ... {...} __attribute__((packed)) var;
 
 This patch fixes few instances where the variable and attributes are
 placed the other way around and had no affect.

-- 
Views expressed in this mail are those of the individual sender and 
do not bind Gsec1 Limited. or its subsidiary, unless the sender has done
so expressly with due authority of Gsec1.
_


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the 

Re: [PATCH x86_64] Live Patching Function on 2.6.11.7

2005-04-20 Thread Rik van Riel
On Wed, 20 Apr 2005, Takashi Ikebe wrote:

 Well, as many said Live patching is very historical  authoritative
 function on especially carrier, telecom vendor.
 If linux want to be adopted on mission critical world, this function is
 esseintial.

Yes, if you want to use Linux in those scenarios you will
need to change the telco programs to use shared memory and
file descriptor passing, instead of live patching.

-- 
Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it. - Brian W. Kernighan
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH][2.6.12-rc2] __attribute__ placement

2005-04-20 Thread Vinay K Nallamothu
Hi,

Apparently my previous patch incorrectly fixed few cases where the
problem didn't exist. The __attribute__((packed)) issue is applicable
only to typedefed structs and the correct syntax for this should be:

typedef struct ... { ... } __attribute__((packed)) new_type;

Please find the updated patch. Hope I haven't introduced any new bugs.

Thanks
Vinay

 drivers/net/gt96100eth.h  |4 ++--
 include/asm-m68knommu/MC68328.h   |2 +-
 include/asm-m68knommu/MC68EZ328.h |2 +-
 include/asm-m68knommu/MC68VZ328.h |2 +-
 4 files changed, 5 insertions(+), 5 deletions(-)

=
diff -urN linux-2.6.12-rc2/drivers/net/gt96100eth.h 
linux-2.6.12-rc2-nvk/drivers/net/gt96100eth.h
--- linux-2.6.12-rc2/drivers/net/gt96100eth.h   2005-04-07 18:56:46.0 
+0530
+++ linux-2.6.12-rc2-nvk/drivers/net/gt96100eth.h   2005-04-20 
15:50:20.0 +0530
@@ -214,7 +214,7 @@
u32 cmdstat;
u32 next;
u32 buff_ptr;
-} gt96100_td_t __attribute__ ((packed));
+} __attribute__ ((packed)) gt96100_td_t;
 
 typedef struct {
 #ifdef DESC_BE
@@ -227,7 +227,7 @@
u32 cmdstat;
u32 next;
u32 buff_ptr;
-} gt96100_rd_t __attribute__ ((packed));
+} __attribute__ ((packed)) gt96100_rd_t;
 
 
 /* Values for the Tx command-status descriptor entry. */
diff -urN linux-2.6.12-rc2/include/asm-m68knommu/MC68328.h 
linux-2.6.12-rc2-nvk/include/asm-m68knommu/MC68328.h
--- linux-2.6.12-rc2/include/asm-m68knommu/MC68328.h2005-04-07 
18:55:40.0 +0530
+++ linux-2.6.12-rc2-nvk/include/asm-m68knommu/MC68328.h2005-04-20 
15:47:37.0 +0530
@@ -993,7 +993,7 @@
   volatile unsigned short int pad1;
   volatile unsigned short int pad2;
   volatile unsigned short int pad3;
-} m68328_uart __attribute__((packed));
+} __attribute__((packed)) m68328_uart;
 
 
 /**
diff -urN linux-2.6.12-rc2/include/asm-m68knommu/MC68EZ328.h 
linux-2.6.12-rc2-nvk/include/asm-m68knommu/MC68EZ328.h
--- linux-2.6.12-rc2/include/asm-m68knommu/MC68EZ328.h  2005-04-07 
18:55:40.0 +0530
+++ linux-2.6.12-rc2-nvk/include/asm-m68knommu/MC68EZ328.h  2005-04-20 
15:48:27.0 +0530
@@ -815,7 +815,7 @@
   volatile unsigned short int nipr;
   volatile unsigned short int pad1;
   volatile unsigned short int pad2;
-} m68328_uart __attribute__((packed));
+} __attribute__((packed)) m68328_uart;
 
 
 /**
diff -urN linux-2.6.12-rc2/include/asm-m68knommu/MC68VZ328.h 
linux-2.6.12-rc2-nvk/include/asm-m68knommu/MC68VZ328.h
--- linux-2.6.12-rc2/include/asm-m68knommu/MC68VZ328.h  2005-04-07 
18:55:40.0 +0530
+++ linux-2.6.12-rc2-nvk/include/asm-m68knommu/MC68VZ328.h  2005-04-20 
15:48:01.0 +0530
@@ -909,7 +909,7 @@
   volatile unsigned short int nipr;
   volatile unsigned short int hmark;
   volatile unsigned short int unused;
-} m68328_uart __attribute__((packed));
+} __attribute__((packed)) m68328_uart;
 
 
 

=
 On Mon, 2005-04-18 at 20:46 +0530, Vinay K Nallamothu wrote:
  Hi,
  
  The variable attributes packed and align when used with struct,
  should have the following order:
  
  struct ... {...} __attribute__((packed)) var;
  
  This patch fixes few instances where the variable and attributes are
  placed the other way around and had no affect.
  

-- 
Views expressed in this mail are those of the individual sender and 
do not bind Gsec1 Limited. or its subsidiary, unless the sender has done
so expressly with due authority of Gsec1.
_


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH Linux 2.6.12-rc2 01/04] blk: use find_first_zero_bit() in blk_queue_start_tag()

2005-04-20 Thread Tejun Heo
01_blk_tag_map_use_find_first_zero_bit.patch

blk_queue_start_tag() hand-coded searching for the first zero
bit in the tag map.  Replace it with find_first_zero_bit().
With this patch, blk_queue_star_tag() doesn't need to fill
remains of tag map with 1, thus allowing it to work properly
with the next remove_real_max_depth patch.

Signed-off-by: Tejun Heo [EMAIL PROTECTED]

 ll_rw_blk.c |   13 -
 1 files changed, 4 insertions(+), 9 deletions(-)

Index: blk-fixes/drivers/block/ll_rw_blk.c
===
--- blk-fixes.orig/drivers/block/ll_rw_blk.c2005-04-20 20:36:35.0 
+0900
+++ blk-fixes/drivers/block/ll_rw_blk.c 2005-04-20 20:36:36.0 +0900
@@ -967,8 +967,7 @@ EXPORT_SYMBOL(blk_queue_end_tag);
 int blk_queue_start_tag(request_queue_t *q, struct request *rq)
 {
struct blk_queue_tag *bqt = q-queue_tags;
-   unsigned long *map = bqt-tag_map;
-   int tag = 0;
+   int tag;
 
if (unlikely((rq-flags  REQ_QUEUED))) {
printk(KERN_ERR 
@@ -977,14 +976,10 @@ int blk_queue_start_tag(request_queue_t 
BUG();
}
 
-   for (map = bqt-tag_map; *map == -1UL; map++) {
-   tag += BLK_TAGS_PER_LONG;
-
-   if (tag = bqt-max_depth)
-   return 1;
-   }
+   tag = find_first_zero_bit(bqt-tag_map, bqt-max_depth);
+   if (tag = bqt-max_depth)
+   return 1;
 
-   tag += ffz(*map);
__set_bit(tag, bqt-tag_map);
 
rq-flags |= REQ_QUEUED;

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH Linux 2.6.12-rc2 00/04] blk: generic tag support fixes

2005-04-20 Thread Tejun Heo
 Hello, Jens.

 These are fixes to generic tag support in the blk layer.  They all
compile okay and I've proof read it, but as I don't have any HBA which
uses generic tag support, I wasn't able to test directly.  However,
all changes are fairly straight-forward.

[ Start of patch descriptions ]

01_blk_tag_map_use_find_first_zero_bit.patch
: use find_first_zero_bit() in blk_queue_start_tag()

blk_queue_start_tag() hand-coded searching for the first zero
bit in the tag map.  Replace it with find_first_zero_bit().
With this patch, blk_queue_star_tag() doesn't need to fill
remains of tag map with 1, thus allowing it to work properly
with the next remove_real_max_depth patch.

02_blk_tag_map_remove_real_max_depth.patch
: remove blk_queue_tag-real_max_depth optimization

blk_queue_tag-real_max_depth was used to optimize out
unnecessary allocations/frees on tag resize.  However, the
whole thing was very broken - tag_map was never allocated to
real_max_depth resulting in access beyond the end of the map,
bits in [max_depth..real_max_depth] were set when initializing
a map and copied when resizing resulting in pre-occupied tags.

As the gain of the optimization is very small, well, almost
nill, remove the whole thing.

03_blk_tag_map_remove_BLK_TAGS_PER_LONG.patch
: remove BLK_TAGS_{PER_LONG|MASK}

Replace BLK_TAGS_PER_LONG with BITS_PER_LONG and remove unused
BLK_TAGS_MASK.

04_blk_tag_map_error_handling_cleanup.patch
: cleanup generic tag support error messages

Add KERN_ERR and __FUNCTION__ to generic tag error messages,
and add a comment in blk_queue_end_tag() which explains the
silent failure path.

[ End of patch descriptions ]

 Thanks a lot. :-)

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH Linux 2.6.12-rc2 02/04] blk: remove blk_queue_tag-real_max_depth optimization

2005-04-20 Thread Tejun Heo
02_blk_tag_map_remove_real_max_depth.patch

blk_queue_tag-real_max_depth was used to optimize out
unnecessary allocations/frees on tag resize.  However, the
whole thing was very broken - tag_map was never allocated to
real_max_depth resulting in access beyond the end of the map,
bits in [max_depth..real_max_depth] were set when initializing
a map and copied when resizing resulting in pre-occupied tags.

As the gain of the optimization is very small, well, almost
nill, remove the whole thing.

Signed-off-by: Tejun Heo [EMAIL PROTECTED]

 drivers/block/ll_rw_blk.c |   35 ++-
 include/linux/blkdev.h|1 -
 2 files changed, 10 insertions(+), 26 deletions(-)

Index: blk-fixes/drivers/block/ll_rw_blk.c
===
--- blk-fixes.orig/drivers/block/ll_rw_blk.c2005-04-20 20:36:36.0 
+0900
+++ blk-fixes/drivers/block/ll_rw_blk.c 2005-04-20 20:36:37.0 +0900
@@ -716,7 +716,7 @@ struct request *blk_queue_find_tag(reque
 {
struct blk_queue_tag *bqt = q-queue_tags;
 
-   if (unlikely(bqt == NULL || tag = bqt-real_max_depth))
+   if (unlikely(bqt == NULL || tag = bqt-max_depth))
return NULL;
 
return bqt-tag_index[tag];
@@ -774,9 +774,9 @@ EXPORT_SYMBOL(blk_queue_free_tags);
 static int
 init_tag_map(request_queue_t *q, struct blk_queue_tag *tags, int depth)
 {
-   int bits, i;
struct request **tag_index;
unsigned long *tag_map;
+   int nr_ulongs;
 
if (depth  q-nr_requests * 2) {
depth = q-nr_requests * 2;
@@ -788,24 +788,17 @@ init_tag_map(request_queue_t *q, struct 
if (!tag_index)
goto fail;
 
-   bits = (depth / BLK_TAGS_PER_LONG) + 1;
-   tag_map = kmalloc(bits * sizeof(unsigned long), GFP_ATOMIC);
+   nr_ulongs = ALIGN(depth, BLK_TAGS_PER_LONG) / BLK_TAGS_PER_LONG;
+   tag_map = kmalloc(nr_ulongs * sizeof(unsigned long), GFP_ATOMIC);
if (!tag_map)
goto fail;
 
memset(tag_index, 0, depth * sizeof(struct request *));
-   memset(tag_map, 0, bits * sizeof(unsigned long));
+   memset(tag_map, 0, nr_ulongs * sizeof(unsigned long));
tags-max_depth = depth;
-   tags-real_max_depth = bits * BITS_PER_LONG;
tags-tag_index = tag_index;
tags-tag_map = tag_map;
 
-   /*
-* set the upper bits if the depth isn't a multiple of the word size
-*/
-   for (i = depth; i  bits * BLK_TAGS_PER_LONG; i++)
-   __set_bit(i, tag_map);
-
return 0;
 fail:
kfree(tag_index);
@@ -870,32 +863,24 @@ int blk_queue_resize_tags(request_queue_
struct blk_queue_tag *bqt = q-queue_tags;
struct request **tag_index;
unsigned long *tag_map;
-   int bits, max_depth;
+   int max_depth, nr_ulongs;
 
if (!bqt)
return -ENXIO;
 
/*
-* don't bother sizing down
-*/
-   if (new_depth = bqt-real_max_depth) {
-   bqt-max_depth = new_depth;
-   return 0;
-   }
-
-   /*
 * save the old state info, so we can copy it back
 */
tag_index = bqt-tag_index;
tag_map = bqt-tag_map;
-   max_depth = bqt-real_max_depth;
+   max_depth = bqt-max_depth;
 
if (init_tag_map(q, bqt, new_depth))
return -ENOMEM;
 
memcpy(bqt-tag_index, tag_index, max_depth * sizeof(struct request *));
-   bits = max_depth / BLK_TAGS_PER_LONG;
-   memcpy(bqt-tag_map, tag_map, bits * sizeof(unsigned long));
+   nr_ulongs = ALIGN(max_depth, BLK_TAGS_PER_LONG) / BLK_TAGS_PER_LONG;
+   memcpy(bqt-tag_map, tag_map, nr_ulongs * sizeof(unsigned long));
 
kfree(tag_index);
kfree(tag_map);
@@ -925,7 +910,7 @@ void blk_queue_end_tag(request_queue_t *
 
BUG_ON(tag == -1);
 
-   if (unlikely(tag = bqt-real_max_depth))
+   if (unlikely(tag = bqt-max_depth))
return;
 
if (unlikely(!__test_and_clear_bit(tag, bqt-tag_map))) {
Index: blk-fixes/include/linux/blkdev.h
===
--- blk-fixes.orig/include/linux/blkdev.h   2005-04-20 20:36:35.0 
+0900
+++ blk-fixes/include/linux/blkdev.h2005-04-20 20:36:37.0 +0900
@@ -294,7 +294,6 @@ struct blk_queue_tag {
struct list_head busy_list; /* fifo list of busy tags */
int busy;   /* current depth */
int max_depth;  /* what we will send to device */
-   int real_max_depth; /* what the array can hold */
atomic_t refcnt;/* map can be shared */
 };
 

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH Linux 2.6.12-rc2 03/04] blk: remove BLK_TAGS_{PER_LONG|MASK}

2005-04-20 Thread Tejun Heo
03_blk_tag_map_remove_BLK_TAGS_PER_LONG.patch

Replace BLK_TAGS_PER_LONG with BITS_PER_LONG and remove unused
BLK_TAGS_MASK.

Signed-off-by: Tejun Heo [EMAIL PROTECTED]

 drivers/block/ll_rw_blk.c |4 ++--
 include/linux/blkdev.h|3 ---
 2 files changed, 2 insertions(+), 5 deletions(-)

Index: blk-fixes/drivers/block/ll_rw_blk.c
===
--- blk-fixes.orig/drivers/block/ll_rw_blk.c2005-04-20 20:36:37.0 
+0900
+++ blk-fixes/drivers/block/ll_rw_blk.c 2005-04-20 20:36:38.0 +0900
@@ -788,7 +788,7 @@ init_tag_map(request_queue_t *q, struct 
if (!tag_index)
goto fail;
 
-   nr_ulongs = ALIGN(depth, BLK_TAGS_PER_LONG) / BLK_TAGS_PER_LONG;
+   nr_ulongs = ALIGN(depth, BITS_PER_LONG) / BITS_PER_LONG;
tag_map = kmalloc(nr_ulongs * sizeof(unsigned long), GFP_ATOMIC);
if (!tag_map)
goto fail;
@@ -879,7 +879,7 @@ int blk_queue_resize_tags(request_queue_
return -ENOMEM;
 
memcpy(bqt-tag_index, tag_index, max_depth * sizeof(struct request *));
-   nr_ulongs = ALIGN(max_depth, BLK_TAGS_PER_LONG) / BLK_TAGS_PER_LONG;
+   nr_ulongs = ALIGN(max_depth, BITS_PER_LONG) / BITS_PER_LONG;
memcpy(bqt-tag_map, tag_map, nr_ulongs * sizeof(unsigned long));
 
kfree(tag_index);
Index: blk-fixes/include/linux/blkdev.h
===
--- blk-fixes.orig/include/linux/blkdev.h   2005-04-20 20:36:37.0 
+0900
+++ blk-fixes/include/linux/blkdev.h2005-04-20 20:36:38.0 +0900
@@ -285,9 +285,6 @@ enum blk_queue_state {
Queue_up,
 };
 
-#define BLK_TAGS_PER_LONG  (sizeof(unsigned long) * 8)
-#define BLK_TAGS_MASK  (BLK_TAGS_PER_LONG - 1)
-
 struct blk_queue_tag {
struct request **tag_index; /* map of busy tags */
unsigned long *tag_map; /* bit map of free/busy tags */

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH Linux 2.6.12-rc2 04/04] blk: cleanup generic tag support error messages

2005-04-20 Thread Tejun Heo
04_blk_tag_map_error_handling_cleanup.patch

Add KERN_ERR and __FUNCTION__ to generic tag error messages,
and add a comment in blk_queue_end_tag() which explains the
silent failure path.

Signed-off-by: Tejun Heo [EMAIL PROTECTED]

 ll_rw_blk.c |   18 +-
 1 files changed, 13 insertions(+), 5 deletions(-)

Index: blk-fixes/drivers/block/ll_rw_blk.c
===
--- blk-fixes.orig/drivers/block/ll_rw_blk.c2005-04-20 20:36:38.0 
+0900
+++ blk-fixes/drivers/block/ll_rw_blk.c 2005-04-20 20:36:40.0 +0900
@@ -911,10 +911,15 @@ void blk_queue_end_tag(request_queue_t *
BUG_ON(tag == -1);
 
if (unlikely(tag = bqt-max_depth))
+   /*
+* This can happen after tag depth has been reduced.
+* FIXME: how about a warning or info message here?
+*/
return;
 
if (unlikely(!__test_and_clear_bit(tag, bqt-tag_map))) {
-   printk(attempt to clear non-busy tag (%d)\n, tag);
+   printk(KERN_ERR %s: attempt to clear non-busy tag (%d)\n,
+  __FUNCTION__, tag);
return;
}
 
@@ -923,7 +928,8 @@ void blk_queue_end_tag(request_queue_t *
rq-tag = -1;
 
if (unlikely(bqt-tag_index[tag] == NULL))
-   printk(tag %d is missing\n, tag);
+   printk(KERN_ERR %s: tag %d is missing\n,
+  __FUNCTION__, tag);
 
bqt-tag_index[tag] = NULL;
bqt-busy--;
@@ -956,8 +962,9 @@ int blk_queue_start_tag(request_queue_t 
 
if (unlikely((rq-flags  REQ_QUEUED))) {
printk(KERN_ERR 
-  request %p for device [%s] already tagged %d,
-  rq, rq-rq_disk ? rq-rq_disk-disk_name : ?, rq-tag);
+  %s: request %p for device [%s] already tagged %d,
+  __FUNCTION__, rq,
+  rq-rq_disk ? rq-rq_disk-disk_name : ?, rq-tag);
BUG();
}
 
@@ -1000,7 +1007,8 @@ void blk_queue_invalidate_tags(request_q
rq = list_entry_rq(tmp);
 
if (rq-tag == -1) {
-   printk(bad tag found on list\n);
+   printk(KERN_ERR
+  %s: bad tag found on list\n, __FUNCTION__);
list_del_init(rq-queuelist);
rq-flags = ~REQ_QUEUED;
} else

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Updated: Dynamic Tick version 050408-1 - C-state measures

2005-04-20 Thread Dominik Brodowski
On Tue, Apr 19, 2005 at 11:03:30PM +0200, Thomas Renninger wrote:
  All we need to do is to update the diff. Without dynamic ticks, if the
  idle loop didn't get called each jiffy, it was a big hint that there was so
  much activity in between, and if there is activity, there is most likely
  also bus master activity, or at least more work to do, so interrupt activity
  is likely. Therefore we assume there was bm_activity even if there was none.
 
 If I understand this right you want at least wait 32 (or whatever value) ms 
 if there was bm activity,
 before it is allowed to trigger C3/C4?

That's the theory of operation of the current algorithm. I think that we
should do that small change to the current algorithm which allows us to keep
C3/C4 working with dyn-idle first, and then think of a very small abstraction
layer to test different idle algroithms, and -- possibly -- use different
ones for different usages.

 I think the problem is (at least I made the experience with this particular
 machine) that bm activity comes very often and regularly (each 30-150ms?).
 
 I think the approach to directly adjust the latency to a deeper sleep state 
 if the
 average bus master and OS activity is low is very efficient.
 
 Because I don't consider whether there was bm_activity the last ms, I only
 consider the average, it seems to happen that I try to trigger
 C3/C4 when there is just something copied and some bm active ?!?

I don't think that this is perfect behaviour: if the system is idle, and
there is _currently_ bus master activity, the CPU should be put into C1 or
C2 type sleep. If you select C3 and actually enter it, you're risking
DMA issues, AFAICS.

 The patch is useless if these failures end up in system freezes on
 other machines...

I know that my patch is useless in its current form, but I wanted to share
it as a different way of doing things. 

 The problem with the old approach is, that after (doesn't matter C1-Cx)
 sleep and dyn_idle_tick, the chance to wake up because of bm activity is
 very likely.
 You enter idle() again - there was bm_activity - C2. Wake up after e.g.
 50ms, because of bm_activity again (bm_sts bit set) - stay in C2, wake up
 after 40ms - bm activity... You only have the chance to get into deeper
 states if the sleeps are interrupted by an interrupt, not bm activity.

That's a side-effect, indeed. However: if there _is_ bus master activity, we
must not enter C3, AFAICS.

Dominik
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH Linux 2.6.12-rc2 00/04] blk: generic tag support fixes

2005-04-20 Thread Jens Axboe
On Wed, Apr 20 2005, Tejun Heo wrote:
  Hello, Jens.
 
  These are fixes to generic tag support in the blk layer.  They all
 compile okay and I've proof read it, but as I don't have any HBA which
 uses generic tag support, I wasn't able to test directly.  However,
 all changes are fairly straight-forward.

All patches look good, thanks!

-- 
Jens Axboe

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Updated: Dynamic Tick version 050408-1 - C-state measures

2005-04-20 Thread Pavel Machek
Hi!

  Because I don't consider whether there was bm_activity the last ms, I only
  consider the average, it seems to happen that I try to trigger
  C3/C4 when there is just something copied and some bm active ?!?
 
 I don't think that this is perfect behaviour: if the system is idle, and
 there is _currently_ bus master activity, the CPU should be put into C1 or
 C2 type sleep. If you select C3 and actually enter it, you're risking
 DMA issues, AFAICS.

What kinds of DMA issues? Waiting 32msec or so is only heuristic; it
can go wrong any time. It would be really bad if it corrupted data or
something like that.
Pavel

-- 
Boycott Kodak -- for their patent abuse against Java.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC][PATCH] nameing reserved pages [0/3]

2005-04-20 Thread KAMEZAWA Hiroyuki
Hi,
There are several types of PG_reserved pages,
(a) Memory Hole
(b) Used by Kernel
(c) Set by drivers
(d) Isorated by MCA
(e) used by perfmon
etc
I think it's useful to distinguish many types of PG_reserved pages.
For example, Memory Hotplug can ignore (a).
2 patches [1/3][2/3] are for naming PG_reserved pages.
A type of a page is recoreded in page-private.
I'm not sure whether this is safe or not, so only reserved-at-boot pages are 
named, currently.
patch [3/3] is an interface to show state of memmap, /dev/memstate.
In /dev/memstate, file offset is pfn and a byte represents a state of a page.
In this patch, memory hole and Reserved pages has its value.
below is output of my box.
0xff --- Invalid page
0x00 --- Common page
0x02 --- Reserved at boot page
[EMAIL PROTECTED] char]#  od  -t x1 -j 0 -N 65535 /dev/memstate
000 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
*
0001540 ff ff ff ff ff ff ff ff ff ff ff ff ff ff 02 02
0001560 02 02 02 02 02 02 02 02 02 02 02 02 02 02 02 02
*
0002400 02 02 02 00 00 00 00 00 00 02 02 02 02 02 02 02
0002420 02 02 02 02 02 02 02 02 02 02 02 02 02 02 02 02
*
0003400 02 02 02 02 02 02 02 02 02 02 02 00 00 00 00 00
0003420 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
*
001 02 02 02 02 02 02 02 02 02 02 02 02 02 02 02 02
*
0010640 02 02 02 02 02 02 02 02 02 02 02 02 02 02 00 00
0010660 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
This would be useful for Memory-Hotplug and some other stuffs.
I think more detailed types can be supported.
Thanks.
-- Kame [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Updated: Dynamic Tick version 050408-1 - C-state measures

2005-04-20 Thread Dominik Brodowski
On Wed, Apr 20, 2005 at 01:57:39PM +0200, Pavel Machek wrote:
 Hi!
 
   Because I don't consider whether there was bm_activity the last ms, I only
   consider the average, it seems to happen that I try to trigger
   C3/C4 when there is just something copied and some bm active ?!?
  
  I don't think that this is perfect behaviour: if the system is idle, and
  there is _currently_ bus master activity, the CPU should be put into C1 or
  C2 type sleep. If you select C3 and actually enter it, you're risking
  DMA issues, AFAICS.
 
 What kinds of DMA issues? Waiting 32msec or so is only heuristic; it
 can go wrong any time. It would be really bad if it corrupted data or
 something like that.

loop()
   a) bus mastering activity is going on at the very moment
   b) the CPU is entering C3
   c) the CPU is woken out of C3 because of bus mastering activity

the repeated delay between b) and c) might be problematic, as can be seen
by the comment in processor_idle.c:

 * TBD: A better policy might be to fallback to the demotion
 *  state (use it for this quantum only) istead of
 *  demoting -- and rely on duration as our sole demotion
 *  qualification.  This may, however, introduce DMA
 *  issues (e.g. floppy DMA transfer overrun/underrun).
 */

I'm not so worried about floppy DMA but about the ipw2x00 issues here.

Dominik
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC][PATCH] nameing reserved pages [1/3]

2005-04-20 Thread KAMEZAWA Hiroyuki
inline functions for naming pages.
-- Kame

Adding page_type definitions and funcs for naming reserved pages.

Reserved page's information is stored into page-private.

This is a weak naming method and anyone can overwrite it. 

This information is used in /dev/memstate in following patch.

Signed-off-by: KAMEZAWA Hiroyuki [EMAIL PROTECTED]


---

 linux-2.6.12-rc2-kamezawa/include/linux/mm.h |   31 +++
 1 files changed, 31 insertions(+)

diff -puN include/linux/mm.h~name_reserved include/linux/mm.h
--- linux-2.6.12-rc2/include/linux/mm.h~name_reserved   2005-04-20 
09:37:48.0 +0900
+++ linux-2.6.12-rc2-kamezawa/include/linux/mm.h2005-04-20 
10:38:01.0 +0900
@@ -348,6 +348,37 @@ static inline void put_page(struct page 
 #endif /* CONFIG_HUGETLB_PAGE */
 
 /*
+ * Type of Pages. This is used in /dev/memstate.
+ * value range is 0-255.
+ */
+enum page_type {
+   Page_Common = 0,
+   Min_Reserved_Types = 1,
+   Rserved_Unknwon = 1,
+   Reserved_At_Boot,
+   Max_Reserved_Types,
+   Page_Invalid = 0xff
+};
+/*
+ * Basically, page-private has no meaning without PG_private.
+ * Here, we use page-private for PG_reserved pages to record type of a page.
+ * Because a page is reserved, anyone will not modify page-private.
+ * When it is freed, page-private will be overwritten by some code.
+ */
+static inline void set_page_reserved(struct page *page, unsigned char type)
+{
+   SetPageReserved(page);
+   page-private = type;
+}
+
+static inline unsigned char reserved_page_type(struct page *page)
+{
+   if (!PageReserved(page))
+   return 0;
+   return (unsigned char)page-private;
+}
+
+/*
  * Multiple processes may see the same page. E.g. for untouched
  * mappings of /dev/null, all processes see the same page full of
  * zeroes, and text pages of executables and shared libraries have

_


[RFC][PATCH] nameing reserved pages [2/3]

2005-04-20 Thread KAMEZAWA Hiroyuki
naming reserved-at-boot page.
-- Kame

Nameing Reserved at boot pages as Reserved_At_Boot.

Signed-off-by: KAMEZAWA Hiroyuki [EMAIL PROTECTED]


---

 linux-2.6.12-rc2-kamezawa/mm/page_alloc.c |2 +-
 1 files changed, 1 insertion(+), 1 deletion(-)

diff -puN mm/page_alloc.c~mark_reserved_boot mm/page_alloc.c
--- linux-2.6.12-rc2/mm/page_alloc.c~mark_reserved_boot 2005-04-20 
10:39:22.0 +0900
+++ linux-2.6.12-rc2-kamezawa/mm/page_alloc.c   2005-04-20 10:39:22.0 
+0900
@@ -1589,7 +1589,7 @@ void __init memmap_init_zone(unsigned lo
set_page_zone(page, NODEZONE(nid, zone));
set_page_count(page, 0);
reset_page_mapcount(page);
-   SetPageReserved(page);
+   set_page_reserved(page, Reserved_At_Boot);
INIT_LIST_HEAD(page-lru);
 #ifdef WANT_PAGE_VIRTUAL
/* The shift won't overflow because ZONE_NORMAL is below 4G. */

_


[RFC][PATCH] nameing reserved pages [3/3]

2005-04-20 Thread KAMEZAWA Hiroyuki
showing state of memmap by /dev/memstate.
-- Kame

/dev/memstate shows status of memmap.
A user can read state of a page as a byte.
This feature is useful for Memory-hotplug and other stuffs,
where we have to investigate for what a page is.

Signed-off-by: KAMEZAWA Hiroyuki [EMAIL PROTECTED]



---

 linux-2.6.12-rc2-kamezawa/drivers/char/mem.c |   63 +++
 1 files changed, 63 insertions(+)

diff -puN drivers/char/mem.c~show_memstate drivers/char/mem.c
--- linux-2.6.12-rc2/drivers/char/mem.c~show_memstate   2005-04-20 
10:39:40.0 +0900
+++ linux-2.6.12-rc2-kamezawa/drivers/char/mem.c2005-04-20 
16:51:45.0 +0900
@@ -715,6 +715,59 @@ static int open_port(struct inode * inod
return capable(CAP_SYS_RAWIO) ? 0 : -EPERM;
 }
 
+static inline unsigned char get_page_type(struct page *page)
+{
+   if ( !PageReserved(page) )
+   return Page_Common;
+   return reserved_page_type(page);
+}
+
+static ssize_t read_memstate(struct file *file, char __user *buf,
+size_t count, loff_t *ppos)
+{
+   unsigned long pfn = *ppos;
+   unsigned long left, written;
+   ssize_t ret;
+   int len, i;
+   struct page *page;
+   char *kbuf;
+
+   if (!count)
+   return 0;
+
+   if (!access_ok(VERIFY_WRITE, buf, count))
+   return -EFAULT;
+
+   left = count;
+   written = 0;
+   kbuf = (char *)__get_free_page(GFP_KERNEL);
+
+   if (!kbuf)
+   return -ENOMEM;
+   ret = -EFAULT;
+   /* copy data */
+   while (left) {
+   len = (left  PAGE_SIZE) ? left : PAGE_SIZE;
+   for (i = 0; i  len; i++, pfn++) {
+   if ( !pfn_valid(pfn) ) {
+   kbuf[i] = Page_Invalid;
+   continue;
+   }
+   page = pfn_to_page(pfn);
+   kbuf[i] = get_page_type(page);
+   }
+   if (copy_to_user(buf, kbuf, len))
+   goto err_out;
+   written += len;
+   left -= len;
+   }
+   *ppos = pfn;
+   ret = written;
+ err_out:
+   free_page((unsigned long)kbuf);
+   return ret;
+}
+
 #define zero_lseek null_lseek
 #define full_lseek  null_lseek
 #define write_zero write_null
@@ -770,6 +823,11 @@ static struct file_operations full_fops 
.write  = write_full,
 };
 
+static struct file_operations memstate_fops = {
+   .llseek = memory_lseek,
+   .read   = read_memstate,
+};
+
 static ssize_t kmsg_write(struct file * file, const char __user * buf,
  size_t count, loff_t *ppos)
 {
@@ -825,6 +883,9 @@ static int memory_open(struct inode * in
case 11:
filp-f_op = kmsg_fops;
break;
+   case 12:
+   filp-f_op = memstate_fops;
+   break;
default:
return -ENXIO;
}
@@ -854,6 +915,7 @@ static const struct {
{8, random,  S_IRUGO | S_IWUSR,   random_fops},
{9, urandom, S_IRUGO | S_IWUSR,   urandom_fops},
{11,kmsg,S_IRUGO | S_IWUSR,   kmsg_fops},
+   {12,memstate, S_IRUSR | S_IRGRP,  memstate_fops},
 };
 
 static struct class_simple *mem_class;
@@ -870,6 +932,7 @@ static int __init chr_dev_init(void)
class_simple_device_add(mem_class,
MKDEV(MEM_MAJOR, devlist[i].minor),
NULL, devlist[i].name);
+   printk(creating mem device %d %d\n,i,devlist[i].minor);
devfs_mk_cdev(MKDEV(MEM_MAJOR, devlist[i].minor),
S_IFCHR | devlist[i].mode, devlist[i].name);
}

_


Re: [PATCH] Updated: Dynamic Tick version 050408-1 - C-state measures

2005-04-20 Thread Pavel Machek
Hi!

Because I don't consider whether there was bm_activity the last ms, I 
only
consider the average, it seems to happen that I try to trigger
C3/C4 when there is just something copied and some bm active ?!?
   
   I don't think that this is perfect behaviour: if the system is idle, and
   there is _currently_ bus master activity, the CPU should be put into C1 or
   C2 type sleep. If you select C3 and actually enter it, you're risking
   DMA issues, AFAICS.
  
  What kinds of DMA issues? Waiting 32msec or so is only heuristic; it
  can go wrong any time. It would be really bad if it corrupted data or
  something like that.
 
 loop()
a) bus mastering activity is going on at the very moment
b) the CPU is entering C3
c) the CPU is woken out of C3 because of bus mastering activity
 
 the repeated delay between b) and c) might be problematic, as can be seen
 by the comment in processor_idle.c:
 
  * TBD: A better policy might be to fallback to the demotion
  *  state (use it for this quantum only) istead of
  *  demoting -- and rely on duration as our sole demotion
  *  qualification.  This may, however, introduce DMA
  *  issues (e.g. floppy DMA transfer overrun/underrun).
  */
 
 I'm not so worried about floppy DMA but about the ipw2x00 issues here.

Like ipw2x00 looses packets if this happens too often?

Pavel
-- 
Boycott Kodak -- for their patent abuse against Java.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC/PATCH] unregister_node() for hotplug use

2005-04-20 Thread Keiichiro Tokunaga
  This is to add a generic function 'unregister_node()'.
It is used to remove objects of a node going away for
hotplug.  If CONFIG_HOTPLUG=y, it becomes available.
This is against 2.6.12-rc2-mm3.

Thanks,
Keiichiro Tokunaga

Signed-off-by: Keiichiro Tokunaga [EMAIL PROTECTED]
---

 linux-2.6.12-rc2-mm3-kei/drivers/base/node.c  |   29 --
 linux-2.6.12-rc2-mm3-kei/include/linux/node.h |6 -
 2 files changed, 32 insertions(+), 3 deletions(-)

diff -puN drivers/base/node.c~numa_hp_base drivers/base/node.c
--- linux-2.6.12-rc2-mm3/drivers/base/node.c~numa_hp_base   2005-04-14 
20:49:37.0 +0900
+++ linux-2.6.12-rc2-mm3-kei/drivers/base/node.c2005-04-14 
20:49:37.0 +0900
@@ -136,7 +136,7 @@ static SYSDEV_ATTR(distance, S_IRUGO, no
  *
  * Initialize and register the node device.
  */
-int __init register_node(struct node *node, int num, struct node *parent)
+int __devinit register_node(struct node *node, int num, struct node *parent)
 {
int error;
 
@@ -145,6 +145,9 @@ int __init register_node(struct node *no
error = sysdev_register(node-sysdev);
 
if (!error){
+   /*
+* If you add new object here, delete it when unregistering.
+*/
sysdev_create_file(node-sysdev, attr_cpumap);
sysdev_create_file(node-sysdev, attr_meminfo);
sysdev_create_file(node-sysdev, attr_numastat);
@@ -153,8 +156,30 @@ int __init register_node(struct node *no
return error;
 }
 
+/*
+ * unregister_node - Remove objects of a node going away from sysfs.
+ * @node - node going away
+ *
+ * This is used only for hotplug.
+ */
+#ifdef CONFIG_HOTPLUG
+void unregister_node(struct node *node)
+{
+   if (node == NULL)
+   return;
+
+   sysdev_remove_file(node-sysdev, attr_cpumap);
+   sysdev_remove_file(node-sysdev, attr_meminfo);
+   sysdev_remove_file(node-sysdev, attr_numastat);
+   sysdev_remove_file(node-sysdev, attr_distance);
+
+   sysdev_unregister(node-sysdev);
+}
+EXPORT_SYMBOL(register_node);
+EXPORT_SYMBOL(unregister_node);
+#endif /* CONFIG_HOTPLUG */
 
-int __init register_node_type(void)
+static int __init register_node_type(void)
 {
return sysdev_class_register(node_class);
 }
diff -puN include/linux/node.h~numa_hp_base include/linux/node.h
--- linux-2.6.12-rc2-mm3/include/linux/node.h~numa_hp_base  2005-04-14 
20:49:37.0 +0900
+++ linux-2.6.12-rc2-mm3-kei/include/linux/node.h   2005-04-14 
20:49:37.0 +0900
@@ -21,12 +21,16 @@
 
 #include linux/sysdev.h
 #include linux/cpumask.h
+#include linux/module.h
 
 struct node {
struct sys_device   sysdev;
 };
 
-extern int register_node(struct node *, int, struct node *);
+extern int __devinit register_node(struct node *, int, struct node *);
+#ifdef CONFIG_HOTPLUG
+extern void unregister_node(struct node *node);
+#endif
 
 #define to_node(sys_device) container_of(sys_device, struct node, sysdev)
 

_
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


working around gcc bogons (was: Re: Whirlpool oopses in 2.6.11 and 2.6.12-rc2)

2005-04-20 Thread Denis Vlasenko
[resending to lkml for wider audience]

   modprobe tcrypt hangs the box on both kernels.
   The last printks are:
   
   wp256 test runs ok
   
   testing wp384
   NNnUnable to handle kernel paging request at virtual address eXXX
   
   Nothing is printed after this and system locks up solid.
   No Sysrq-B.
   
   IIRC, 2.6.9 was okay.
  
  Update: it does not oops on another machine. CPU or .config related,
  I'll look into it...
 
 Any update?  This is candidate for -stable fixing if it's an actual bug.

Yes. wp512_process_buffer() was using 3k of stack if compiled with -O2.
The appenede wp512.c (sans table at top) is instrumented to show it.
Use make crypto/wp512.s.

The meat of the matter is this:

L[0]  = C0[BYTE7(K[0])] XEND
X(L[0]) C1[BYTE6(K[7])] XEND
X(L[0]) C2[BYTE5(K[6])] XEND
X(L[0]) C3[BYTE4(K[5])] XEND
X(L[0]) C4[BYTE3(K[4])] XEND
X(L[0]) C5[BYTE2(K[3])] XEND
X(L[0]) C6[BYTE1(K[2])] XEND
X(L[0]) C7[BYTE0(K[1])] XEND
X(L[0]) rc[r];

It can be done either as v = c1 ^ c2 ^ c3 ^ c4;
or as v = c1; v ^= c2; v ^= c3; v ^= c4; macros
X and XEND allow to test both ways.

gcc 3.2.3 (IIRC, the box is at home) eats
~3k of stack for first method. gcc 3.4.1
seems to do better, but gcc 3.2.3 is in wide
use.

There is more.

#define BYTE7(v) ((u8)((v)  56))
gcc produce full u32 load, shift and zero extend
for this one.

#define BYTE7(v) (((u8*)v)[7])
gcc does simple byte load.

There is even more.

#define BYTE3(v) ((u8)((u32)(v)  24))
   ^
Without this, gcc produces:
shrdl   $8, %edx, %eax === not needed
andl$255, %eax

This is all seen on i386 only. I expect this to be different
on each arch.

I'd like to generate good code, yet without heavy tailoring
for gcc versions and CPU architectures.

What shall I do?
--
vda

/**
 * The core Whirlpool transform.
 */

static void wp512_process_buffer(struct wp512_ctx *wctx) {
int i, r;
u64 K[8];/* the round key */
u64 block[8];/* mu(buffer) */
u64 state[8];/* the cipher state */
u64 L[8];

for (i = 0; i  8; i++) {
block[i] = be64_to_cpu( ((__be64*)wctx-buffer)[i] );
}

state[0] = block[0] ^ (K[0] = wctx-hash[0]);
state[1] = block[1] ^ (K[1] = wctx-hash[1]);
state[2] = block[2] ^ (K[2] = wctx-hash[2]);
state[3] = block[3] ^ (K[3] = wctx-hash[3]);
state[4] = block[4] ^ (K[4] = wctx-hash[4]);
state[5] = block[5] ^ (K[5] = wctx-hash[5]);
state[6] = block[6] ^ (K[6] = wctx-hash[6]);
state[7] = block[7] ^ (K[7] = wctx-hash[7]);


// gcc optimizer bug: first method is noticeably
// worse than second: loads full u32, shifts and
// zero-extends low u8 to u32
#if 0
 #define BYTE7(v) ((u8)((v)  56))
 #define BYTE6(v) ((u8)((v)  48))
 #define BYTE5(v) ((u8)((v)  40))
 #define BYTE4(v) ((u8)((v)  32))
 // gcc optimizer bug: without (u32) below will emit
 // spurious shrd insns
 #define BYTE3(v) ((u8)((u32)(v)  24))
 #define BYTE2(v) ((u8)((u32)(v)  16))
 #define BYTE1(v) ((u8)((u32)(v)   8))
 #define BYTE0(v) ((u8)(v))
#else
// little-endian
 #define BYTE7(v) (((u8*)v)[7])
 #define BYTE6(v) (((u8*)v)[6])
 #define BYTE5(v) (((u8*)v)[5])
 #define BYTE4(v) (((u8*)v)[4])
 #define BYTE3(v) (((u8*)v)[3])
 #define BYTE2(v) (((u8*)v)[2])
 #define BYTE1(v) (((u8*)v)[1])
 #define BYTE0(v) (((u8*)v)[0])
#endif

// gcc -O2 optimizer bug: second method
// causes excessive spills (~3K stack used)
#if 1
 #define X(a) a ^=
 #define XEND ;
#else
 #define X(a) ^
 #define XEND
#endif
for (r = 1; r = WHIRLPOOL_ROUNDS; r++) {
asm(#1);
L[0]  = C0[BYTE7(K[0])] XEND
X(L[0]) C1[BYTE6(K[7])] XEND
X(L[0]) C2[BYTE5(K[6])] XEND
X(L[0]) C3[BYTE4(K[5])] XEND
X(L[0]) C4[BYTE3(K[4])] XEND
X(L[0]) C5[BYTE2(K[3])] XEND
X(L[0]) C6[BYTE1(K[2])] XEND
X(L[0]) C7[BYTE0(K[1])] XEND
X(L[0]) rc[r];
asm(#2);

L[1]  = C0[BYTE7(K[1])] XEND
X(L[1]) C1[BYTE6(K[0])] XEND
X(L[1]) C2[BYTE5(K[7])] XEND
X(L[1]) C3[BYTE4(K[6])] XEND
X(L[1]) C4[BYTE3(K[5])] XEND
X(L[1]) C5[BYTE2(K[4])] XEND
X(L[1]) C6[BYTE1(K[3])] XEND
X(L[1]) C7[BYTE0(K[2])];

L[2]  = C0[BYTE7(K[2])] XEND
X(L[2]) C1[BYTE6(K[1])] XEND
X(L[2]) C2[BYTE5(K[0])] XEND
X(L[2]) C3[BYTE4(K[7])] XEND
X(L[2]) C4[BYTE3(K[6])] XEND
X(L[2]) C5[BYTE2(K[5])] XEND
X(L[2]) C6[BYTE1(K[4])] XEND
X(L[2]) C7[BYTE0(K[3])];

L[3]  = C0[BYTE7(K[3])] XEND
X(L[3]) C1[BYTE6(K[2])] XEND
X(L[3]) 

Re: [PATCH] Updated: Dynamic Tick version 050408-1 - C-state measures

2005-04-20 Thread Dominik Brodowski
Hi,

On Wed, Apr 20, 2005 at 02:08:46PM +0200, Pavel Machek wrote:
 Like ipw2x00 looses packets if this happens too often?

See PCI latency error if C3 enabled on http://ipw2100.sf.net -- it causes
network instability, frequent firmware restarts.

Dominik
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[no subject]

2005-04-20 Thread vxoohxrniwih
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] nameing reserved pages [0/3]

2005-04-20 Thread Arjan van de Ven
On Wed, 2005-04-20 at 21:02 +0900, KAMEZAWA Hiroyuki wrote:
 Hi,
 
 There are several types of PG_reserved pages,
 (a) Memory Hole
 (b) Used by Kernel
 (c) Set by drivers
 (d) Isorated by MCA
 (e) used by perfmon
 etc
 
 I think it's useful to distinguish many types of PG_reserved pages.

I'm not so sure about this. at all.

 For example, Memory Hotplug can ignore (a).

Memory Hotplug can also use page_is_ram().

/dev/memstate really looks like a bad idea to me as well... I rather
have less than more /dev/*mem*



-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC/PATCH] unregister_node() for hotplug use

2005-04-20 Thread Arjan van de Ven

 diff -puN drivers/base/node.c~numa_hp_base drivers/base/node.c
 --- linux-2.6.12-rc2-mm3/drivers/base/node.c~numa_hp_base 2005-04-14 
 20:49:37.0 +0900
 +++ linux-2.6.12-rc2-mm3-kei/drivers/base/node.c  2005-04-14 
 20:49:37.0 +0900
 @@ -136,7 +136,7 @@ static SYSDEV_ATTR(distance, S_IRUGO, no
   
 +EXPORT_SYMBOL(register_node);
 +EXPORT_SYMBOL(unregister_node);
 +#endif /* CONFIG_HOTPLUG */
  

please make these EXPORT_SYMBOL_GPL; the rest of sysfs is too and this
is very much deep kernel internals that are linux specific.


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Updated: Dynamic Tick version 050408-1 - C-state measures

2005-04-20 Thread Thomas Renninger
Dominik Brodowski wrote:
 On Tue, Apr 19, 2005 at 11:03:30PM +0200, Thomas Renninger wrote:
All we need to do is to update the diff. Without dynamic ticks, if the
idle loop didn't get called each jiffy, it was a big hint that there was so
much activity in between, and if there is activity, there is most likely
also bus master activity, or at least more work to do, so interrupt activity
is likely. Therefore we assume there was bm_activity even if there was none.

If I understand this right you want at least wait 32 (or whatever value) ms 
if there was bm activity,
before it is allowed to trigger C3/C4?
 
 That's the theory of operation of the current algorithm. I think that we
 should do that small change to the current algorithm which allows us to keep
 C3/C4 working with dyn-idle first, and then think of a very small abstraction
 layer to test different idle algroithms, and -- possibly -- use different
 ones for different usages.
 
I think the problem is (at least I made the experience with this particular
machine) that bm activity comes very often and regularly (each 30-150ms?).

I think the approach to directly adjust the latency to a deeper sleep state 
if the
average bus master and OS activity is low is very efficient.

Because I don't consider whether there was bm_activity the last ms, I only
consider the average, it seems to happen that I try to trigger
C3/C4 when there is just something copied and some bm active ?!?
 
 I don't think that this is perfect behaviour: if the system is idle, and
 there is _currently_ bus master activity, the CPU should be put into C1 or
 C2 type sleep. If you select C3 and actually enter it, you're risking
 DMA issues, AFAICS.
 
On my system triggering C3/C4 is just ignored (sleep_ticks  0).
These ignorings (C3/C4 failures) seem to directly depend on how much bm_activity
there actually is.
With the current method (wait at least 30 ms if there was bm activity before
triggering C3/C4) these failures never happened.
As mentioned using bm_promotion_ms you can lower the failures, but never reach 
zero.
If these failures lead to system freezes on other systems, my next sentence is 
valid
(I meant my patch).

The patch is useless if these failures end up in system freezes on
other machines...
 
 I know that my patch is useless in its current form, but I wanted to share
 it as a different way of doing things. 
 
The problem with the old approach is, that after (doesn't matter C1-Cx)
sleep and dyn_idle_tick, the chance to wake up because of bm activity is
very likely.
You enter idle() again - there was bm_activity - C2. Wake up after e.g.
50ms, because of bm_activity again (bm_sts bit set) - stay in C2, wake up
after 40ms - bm activity... You only have the chance to get into deeper
states if the sleeps are interrupted by an interrupt, not bm activity.
 
 That's a side-effect, indeed. However: if there _is_ bus master activity, we
 must not enter C3, AFAICS.
 

What about a mixed approach: only reprogram timer if you want to go to deeper
sleeping states (C3-Cx) when bm activity comes in place?

It's the only way you can say: the last xy ms there was no bm activity (use 
bm_history),
now it's safe to sleep and also be efficient: don't sleep forever in C1/C2 - 
bm_sts bit
will probably be set afterwards and you need to wait another xy ms in C1/C2
- endless loop ...

Like that the timer is only disabled where it is really useful, on C3-Cx 
machines
(or are there other cases?).


Thomas
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: GPL violation by CorAccess?

2005-04-20 Thread Steven Rostedt
On Wed, 2005-04-20 at 09:30 +0200, Bernd Petrovitsch wrote:

 
 As long as they do not statically link against LGPL (or GPL) code and as
 long as they do not link dynamically agaist GPL code. And there are
 probably more rules .
 

Actually, I believe that the LGPL allows for static linking as well. As
long as you only interact with the library through the defined API, it
is OK.

From the LGPL preamble:

 The precise terms and conditions for copying, distribution and
modification follow.  Pay close attention to the difference between a
work based on the library and a work that uses the library.  The
former contains code derived from the library, whereas the latter must
be combined with the library in order to run.


Point number 5 of TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND 
MODIFICATION:

 5. A program that contains no derivative of any portion of the
Library, but is designed to work with the Library by being compiled or
linked with it, is called a work that uses the Library.  Such a
work, in isolation, is not a derivative work of the Library, and
therefore falls outside the scope of this License.


So, I would say that the LGPL _does_ allow statically linked to non GPL
work.


-- Steve


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [2.6 patch] drivers/net/sk98lin/: possible cleanups

2005-04-20 Thread Adrian Bunk
On Wed, Apr 20, 2005 at 09:39:28AM +0100, Christoph Hellwig wrote:
 On Wed, Apr 20, 2005 at 04:15:26AM +0200, Adrian Bunk wrote:
  This patch contains the following possible cleanups:
  - make needlessly global functions static
  - remove unused code
 
 Not sure it's worth doing much on this, as the driver is beeing
 obsoleted by the skge driver.

I know, but as long as it's in the kernel I'm sending such patches.

cu
Adrian

-- 

   Is there not promise of rain? Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
   Only a promise, Lao Er said.
   Pearl S. Buck - Dragon Seed

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: GPL violation by CorAccess?

2005-04-20 Thread Arjan van de Ven
On Wed, 2005-04-20 at 08:49 -0400, Steven Rostedt wrote:
 On Wed, 2005-04-20 at 09:30 +0200, Bernd Petrovitsch wrote:
 
  
  As long as they do not statically link against LGPL (or GPL) code and as
  long as they do not link dynamically agaist GPL code. And there are
  probably more rules .
  
 
 Actually, I believe that the LGPL allows for static linking as well.

it does, as long as you provide the .o files of your own stuff so that
the end user can relink with  say a bugfixed version of library.


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: GPL violation by CorAccess?

2005-04-20 Thread Steven Rostedt
On Wed, 2005-04-20 at 14:57 +0200, Arjan van de Ven wrote:
 On Wed, 2005-04-20 at 08:49 -0400, Steven Rostedt wrote:
  On Wed, 2005-04-20 at 09:30 +0200, Bernd Petrovitsch wrote:
  
   
   As long as they do not statically link against LGPL (or GPL) code and as
   long as they do not link dynamically agaist GPL code. And there are
   probably more rules .
   
  
  Actually, I believe that the LGPL allows for static linking as well.
 
 it does, as long as you provide the .o files of your own stuff so that
 the end user can relink with  say a bugfixed version of library.

I don't see that in the license.  As point 5 showed: Such a
work, in isolation, is not a derivative work of the Library, and
therefore falls outside the scope of this License. So you don't need to
do anything more than supply the source of the LPGL work. In fact, it
may not be a good idea to add a bugfixed version of the libary without
going through the vendor. You don't know if the application that uses
this depends on the side effects of the bug.

-- Steve


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH x86_64] Live Patching Function on 2.6.11.7

2005-04-20 Thread Ralf Baechle
On Mon, Apr 18, 2005 at 02:25:06AM -0700, Chris Wedgwood wrote:

  The call switching folks have been doing live patching at least
  since I worked on it, over 25 years ago.  This is not just
  marketing.
 
 That still doesn't explain *why* live patching is needed.

The more optimization a modern compiler does the less practical a patching
approach seems for anything but very trivial fixes.

I'd try a shared library based approach for on the fly updates.

  Ralf
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[Patch] Staircase cpu scheduler v11

2005-04-20 Thread Con Kolivas
The staircase single priority array foreground/background cpu scheduler has 
been updated to version 11.

Numerous minor behavioural issues have been abolished with a much cleaner 
simple mathematical priority elevation/dropping mechanism and virtually no 
interactivity estimation algorithms exist in the design. Behaviour across 
all loads appears to have been improved with this.

Worst case latencies have been much improved with in-kernel work on behalf of 
user processes being debited from the user processes.

Lots of micro-optimisations were added and throughput has improved.

All forms of gaming and audio issues appear to have been abolished.

A rolled up patch for 2.6.11 is here:
http://ck.kolivas.org/patches/2.6/2.6.11/2.6.11_to_staircase11.diff

and an incremental from 2.6.11-ck4 to staircase 11 is here:
http://ck.kolivas.org/patches/2.6/2.6.11/2.6.11-ck4_to_staircase11.diff

The next -ck release will use this as its base.

Thanks to all the people testing this work and giving feedback.

Cheers,
Con


pgpGuFUYpP4Il.pgp
Description: PGP signature


Re: GPL violation by CorAccess?

2005-04-20 Thread Michael Poole
Steven Rostedt writes:

 On Wed, 2005-04-20 at 14:57 +0200, Arjan van de Ven wrote:
 On Wed, 2005-04-20 at 08:49 -0400, Steven Rostedt wrote:
  On Wed, 2005-04-20 at 09:30 +0200, Bernd Petrovitsch wrote:
  
   
   As long as they do not statically link against LGPL (or GPL) code and as
   long as they do not link dynamically agaist GPL code. And there are
   probably more rules .
   
  
  Actually, I believe that the LGPL allows for static linking as well.
 
 it does, as long as you provide the .o files of your own stuff so that
 the end user can relink with  say a bugfixed version of library.

 I don't see that in the license.  As point 5 showed: Such a
 work, in isolation, is not a derivative work of the Library, and
 therefore falls outside the scope of this License.

Such a work refers to A program that contains no derivative of any
portion of the library.  A program that is statically linked against
the library clearly contains part or all of the library, and cannot
qualify for the lower threshold of section 5.  Section 5 is talking
about late binding to the library; dynamic linking is one example.

For programs distributed as object code that does contain part of the
library, the distributor must -- sooner or later -- comply with 6(a)
(allow the user to relink) or 6(b) (use dynamic linking).

Michael Poole
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: GPL violation by CorAccess?

2005-04-20 Thread Arjan van de Ven
On Wed, 2005-04-20 at 09:07 -0400, Steven Rostedt wrote:
 On Wed, 2005-04-20 at 14:57 +0200, Arjan van de Ven wrote:
  On Wed, 2005-04-20 at 08:49 -0400, Steven Rostedt wrote:
   On Wed, 2005-04-20 at 09:30 +0200, Bernd Petrovitsch wrote:
   

As long as they do not statically link against LGPL (or GPL) code and as
long as they do not link dynamically agaist GPL code. And there are
probably more rules .

   
   Actually, I believe that the LGPL allows for static linking as well.
  
  it does, as long as you provide the .o files of your own stuff so that
  the end user can relink with  say a bugfixed version of library.
 
 I don't see that in the license.  As point 5 showed: Such a
 work, in isolation, is not a derivative work of the Library, and

you missed the point in isolation. If you do NOT link against the lib,
eg your app in isolation, you don't have to care abuot the LGPL. That is
what it says. The moment you do link you are no longer in isolation.


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: bdflush/rpciod high CPU utilization, profile does not make sense

2005-04-20 Thread Jakob Oestergaard
On Tue, Apr 19, 2005 at 06:46:28PM -0400, Trond Myklebust wrote:
 ty den 19.04.2005 Klokka 21:45 (+0200) skreiv Jakob Oestergaard:
 
  It mounts a home directory from a 2.6.6 NFS server - the client and
  server are on a hub'ed 100Mbit network.
  
  On the earlier 2.6 client I/O performance was as one would expect on
  hub'ed 100Mbit - meaning, not exactly stellar, but you'd get around 4-5
  MB/sec and decent interactivity.
 
 OK, hold it right there...
 
...
 Also, does that hub support NICs that do autonegotiation? (I'll bet the
 answer is no).

*blush*

Ok Trond, you got me there - I don't know why upgrading the client made
the problem much more visible though, but the *server* had negotiated
full duplex rather than half (the client negotiated half ok). Fixing
that on the server side made the client pleasent to work with again.
Mom's a happy camper now again  ;)

Sorry for jumping the gun there...

To get back to the original problem;

I wonder if (as was discussed) the tg3 driver on my NFS server is
dropping packets, causing the 2.6.11 NFS client to misbehave... This
didn't make sense to me before (since earlier clients worked well), but
having just seen this other case where a broken server setup caused
2.6.11 clients to misbehave (where earlier clients were fine), maybe it
could be an explanation...

Will try either changing tg3 driver or putting in an e1000 on my NFS
server - I will let you know about the status on this when I know more.

Thanks all,

-- 

 / jakob

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] nameing reserved pages [0/3]

2005-04-20 Thread Kamezawa Hiroyuki
Arjan van de Ven wrote:
For example, Memory Hotplug can ignore (a).
   

Memory Hotplug can also use page_is_ram().
 

Yes. we can use page_is_ram() for finding (a)memory hole.
But I'd like to catch other removable PG_reserved pages like (d)Isorated 
by MCA (e)used by perfmon and
some of (b) used by kernerl and (c) Set by drivers.
What I'm thinking of is to detect whether memory is hot-removable or not 
before removing actually.

/dev/memstate really looks like a bad idea to me as well... I rather
have less than more /dev/*mem*
 

For showing page usage and its location, I've thought of other 
interface, sysfs, procfs...
But I have no idea.
Physical memory area has vast space and I want to use lseek() or 
ioctl().( I don't like  ioctl())
Do you have any recommendation ?

-- Kame
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] nameing reserved pages [0/3]

2005-04-20 Thread Arjan van de Ven
On Wed, 2005-04-20 at 23:15 +0900, Kamezawa Hiroyuki wrote:
 Arjan van de Ven wrote:
 
 For example, Memory Hotplug can ignore (a).
 
 
 
 Memory Hotplug can also use page_is_ram().
   
 
 Yes. we can use page_is_ram() for finding (a)memory hole.
 But I'd like to catch other removable PG_reserved pages like (d)Isorated 
 by MCA (e)used by perfmon and
 some of (b) used by kernerl and (c) Set by drivers.
 What I'm thinking of is to detect whether memory is hot-removable or not 
 before removing actually.

MCA's probably shouldn't set PG_reserved; I don't see why they should.
They could just steal the page and leak it.

 
 /dev/memstate really looks like a bad idea to me as well... I rather
 have less than more /dev/*mem*
   
 
 For showing page usage and its location, I've thought of other 
 interface, sysfs, procfs...
 But I have no idea.

Why do you want this exported to userspace? There is absolutely no way
you can get this exported race free without shutting the VM down, and
without being race free this information has absolutely no meaning !!
(and when you shut the VM down you really shouldn't depend on userspace
anymore either)



-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] nameing reserved pages [0/3]

2005-04-20 Thread Kamezawa Hiroyuki
Arjan van de Ven wrote:
On Wed, 2005-04-20 at 23:15 +0900, Kamezawa Hiroyuki wrote:
 

MCA's probably shouldn't set PG_reserved; I don't see why they should.
They could just steal the page and leak it.
 

Actually leaked pages cannot be hot-removed/replaced. So we have to 
trace which pages is removed by MCA.
I think Set PG_reserved and set page-private = Removed_by_MCA is a 
simple idea.

/dev/memstate really looks like a bad idea to me as well... I rather
have less than more /dev/*mem*
 

For showing page usage and its location, I've thought of other 
interface, sysfs, procfs...
But I have no idea.
   

Why do you want this exported to userspace? There is absolutely no way
you can get this exported race free without shutting the VM down, and
without being race free this information has absolutely no meaning !!
 

No meaning ? 
Before memory-hotremove, we can guessing whether memory is hot-removable 
or not.
As you say , this is not atomic and not fully responsible.

After failing memory-hotremove, detecting why hot-remove was failed is 
very important.
I think ,when memory hot-remove faild, memory area is isolated until it 
is pushed back by an operator.
We can get a real snapshot of specified memory area.

Regards,
-- Kame
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH x86_64] Live Patching Function on 2.6.11.7

2005-04-20 Thread Chris Friesen
Rik van Riel wrote:
On Wed, 20 Apr 2005, Takashi Ikebe wrote:

Well, as many said Live patching is very historical  authoritative
function on especially carrier, telecom vendor.
If linux want to be adopted on mission critical world, this function is
esseintial.

Yes, if you want to use Linux in those scenarios you will
need to change the telco programs to use shared memory and
file descriptor passing, instead of live patching.
Unfortunately we're also dealing (in many cases) with pre-existing 
software coming over from other OS's.  The beancounters want to avoid 
rewriting the millions of lines of application code, so they'd rather 
add the missing support to the kernel.

If it doesn't go into mainline, we'll just end up with a bunch of 
different telco-patches being maintained on the side.  I highly doubt 
all the applications will get fixed any time soon.

Chris
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


NForce4 ide problems?

2005-04-20 Thread ismail dönmez
Hi all,

I recently bought an Asus A8N-SLI mobo and an AMD 3500+ CPU for my
system but my ide drive seems to have some problems with them. Here is
what I get at boot :

snip
hda: 156368016 sectors (80060 MB) w/2048KiB Cache, CHS=16383/255/63, UDMA(100)
hda: cache flushes supported
 /dev/ide/host0/bus0/target0/lun0:hda: dma_intr: status=0x51 {
DriveReady SeekComplete Error }
hda: dma_intr: error=0x84 { DriveStatusError BadCRC }
ide: failed opcode was: unknown
hda: dma_intr: status=0x51 { DriveReady SeekComplete Error }
hda: dma_intr: error=0x84 { DriveStatusError BadCRC }
ide: failed opcode was: unknown
hda: dma_intr: status=0x51 { DriveReady SeekComplete Error }
hda: dma_intr: error=0x84 { DriveStatusError BadCRC }
ide: failed opcode was: unknown
/snip

First I thought it was bad ide cable ( because I wasn't using the one
that came with mobo ) so I tried with the brand new cable coming with
mobo and same error happened. Also trying to do something like :

hdparm -m16 -c -u1 -d1 -Xudma2 /dev/hda

results in a cpu exception thrown and a kernel panic after that. Full
dmesg log is attached. I appreciate any help/comments.

P.S: I tried with kernel 2.6.10 and 2.6.12-rc2 and same problems happen

Regards,
ismail


-- 
Time is what you make of it
Bootdata ok (command line is [EMAIL PROTECTED] ro root=307)
Linux version 2.6.12-rc2 ([EMAIL PROTECTED]) (gcc version 4.0.0 20050413 
(prerelease) (Debian 4.0-0pre11)) #3 Wed Apr 20 17:07:40 EEST 2005
BIOS-provided physical RAM map:
 BIOS-e820:  - 0009f800 (usable)
 BIOS-e820: 0009f800 - 000a (reserved)
 BIOS-e820: 000f - 0010 (reserved)
 BIOS-e820: 0010 - 3fff (usable)
 BIOS-e820: 3fff - 3fff3000 (ACPI NVS)
 BIOS-e820: 3fff3000 - 4000 (ACPI data)
 BIOS-e820: e000 - f000 (reserved)
 BIOS-e820: fec0 - fec01000 (reserved)
 BIOS-e820: fee0 - fef0 (reserved)
 BIOS-e820: fefffc00 - ff00 (reserved)
 BIOS-e820:  - 0001 (reserved)
ACPI: RSDP (v000 Nvidia) @ 0x000f74b0
ACPI: RSDT (v001 Nvidia AWRDACPI 0x42302e31 AWRD 0x) @ 
0x3fff3000
ACPI: FADT (v001 Nvidia AWRDACPI 0x42302e31 AWRD 0x) @ 
0x3fff3040
ACPI: MCFG (v001 Nvidia AWRDACPI 0x42302e31 AWRD 0x) @ 
0x3fff9280
ACPI: MADT (v001 Nvidia AWRDACPI 0x42302e31 AWRD 0x) @ 
0x3fff9200
ACPI: DSDT (v001 NVIDIA AWRDACPI 0x1000 MSFT 0x010e) @ 
0x
On node 0 totalpages: 262128
  DMA zone: 4096 pages, LIFO batch:1
  Normal zone: 258032 pages, LIFO batch:16
  HighMem zone: 0 pages, LIFO batch:1
Nvidia board detected. Ignoring ACPI timer override.
ACPI: Local APIC address 0xfee0
ACPI: LAPIC (acpi_id[0x00] lapic_id[0x00] enabled)
Processor #0 15:7 APIC version 16
ACPI: LAPIC_NMI (acpi_id[0x00] high edge lint[0x1])
ACPI: IOAPIC (id[0x02] address[0xfec0] gsi_base[0])
IOAPIC[0]: apic_id 2, version 17, address 0xfec0, GSI 0-23
ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
ACPI: BIOS IRQ0 pin2 override ignored.
ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level)
ACPI: INT_SRC_OVR (bus 0 bus_irq 14 global_irq 14 high edge)
ACPI: INT_SRC_OVR (bus 0 bus_irq 15 global_irq 15 high edge)
ACPI: IRQ9 used by override.
ACPI: IRQ14 used by override.
ACPI: IRQ15 used by override.
Setting APIC routing to flat
Using ACPI (MADT) for SMP configuration information
Built 1 zonelists
Kernel command line: [EMAIL PROTECTED] ro root=307 console=tty0
Initializing CPU#0
PID hash table entries: 4096 (order: 12, 131072 bytes)
time.c: Using 1.193182 MHz PIT timer.
time.c: Detected 2211.359 MHz processor.
Console: colour dummy device 80x25
Dentry cache hash table entries: 262144 (order: 9, 2097152 bytes)
Inode-cache hash table entries: 131072 (order: 8, 1048576 bytes)
Memory: 1026920k/1048512k available (1908k kernel code, 20892k reserved, 955k 
data, 132k init)
Calibrating delay loop... 4374.52 BogoMIPS (lpj=2187264)
Mount-cache hash table entries: 256
CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line)
CPU: L2 Cache: 512K (64 bytes/line)
CPU: AMD Athlon(tm) 64 Processor 3500+ stepping 0a
Using local APIC NMI watchdog using perfctr0
Using local APIC timer interrupts.
Detected 12.564 MHz APIC timer.
NET: Registered protocol family 16
PCI: Using configuration type 1
mtrr: v2.0 (20020519)
ACPI: Subsystem revision 20050309
ACPI: Interpreter enabled
ACPI: Using IOAPIC for interrupt routing
ACPI: PCI Root Bridge [PCI0] (:00)
PCI: Probing PCI hardware (bus 00)
PCI: Transparent bridge - :00:09.0
Boot video device is :01:00.0
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0._PRT]
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.HUB0._PRT]
ACPI: PCI Interrupt Link [LNK1] (IRQs 3 4 5 7 9 10 *11 12 14 15)
ACPI: PCI Interrupt Link [LNK2] (IRQs 3 4 5 7 9 10 11 12 14 

Re: [PATCH x86_64] Live Patching Function on 2.6.11.7

2005-04-20 Thread Chris Friesen
Ralf Baechle wrote:

I'd try a shared library based approach for on the fly updates.
The version that I've seen imposed requirements on the application for 
this to work properly.

There are tradeoffs either way.
Chris
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


writev to scsi disks

2005-04-20 Thread Dheeraj Pandey
I was wondering if I did a simple writev to a SCSI disk, does it take
the sg path to the device? I am guessing sg (REQ_SPECIAL) is only
true for character devices (and ioctl's) and not block devices.

These are my questions:
  - Is sg a common feature among SCSI disks these days? How do I know
what disks support this feature (any capabilities published by the driver)?
  - How does one make writev work for SCSI disk (as a block device)
in direct_io?
  - If I use SCSI disks as character device, can I simply use writev on the
character device file, and will sg codepath be taken?
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Need material on Ip Stack development

2005-04-20 Thread Shan Vernakar
Hello All, 

I am student, i need some information to develop the ip stack in
linux, i am great full to you if you help me out in this, if you have
any material or link, please send it to me.

Tahnking you 
Shan
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[no subject]

2005-04-20 Thread onemzolfqns
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] remove some usesless casts

2005-04-20 Thread Phillip Lougher
Jörn Engel wrote:
Squashfs is extremely cast-happy.  This patch makes it less so.
Jörn
Hi,
Thanks for the patch.  Unnecessary casts were one of the things 
mentioned when I submitted the patches to the LKML, and therefore I 
suspect most of them have been already fixed (but I will apply your 
patch to check).

I will send revised patches to the LKML soon, most of the issues raised 
by the comments have been fixed, the current delay is being caused by 
the 4GB limit re-work.

Regards
Phillip
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [2.6 patch] drivers/ieee1394/: remove unneeded EXPORT_SYMBOL's

2005-04-20 Thread Stefan Richter
Arjan van de Ven wrote:
On Wed, 2005-04-20 at 00:00 +0200, Stefan Richter wrote:
There are users (though not in the kernel at the moment)
nor for the last 5 months... how long will it be ?
Have there been problems with the API during the past 5 months, except 
that several kernel trees are using some parts of the API? (We are 
actually speaking about two APIs of the ieee1394 framework.)

Which problems are solved by this patch? Do they outweigh the problems 
it creates? The latter have been discussed. Dismissing them as Other 
People's Problems does not nullify them.

Where is the agreed-upon, published plan for removal of features in 
ieee1394?
--
Stefan Richter
-=-=-=-= -=-- =-=--
http://arcgraph.de/sr/
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [2.6 patch] drivers/ieee1394/: remove unneeded EXPORT_SYMBOL's

2005-04-20 Thread Arjan van de Ven
On Wed, 2005-04-20 at 18:31 +0200, Stefan Richter wrote:
 Arjan van de Ven wrote:
  On Wed, 2005-04-20 at 00:00 +0200, Stefan Richter wrote:
 There are users (though not in the kernel at the moment)
  nor for the last 5 months... how long will it be ?
 
 Have there been problems with the API during the past 5 months, except 
 that several kernel trees are using some parts of the API? (We are 
 actually speaking about two APIs of the ieee1394 framework.)
 
 Which problems are solved by this patch? 

exports and the functions beneath them cause the kernel binary be bigger
than needed. If nothing is using an api.. is it really the right one?


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[2.6 patch] drivers/net/hamradio/baycom_epp.c: cleanups

2005-04-20 Thread Adrian Bunk
The times when tricky goto's produced better codes are long gone.

This patch should express the same in a better way, please check whether 
I made any mistake.

Signed-off-by: Adrian Bunk [EMAIL PROTECTED]

---

 drivers/net/hamradio/baycom_epp.c |  126 --
 1 files changed, 36 insertions(+), 90 deletions(-)

--- linux-2.6.12-rc2-mm3/drivers/net/hamradio/baycom_epp.c.old  2005-04-20 
16:18:47.0 +0200
+++ linux-2.6.12-rc2-mm3/drivers/net/hamradio/baycom_epp.c  2005-04-20 
17:14:36.0 +0200
@@ -374,29 +374,6 @@
 }
 
 /* - */
-/*
- * high performance HDLC encoder
- * yes, it's ugly, but generates pretty good code
- */
-
-#define ENCODEITERA(j) \
-({ \
-if (!(notbitstream  (0x1f0  j)))\
-goto stuff##j; \
-  encodeend##j:;  \
-})
-
-#define ENCODEITERB(j)  \
-({  \
-  stuff##j: \
-bitstream = ~(0x100  j); \
-bitbuf = (bitbuf  (((2  j)  numbit) - 1)) |\
-((bitbuf  ~(((2  j)  numbit) - 1))  1);  \
-numbit++;   \
-notbitstream = ~bitstream;  \
-goto encodeend##j;  \
-})
-
 
 static void encode_hdlc(struct baycom_state *bc)
 {
@@ -405,6 +382,7 @@
int pkt_len;
 unsigned bitstream, notbitstream, bitbuf, numbit, crc;
unsigned char crcarr[2];
+   int j;

if (bc-hdlctx.bufcnt  0)
return;
@@ -429,24 +407,14 @@
pkt_len--;
if (!pkt_len)
bp = crcarr;
-   ENCODEITERA(0);
-   ENCODEITERA(1);
-   ENCODEITERA(2);
-   ENCODEITERA(3);
-   ENCODEITERA(4);
-   ENCODEITERA(5);
-   ENCODEITERA(6);
-   ENCODEITERA(7);
-   goto enditer;
-   ENCODEITERB(0);
-   ENCODEITERB(1);
-   ENCODEITERB(2);
-   ENCODEITERB(3);
-   ENCODEITERB(4);
-   ENCODEITERB(5);
-   ENCODEITERB(6);
-   ENCODEITERB(7);
-   enditer:
+   for (j = 0; j  8; j++)
+   if (unlikely(!(notbitstream  (0x1f0  j {
+   bitstream = ~(0x100  j);
+   bitbuf = (bitbuf  (((2  j)  numbit) - 1)) |
+   ((bitbuf  ~(((2  j)  numbit) - 1)) 
 1);
+   numbit++;
+   notbitstream = ~bitstream;
+   }
numbit += 8;
while (numbit = 8) {
*wp++ = bitbuf;
@@ -612,37 +580,6 @@
bc-stats.rx_packets++;
 }
 
-#define DECODEITERA(j)\
-({\
-if (!(notbitstream  (0x0fc  j)))  /* flag or abort */  \
-goto flgabrt##j;  \
-if ((bitstream  (0x1f8  j)) == (0xf8  j))   /* stuffed bit */\
-goto stuff##j;\
-  enditer##j:  ;   
\
-})
-
-#define DECODEITERB(j) 
\
-({ 
\
-  flgabrt##j:  
\
-if (!(notbitstream  (0x1fc  j))) {  /* abort received 
*/\
-state = 0; 
\
-goto enditer##j;   
\
-}  
\
-if ((bitstream  (0x1fe  j)) != (0x0fc  j))   /* flag received */  
\
-goto enditer##j;   
\
-if (state) 
\
-do_rxpacket(dev);  
\
-bc-hdlcrx.bufcnt = 0; 
\
-bc-hdlcrx.bufptr = bc-hdlcrx.buf;
\
-state = 1; 
\
-numbits = 

Re: enforcing DB immutability

2005-04-20 Thread Erik Mouw
On Wed, Apr 20, 2005 at 08:41:15AM -, [EMAIL PROTECTED] wrote:
 [A discussion on the git list about how to provide a hardlinked file
 that *cannot* me modified by an editor, but must be replaced by
 a new copy.]

Some time ago there was somebody working on copy-on-write links: once
you modify a cow-linked file, the file contents are copied, the file is
unlinked and you can safely work on the new file. It has some horrible
semantics in that the inode number of the opened file changes, I don't
know if applications are or should be aware of that.


Erik

-- 
+-- Erik Mouw -- www.harddisk-recovery.com -- +31 70 370 12 90 --
| Lab address: Delftechpark 26, 2628 XH, Delft, The Netherlands
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2.6.12-rc2] aoe [1/6]: improve allowed interfaces configuration

2005-04-20 Thread Ed L Cashin

improve allowed interfaces configuration

Signed-off-by: Ed L. Cashin [EMAIL PROTECTED]

diff -uprN a/Documentation/aoe/aoe.txt b/Documentation/aoe/aoe.txt
--- a/Documentation/aoe/aoe.txt 2005-04-20 11:40:55.0 -0400
+++ b/Documentation/aoe/aoe.txt 2005-04-20 11:42:20.0 -0400
@@ -33,6 +33,9 @@ USING DEVICE NODES
   cat /dev/etherd/err blocks, waiting for error diagnostic output,
   like any retransmitted packets.
 
+  The /dev/etherd/interfaces special file is obsoleted by the
+  aoe_iflist boot option and module option (and its sysfs entry
+  described in the next section).
   echo eth2 eth4  /dev/etherd/interfaces tells the aoe driver to
   limit ATA over Ethernet traffic to eth2 and eth4.  AoE traffic from
   untrusted networks should be ignored as a matter of security.
@@ -89,3 +92,24 @@ USING SYSFS
   e4.7eth1  up
   e4.8eth1  up
   e4.9eth1  up
+
+  When the aoe driver is a module, use
+  /sys/module/aoe/parameters/aoe_iflist instead of
+  /dev/etherd/interfaces to limit AoE traffic to the network
+  interfaces in the given whitespace-separated list.  Unlike the old
+  character device, the sysfs entry can be read from as well as
+  written to.
+
+  It's helpful to trigger discovery after setting the list of allowed
+  interfaces.  If your distro provides an aoe-discover script, you can
+  use that.  Otherwise, you can directly use the /dev/etherd/discover
+  file described above.
+
+DRIVER OPTIONS
+
+  There is a boot option for the built-in aoe driver and a
+  corresponding module parameter, aoe_iflist.  Without this option,
+  all network interfaces may be used for ATA over Ethernet.  Here is a
+  usage example for the module parameter.
+
+modprobe aoe_iflist=eth1 eth3
diff -uprN a/drivers/block/aoe/aoenet.c b/drivers/block/aoe/aoenet.c
--- a/drivers/block/aoe/aoenet.c2005-04-20 11:41:18.0 -0400
+++ b/drivers/block/aoe/aoenet.c2005-04-20 11:42:20.0 -0400
@@ -7,6 +7,7 @@
 #include linux/hdreg.h
 #include linux/blkdev.h
 #include linux/netdevice.h
+#include linux/moduleparam.h
 #include aoe.h
 
 #define NECODES 5
@@ -26,6 +27,19 @@ enum {
 };
 
 static char aoe_iflist[IFLISTSZ];
+module_param_string(aoe_iflist, aoe_iflist, IFLISTSZ, 0600);
+MODULE_PARM_DESC(aoe_iflist,  aoe_iflist=\dev1 [dev2 ...]\n);
+
+#ifndef MODULE
+static int __init aoe_iflist_setup(char *str)
+{
+   strncpy(aoe_iflist, str, IFLISTSZ);
+   aoe_iflist[IFLISTSZ - 1] = '\0';
+   return 1;
+}
+
+__setup(aoe_iflist=, aoe_iflist_setup);
+#endif
 
 int
 is_aoe_netif(struct net_device *ifp)
@@ -36,7 +50,8 @@ is_aoe_netif(struct net_device *ifp)
if (aoe_iflist[0] == '\0')
return 1;
 
-   for (p = aoe_iflist; *p; p = q + strspn(q, WHITESPACE)) {
+   p = aoe_iflist + strspn(aoe_iflist, WHITESPACE);
+   for (; *p; p = q + strspn(q, WHITESPACE)) {
q = p + strcspn(p, WHITESPACE);
if (q != p)
len = q - p;


-- 
  Ed L. Cashin [EMAIL PROTECTED]

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [2.6 patch] drivers/ieee1394/: remove unneeded EXPORT_SYMBOL's

2005-04-20 Thread Stefan Richter
Arjan van de Ven wrote:
If nothing is using an api
Check the archive.
--
Stefan Richter
-=-=-=-= -=-- =-=--
http://arcgraph.de/sr/
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2.6.12-rc2] aoe [2/6]: aoe-stat should work for built-in as well as module

2005-04-20 Thread Ed L Cashin

aoe-stat should work for built-in as well as module

Signed-off-by: Ed L. Cashin [EMAIL PROTECTED]

diff -uprN a/Documentation/aoe/status.sh b/Documentation/aoe/status.sh
--- a/Documentation/aoe/status.sh   2005-04-20 11:40:55.0 -0400
+++ b/Documentation/aoe/status.sh   2005-04-20 11:42:20.0 -0400
@@ -14,10 +14,6 @@ test ! -d $sysd/block  {
echo $me Error: sysfs is not mounted 12
exit 1
 }
-test -z `lsmod | grep '^aoe'`  {
-   echo  $me Error: aoe module is not loaded 12
-   exit 1
-}
 
 for d in `ls -d $sysd/block/etherd* 2/dev/null | grep -v p` end; do
# maybe ls comes up empty, so we use end


-- 
  Ed L. Cashin [EMAIL PROTECTED]

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2.6.12-rc2] aoe [3/6]: update the documentation to mention aoetools

2005-04-20 Thread Ed L Cashin

update the documentation to mention aoetools

Signed-off-by: Ed L. Cashin [EMAIL PROTECTED]

diff -uprN a/Documentation/aoe/aoe.txt b/Documentation/aoe/aoe.txt
--- a/Documentation/aoe/aoe.txt 2005-04-20 11:42:20.0 -0400
+++ b/Documentation/aoe/aoe.txt 2005-04-20 11:42:21.0 -0400
@@ -4,6 +4,16 @@ The EtherDrive (R) HOWTO for users of 2.
 
   It has many tips and hints!
 
+The aoetools are userland programs that are designed to work with this
+driver.  The aoetools are on sourceforge.
+
+  http://aoetools.sourceforge.net/
+
+The scripts in this Documentation/aoe directory are intended to
+document the use of the driver and are not necessary if you install
+the aoetools.
+
+
 CREATING DEVICE NODES
 
   Users of udev should find the block device nodes created
@@ -33,19 +43,17 @@ USING DEVICE NODES
   cat /dev/etherd/err blocks, waiting for error diagnostic output,
   like any retransmitted packets.
 
-  The /dev/etherd/interfaces special file is obsoleted by the
-  aoe_iflist boot option and module option (and its sysfs entry
-  described in the next section).
   echo eth2 eth4  /dev/etherd/interfaces tells the aoe driver to
   limit ATA over Ethernet traffic to eth2 and eth4.  AoE traffic from
-  untrusted networks should be ignored as a matter of security.
+  untrusted networks should be ignored as a matter of security.  See
+  also the aoe_iflist driver option described below.
 
   echo  /dev/etherd/discover tells the driver to find out what AoE
   devices are available.
 
   These character devices may disappear and be replaced by sysfs
-  counterparts, so distribution maintainers are encouraged to create
-  scripts that use these devices.
+  counterparts.  Using the commands in aoetools insulates users from
+  these implementation details.
 
   The block devices are named like this:
 
@@ -69,7 +77,8 @@ USING SYSFS
   through which we are communicating with the remote AoE device.
 
   There is a script in this directory that formats this information
-  in a convenient way.
+  in a convenient way.  Users with aoetools can use the aoe-stat
+  command.
 
   [EMAIL PROTECTED] root# sh Documentation/aoe/status.sh 
  e10.0eth3  up
@@ -101,9 +110,9 @@ USING SYSFS
   written to.
 
   It's helpful to trigger discovery after setting the list of allowed
-  interfaces.  If your distro provides an aoe-discover script, you can
-  use that.  Otherwise, you can directly use the /dev/etherd/discover
-  file described above.
+  interfaces.  The aoetools package provides an aoe-discover script
+  for this purpose.  You can also directly use the
+  /dev/etherd/discover special file described above.
 
 DRIVER OPTIONS
 


-- 
  Ed L. Cashin [EMAIL PROTECTED]

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2.6.12-rc2] aoe [5/6]: add firmware version to info in sysfs

2005-04-20 Thread Ed L Cashin

add firmware version to info in sysfs

Signed-off-by: Ed L. Cashin [EMAIL PROTECTED]

diff -uprN a/drivers/block/aoe/aoeblk.c b/drivers/block/aoe/aoeblk.c
--- a/drivers/block/aoe/aoeblk.c2005-04-20 11:41:18.0 -0400
+++ b/drivers/block/aoe/aoeblk.c2005-04-20 11:42:23.0 -0400
@@ -37,6 +37,13 @@ static ssize_t aoedisk_show_netif(struct
 
return snprintf(page, PAGE_SIZE, %s\n, d-ifp-name);
 }
+/* firmware version */
+static ssize_t aoedisk_show_fwver(struct gendisk * disk, char *page)
+{
+   struct aoedev *d = disk-private_data;
+
+   return snprintf(page, PAGE_SIZE, 0x%04x\n, (unsigned int) d-fw_ver);
+}
 
 static struct disk_attribute disk_attr_state = {
.attr = {.name = state, .mode = S_IRUGO },
@@ -50,6 +57,10 @@ static struct disk_attribute disk_attr_n
.attr = {.name = netif, .mode = S_IRUGO },
.show = aoedisk_show_netif
 };
+static struct disk_attribute disk_attr_fwver = {
+   .attr = {.name = fwver, .mode = S_IRUGO },
+   .show = aoedisk_show_fwver
+};
 
 static void
 aoedisk_add_sysfs(struct aoedev *d)
@@ -57,6 +68,7 @@ aoedisk_add_sysfs(struct aoedev *d)
sysfs_create_file(d-gd-kobj, disk_attr_state.attr);
sysfs_create_file(d-gd-kobj, disk_attr_mac.attr);
sysfs_create_file(d-gd-kobj, disk_attr_netif.attr);
+   sysfs_create_file(d-gd-kobj, disk_attr_fwver.attr);
 }
 void
 aoedisk_rm_sysfs(struct aoedev *d)
@@ -64,6 +76,7 @@ aoedisk_rm_sysfs(struct aoedev *d)
sysfs_remove_link(d-gd-kobj, state);
sysfs_remove_link(d-gd-kobj, mac);
sysfs_remove_link(d-gd-kobj, netif);
+   sysfs_remove_link(d-gd-kobj, fwver);
 }
 
 static int


-- 
  Ed L. Cashin [EMAIL PROTECTED]

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2.6.12-rc2] aoe [6/6]: update version number to 10

2005-04-20 Thread Ed L Cashin

update version number to 10

Signed-off-by: Ed L. Cashin [EMAIL PROTECTED]

--- b/drivers/block/aoe/aoe.h   2005-04-20 11:42:19.0 -0400
+++ b/drivers/block/aoe/aoe.h   2005-04-20 11:42:22.0 -0400
@@ -1,5 +1,5 @@
 /* Copyright (c) 2004 Coraid, Inc.  See COPYING for GPL terms. */
-#define VERSION 6
+#define VERSION 10
 #define AOE_MAJOR 152
 #define DEVICE_NAME aoe
 


-- 
  Ed L. Cashin [EMAIL PROTECTED]

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: GPL violation by CorAccess?

2005-04-20 Thread Pavel Machek
Hi!

  I have seen a device by CorAccess which apparently uses Linux and didn't 
  find
  anything that would suggest it complies to GPL, though I had access to the
  complete shipping package. Does anyone know about known cause of violation 
  by
  this company or should I investigate further?
 
 Well what is the case if you use unmodified GPL code, do you still have
 to provide sources to the end user if you give them binaries?  I would
 guess yes, but IANAL.
 
 As far as I can tell their system is a geode GX1 so runs standard x86
 software.  Maybe they didn't have to modify any of the linux kernel to
 run what they needed.  Their applications are their business of course.
 It looks like they use QT as the gui toolkit, which I don't off hand
 know the current license conditions of.  Then there is the web browser
 and such, which has it's own license conditions.  Of course for all I
 know their user manual has an offer of sending a CD with the sources if
 you ask.  Does anyone actually have their product that could check for
 that?

QT is GPLed, IIRC. Not LGPL-ed, meaning you can't link it with
proprietary application without license from trolltech.
Pavel
-- 
Boycott Kodak -- for their patent abuse against Java.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [discuss] [Patch] X86_64 TASK_SIZE cleanup

2005-04-20 Thread Zou, Nanhai
Hi Andi,
   What is your comment on this patch?
Here is another example bug this patch will fix.
The following piece of code will get a success mmap even if compiled
with -m32. 
   
   int *p;
   p = mmap((void *)(0xE000UL), 0x1UL, PROT_READ|PROT_WRITE,
MAP_FIXED|MAP_PRIVATE|MAP_ANON, 0, 0);

I believe there are other kind of corner case bugs around mm and fs. 
e.g in mremap and munmap.
Those bugs will be fixed by this patch. 

Zou Nan hai
 -Original Message-
 From: Zou, Nanhai
 Sent: Tuesday, April 19, 2005 12:37 AM
 To: 'Andi Kleen'
 Cc: [EMAIL PROTECTED]; linux-kernel@vger.kernel.org; Siddha, Suresh B
 Subject: RE: [discuss] [Patch] X86_64 TASK_SIZE cleanup
 
 
 When a 32bit program is mapping a lot of hugepage vm_areas,
 hugetlb_get_unmapped_area may search beyond 4G, then the program will
get a
 SIGFAULT instead of an errno of ENOMEM.
 This patch will fix that.
 I believe there are other inconsistent cases in generic code like mm
and fs.
 
 Zou Nan hai
 
  -Original Message-
  From: Andi Kleen [mailto:[EMAIL PROTECTED]
  Sent: Monday, April 18, 2005 5:06 PM
  To: Zou, Nanhai
  Cc: [EMAIL PROTECTED]; Andi Kleen; linux-kernel@vger.kernel.org;
Siddha,
  Suresh B
  Subject: Re: [discuss] [Patch] X86_64 TASK_SIZE cleanup
 
  On Sat, Apr 16, 2005 at 09:34:25AM +0800, Zou, Nanhai wrote:
  
   Hi,
  This patch will clean up the X86_64 compatibility mode
TASK_SIZE
   define thus fix some bugs found in X86_64 compatibility mode
program.
 
  Fix what bugs exactly?  Please a detailed description.
 
  -Andi
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2.6.12-rc2] aoe [4/6]: allow multiple aoe devices to have the same mac

2005-04-20 Thread Ed L Cashin

allow multiple aoe devices to have the same mac

Signed-off-by: Ed L. Cashin [EMAIL PROTECTED]

diff -u b/drivers/block/aoe/aoedev.c b/drivers/block/aoe/aoedev.c
--- b/drivers/block/aoe/aoedev.c2005-04-20 11:42:18.0 -0400
+++ b/drivers/block/aoe/aoedev.c2005-04-20 11:42:22.0 -0400
@@ -109,25 +109,22 @@
spin_lock_irqsave(devlist_lock, flags);
 
for (d=devlist; d; d=d-next)
-   if (d-sysminor == sysminor
-   || memcmp(d-addr, addr, sizeof d-addr) == 0)
+   if (d-sysminor == sysminor)
break;
 
if (d == NULL  (d = aoedev_newdev(bufcnt)) == NULL) {
spin_unlock_irqrestore(devlist_lock, flags);
printk(KERN_INFO aoe: aoedev_set: aoedev_newdev failure.\n);
return NULL;
-   }
+   } /* if newdev, (d-flags  DEVFL_UP) == 0 for below */
 
spin_unlock_irqrestore(devlist_lock, flags);
spin_lock_irqsave(d-lock, flags);
 
d-ifp = ifp;
-
-   if (d-sysminor != sysminor
-   || (d-flags  DEVFL_UP) == 0) {
+   memcpy(d-addr, addr, sizeof d-addr);
+   if ((d-flags  DEVFL_UP) == 0) {
aoedev_downdev(d); /* flushes outstanding frames */
-   memcpy(d-addr, addr, sizeof d-addr);
d-sysminor = sysminor;
d-aoemajor = AOEMAJOR(sysminor);
d-aoeminor = AOEMINOR(sysminor);


-- 
  Ed L. Cashin [EMAIL PROTECTED]

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] nameing reserved pages [0/3]

2005-04-20 Thread Dave Hansen
On Wed, 2005-04-20 at 16:30 +0200, Arjan van de Ven wrote:
 Why do you want this exported to userspace? There is absolutely no way
 you can get this exported race free without shutting the VM down, and
 without being race free this information has absolutely no meaning !!
 (and when you shut the VM down you really shouldn't depend on userspace
 anymore either)

The two cases where this is expected to be used are not concerned with
races.  The first is when a memory remove operation occurs.  It first
looks at the hotplug area, and removes all the pages that it can from
the allocator.  Then, it sets about migrating all of the other pages
that are being used for things like page cache or anonymous memory.

After that, the question sometimes remains why particular pages can't be
removed.  Kame's patch is an attempt to help figure that out.

That's one reason I suggested having an individual device file for each
of the memory areas that get added or removed.  It would keep the
confusion to a minimum, and you'd be more sure that what you were
looking at was information only for the memory area that is *almost*
removed.

I don't know what state the system is in when the kdump folks want to
read this information.

-- Dave

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2.6.12-rc2] aoe [1/6]: improve allowed interfaces configuration

2005-04-20 Thread Ed L Cashin
Randy.Dunlap [EMAIL PROTECTED] writes:

 On Wed, 20 Apr 2005 13:02:12 -0400 Ed L Cashin wrote:

 Just a nit/typo:

 | +modprobe aoe_iflist=eth1 eth3

 |  static char aoe_iflist[IFLISTSZ];
 | +module_param_string(aoe_iflist, aoe_iflist, IFLISTSZ, 0600);
 | +MODULE_PARM_DESC(aoe_iflist,  aoe_iflist=\dev1 [dev2 ...]\n);

 No leading space ( aoe_iflist=) and put a trailing \ in it:

   +MODULE_PARM_DESC(aoe_iflist, aoe_iflist=\dev1 [dev2 ...]\\n);

Thanks for catching that.


improve allowed interfaces configuration

Signed-off-by: Ed L. Cashin [EMAIL PROTECTED]

diff -uprN a/Documentation/aoe/aoe.txt b/Documentation/aoe/aoe.txt
--- a/Documentation/aoe/aoe.txt 2005-04-20 11:40:55.0 -0400
+++ b/Documentation/aoe/aoe.txt 2005-04-20 11:42:20.0 -0400
@@ -33,6 +33,9 @@ USING DEVICE NODES
   cat /dev/etherd/err blocks, waiting for error diagnostic output,
   like any retransmitted packets.
 
+  The /dev/etherd/interfaces special file is obsoleted by the
+  aoe_iflist boot option and module option (and its sysfs entry
+  described in the next section).
   echo eth2 eth4  /dev/etherd/interfaces tells the aoe driver to
   limit ATA over Ethernet traffic to eth2 and eth4.  AoE traffic from
   untrusted networks should be ignored as a matter of security.
@@ -89,3 +92,24 @@ USING SYSFS
   e4.7eth1  up
   e4.8eth1  up
   e4.9eth1  up
+
+  When the aoe driver is a module, use
+  /sys/module/aoe/parameters/aoe_iflist instead of
+  /dev/etherd/interfaces to limit AoE traffic to the network
+  interfaces in the given whitespace-separated list.  Unlike the old
+  character device, the sysfs entry can be read from as well as
+  written to.
+
+  It's helpful to trigger discovery after setting the list of allowed
+  interfaces.  If your distro provides an aoe-discover script, you can
+  use that.  Otherwise, you can directly use the /dev/etherd/discover
+  file described above.
+
+DRIVER OPTIONS
+
+  There is a boot option for the built-in aoe driver and a
+  corresponding module parameter, aoe_iflist.  Without this option,
+  all network interfaces may be used for ATA over Ethernet.  Here is a
+  usage example for the module parameter.
+
+modprobe aoe_iflist=eth1 eth3
diff -uprN a/drivers/block/aoe/aoenet.c b/drivers/block/aoe/aoenet.c
--- a/drivers/block/aoe/aoenet.c2005-04-20 11:41:18.0 -0400
+++ b/drivers/block/aoe/aoenet.c2005-04-20 11:42:20.0 -0400
@@ -7,6 +7,7 @@
 #include linux/hdreg.h
 #include linux/blkdev.h
 #include linux/netdevice.h
+#include linux/moduleparam.h
 #include aoe.h
 
 #define NECODES 5
@@ -26,6 +27,19 @@ enum {
 };
 
 static char aoe_iflist[IFLISTSZ];
+module_param_string(aoe_iflist, aoe_iflist, IFLISTSZ, 0600);
+MODULE_PARM_DESC(aoe_iflist, aoe_iflist=\dev1 [dev2 ...]\\n);
+
+#ifndef MODULE
+static int __init aoe_iflist_setup(char *str)
+{
+   strncpy(aoe_iflist, str, IFLISTSZ);
+   aoe_iflist[IFLISTSZ - 1] = '\0';
+   return 1;
+}
+
+__setup(aoe_iflist=, aoe_iflist_setup);
+#endif
 
 int
 is_aoe_netif(struct net_device *ifp)
@@ -36,7 +50,8 @@ is_aoe_netif(struct net_device *ifp)
if (aoe_iflist[0] == '\0')
return 1;
 
-   for (p = aoe_iflist; *p; p = q + strspn(q, WHITESPACE)) {
+   p = aoe_iflist + strspn(aoe_iflist, WHITESPACE);
+   for (; *p; p = q + strspn(q, WHITESPACE)) {
q = p + strcspn(p, WHITESPACE);
if (q != p)
len = q - p;


-- 
  Ed L Cashin [EMAIL PROTECTED]


Re: [PATCH 2.6.12-rc2] aoe [1/6]: improve allowed interfaces configuration

2005-04-20 Thread Randy.Dunlap
On Wed, 20 Apr 2005 13:02:12 -0400 Ed L Cashin wrote:

Just a nit/typo:

| +modprobe aoe_iflist=eth1 eth3

|  static char aoe_iflist[IFLISTSZ];
| +module_param_string(aoe_iflist, aoe_iflist, IFLISTSZ, 0600);
| +MODULE_PARM_DESC(aoe_iflist,  aoe_iflist=\dev1 [dev2 ...]\n);

No leading space ( aoe_iflist=) and put a trailing \ in it:

  +MODULE_PARM_DESC(aoe_iflist, aoe_iflist=\dev1 [dev2 ...]\\n);


---
~Randy
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC/PATCH] unregister_node() for hotplug use

2005-04-20 Thread Greg KH
On Wed, Apr 20, 2005 at 09:07:44PM +0900, Keiichiro Tokunaga wrote:
   This is to add a generic function 'unregister_node()'.
 It is used to remove objects of a node going away for
 hotplug.  If CONFIG_HOTPLUG=y, it becomes available.
 This is against 2.6.12-rc2-mm3.

Please CC: this kind of stuff to the driver core maintainer, otherwise
it can get dropped...

Anyway, comments below:

 diff -puN drivers/base/node.c~numa_hp_base drivers/base/node.c
 --- linux-2.6.12-rc2-mm3/drivers/base/node.c~numa_hp_base 2005-04-14 
 20:49:37.0 +0900
 +++ linux-2.6.12-rc2-mm3-kei/drivers/base/node.c  2005-04-14 
 20:49:37.0 +0900
 @@ -136,7 +136,7 @@ static SYSDEV_ATTR(distance, S_IRUGO, no
   *
   * Initialize and register the node device.
   */
 -int __init register_node(struct node *node, int num, struct node *parent)
 +int __devinit register_node(struct node *node, int num, struct node *parent)
  {
   int error;
  
 @@ -145,6 +145,9 @@ int __init register_node(struct node *no
   error = sysdev_register(node-sysdev);
  
   if (!error){
 + /*
 +  * If you add new object here, delete it when unregistering.
 +  */

Comment really isn't needed.

 +/*
 + * unregister_node - Remove objects of a node going away from sysfs.
 + * @node - node going away
 + *
 + * This is used only for hotplug.
 + */

If you are going to create function comments, at least use the proper
kerneldoc format.

 +#ifdef CONFIG_HOTPLUG

You don't provide function prototype for when CONFIG_HOTPLUG is not
enabled.

 +void unregister_node(struct node *node)
 +{
 + if (node == NULL)
 + return;

How can this happen?

 +
 + sysdev_remove_file(node-sysdev, attr_cpumap);
 + sysdev_remove_file(node-sysdev, attr_meminfo);
 + sysdev_remove_file(node-sysdev, attr_numastat);
 + sysdev_remove_file(node-sysdev, attr_distance);
 +
 + sysdev_unregister(node-sysdev);
 +}
 +EXPORT_SYMBOL(register_node);
 +EXPORT_SYMBOL(unregister_node);

All of sysfs and the driver core are EXPORT_SYMBOL_GPL().  Please follow
that convention.

 +#endif /* CONFIG_HOTPLUG */
  
 -int __init register_node_type(void)
 +static int __init register_node_type(void)

Are you sure no one calls this?

  {
   return sysdev_class_register(node_class);
  }
 diff -puN include/linux/node.h~numa_hp_base include/linux/node.h
 --- linux-2.6.12-rc2-mm3/include/linux/node.h~numa_hp_base2005-04-14 
 20:49:37.0 +0900
 +++ linux-2.6.12-rc2-mm3-kei/include/linux/node.h 2005-04-14 
 20:49:37.0 +0900
 @@ -21,12 +21,16 @@
  
  #include linux/sysdev.h
  #include linux/cpumask.h
 +#include linux/module.h

Why?

  
  struct node {
   struct sys_device   sysdev;
  };
  
 -extern int register_node(struct node *, int, struct node *);
 +extern int __devinit register_node(struct node *, int, struct node *);

__devinit is not needed on a function prototype.

 +#ifdef CONFIG_HOTPLUG
 +extern void unregister_node(struct node *node);
 +#endif

Not needed for a function prototype.

thanks,

greg k-h
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2.6.12-rc2] aoe [5/6]: add firmware version to info in sysfs

2005-04-20 Thread Randy.Dunlap

| add firmware version to info in sysfs
| 
| +static struct disk_attribute disk_attr_fwver = {
| + .attr = {.name = fwver, .mode = S_IRUGO },
| + .show = aoedisk_show_fwver
| +};
| @@ -64,6 +76,7 @@ aoedisk_rm_sysfs(struct aoedev *d)
|   sysfs_remove_link(d-gd-kobj, state);
|   sysfs_remove_link(d-gd-kobj, mac);
|   sysfs_remove_link(d-gd-kobj, netif);
| + sysfs_remove_link(d-gd-kobj, fwver);


It's a good thing that you spelled out firmware version
for me.
Just seeing 'fwver' provided these comments from others:


n vwls s bd  (well, it does have 'e'; maybe it shouldn't :)
friends fwver
fw is firewire


so something like 'firmware-version' would be appreciated
(for the sysfs filename).

Thanks,
---
~Randy
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Kernel page table and module text

2005-04-20 Thread Bodo Eggert [EMAIL PROTECTED]
Allison [EMAIL PROTECTED] wrote:

 I want to find where each module is loaded in memory by traversing the
 module list . Once I have the address and the size of the module, I
 want to read the bytes in memory of the module and hash it to check
 it's integrity.

JFTR: This may work against random memory corruption, but it will fail for
detecting attacks.
-- 
Top 100 things you don't want the sysadmin to say:
54. Uh huh..nu -k $USER.. no problemsure thing...

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Serious performance degradation on a RAID with kernel 2.6.10-bk7 and later

2005-04-20 Thread Andreas Hirstius
Hi,
We have a rx4640 with 3x 3Ware 9500 SATA controllers and 24x WD740GD HDD 
in a software RAID0 configuration (using md).
With kernel 2.6.11 the read performance on the md is reduced by a factor 
of 20 (!!) compared to previous kernels.
The write rate to the md doesn't change!! (it actually improves a bit).

The config for the kernels are basically identical.
Here is some vmstat output:
kernel 2.6.9: ~1GB/s read
procs  memory  swap  io 
system cpu
r  b   swpd   free   buff  cache   si   sobibo   incs us sy wa id
1  1  0  12672   6592 1591411200 108134456 15719  1583 0 11 14 74
1  0  0  12672   6592 1591520000 1130496 0 15996  1626 0 11 14 74
0  1  0  12672   6592 1591411200 1081344 0 15891  1570 0 11 14 74
0  1  0  12480   6592 1591411200 1081344 0 15855  1537 0 11 14 74
1  0  0  12416   6592 1591411200 1130496 0 16006  1586 0 12 14 74

kernel 2.6.11: ~55MB/s read
procs  memory  swap  io 
system cpu
r  b   swpd   free   buff  cache   si   sobibo   incs us sy wa id
1  1  0  24448  37568 1590598400 56934 0 5166  1862  0 1 24 75
0  1  0  20672  37568 1590924800 57280 0 5168  1871  0 1 24 75
0  1  0  22848  37568 1590707200 57306 0 5173  1874  0 1 24 75
0  1  0  25664  37568 1590380800 57190 0 5171  1870  0 1 24 75
0  1  0  21952  37568 1590816000 57267 0 5168  1871  0 1 24 75

Because the filesystem might have an impact on the measurement, dd on /dev/md0
was used to get information about the performance. 
This also opens the possibility to test with block sizes larger than the page size.
And it appears that the performance with kernel 2.6.11 is closely 
related to the block size.
For example if the block size is exactly a multiple (2) of the page 
size the performance is back to ~1.1GB/s.
The general behaviour is a bit more complicated:  

 1. bs = 1.5 * ps : ~27-57MB/s (differs with ps)
 2. bs  1.5 * ps  bs  2 * ps : rate increases to max. rate
 3. bs = n * ps ; (n = 2) : ~1.1GB/s (== max. rate)
 4. bs  n * ps  bs  ~(n+0.5) * ps ; (n  2) : ~27-70MB/s (differs 
with ps)
 5. bs  ~(n+0.5) * ps  bs  (n+1) * ps ; (n  2) : increasing rate 
in several, more or
 less, distinct steps (e.g. 1/3 of max. rate and then 2/3 of max 
rate for 64k pages)

I've tested all four possible page sizes on Itanium (4k, 8k, 16k and 64k) and the pattern is 
always the same!!

With kernel 2.6.9 (any kernel before 2.6.10-bk6) the read rate is always at 
~1.1GB/s,
independent of the block size.
This simple patch solves the problem, but I have no idea of possible 
side-effects ...
--- linux-2.6.12-rc2_orig/mm/filemap.c  2005-04-04 18:40:05.0 +0200
+++ linux-2.6.12-rc2/mm/filemap.c   2005-04-20 10:27:42.0 +0200
@@ -719,7 +719,7 @@
   index = *ppos  PAGE_CACHE_SHIFT;
   next_index = index;
   prev_index = ra.prev_page;
-   last_index = (*ppos + desc-count + PAGE_CACHE_SIZE-1)  
PAGE_CACHE_SHIFT;
+   last_index = (*ppos + desc-count + PAGE_CACHE_SIZE)  
PAGE_CACHE_SHIFT;
   offset = *ppos  ~PAGE_CACHE_MASK;
   isize = i_size_read(inode);
--- linux-2.6.12-rc2_orig/mm/readahead.c2005-04-04 18:40:05.0 
+0200
+++ linux-2.6.12-rc2/mm/readahead.c 2005-04-20 18:37:04.0 +0200
@@ -70,7 +70,7 @@
 */
static unsigned long get_init_ra_size(unsigned long size, unsigned long max)
{
-   unsigned long newsize = roundup_pow_of_two(size);
+   unsigned long newsize = size;
   if (newsize = max / 64)
   newsize = newsize * newsize;

In order to keep this mail short, I've created a webpage that contains 
all the detailed information and some plots:
http://www.cern.ch/openlab-debugging/raid

Regards,
  Andreas Hirstius
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] tty races

2005-04-20 Thread Jason Baron

There are a couple of tty race conditions, which lead to inconsistent tty 
reference counting and tty layer oopses.

The first is a tty_open vs. tty_close race in drivers/char/tty.io.c. 
Basically, from the time that the tty-count is deemed to be 1 and that we 
are going to free it to the time that TTY_CLOSING bit is set, needs to be 
atomic with respect to the manipulation of tty-count in init_dev(). This 
atomicity was previously guarded by the BKL. However, this is no longer 
true with the addition of a down() call in the middle of the 
release_dev()'s atomic path. So either the down() needs to be moved 
outside the atomic patch or dropped. I would vote for simply dropping it 
as i don't see why it is necessary.

The second race is tty_open vs. tty_open. This race I've seen when the 
virtual console is the tty driver. In con_open(),  vc_allocate() is called 
if the tty-count is 1. However, this check of the tty-count is not 
guarded by the 'tty_sem'. Thus, it is possible for con_open(), to never 
see the tty-count as 1, and thus never call vc_allocate(). This leads to 
a NULL filp-private_data, and an oops.

The test case below reproduces these problems, and the patch fixes it. The 
test case uses /dev/tty9, which is generally restricted to root for 
open(). It may be able to exploit these races using pseudo terminals, 
although i wasn't able to. A previous report of this issue, with an oops 
trace was: http://www.ussg.iu.edu/hypermail/linux/kernel/0503.2/0017.html

thanks,

-Jason


--- linux/drivers/char/tty_io.c.bak
+++ linux/drivers/char/tty_io.c
@@ -1596,14 +1596,9 @@ static void release_dev(struct file * fi
 * each iteration we avoid any problems.
 */
while (1) {
-   /* Guard against races with tty-count changes elsewhere and
-  opens on /dev/tty */
-  
-   down(tty_sem);
tty_closing = tty-count = 1;
o_tty_closing = o_tty 
(o_tty-count = (pty_master ? 1 : 0));
-   up(tty_sem);
do_sleep = 0;
 
if (tty_closing) {
@@ -1640,7 +1635,6 @@ static void release_dev(struct file * fi
 * block, so it's safe to proceed with closing.
 */
 
-   down(tty_sem);
if (pty_master) {
if (--o_tty-count  0) {
printk(KERN_WARNING release_dev: bad pty slave count 
@@ -1654,7 +1648,6 @@ static void release_dev(struct file * fi
   tty-count, tty_name(tty, buf));
tty-count = 0;
}
-   up(tty_sem);

/*
 * We've decremented tty-count, so we need to remove this file
@@ -1844,9 +1837,10 @@ retry_open:
}
 got_driver:
retval = init_dev(driver, index, tty);
-   up(tty_sem);
-   if (retval)
+   if (retval) {
+   up(tty_sem);
return retval;
+   }
 
filp-private_data = tty;
file_move(filp, tty-tty_files);
@@ -1863,6 +1857,7 @@ got_driver:
else
retval = -ENODEV;
}
+   up(tty_sem);
filp-f_flags = saved_flags;
 
if (!retval  test_bit(TTY_EXCLUSIVE, tty-flags)  
!capable(CAP_SYS_ADMIN))



#include sys/types.h
#include sys/stat.h
#include stdio.h
#include stdlib.h
#include unistd.h
#include fcntl.h
#include time.h
#include pthread.h
#include linux/fb.h
#include linux/vt.h
#include linux/kd.h

#define NTHREADS 300

void *thread_function();
int open_fail_num;
int open_success;

int
main(int argc, char *argv[])
{
int i, j;
pthread_t thread_id[NTHREADS];

for(;;) {
for(i=0; i  NTHREADS; i++) {
pthread_create(thread_id[i], NULL, thread_function, 
NULL);
}
for(j=0; j  NTHREADS; j++) {
pthread_join(thread_id[j], NULL); 
}
printf(open failures: %i\n, open_fail_num);
printf(open success: %i\n, open_success);
}
}

void *thread_function()
{
int fd;
time_t t;
int val;
int ret;

fd = open(/dev/tty9, O_RDWR);

val = 0;
//call an ioctl
ret = ioctl(fd, KDGETMODE, val);
if (ret != 0) {
perror(ioctl error\n);
}

if (fd  0) {
open_fail_num++;
} else {
open_success++;
}
/* just waste some random time */
t = (time((time_t *)0) 31L)  6;
while (t--  0)
(void)time((time_t *)0);
close(fd);  
}




-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


  1   2   >