Re: [UPDATED PATCH] Re: [Lse-tech] Re: [Patch 7/7] Generic netlink interface (delay accounting)

2006-03-11 Thread jamal
On Fri, 2006-10-03 at 22:09 +0530, Balbir Singh wrote:
 On Fri, Mar 10, 2006 at 09:53:53AM -0500, jamal wrote: 

  On kernel-user (in the case of response to #a or async notifiation as
  in #b) you really dont need to specify the TG/PID since they appear in
  the STATS etc.
 
 I see your point now. I am looking at other users of netlink like
 rtnetlink and I see the classical usage.
 
 We can implement TLV's in our code, but for the most part the data we exchange
 between the user - kernel has all the TLV's listed in the enum above.

 The major differnece is the type (pid/tgid). Hence we created a structure
 (taskstats) instead of using TLV's.

Something to remember:

1) TLVs are essentially giving you the flexibility to send optionally
appearing elements. It is up to the receiver (in the kernel or user
space) to check for the presence of mandatory elements or execute things
depending on the presence of certain TLVs. Example in your case:
if the tgid TLV appears then the user is requesting for that TLV
if the pid appears then they are requesting for that
if both appear then it is the  of the two.
You should always ignore TLVs you dont understand - to allow for forward
compatibility.

2)  The T part is essentially also encoding (semantically) what size
the value is; the L part is useful for validation. So the receiver
will always know what the size of the TLV is by definition and uses the
L to make sure it is the right size. Reject what is of the wrong size.

cheers,
jamal

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [UPDATED PATCH] Re: [Lse-tech] Re: [Patch 7/7] Generic netlink interface (delay accounting)

2006-03-10 Thread jamal
On Thu, 2006-09-03 at 20:07 +0530, Balbir Singh wrote:

 Please find the latest version of the patch for review. The genetlink
 code has been updated as per your review comments. The changelog is provided
 below
 
 1. Eliminated TASKSTATS_CMD_LISTEN and TASKSTATS_CMD_IGNORE
 2. Provide generic functions called genlmsg_data() and genlmsg_len()
in linux/net/genetlink.h
 3. Do not multicast all replies, multicast only events generated due
to task exit.
 4. The taskstats and taskstats_reply structures are now 64 bit aligned.
 5. Family id is dynamically generated.
 
 Please let us know if we missed something out.

Design still shaky IMO - now that i think i may understand what your end
goal is. 
Using the principles i described in earlier email, the problem you are
trying to solve is:

a) shipping of the taskstats from kernel to user-space asynchronously to
all listeners on multicast channel/group TASKSTATS_LISTEN_GRP
at the point when some process exits.
b) responding to queries issued by the user to the kernel for taskstats
of a particular defined tgid and/or pid combination. 

Did i summarize your goals correctly?

So lets stat with #b:
i) the message is multicast; there has to be a user space app registered
to the multicast group otherwise nothing goes to user space.
ii) user space issues a GET and it seems to me the appropriate naming
for the response is a NEW.

Lets go to #a:
The issued multicast messages are also NEW and no different from the
ones sent in response to a GET.

Having said that then, you have the following commands:

enum {
   TASKSTATS_CMD_UNSPEC,   /* Reserved */
   TASKSTATS_CMD_GET,  /* user - kernel query*/
   TASKSTATS_CMD_NEW, /* kernel - user update */
};

You also need the following TLVs

enum {
   TASKSTATS_TYPE_UNSPEC,   /* Reserved */
   TASKSTATS_TYPE_TGID,  /* The TGID */
   TASKSTATS_TYPE_PID, /* The PID */
   TASKSTATS_TYPE_STATS,/* carries the taskstats */
   TASKSTATS_TYPE_VERS, /* carries the version */
};

Refer to the doc i passed you and provide feedback if how to use the
above is not obvious.
 
The use of TLVs above implies that any of these can be optionally
appearing.
So when you are going from user-kernel with a GET a in #a above, then
you can specify the PID and/or TGID and you dont need to specify the
STATS and this would be perfectly legal.

On kernel-user (in the case of response to #a or async notifiation as
in #b) you really dont need to specify the TG/PID since they appear in
the STATS etc.
I take it you dont want to configure the values of taskstats from user
space, otherwise user-kernel will send a NEW as well.
I also take it dumping doesnt apply to you, so you dont need a dump
callback in your kernel code.
From what i described so far, you dont really need a header for yourself
either (maybe you need one to just store the VERSION?)

I didnt understand the point to the err field you had in the reply.
Netlink already does issue errors which can be read via perror. If this
is different from what you need, it may be an excuse to have your own
header.

I hope this helps.

cheers,
jamal



-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [UPDATED PATCH] Re: [Lse-tech] Re: [Patch 7/7] Generic netlink interface (delay accounting)

2006-03-10 Thread jamal
On Fri, 2006-10-03 at 09:53 -0500, jamal wrote:

 
 a) shipping of the taskstats from kernel to user-space asynchronously to
 all listeners on multicast channel/group TASKSTATS_LISTEN_GRP
 at the point when some process exits.
 b) responding to queries issued by the user to the kernel for taskstats
 of a particular defined tgid and/or pid combination. 
 
 Did i summarize your goals correctly?
 
 So lets stat with #b:
 i) the message is multicast; there has to be a user space app registered
 to the multicast group otherwise nothing goes to user space.

I mispoke:
The above applies to #a. 
For #b, the message from/to kernel to user is unicast.


cheers,
jamal


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [UPDATED PATCH] Re: [Lse-tech] Re: [Patch 7/7] Generic netlink interface (delay accounting)

2006-03-10 Thread Balbir Singh
On Fri, Mar 10, 2006 at 09:53:53AM -0500, jamal wrote:
 On Thu, 2006-09-03 at 20:07 +0530, Balbir Singh wrote:
  Please let us know if we missed something out.
 
 Design still shaky IMO - now that i think i may understand what your end
 goal is. 
 Using the principles i described in earlier email, the problem you are
 trying to solve is:
 
 a) shipping of the taskstats from kernel to user-space asynchronously to
 all listeners on multicast channel/group TASKSTATS_LISTEN_GRP
 at the point when some process exits.
 b) responding to queries issued by the user to the kernel for taskstats
 of a particular defined tgid and/or pid combination. 
 
 Did i summarize your goals correctly?

Yes, you did.

 
 So lets stat with #b:
 i) the message is multicast; there has to be a user space app registered
 to the multicast group otherwise nothing goes to user space.
 ii) user space issues a GET and it seems to me the appropriate naming
 for the response is a NEW.
 
 Lets go to #a:
 The issued multicast messages are also NEW and no different from the
 ones sent in response to a GET.
 
 Having said that then, you have the following commands:
 
 enum {
TASKSTATS_CMD_UNSPEC,   /* Reserved */
TASKSTATS_CMD_GET,  /* user - kernel query*/
TASKSTATS_CMD_NEW, /* kernel - user update */
 };
 
 You also need the following TLVs
 
 enum {
TASKSTATS_TYPE_UNSPEC,   /* Reserved */
TASKSTATS_TYPE_TGID,  /* The TGID */
TASKSTATS_TYPE_PID, /* The PID */
TASKSTATS_TYPE_STATS,  /* carries the taskstats */
TASKSTATS_TYPE_VERS,   /* carries the version */
 };
 
 Refer to the doc i passed you and provide feedback if how to use the
 above is not obvious.
  

I will look at the document, just got hold of it.

 The use of TLVs above implies that any of these can be optionally
 appearing.
 So when you are going from user-kernel with a GET a in #a above, then
 you can specify the PID and/or TGID and you dont need to specify the
 STATS and this would be perfectly legal.
 
 On kernel-user (in the case of response to #a or async notifiation as
 in #b) you really dont need to specify the TG/PID since they appear in
 the STATS etc.

I see your point now. I am looking at other users of netlink like
rtnetlink and I see the classical usage.

We can implement TLV's in our code, but for the most part the data we exchange
between the user - kernel has all the TLV's listed in the enum above.
The major differnece is the type (pid/tgid). Hence we created a structure
(taskstats) instead of using TLV's.

 I take it you dont want to configure the values of taskstats from user
 space, otherwise user-kernel will send a NEW as well.

Your understanding is correct.

 I also take it dumping doesnt apply to you, so you dont need a dump
 callback in your kernel code.

Yes, this is correct as well.

 From what i described so far, you dont really need a header for yourself
 either (maybe you need one to just store the VERSION?)
 

True, we do not need a header.

 I didnt understand the point to the err field you had in the reply.
 Netlink already does issue errors which can be read via perror. If this
 is different from what you need, it may be an excuse to have your own
 header.
 

Hmm.. Will look into this.

 I hope this helps.
 

Yes, it does immensely. Thanks for the detailed feedback.

 cheers,
 jamal
 
 

Warm Regards,
Balbir
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[UPDATED PATCH] Re: [Lse-tech] Re: [Patch 7/7] Generic netlink interface (delay accounting)

2006-03-09 Thread Balbir Singh
 Thanks for the clarification of the usage model. While our needs are 
 certainly much less complex,
 it is useful to know the range of options.
 
 There are no hard rules on what you need to be multicasting and as an
 example you could send periodic(aka time based) samples from the kernel
 on a multicast channel and that would be received by all. It did seem
 odd that you want to have a semi-promiscous mode where a response to a
 GET is multicast. If that is still what you want to achieve, then you
 should.
  
 Also if you can provide feedback whether the doc i sent was any use
 and what wasnt clear etc.
 also take a look at the excellent documentation Thomas Graf has put in
 the kernel for all the utilities for manipulating netlink messages and
 tell me if that should also be put in this doc (It is listed as a TODO).

Hello, Jamal,

Please find the latest version of the patch for review. The genetlink
code has been updated as per your review comments. The changelog is provided
below

1. Eliminated TASKSTATS_CMD_LISTEN and TASKSTATS_CMD_IGNORE
2. Provide generic functions called genlmsg_data() and genlmsg_len()
   in linux/net/genetlink.h
3. Do not multicast all replies, multicast only events generated due
   to task exit.
4. The taskstats and taskstats_reply structures are now 64 bit aligned.
5. Family id is dynamically generated.

Please let us know if we missed something out.

Thanks,
Balbir


Signed-off-by: Shailabh Nagar [EMAIL PROTECTED]
Signed-off-by: Balbir Singh [EMAIL PROTECTED]

---

 include/linux/delayacct.h |2 
 include/linux/taskstats.h |  128 
 include/net/genetlink.h   |   20 +++
 init/Kconfig  |   16 ++-
 kernel/Makefile   |1 
 kernel/delayacct.c|   56 ++
 kernel/taskstats.c|  244 ++
 7 files changed, 464 insertions(+), 3 deletions(-)

diff -puN include/linux/delayacct.h~delayacct-genetlink 
include/linux/delayacct.h
--- linux-2.6.16-rc5/include/linux/delayacct.h~delayacct-genetlink  
2006-03-09 17:15:31.0 +0530
+++ linux-2.6.16-rc5-balbir/include/linux/delayacct.h   2006-03-09 
17:15:31.0 +0530
@@ -15,6 +15,7 @@
 #define _LINUX_TASKDELAYS_H
 
 #include linux/sched.h
+#include linux/taskstats.h
 
 #ifdef CONFIG_TASK_DELAY_ACCT
 extern int delayacct_on;   /* Delay accounting turned on/off */
@@ -24,6 +25,7 @@ extern void __delayacct_tsk_init(struct 
 extern void __delayacct_tsk_exit(struct task_struct *);
 extern void __delayacct_blkio(void);
 extern void __delayacct_swapin(void);
+extern int delayacct_add_tsk(struct taskstats_reply *, struct task_struct *);
 
 static inline void delayacct_tsk_init(struct task_struct *tsk)
 {
diff -puN /dev/null include/linux/taskstats.h
--- /dev/null   2004-06-24 23:34:38.0 +0530
+++ linux-2.6.16-rc5-balbir/include/linux/taskstats.h   2006-03-09 
19:28:54.0 +0530
@@ -0,0 +1,128 @@
+/* taskstats.h - exporting per-task statistics
+ *
+ * Copyright (C) Shailabh Nagar, IBM Corp. 2006
+ *   (C) Balbir Singh,   IBM Corp. 2006
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of version 2.1 of the GNU Lesser General Public License
+ * as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it would be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
+ */
+
+#ifndef _LINUX_TASKSTATS_H
+#define _LINUX_TASKSTATS_H
+
+/* Format for per-task data returned to userland when
+ * - a task exits
+ * - listener requests stats for a task
+ *
+ * The struct is versioned. Newer versions should only add fields to
+ * the bottom of the struct to maintain backward compatibility.
+ *
+ * To create the next version, bump up the taskstats_version variable
+ * and delineate the start of newly added fields with a comment indicating
+ * the version number.
+ */
+
+#define TASKSTATS_VERSION  1
+
+struct taskstats {
+   /* Maintain 64-bit alignment while extending */
+
+   /* Version 1 */
+#define TASKSTATS_NOPID-1
+   __s64   pid;
+   __s64   tgid;
+
+   /* XXX_count is number of delay values recorded.
+* XXX_total is corresponding cumulative delay in nanoseconds
+*/
+
+#define TASKSTATS_NOCPUSTATS   1
+   __u64   cpu_count;
+   __u64   cpu_delay_total;/* wait, while runnable, for cpu */
+   __u64   blkio_count;
+   __u64   blkio_delay_total;  /* sync,block io completion wait*/
+   __u64   swapin_count;
+   __u64   swapin_delay_total; /* swapin page fault wait*/
+
+   __u64   cpu_run_total;  /* cpu running time
+* no count available/provided */
+};
+
+
+#define TASKSTATS_LISTEN_GROUP 0x1
+
+/*
+ * Commands sent from userspace
+ * Not versioned. New commands should only be inserted at the enum's end
+ 

Re: [UPDATED PATCH] Re: [Lse-tech] Re: [Patch 7/7] Generic netlink interface (delay accounting)

2006-03-09 Thread Shailabh Nagar

Balbir Singh wrote:

snip


Hello, Jamal,

Please find the latest version of the patch for review. The genetlink
code has been updated as per your review comments. The changelog is provided
below

1. Eliminated TASKSTATS_CMD_LISTEN and TASKSTATS_CMD_IGNORE
2. Provide generic functions called genlmsg_data() and genlmsg_len()
  in linux/net/genetlink.h
 


Balbir,
it might be a good idea to split 2. out separately, since it has generic 
value beyond the

delay accounting patches (just like we did for the timespec_diff_ns change)


Thanks,
Shailabh


3. Do not multicast all replies, multicast only events generated due
  to task exit.
4. The taskstats and taskstats_reply structures are now 64 bit aligned.
5. Family id is dynamically generated.

Please let us know if we missed something out.

Thanks,
Balbir


Signed-off-by: Shailabh Nagar [EMAIL PROTECTED]
Signed-off-by: Balbir Singh [EMAIL PROTECTED]

---

include/linux/delayacct.h |2 
include/linux/taskstats.h |  128 

include/net/genetlink.h   |   20 +++
init/Kconfig  |   16 ++-
kernel/Makefile   |1 
kernel/delayacct.c|   56 ++

kernel/taskstats.c|  244 ++
7 files changed, 464 insertions(+), 3 deletions(-)

 


snip

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html