Re: Good design to expose debug info from kernel module
Sounds good, thanks (although it'll be harder to use from non-C programs). Do you have a good idea how to stream information as a response to ioctl? On Mon, Mar 30, 2015 at 3:31 PM, Gilboa Davara gilb...@gmail.com wrote: On Thu, Mar 26, 2015 at 11:36 PM, Elazar Leibovich elaz...@gmail.com wrote: Hi, I'm writing a kernel module, and I want to expose some debug information about it. The debug information is often of the form of request-response. For example: - Hey module, what's up with data at 0xe8ff0040c000? - Cached, populated two hours ago. - Hey module, please invalidate data at 0xe8ff0002cb00 - Sure thing. - Hey module, please record all accesses to 0xe8ff0006bbf0. - OK, ask me again for stats-5 ... - Hey module, what's in stats-5? - So far, 41 accesses by 22 users. Now, the question is, what is a good design to expose this information. I think that the most reasonable way to interact with userspace is through a debugfs file. In my view, I usually prefer using ioctl-over-character device with *strongly* defined API / structures. Looking at the examples, your API should be fairly simple: typedef struct _debug_request_ { request_t request_type; void *request_pointer; answer_t answer_type; union answer_data { int __out error_code; int __out next_answer_time; struct access_info { int __out access_count; int __out user_count; }; }; } debug_request_t; If you make sure you copy_from/to_user this structure in-and-out of user-space, you get fairly robust interface in a fairly short and easy to debug code. (E.g. compared to having to parse user-supplied text string in kernel mode or using debugfs [partial read/write, etc]). - Gilboa ___ Linux-il mailing list Linux-il@cs.huji.ac.il http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il ___ Linux-il mailing list Linux-il@cs.huji.ac.il http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il
Re: Good design to expose debug info from kernel module
On Mon, Mar 30, 2015 at 3:44 PM, Elazar Leibovich elaz...@gmail.com wrote: Sounds good, thanks (although it'll be harder to use from non-C programs). I usually complement each kernel module with a user-mode C library wrapped inside a nice / easy to use (...) C++ class that handles the difficult bits. It far easier to make a C++ class that handles idiot-proof (E.g. string commands) data than doing it inside the kernel. (Plus, when a user mode library goes up the flame due to bad input, you don't end up with a dead machine...) Do you have a good idea how to stream information as a response to ioctl? Use TLV (type / length / value-array). typedef union _answer_data_ { int __out error_code; int __out next_answer_time; struct access_info { int __out access_count; int __out user_count; }; } answer_data_t; typedef struct _debug_request_ { request_t request_type; void *request_pointer; answer_t answer_type; --- Answer type. ( Enum, define, etc ). int answer_buffer_size; --- User supplied answer buffer size (in answer_data union units). int __out answer_buffer_used; Answer buffer used size (number of answers copied by the kernel). answer_data_t *answer_data_array; -- Answer buffer array (user-mode pointer). } debug_request_t; Now, in-case of multiple answers, you simply copy_to_user((void *)debug_request_copy-answer_data_array[count], (void *)kernel_answer, sizeof(answer_data_t)); count ++; .. To copy each answer to the user buffer. Once the array is depleted, the kernel function returns with a valid error code (E.g. -EEXIST) and tells to the user mode caller that there's additional data to be read (E.g. until the kernel returns -EWOULDBLOCK). Alternatively (and somewhat more complex) you can setup mmap over file handle, and simply map a static kernel buffer directly to the user mode. Though, unless you need to stream huge amounts of data and/or ioctl performance penalty [50-200ns] is too high, I wouldn't bother. BTW, Most likely your are aware of this, but just in case, never access the user mode buffer directly. When you first access the request buffer you copy it (copy_from_user) to kernel debug_request (no need to allocate it, considering its size, you shouldn't have any issues using stack, even on i386). - Gilboa ___ Linux-il mailing list Linux-il@cs.huji.ac.il http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il
Re: Good design to expose debug info from kernel module
And this is exactly my debate. On the one hand, using protobuf (or flatbuffers, or cap'nproto, or JSON, or msgpack, or whatever standard serialization format) has its share of complexity. You have to find the end of the struct, you have to have a parser both in kernel and in userspace, etc. On the other hand, what we're seeing here is, to paraphrase Greenspun, an ad hoc, informally-specified, bug-ridden, slow implementation of half of protocol buffers ;-) Given we're adding protobuf parser to the module, we can access it relatively easily from any programming language, we do need need a special syscall or include special headers (you need to get the ioctl number from the running module version). And we're getting more features (i.e., ZigZag encoding of integers, backwards compatibility) for free. I can definitely see your point that sticking to C structs is simpler for many things, but standard serialization could be simpler for others. PS Since you mentioned copy_from_user, I think it's worth mentioning that the reason for not accessing user's memory directly, is that without copy_from_user, the data could be swapped out in the middle of the memcpy. Or at least this is one of the reasons. Thanks, On Mon, Mar 30, 2015 at 5:08 PM, Gilboa Davara gilb...@gmail.com wrote: On Mon, Mar 30, 2015 at 3:44 PM, Elazar Leibovich elaz...@gmail.com wrote: Sounds good, thanks (although it'll be harder to use from non-C programs). I usually complement each kernel module with a user-mode C library wrapped inside a nice / easy to use (...) C++ class that handles the difficult bits. It far easier to make a C++ class that handles idiot-proof (E.g. string commands) data than doing it inside the kernel. (Plus, when a user mode library goes up the flame due to bad input, you don't end up with a dead machine...) Do you have a good idea how to stream information as a response to ioctl? Use TLV (type / length / value-array). typedef union _answer_data_ { int __out error_code; int __out next_answer_time; struct access_info { int __out access_count; int __out user_count; }; } answer_data_t; typedef struct _debug_request_ { request_t request_type; void *request_pointer; answer_t answer_type; --- Answer type. ( Enum, define, etc ). int answer_buffer_size; --- User supplied answer buffer size (in answer_data union units). int __out answer_buffer_used; Answer buffer used size (number of answers copied by the kernel). answer_data_t *answer_data_array; -- Answer buffer array (user-mode pointer). } debug_request_t; Now, in-case of multiple answers, you simply copy_to_user((void *)debug_request_copy-answer_data_array[count], (void *)kernel_answer, sizeof(answer_data_t)); count ++; .. To copy each answer to the user buffer. Once the array is depleted, the kernel function returns with a valid error code (E.g. -EEXIST) and tells to the user mode caller that there's additional data to be read (E.g. until the kernel returns -EWOULDBLOCK). Alternatively (and somewhat more complex) you can setup mmap over file handle, and simply map a static kernel buffer directly to the user mode. Though, unless you need to stream huge amounts of data and/or ioctl performance penalty [50-200ns] is too high, I wouldn't bother. BTW, Most likely your are aware of this, but just in case, never access the user mode buffer directly. When you first access the request buffer you copy it (copy_from_user) to kernel debug_request (no need to allocate it, considering its size, you shouldn't have any issues using stack, even on i386). - Gilboa ___ Linux-il mailing list Linux-il@cs.huji.ac.il http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il
Re: Good design to expose debug info from kernel module
On 29 March 2015 at 08:14, Elazar Leibovich elaz...@gmail.com wrote: 4) I really think nowadays text is a bit obsolete. I can hardly think of a case where text would be more convenient than, say, json or HTML. Web browser is at your fingertips, and HTML table is easier to handle than whitespace separated output based on your terminal size. trtdeth0/tdtd100/td/tr is just as grep'able as the tab separated version, and is easier to view, sort, etc. By text I DID mean JSON. I think that using JSON should address all the concerns above and keep the protocol future-proof. I didn't find kernel-specific JSON parser implementations but suspect that it should be possible to use any simple user-space C implementation. [0] https://github.com/elazarl/cpu_affinity/blob/master/tracecpu.d [1] https://github.com/elazarl/cpu_affinity/blob/master/test/linux/plot_ftrace_sched_switch.py On Sat, Mar 28, 2015 at 12:49 AM, Amos Shapira amos.shap...@gmail.com wrote: If serialisation (aka marshalling) is considered, how about making it text based? Then you can use simple shell tools to talk to it. On 27 March 2015 at 22:34, Elazar Leibovich elaz...@gmail.com wrote: IMHO, C structs are no way near as usable as proper serialization format. For example, what about optional fields? What about variable length array? What about binary backwards compatibility? What about supporting other languages? It's not trivial to take a C struct and generate the proper struct.unpack string for it. Look at the complexity in perf_event_open(2), just parsing the event stream takes a good chunk of code[0], with many potential bugs. Parsing it with protobuf (or one of the other serialization formats) would take three lines or so, would be more efficient, and would be easier to program against, and less prone to bugs, etc. [0] Here is my take, and it's not even complete https://gist.github.com/elazarl/c8404686e71ef0b36cc7 On Fri, Mar 27, 2015 at 12:26 PM, guy keren guy.choo.ke...@gmail.com wrote: i imagine, if you use the proper 'packing' pragmas, you can simply mempcy structures, without really writing serialization code (there's no endianess issues, with both sides running on the same host, by definition). --guy On 03/27/2015 10:03 AM, Elazar Leibovich wrote: Thanks, didn't know netlink. You still need a solution to parse the sent message, where protocol buffers etc, can help. (e.g., binary data into struct mymodule_request). Or am I missing something? On Fri, Mar 27, 2015 at 3:33 AM, guy keren guy.choo.ke...@gmail.com wrote: take a look at this: http://www.linuxfoundation.org/collaborate/workgroups/networking/generic_netlink_howto (link got broken - place it all on a single line) --guy On 03/26/2015 11:36 PM, Elazar Leibovich wrote: Hi, I'm writing a kernel module, and I want to expose some debug information about it. The debug information is often of the form of request-response. For example: - Hey module, what's up with data at 0xe8ff0040c000? - Cached, populated two hours ago. - Hey module, please invalidate data at 0xe8ff0002cb00 - Sure thing. - Hey module, please record all accesses to 0xe8ff0006bbf0. - OK, ask me again for stats-5 ... - Hey module, what's in stats-5? - So far, 41 accesses by 22 users. Now, the question is, what is a good design to expose this information. I think that the most reasonable way to interact with userspace is through a debugfs file. The user would open the debugfs file in read+write mode, would write a request, and accept a response from it. As I see it, there are two fundamental problems needs to be solved: - Parsing the request from the client. - Writing the response in a recognizeable format. A simple solution I first came up with, is to use a ad-hoc request-response format. In my case, request and response are line delimited, request is a hex address, and response is a translated hex address. Here is the relevant snippet. struct pipe { DECLARE_KFIFO(fifo, T, (14)); wait_queue_head_t queue; char buf[100]; int buflen; char resp[100]; int resp_len; }; static DEFINE_MUTEX(mutex); static int open(struct inode *inode, struct file *file) { struct pipe *pipe; if (!(file-f_mode FMODE_READ) || !(file-f_mode FMODE_READ)) { pr_warn(must open with O_RDWR\n); return -EINVAL; } mutex_lock(mutex); pipe = kzalloc(sizeof(*pipe), GFP_KERNEL); INIT_KFIFO(pipe-fifo); init_waitqueue_head(pipe-queue); file-private = pipe; } static int write(struct file *file, const char __user *ubuf, size_t count, loff_t *ppos) { char *eol; size_t n = min_t(size_t, count, sizeof(pipe-buf)); struct pipe *pipe = file-private_data; if
Re: Good design to expose debug info from kernel module
I've been bitten by text based serialization formats: 0) You have to write a parser/printer for each new protocol. 1) Separating messages in a stream is trickier with text based formats: int rv = read(fd, len, sizeof(len)); assert(rv 0); rv = read(len, buf, sizeof(buf))); assert(rv 0); instead of (without handling errors): int len = 0; while ((eol = memchr(buf, '\n', len)) == NULL) { int n = read(buf + len, sizeof(buf)-len); assert(n 0); len += n; } char *leftover = eol+1; int leftover_len = len-(eol+1-buf); // keep leftover, and add it to next read 2) Text based serialization are hard to parse. It's nice that their readable from the terminal, but whenever I've had to analyze it a bit more, I had to resort to crazy regexp. Instead of abstract description, here is a real problem I was trying to solve. Sketch when did the machine context switched lately. With dtrace, it was a few minute work, as I had the event data parsed[0]. When using Linux, I had to trace ftrace's textual output, which took time to get right, AND, was still fragile, as weird file names could potentially break it[1]. 3) While for simple protocols text based semi-protocol could work, my experience is, when you have more complex requests (say, more than one field, and variable length array), it becomes a burden. Especially if you want to keep backwards compatibility. Using standard makes both consuming and producing requests and response almost effortless. 4) I really think nowadays text is a bit obsolete. I can hardly think of a case where text would be more convenient than, say, json or HTML. Web browser is at your fingertips, and HTML table is easier to handle than whitespace separated output based on your terminal size. trtdeth0/tdtd100/td/tr is just as grep'able as the tab separated version, and is easier to view, sort, etc. [0] https://github.com/elazarl/cpu_affinity/blob/master/tracecpu.d [1] https://github.com/elazarl/cpu_affinity/blob/master/test/linux/plot_ftrace_sched_switch.py On Sat, Mar 28, 2015 at 12:49 AM, Amos Shapira amos.shap...@gmail.com wrote: If serialisation (aka marshalling) is considered, how about making it text based? Then you can use simple shell tools to talk to it. On 27 March 2015 at 22:34, Elazar Leibovich elaz...@gmail.com wrote: IMHO, C structs are no way near as usable as proper serialization format. For example, what about optional fields? What about variable length array? What about binary backwards compatibility? What about supporting other languages? It's not trivial to take a C struct and generate the proper struct.unpack string for it. Look at the complexity in perf_event_open(2), just parsing the event stream takes a good chunk of code[0], with many potential bugs. Parsing it with protobuf (or one of the other serialization formats) would take three lines or so, would be more efficient, and would be easier to program against, and less prone to bugs, etc. [0] Here is my take, and it's not even complete https://gist.github.com/elazarl/c8404686e71ef0b36cc7 On Fri, Mar 27, 2015 at 12:26 PM, guy keren guy.choo.ke...@gmail.com wrote: i imagine, if you use the proper 'packing' pragmas, you can simply mempcy structures, without really writing serialization code (there's no endianess issues, with both sides running on the same host, by definition). --guy On 03/27/2015 10:03 AM, Elazar Leibovich wrote: Thanks, didn't know netlink. You still need a solution to parse the sent message, where protocol buffers etc, can help. (e.g., binary data into struct mymodule_request). Or am I missing something? On Fri, Mar 27, 2015 at 3:33 AM, guy keren guy.choo.ke...@gmail.com wrote: take a look at this: http://www.linuxfoundation.org/collaborate/workgroups/networking/generic_netlink_howto (link got broken - place it all on a single line) --guy On 03/26/2015 11:36 PM, Elazar Leibovich wrote: Hi, I'm writing a kernel module, and I want to expose some debug information about it. The debug information is often of the form of request-response. For example: - Hey module, what's up with data at 0xe8ff0040c000? - Cached, populated two hours ago. - Hey module, please invalidate data at 0xe8ff0002cb00 - Sure thing. - Hey module, please record all accesses to 0xe8ff0006bbf0. - OK, ask me again for stats-5 ... - Hey module, what's in stats-5? - So far, 41 accesses by 22 users. Now, the question is, what is a good design to expose this information. I think that the most reasonable way to interact with userspace is through a debugfs file. The user would open the debugfs file in read+write mode, would write a request, and accept a response from it. As I see it, there are two fundamental problems needs to be solved: - Parsing the request from the client. - Writing the response in a
Re: Good design to expose debug info from kernel module
Thanks, didn't know netlink. You still need a solution to parse the sent message, where protocol buffers etc, can help. (e.g., binary data into struct mymodule_request). Or am I missing something? On Fri, Mar 27, 2015 at 3:33 AM, guy keren guy.choo.ke...@gmail.com wrote: take a look at this: http://www.linuxfoundation.org/collaborate/workgroups/networking/generic_netlink_howto (link got broken - place it all on a single line) --guy On 03/26/2015 11:36 PM, Elazar Leibovich wrote: Hi, I'm writing a kernel module, and I want to expose some debug information about it. The debug information is often of the form of request-response. For example: - Hey module, what's up with data at 0xe8ff0040c000? - Cached, populated two hours ago. - Hey module, please invalidate data at 0xe8ff0002cb00 - Sure thing. - Hey module, please record all accesses to 0xe8ff0006bbf0. - OK, ask me again for stats-5 ... - Hey module, what's in stats-5? - So far, 41 accesses by 22 users. Now, the question is, what is a good design to expose this information. I think that the most reasonable way to interact with userspace is through a debugfs file. The user would open the debugfs file in read+write mode, would write a request, and accept a response from it. As I see it, there are two fundamental problems needs to be solved: - Parsing the request from the client. - Writing the response in a recognizeable format. A simple solution I first came up with, is to use a ad-hoc request-response format. In my case, request and response are line delimited, request is a hex address, and response is a translated hex address. Here is the relevant snippet. struct pipe { DECLARE_KFIFO(fifo, T, (14)); wait_queue_head_t queue; char buf[100]; int buflen; char resp[100]; int resp_len; }; static DEFINE_MUTEX(mutex); static int open(struct inode *inode, struct file *file) { struct pipe *pipe; if (!(file-f_mode FMODE_READ) || !(file-f_mode FMODE_READ)) { pr_warn(must open with O_RDWR\n); return -EINVAL; } mutex_lock(mutex); pipe = kzalloc(sizeof(*pipe), GFP_KERNEL); INIT_KFIFO(pipe-fifo); init_waitqueue_head(pipe-queue); file-private = pipe; } static int write(struct file *file, const char __user *ubuf, size_t count, loff_t *ppos) { char *eol; size_t n = min_t(size_t, count, sizeof(pipe-buf)); struct pipe *pipe = file-private_data; if (copy_from_user(pipe-buf[pipe-buflen], ubuf, n) return -EFAULT; eol = memchr(buf, '\n', n); if (eol == NULL) return count; *eol = '\0'; // TODO: wait when queue full if (!kfifo_in(pipe-fifo, processLine(buf), 1) return -EFAULT; wake_up_interruptible(pipe-queue); memmove(pipe-buf[0], pipe-buf[n], pipe-buflen-n); } static int read(struct file *file, const char __user *ubuf, size_t count, loff_t *ppos) { struct pipe *pipe = file-private_data; T req; wait_event_interruptible(pipe-queue, kfifo_out(pipe-fifo, req, 1)); process_request(req, pipe-resp, pipe-resp_len); if (count pipe-resp_len) return -EFAULT; // TODO: handle copy to client in parts if (copy_to_user(userbuf, buf, pipe-resp_len)) return -EFAULT; } Usage is: fd = io.FileIO(/debug/mymodule/file, r+) fd.write('req...') print fd.read(100) This is not so robust, for many reasons (look how many bugs are in this small and simple snippet), and some parts need to be repeated for each input type. What I've had in mind, in similar fashion to grpc.io, have the user write a size prefixed protocol buffer object to the file, and similarly read it as a response. Something like: fd = io.FileIO(/debug/mymodule/file, r+) fd.write(myReq.SerializeToString()) len = struct.unpack(i, fd.read(4)) Resp.ParseFromString(fd.read(len)) I believe it is not hard to create a kernel compatible protocol buffer code generator. When you have this in place, you have to write a very simple logic to add a new functionality to the debugfs file. Handler would essentially get pointers to a request struct, and a response struct, and would need to fill out the response struct. Are there similar solutions? What problems might my approach cause? Is there a better idea for this problem altogether? Thanks, ___ Linux-il mailing list Linux-il@cs.huji.ac.il http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il ___ Linux-il mailing list Linux-il@cs.huji.ac.il http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il ___ Linux-il mailing list Linux-il@cs.huji.ac.il http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il
Re: Good design to expose debug info from kernel module
IMHO, C structs are no way near as usable as proper serialization format. For example, what about optional fields? What about variable length array? What about binary backwards compatibility? What about supporting other languages? It's not trivial to take a C struct and generate the proper struct.unpack string for it. Look at the complexity in perf_event_open(2), just parsing the event stream takes a good chunk of code[0], with many potential bugs. Parsing it with protobuf (or one of the other serialization formats) would take three lines or so, would be more efficient, and would be easier to program against, and less prone to bugs, etc. [0] Here is my take, and it's not even complete https://gist.github.com/elazarl/c8404686e71ef0b36cc7 On Fri, Mar 27, 2015 at 12:26 PM, guy keren guy.choo.ke...@gmail.com wrote: i imagine, if you use the proper 'packing' pragmas, you can simply mempcy structures, without really writing serialization code (there's no endianess issues, with both sides running on the same host, by definition). --guy On 03/27/2015 10:03 AM, Elazar Leibovich wrote: Thanks, didn't know netlink. You still need a solution to parse the sent message, where protocol buffers etc, can help. (e.g., binary data into struct mymodule_request). Or am I missing something? On Fri, Mar 27, 2015 at 3:33 AM, guy keren guy.choo.ke...@gmail.com wrote: take a look at this: http://www.linuxfoundation.org/collaborate/workgroups/networking/generic_netlink_howto (link got broken - place it all on a single line) --guy On 03/26/2015 11:36 PM, Elazar Leibovich wrote: Hi, I'm writing a kernel module, and I want to expose some debug information about it. The debug information is often of the form of request-response. For example: - Hey module, what's up with data at 0xe8ff0040c000? - Cached, populated two hours ago. - Hey module, please invalidate data at 0xe8ff0002cb00 - Sure thing. - Hey module, please record all accesses to 0xe8ff0006bbf0. - OK, ask me again for stats-5 ... - Hey module, what's in stats-5? - So far, 41 accesses by 22 users. Now, the question is, what is a good design to expose this information. I think that the most reasonable way to interact with userspace is through a debugfs file. The user would open the debugfs file in read+write mode, would write a request, and accept a response from it. As I see it, there are two fundamental problems needs to be solved: - Parsing the request from the client. - Writing the response in a recognizeable format. A simple solution I first came up with, is to use a ad-hoc request-response format. In my case, request and response are line delimited, request is a hex address, and response is a translated hex address. Here is the relevant snippet. struct pipe { DECLARE_KFIFO(fifo, T, (14)); wait_queue_head_t queue; char buf[100]; int buflen; char resp[100]; int resp_len; }; static DEFINE_MUTEX(mutex); static int open(struct inode *inode, struct file *file) { struct pipe *pipe; if (!(file-f_mode FMODE_READ) || !(file-f_mode FMODE_READ)) { pr_warn(must open with O_RDWR\n); return -EINVAL; } mutex_lock(mutex); pipe = kzalloc(sizeof(*pipe), GFP_KERNEL); INIT_KFIFO(pipe-fifo); init_waitqueue_head(pipe-queue); file-private = pipe; } static int write(struct file *file, const char __user *ubuf, size_t count, loff_t *ppos) { char *eol; size_t n = min_t(size_t, count, sizeof(pipe-buf)); struct pipe *pipe = file-private_data; if (copy_from_user(pipe-buf[pipe-buflen], ubuf, n) return -EFAULT; eol = memchr(buf, '\n', n); if (eol == NULL) return count; *eol = '\0'; // TODO: wait when queue full if (!kfifo_in(pipe-fifo, processLine(buf), 1) return -EFAULT; wake_up_interruptible(pipe-queue); memmove(pipe-buf[0], pipe-buf[n], pipe-buflen-n); } static int read(struct file *file, const char __user *ubuf, size_t count, loff_t *ppos) { struct pipe *pipe = file-private_data; T req; wait_event_interruptible(pipe-queue, kfifo_out(pipe-fifo, req, 1)); process_request(req, pipe-resp, pipe-resp_len); if (count pipe-resp_len) return -EFAULT; // TODO: handle copy to client in parts if (copy_to_user(userbuf, buf, pipe-resp_len)) return -EFAULT; } Usage is: fd = io.FileIO(/debug/mymodule/file, r+) fd.write('req...') print fd.read(100) This is not so robust, for many reasons (look how many bugs are in this small and simple snippet), and some parts need to be repeated for each input type. What I've had in mind, in similar fashion to grpc.io, have the user write a size prefixed protocol buffer object to the file, and similarly read it as a response. Something like: fd =
Re: Good design to expose debug info from kernel module
i imagine, if you use the proper 'packing' pragmas, you can simply mempcy structures, without really writing serialization code (there's no endianess issues, with both sides running on the same host, by definition). --guy On 03/27/2015 10:03 AM, Elazar Leibovich wrote: Thanks, didn't know netlink. You still need a solution to parse the sent message, where protocol buffers etc, can help. (e.g., binary data into struct mymodule_request). Or am I missing something? On Fri, Mar 27, 2015 at 3:33 AM, guy keren guy.choo.ke...@gmail.com wrote: take a look at this: http://www.linuxfoundation.org/collaborate/workgroups/networking/generic_netlink_howto (link got broken - place it all on a single line) --guy On 03/26/2015 11:36 PM, Elazar Leibovich wrote: Hi, I'm writing a kernel module, and I want to expose some debug information about it. The debug information is often of the form of request-response. For example: - Hey module, what's up with data at 0xe8ff0040c000? - Cached, populated two hours ago. - Hey module, please invalidate data at 0xe8ff0002cb00 - Sure thing. - Hey module, please record all accesses to 0xe8ff0006bbf0. - OK, ask me again for stats-5 ... - Hey module, what's in stats-5? - So far, 41 accesses by 22 users. Now, the question is, what is a good design to expose this information. I think that the most reasonable way to interact with userspace is through a debugfs file. The user would open the debugfs file in read+write mode, would write a request, and accept a response from it. As I see it, there are two fundamental problems needs to be solved: - Parsing the request from the client. - Writing the response in a recognizeable format. A simple solution I first came up with, is to use a ad-hoc request-response format. In my case, request and response are line delimited, request is a hex address, and response is a translated hex address. Here is the relevant snippet. struct pipe { DECLARE_KFIFO(fifo, T, (14)); wait_queue_head_t queue; char buf[100]; int buflen; char resp[100]; int resp_len; }; static DEFINE_MUTEX(mutex); static int open(struct inode *inode, struct file *file) { struct pipe *pipe; if (!(file-f_mode FMODE_READ) || !(file-f_mode FMODE_READ)) { pr_warn(must open with O_RDWR\n); return -EINVAL; } mutex_lock(mutex); pipe = kzalloc(sizeof(*pipe), GFP_KERNEL); INIT_KFIFO(pipe-fifo); init_waitqueue_head(pipe-queue); file-private = pipe; } static int write(struct file *file, const char __user *ubuf, size_t count, loff_t *ppos) { char *eol; size_t n = min_t(size_t, count, sizeof(pipe-buf)); struct pipe *pipe = file-private_data; if (copy_from_user(pipe-buf[pipe-buflen], ubuf, n) return -EFAULT; eol = memchr(buf, '\n', n); if (eol == NULL) return count; *eol = '\0'; // TODO: wait when queue full if (!kfifo_in(pipe-fifo, processLine(buf), 1) return -EFAULT; wake_up_interruptible(pipe-queue); memmove(pipe-buf[0], pipe-buf[n], pipe-buflen-n); } static int read(struct file *file, const char __user *ubuf, size_t count, loff_t *ppos) { struct pipe *pipe = file-private_data; T req; wait_event_interruptible(pipe-queue, kfifo_out(pipe-fifo, req, 1)); process_request(req, pipe-resp, pipe-resp_len); if (count pipe-resp_len) return -EFAULT; // TODO: handle copy to client in parts if (copy_to_user(userbuf, buf, pipe-resp_len)) return -EFAULT; } Usage is: fd = io.FileIO(/debug/mymodule/file, r+) fd.write('req...') print fd.read(100) This is not so robust, for many reasons (look how many bugs are in this small and simple snippet), and some parts need to be repeated for each input type. What I've had in mind, in similar fashion to grpc.io, have the user write a size prefixed protocol buffer object to the file, and similarly read it as a response. Something like: fd = io.FileIO(/debug/mymodule/file, r+) fd.write(myReq.SerializeToString()) len = struct.unpack(i, fd.read(4)) Resp.ParseFromString(fd.read(len)) I believe it is not hard to create a kernel compatible protocol buffer code generator. When you have this in place, you have to write a very simple logic to add a new functionality to the debugfs file. Handler would essentially get pointers to a request struct, and a response struct, and would need to fill out the response struct. Are there similar solutions? What problems might my approach cause? Is there a better idea for this problem altogether? Thanks, ___ Linux-il mailing list Linux-il@cs.huji.ac.il http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il ___ Linux-il mailing list Linux-il@cs.huji.ac.il http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il
Re: Good design to expose debug info from kernel module
take a look at this: http://www.linuxfoundation.org/collaborate/workgroups/networking/generic_netlink_howto (link got broken - place it all on a single line) --guy On 03/26/2015 11:36 PM, Elazar Leibovich wrote: Hi, I'm writing a kernel module, and I want to expose some debug information about it. The debug information is often of the form of request-response. For example: - Hey module, what's up with data at 0xe8ff0040c000? - Cached, populated two hours ago. - Hey module, please invalidate data at 0xe8ff0002cb00 - Sure thing. - Hey module, please record all accesses to 0xe8ff0006bbf0. - OK, ask me again for stats-5 ... - Hey module, what's in stats-5? - So far, 41 accesses by 22 users. Now, the question is, what is a good design to expose this information. I think that the most reasonable way to interact with userspace is through a debugfs file. The user would open the debugfs file in read+write mode, would write a request, and accept a response from it. As I see it, there are two fundamental problems needs to be solved: - Parsing the request from the client. - Writing the response in a recognizeable format. A simple solution I first came up with, is to use a ad-hoc request-response format. In my case, request and response are line delimited, request is a hex address, and response is a translated hex address. Here is the relevant snippet. struct pipe { DECLARE_KFIFO(fifo, T, (14)); wait_queue_head_t queue; char buf[100]; int buflen; char resp[100]; int resp_len; }; static DEFINE_MUTEX(mutex); static int open(struct inode *inode, struct file *file) { struct pipe *pipe; if (!(file-f_mode FMODE_READ) || !(file-f_mode FMODE_READ)) { pr_warn(must open with O_RDWR\n); return -EINVAL; } mutex_lock(mutex); pipe = kzalloc(sizeof(*pipe), GFP_KERNEL); INIT_KFIFO(pipe-fifo); init_waitqueue_head(pipe-queue); file-private = pipe; } static int write(struct file *file, const char __user *ubuf, size_t count, loff_t *ppos) { char *eol; size_t n = min_t(size_t, count, sizeof(pipe-buf)); struct pipe *pipe = file-private_data; if (copy_from_user(pipe-buf[pipe-buflen], ubuf, n) return -EFAULT; eol = memchr(buf, '\n', n); if (eol == NULL) return count; *eol = '\0'; // TODO: wait when queue full if (!kfifo_in(pipe-fifo, processLine(buf), 1) return -EFAULT; wake_up_interruptible(pipe-queue); memmove(pipe-buf[0], pipe-buf[n], pipe-buflen-n); } static int read(struct file *file, const char __user *ubuf, size_t count, loff_t *ppos) { struct pipe *pipe = file-private_data; T req; wait_event_interruptible(pipe-queue, kfifo_out(pipe-fifo, req, 1)); process_request(req, pipe-resp, pipe-resp_len); if (count pipe-resp_len) return -EFAULT; // TODO: handle copy to client in parts if (copy_to_user(userbuf, buf, pipe-resp_len)) return -EFAULT; } Usage is: fd = io.FileIO(/debug/mymodule/file, r+) fd.write('req...') print fd.read(100) This is not so robust, for many reasons (look how many bugs are in this small and simple snippet), and some parts need to be repeated for each input type. What I've had in mind, in similar fashion to grpc.io, have the user write a size prefixed protocol buffer object to the file, and similarly read it as a response. Something like: fd = io.FileIO(/debug/mymodule/file, r+) fd.write(myReq.SerializeToString()) len = struct.unpack(i, fd.read(4)) Resp.ParseFromString(fd.read(len)) I believe it is not hard to create a kernel compatible protocol buffer code generator. When you have this in place, you have to write a very simple logic to add a new functionality to the debugfs file. Handler would essentially get pointers to a request struct, and a response struct, and would need to fill out the response struct. Are there similar solutions? What problems might my approach cause? Is there a better idea for this problem altogether? Thanks, ___ Linux-il mailing list Linux-il@cs.huji.ac.il http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il ___ Linux-il mailing list Linux-il@cs.huji.ac.il http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il