Re: RFC: android logger feedback request
I'm back from vacation - sorry for the long delay in responding. First, let me say thanks to all those who have responded with ideas and suggestions. I'll be following up on many of them in the new few weeks. This is a background task for me, so it will likely go relatively slowly. I have volunteer offers of assistance, and some resources available for contract work, but it might take time to set that up. In general, though, things should be able to move forward. I could respond to a lot of messages individually, but to conserve space I'll focus on this one, and present an overall idea for how I plan to proceed. On 12/21/2011 11:05 PM, NeilBrown wrote: Possibly it would be useful to be clear what we all *are* really interested in, because I suspect there is a lot of variety. You appear to be interested in providing a great platform for a phone, in minimising unnecessary churn in that platform, and in having the freedom to optimise various aspects however you see fit. You appear to have little problem with maintaining some components out-of-mainline. This is all perfectly valid and very much in the spirit of freedom that binds us together. Others want to be able to run a main-line kernel underneath the Android user-space - and that is perfectly reasonable as well. I don't much care about either of those, but I want Linux to be high quality (according to my own personal standards of course) which means keeping bad stuff out (though you could argue that the horse has well and truly bolted there) and including good and useful stuff in. And I think Android has something to offer on both sides there :-) Thanks very much for this. I think it *is* important to take a step back and compare goals, to see how we can achieve the best result for all parties. Weighing all that up, I don't think it is useful to set our goal on getting Android to use a mainline kernel - that isn't going to happen. Rather we should focus primarily on making it *possible* to run android user-space on mainline. That could involve two things 1/ Making sure the interfaces that Android uses are abstracted at a suitable level so that when running a mainline kernel you can just slip in an alternate shared library with compatible functionality. 2/ Adding functionality to Linux so that it is possible to provide the functionality that Android needs. Agreed. Android should not *have* to use the interface that the mainline community decides is preferred, but nor should mainline be required to include the drivers that Android wants to use. History shows us that isn't going to happen. I agree with the sentiment in the first sentence, but I would like to make a few observations about this code, and the problem in general, for consideration. I'll do this below because it's a bit philosophical, and I'd rather focus on the way forward first. But if there was a fairly low-level API that Android used, then those in the community could who want Android on a mainline kernel could work to implement that API with whatever mixture of user-space and kernel-space that achieves consensus. Android could of course change to use the community version eventually if they found that was a good thing, or could keep using their own. So: bringing this back to the android logger... What I would like to see is a low-level API which is used to access logging for which alternate implementations could be written. Ideally this would be embodied in a single .so, but we have the source so that doesn't need to be a barrier. Then we could argue to our heart's content about how best to implement that API - Journal and nsyslogd and rsyslogd could all implement it in different ways and we could be one step closer to running Android on a mainline kernel, but the Android developers don't need to be involved (but can if they want to of course). I would be important that the API is specified clearly - neither under specified nor over-specified. That means that the Android implementation would need to explicitly forbid anything that isn't explicit permitted. This is because most testing will happen on the Android platform so it's actual behaviour will become the defacto standard. Could that be a possible way forward? I'd like to pursue this, as well as some of the minor code cleanups suggested by Andrew Morton. Here's what I'm thinking: I'd like to implement Neil's file-system based solution as a test case, and compare that with the existing code. I like the elegance of using existing filesystem semantics, and the removal of some hard-coded limits and policy from the kernel. A lower priority would be to also try a user-space-only solution, as described by Arnd. Optimally, it would be nice to have all three systems to compare (char dev, log fs, and ram fs) against each other. I'll try to use these under the existing logger library API to see if the current semantics
Re: RFC: android logger feedback request
On Fri, Jan 06, 2012 at 12:56:27PM -0800, Tim Bird wrote: Now, having said all that, let me go off into some philosophical weeds... Ah, this should be fun :) This code is only about 700 lines long, and specializes the kernel for an extremely popular (by usage) user space stack. Code of lower quality which specializes the kernel for much less-used hardware has been admitted into the kernel with significantly less fuss than this code has received. That's not an argument to accept the code as is, it's just an observation about the relative hurdle that this code faces compared to lots of other code that goes into the kernel. (And please don't interpret this as dissatisfaction with the feedback received.) If this code were a character driver for an obscure serial port on a lesser-known chip architecture, I don't think it would get any attention at all. As it is, it's looking like at least a few man months of work will be required, as well as some relatively unneeded changes to Android user space, to get this feature into a permanently acceptable state. I wouldn't be surprised to see this stretch into a few calendar years. Code that specializes the kernel in weird ways is accepted into the kernel all the time, and I've tried to figure out why this particular bit of code is treated differently. Especially since this code is self-contained, configurable, and imposes no perceivable long-term maintenance burden. (That's not true of all Android patches, but I believe it's true of this one). I have a few theories: 1) this is not tied to hardware, and as such represents a general feature (but people are not at all required to treat it as such, just as they are not required to use other people's weird drivers). That is exactly true. 2) people want to avoid duplication with other similar features (again, since it's self-contained and configurable, I don't know why it would bother people if this existed in tandem with other solutions - especially since it's so small) Again, very true in the duplication sense. It is general code, not tied to hardware, so it better justify itself somehow. 3) there is really no maintainer for this feature category, so discussions get bogged down as varying requirements and solutions are suggested, which can not easily be compared against each other (especially for non-existent implementations) In particular, it's unclear who I have to get the approval of for this code or some derivative of it to be accepted. That makes the development task a very open-ended one. No, I don't think this is a problem, you have bigger ones to deal with :) 4) this is for a popular use case, as opposed to some minor outlying thing, and so people perceive the need to get it exactly right. In this sense, the code would be a victim of it's own success. It's not the code's success here, it's the this solves a general problem in a very specific way issue, that just happens to be a use case a lot of different people want to see addressed. And since it is only solving it in a narrow way, that's a problem for all of those other people. 5) blocking this is perceived to be a way to accomplish a larger, related goal (if this is true then it has lots of interesting implications for the economics of open source work) No, I don't think that's the issue here. In general, there is a tension between the normal nature of adapting the kernel to the most general use cases, and the specialization that is performed in developing an embedded product. Often times, solutions to embedded requirements are very finely tuned to a particular field of use or situation, and don't lend themselves easily to the type of generalization that mainlining usually requires. Why do people get hung up on the embedded is special case here. Fact, it's not. Do you somehow think that the work done for the HUGE multiprocessor machines 10 years ago would have gotten away with hacks saying, this is a special case, only we care about this, when they were dealing with 4 and 8 way machines? No, they didn't, and as such, there's nothing special with embedded here either. Matter of fact, your machines are now more powerful than those old 10 year old machines, and are really general purpose computers, just used in a single/limited manner, again, just like those original big SMP machines were. So, I've said it many times before, and I'll say it again: Yes, you are special and unique, just like everyone else. The next person who says the embedded is different phrase again, owes me a beer of my choice. Which brings me to my last point and a question. Is it inconceivable for there to be a category of code in the Linux kernel which supports ONLY Android user space, and no other? That is, must every Android patch be generalized in some manner to a broader use case? No, that's fine, I have no problem with a drivers/android/ directory for android only stuff, because: I
Re: RFC: android logger feedback request
On 01/06/2012 01:20 PM, Greg KH wrote: On Fri, Jan 06, 2012 at 12:56:27PM -0800, Tim Bird wrote: 4) this is for a popular use case, as opposed to some minor outlying thing, and so people perceive the need to get it exactly right. In this sense, the code would be a victim of it's own success. It's not the code's success here, it's the this solves a general problem in a very specific way issue, that just happens to be a use case a lot of different people want to see addressed. And since it is only solving it in a narrow way, that's a problem for all of those other people. It's only a problem in the sense that those different people don't get something for free. The presence of an isolated, configurable driver somewhere in the kernel source, that addresses one group's issues and not another's doesn't impose any measurable burden on that other group. It's more of an opportunity cost thing. The opportunity to develop something more general can be reduced when a narrow solution is accepted. But this is exactly related to encouraging people to develop more general solutions, when narrow ones accomplish what they need (see point 5). I understand that the whole model is built on people contributing code that not only addresses their own needs, but the needs of others. But of course there should be balance. For example I have concerns about integrating this into the printk code path, because I think that code path is complex enough as it is, and that combining this functionality with that is likely to create more maintenance problems rather than less in the long run. (But I'll still plan to look at this option ...) Sometimes, features should just remain separate. 5) blocking this is perceived to be a way to accomplish a larger, related goal (if this is true then it has lots of interesting implications for the economics of open source work) No, I don't think that's the issue here. I don't think anyone is thinking: If I block this I can get Tim to do x. But there *is* a desire to create a more general logging solution. And not accepting this as is, is a way to encourage people to work on that. For some situations, I would object to that. But in this case I actually agree that there are more potential users of this, and it would be nice to solve more problems than just the Android case, with a consolidated code base. That's why I'm willing to invest in researching other loggers and doing some experimental development. In general, there is a tension between the normal nature of adapting the kernel to the most general use cases, and the specialization that is performed in developing an embedded product. Often times, solutions to embedded requirements are very finely tuned to a particular field of use or situation, and don't lend themselves easily to the type of generalization that mainlining usually requires. I should have left off embedded product for this thread. There is a tension in general between generalization and specialization. And that tension shows up in a lot of discussions about mainlining code. My goal here is to determine where people think the balance lies for this particular code. Why do people get hung up on the embedded is special case here. Fact, it's not. Do you somehow think that the work done for the HUGE multiprocessor machines 10 years ago would have gotten away with hacks saying, this is a special case, only we care about this, when they were dealing with 4 and 8 way machines? No, they didn't, and as such, there's nothing special with embedded here either. Matter of fact, your machines are now more powerful than those old 10 year old machines, and are really general purpose computers, just used in a single/limited manner, again, just like those original big SMP machines were. So, I've said it many times before, and I'll say it again: Yes, you are special and unique, just like everyone else. The next person who says the embedded is different phrase again, owes me a beer of my choice. Consider this your free beer token... :-) I believe embedded is different because we throw away so much code, that I just don't think desktop and enterprise do. The number of non-mainlined BSPs that have gone straight from product release to bit bucket is truly staggering. You're only looking at the Android case, where the machines are as powerful as they were 10 years ago. Broad swaths of embedded are not on that kind of hardware. It's different (there it is again, is that two beers?) being at the bottom side of the scalability spectrum rather than at the top. Everyone expected that eventually there would be more SMP machines in the world. This is not true of all embedded processors. We're used to developing code that we'll throw away after a single release. But I think this is beside the point for this particular code. I agree that it's worth some effort to try to generalize this logger code. Indeed, when I first saw the
Re: RFC: android logger feedback request
On Fri, Jan 06, 2012 at 02:41:52PM -0800, Tim Bird wrote: On 01/06/2012 01:20 PM, Greg KH wrote: On Fri, Jan 06, 2012 at 12:56:27PM -0800, Tim Bird wrote: 4) this is for a popular use case, as opposed to some minor outlying thing, and so people perceive the need to get it exactly right. In this sense, the code would be a victim of it's own success. It's not the code's success here, it's the this solves a general problem in a very specific way issue, that just happens to be a use case a lot of different people want to see addressed. And since it is only solving it in a narrow way, that's a problem for all of those other people. It's only a problem in the sense that those different people don't get something for free. The presence of an isolated, configurable driver somewhere in the kernel source, that addresses one group's issues and not another's doesn't impose any measurable burden on that other group. That's been proven incorrect over the long-run as we all must now maintain that user/kernel API for forever. You could say the same thing for adding new ioctls for individual drivers instead of creating a new syscall. Sometimes, in rare cases, it does make sense to create a new ioctl. But that decision has to be taken very carefully, as again, it needs to be standardized with all other drivers of that time, and maintained for forever. It's more of an opportunity cost thing. The opportunity to develop something more general can be reduced when a narrow solution is accepted. But this is exactly related to encouraging people to develop more general solutions, when narrow ones accomplish what they need (see point 5). When people work together on their narrow goals, they end up creating something that works for way more than just themselves, and in fact, it turns out it works better for people who originally wasn't even aware of the issues involved. Again, this has been proven over time with the Linux kernel development process. I wrote a whole chapter on this very topic in the book, Beautiful Code, if you want to hear more about it. In fact, I would argue, that is one of the top reasons why Linux has succeeded so well. Again, you really aren't unique, others will have the same issues/problems that you do, and if you solve them in a case that works for more than one, it makes everything better. I understand that the whole model is built on people contributing code that not only addresses their own needs, but the needs of others. But of course there should be balance. For example I have concerns about integrating this into the printk code path, because I think that code path is complex enough as it is, and that combining this functionality with that is likely to create more maintenance problems rather than less in the long run. (But I'll still plan to look at this option ...) Complex code paths are tough, I will grant you that. But if it was easy, then, you wouldn't be the person for the job :) Sometimes, features should just remain separate. When they do different things, of course. When they do the same thing, of course not. Why do people get hung up on the embedded is special case here. Fact, it's not. Do you somehow think that the work done for the HUGE multiprocessor machines 10 years ago would have gotten away with hacks saying, this is a special case, only we care about this, when they were dealing with 4 and 8 way machines? No, they didn't, and as such, there's nothing special with embedded here either. Matter of fact, your machines are now more powerful than those old 10 year old machines, and are really general purpose computers, just used in a single/limited manner, again, just like those original big SMP machines were. So, I've said it many times before, and I'll say it again: Yes, you are special and unique, just like everyone else. The next person who says the embedded is different phrase again, owes me a beer of my choice. Consider this your free beer token... :-) Oooh, nice, I'll take you up on that at ELC in a few weeks :) I believe embedded is different because we throw away so much code, that I just don't think desktop and enterprise do. The number of non-mainlined BSPs that have gone straight from product release to bit bucket is truly staggering. Yes, and that's sad. And maybe, if embedded feels that they are so special, and that contributing to mainline doesn't help or matter, then they shouldn't. Yes, I mean it, why push stuff upstream if you aren't going to maintain it and no one will ever build on top of it. Oh wait, you will end up building on it with your next iteration, so if you throw it away, oops, you just wasted time and money. And again, this mirrors the super computer case EXACTLY! In fact, I would argue that there are less machines shipped for those types of systems, than embedded, and do you see those developers saying that they should just keep their
Re: RFC: android logger feedback request
On Thu, Dec 29, 2011 at 9:39 AM, Andrew Morton a...@linux-foundation.org wrote: On Wed, 21 Dec 2011 14:59:15 -0800 Tim Bird tim.b...@am.sony.com wrote: Hi all, I'm looking for feedback on the Android logger code, The code looks nice. ... +/* logger_offset - returns index 'n' into the log via (optimized) modulus */ +#define logger_offset(n) ((n) (log-size - 1)) + This macro depends upon the presence of a local variable called log and is therefore all stinky. Pass log in as an arg and convert it to a regular C function. ... +/* + * get_entry_len - Grabs the length of the payload of the next entry starting + * from 'off'. + * + * Caller needs to hold log-mutex. + */ +static __u32 get_entry_len(struct logger_log *log, size_t off) +{ + __u16 val; + + switch (log-size - off) { + case 1: + memcpy(val, log-buffer + off, 1); + memcpy(((char *) val) + 1, log-buffer, 1); So numbers in the buffer are in host endian order. That's worth capturing in a comment somewhere. There must be a way of doing the above more neatly, using log-buffer[off] and log-buffer[0]. Perhaps using include/linux/unaligned/packed_struct.h. + break; + default: + memcpy(val, log-buffer + off, 2); get_unaligned() + } + + return sizeof(struct logger_entry) + val; +} + ... +static ssize_t logger_read(struct file *file, char __user *buf, + size_t count, loff_t *pos) +{ + struct logger_reader *reader = file-private_data; + struct logger_log *log = reader-log; + ssize_t ret; + DEFINE_WAIT(wait); + +start: + while (1) { + prepare_to_wait(log-wq, wait, TASK_INTERRUPTIBLE); + + mutex_lock(log-mutex); If mutex_lock() had to wait for the mutex, it will return in state TASK_RUNNING, thus rubbing out the effects of prepare_to_wait(). We'll just loop again so this is a benign buglet. Could we lose a wakeup by running prepaer_to_wait() outside the lock? I don't think so, but I stopped thinking about it when I saw the above bug. These two lines should be switched around. + ret = (log-w_off == reader-r_off); + mutex_unlock(log-mutex); + if (!ret) + break; + + if (file-f_flags O_NONBLOCK) { + ret = -EAGAIN; + break; + } + + if (signal_pending(current)) { + ret = -EINTR; + break; + } + + schedule(); + } + + finish_wait(log-wq, wait); + if (ret) + return ret; + + mutex_lock(log-mutex); + + /* is there still something to read or did we race? */ + if (unlikely(log-w_off == reader-r_off)) { + mutex_unlock(log-mutex); + goto start; + } + + /* get the size of the next entry */ + ret = get_entry_len(log, reader-r_off); + if (count ret) { + ret = -EINVAL; + goto out; + } + + /* get exactly one entry from the log */ + ret = do_read_log_to_user(log, reader, buf, ret); + +out: + mutex_unlock(log-mutex); + + return ret; +} + ... +/* + * clock_interval - is a c b in mod-space? Put another way, does the line + * from a to b cross c? + */ This is where my little brain started to hurt. I assume it's correct ;) +static inline int clock_interval(size_t a, size_t b, size_t c) +{ + if (b a) { + if (a c || b = c) + return 1; + } else { + if (a c b = c) + return 1; + } + + return 0; +} The explicit inline(s) are increaseingly old-fashioned. gcc is good about this now. ... +static ssize_t do_write_log_from_user(struct logger_log *log, + const void __user *buf, size_t count) +{ + size_t len; + + len = min(count, log-size - log-w_off); + if (len copy_from_user(log-buffer + log-w_off, buf, len)) + return -EFAULT; + + if (count != len) + if (copy_from_user(log-buffer, buf + len, count - len)) + return -EFAULT; Is it correct to simply return here without updating -w_off? We've just copied len bytes in from userspace. + log-w_off = logger_offset(log-w_off + count); + + return count; +} + +/* + * logger_aio_write - our write method, implementing support for write(), + * writev(), and aio_write(). Writes are our fast path, and we try to optimize + * them above all else. + */ +ssize_t logger_aio_write(struct kiocb *iocb, const struct iovec *iov, + unsigned long nr_segs, loff_t ppos) +{ + struct logger_log *log = file_get_log(iocb-ki_filp); + size_t orig = log-w_off; + struct logger_entry header; + struct timespec now; +
Re: RFC: android logger feedback request
On Wed, 21 Dec 2011 14:59:15 -0800 Tim Bird tim.b...@am.sony.com wrote: Hi all, I'm looking for feedback on the Android logger code, The code looks nice. ... +/* logger_offset - returns index 'n' into the log via (optimized) modulus */ +#define logger_offset(n) ((n) (log-size - 1)) + This macro depends upon the presence of a local variable called log and is therefore all stinky. Pass log in as an arg and convert it to a regular C function. ... +/* + * get_entry_len - Grabs the length of the payload of the next entry starting + * from 'off'. + * + * Caller needs to hold log-mutex. + */ +static __u32 get_entry_len(struct logger_log *log, size_t off) +{ + __u16 val; + + switch (log-size - off) { + case 1: + memcpy(val, log-buffer + off, 1); + memcpy(((char *) val) + 1, log-buffer, 1); So numbers in the buffer are in host endian order. That's worth capturing in a comment somewhere. There must be a way of doing the above more neatly, using log-buffer[off] and log-buffer[0]. Perhaps using include/linux/unaligned/packed_struct.h. + break; + default: + memcpy(val, log-buffer + off, 2); get_unaligned() + } + + return sizeof(struct logger_entry) + val; +} + ... +static ssize_t logger_read(struct file *file, char __user *buf, +size_t count, loff_t *pos) +{ + struct logger_reader *reader = file-private_data; + struct logger_log *log = reader-log; + ssize_t ret; + DEFINE_WAIT(wait); + +start: + while (1) { + prepare_to_wait(log-wq, wait, TASK_INTERRUPTIBLE); + + mutex_lock(log-mutex); If mutex_lock() had to wait for the mutex, it will return in state TASK_RUNNING, thus rubbing out the effects of prepare_to_wait(). We'll just loop again so this is a benign buglet. Could we lose a wakeup by running prepaer_to_wait() outside the lock? I don't think so, but I stopped thinking about it when I saw the above bug. These two lines should be switched around. + ret = (log-w_off == reader-r_off); + mutex_unlock(log-mutex); + if (!ret) + break; + + if (file-f_flags O_NONBLOCK) { + ret = -EAGAIN; + break; + } + + if (signal_pending(current)) { + ret = -EINTR; + break; + } + + schedule(); + } + + finish_wait(log-wq, wait); + if (ret) + return ret; + + mutex_lock(log-mutex); + + /* is there still something to read or did we race? */ + if (unlikely(log-w_off == reader-r_off)) { + mutex_unlock(log-mutex); + goto start; + } + + /* get the size of the next entry */ + ret = get_entry_len(log, reader-r_off); + if (count ret) { + ret = -EINVAL; + goto out; + } + + /* get exactly one entry from the log */ + ret = do_read_log_to_user(log, reader, buf, ret); + +out: + mutex_unlock(log-mutex); + + return ret; +} + ... +/* + * clock_interval - is a c b in mod-space? Put another way, does the line + * from a to b cross c? + */ This is where my little brain started to hurt. I assume it's correct ;) +static inline int clock_interval(size_t a, size_t b, size_t c) +{ + if (b a) { + if (a c || b = c) + return 1; + } else { + if (a c b = c) + return 1; + } + + return 0; +} The explicit inline(s) are increaseingly old-fashioned. gcc is good about this now. ... +static ssize_t do_write_log_from_user(struct logger_log *log, + const void __user *buf, size_t count) +{ + size_t len; + + len = min(count, log-size - log-w_off); + if (len copy_from_user(log-buffer + log-w_off, buf, len)) + return -EFAULT; + + if (count != len) + if (copy_from_user(log-buffer, buf + len, count - len)) + return -EFAULT; Is it correct to simply return here without updating -w_off? We've just copied len bytes in from userspace. + log-w_off = logger_offset(log-w_off + count); + + return count; +} + +/* + * logger_aio_write - our write method, implementing support for write(), + * writev(), and aio_write(). Writes are our fast path, and we try to optimize + * them above all else. + */ +ssize_t logger_aio_write(struct kiocb *iocb, const struct iovec *iov, + unsigned long nr_segs, loff_t ppos) +{ + struct logger_log *log = file_get_log(iocb-ki_filp); + size_t orig = log-w_off; + struct logger_entry header; + struct timespec now; + ssize_t ret = 0; + + now = current_kernel_time(); + + header.pid = current-tgid; +
Re: RFC: android logger feedback request
I have no problem leaving the logger driver in staging, but it seems that Tim is taking on the task to do the harder thing here, which probably would entail work on both sides, which as a openhandset alliance company member, he might have a change that someone like me might not :) I'd like to see it in staging. There is a general discussion going on about how such logging services should look (and indeed it goes back before Android bits), and it looks like a good way to proceed is going to be to play with the Android one - perhaps adding the fs model to it. So we need something and the Android interface at the very least is a prototype that has production proof of working. Alan -- To unsubscribe from this list: send the line unsubscribe linux-embedded in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: RFC: android logger feedback request
On Fri, Dec 23, 2011 at 03:22:31PM +, Alan Cox wrote: I have no problem leaving the logger driver in staging, but it seems that Tim is taking on the task to do the harder thing here, which probably would entail work on both sides, which as a openhandset alliance company member, he might have a change that someone like me might not :) I'd like to see it in staging. There is a general discussion going on about how such logging services should look (and indeed it goes back before Android bits), and it looks like a good way to proceed is going to be to play with the Android one - perhaps adding the fs model to it. So we need something and the Android interface at the very least is a prototype that has production proof of working. It is already in staging, look in the staging-next branch of linux-next. thanks, greg k-h -- To unsubscribe from this list: send the line unsubscribe linux-embedded in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: RFC: android logger feedback request
On Thursday 22 December 2011, NeilBrown wrote: If you created a 'logbuf' filesystem that used libfs to provide a single directory in which privileged processes could create files then you wouldn't need the kernel to know the allowed logs: radio, events, main, system. The size could be set by ftruncate() (by privileged used again) rather than being hardcoded. You would defined 'read' and 'write' much like you currently do to create a list of datagrams in a circular buffer and replace the ioctls by more standard interfaces: LOGGER_GET_LOG_BUG_SIZE would use 'stat' and the st_blocks field LOGGER_GET_LOG_LEN would use 'stat' and the st_size field LOGGER_GET_NEXT_ENTRY_LEN could use the FIONREAD ioctl LOGGER_FLUSH_LOG could use ftruncate The result would be much the same amount of code, but an interface which has fewer details hard-coded and is generally more versatile and accessible. I like the idea and was going to suggest something very similar, but I wonder if we could take the approach even further: * Remove all kernel code for this and use a user space library together with tmpfs * prepopulate the tmpfs at boot time with all the log buffers in the right size, and set the maximum file system size so that they cannot grow further. * Have minimal formatting in the log buffer: A few bytes header (ring buffer start and end) * Mandate that user space must use mmap and atomic operations to reserve space in the log and write to the files. * Provide a tool to get the log data out of the buffer again in a race-free way. Since any program that is allowed to write to the buffer can overwrite all existing information in it anyway, I think we don't actually need any kernel help in maintaining consistency of the contents either -- the reader will simply discard any data. The main thing we would not be able to guarantee without kernel help is proving the origin of individual messages, but I'm not sure if that is a design goal. Arnd -- To unsubscribe from this list: send the line unsubscribe linux-embedded in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: RFC: android logger feedback request
On Thu, Dec 22, 2011 at 15:59, Arnd Bergmann a...@arndb.de wrote: * Remove all kernel code for this and use a user space library together with tmpfs * prepopulate the tmpfs at boot time with all the log buffers in the right size, and set the maximum file system size so that they cannot grow further. * Have minimal formatting in the log buffer: A few bytes header (ring buffer start and end) * Mandate that user space must use mmap and atomic operations to reserve space in the log and write to the files. * Provide a tool to get the log data out of the buffer again in a race-free way. Since any program that is allowed to write to the buffer can overwrite all existing information in it anyway, I think we don't actually need any kernel help in maintaining consistency of the contents either -- the reader will simply discard any data. The main thing we would not be able to guarantee without kernel help is proving the origin of individual messages, but I'm not sure if that is a design goal. I'm very sure you want rate limiting, global sequence numbers and trusted properties added to the log. I can imagine that the kernel provides a single character device, instantiates a tmpfs mountpoint and splits all stuff that travels into that device into several files (ring buffers) in the tmpfs filesystem. The files are probably just split by the uid of the incoming message sender, which maps to an app at Android. Only the character device would be able to write to the file, but the user with the matching uid can read its own file back directly from tmpfs. I'm very convinced that we should not design a logging system today, where we give the ownership of the data to the actual untrusted process that logs. The current kmsg buffer could just be one additional file in this tmpfs mount. Kay -- To unsubscribe from this list: send the line unsubscribe linux-embedded in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RFC: android logger feedback request
Hi all, I'm looking for feedback on the Android logger code, to see what it would take to make this code acceptable for inclusion in the mainline kernel. Information about the features of Android logging system can be found at: http://elinux.org/Android_Logging_System This system creates a new system-wide logging service, in the kernel, for user-space message. It is more comparable to syslog than to the kernel log buffer, as it holds only user-space messages. It is optimized for write performance, since most of the time the log is written to and never read. It creates multiple log channels, to prevent an abundance of log messages in one channel from overwriting messages in another channel. The log channels have sizes fixed at kernel compile-time. Log messages are stored in very simple in-kernel buffers, that overflow old messages upon wrapping. A fixed set of attributes (pid, tid, timestamp and message), is kept for each message. By convention, Android puts a message priority and context tag into each message. In Android, this system uses a fixed set of device nodes with well-known names: /dev/log/main, /dev/log/events, /dev/log/radio and /dev/log/system. Operations on the log are done via a character device, using standard file operations and some ioctls. The code for this is below (I've moved it from linux-next drivers/staging/android for my own testing). Please let me know what issues you see with this code. One specific question I have is where is the most appropriate place for this code to live, in the kernel source tree? Other embedded systems might want to use this system (it is simpler than syslog, and superior in some ways), so I don't think it should remain in an android-specific directory. Thanks, -- Tim P.S. Sorry to do this right before the holidays. I'll be away from my work machine for most of the next few weeks. I can answer informational questions, and gather feedback, but I won't have a lot of time for testing new code until after the New Year. Here's the code: diff --git a/drivers/misc/logger.c b/drivers/misc/logger.c new file mode 100644 index 000..fa76ce7 --- /dev/null +++ b/drivers/misc/logger.c @@ -0,0 +1,616 @@ +/* + * drivers/misc/logger.c + * + * A Logging Subsystem + * + * Copyright (C) 2007-2008 Google, Inc. + * + * Robert Love rl...@google.com + * + * This software is licensed under the terms of the GNU General Public + * License version 2, as published by the Free Software Foundation, and + * may be copied, distributed, and modified under those terms. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + */ + +#include linux/sched.h +#include linux/module.h +#include linux/fs.h +#include linux/miscdevice.h +#include linux/uaccess.h +#include linux/poll.h +#include linux/slab.h +#include linux/time.h +#include logger.h + +#include asm/ioctls.h + +/* + * struct logger_log - represents a specific log, such as 'main' or 'radio' + * + * This structure lives from module insertion until module removal, so it does + * not need additional reference counting. The structure is protected by the + * mutex 'mutex'. + */ +struct logger_log { + unsigned char *buffer;/* the ring buffer itself */ + struct miscdevice misc; /* misc device representing the log */ + wait_queue_head_t wq; /* wait queue for readers */ + struct list_headreaders; /* this log's readers */ + struct mutexmutex; /* mutex protecting buffer */ + size_t w_off; /* current write head offset */ + size_t head; /* new readers start here */ + size_t size; /* size of the log */ +}; + +/* + * struct logger_reader - a logging device open for reading + * + * This object lives from open to release, so we don't need additional + * reference counting. The structure is protected by log-mutex. + */ +struct logger_reader { + struct logger_log *log; /* associated log */ + struct list_headlist; /* entry in logger_log's list */ + size_t r_off; /* current read head offset */ +}; + +/* logger_offset - returns index 'n' into the log via (optimized) modulus */ +#define logger_offset(n) ((n) (log-size - 1)) + +/* + * file_get_log - Given a file structure, return the associated log + * + * This isn't aesthetic. We have several goals: + * + * 1) Need to quickly obtain the associated log during an I/O operation + * 2) Readers need to maintain state (logger_reader) + * 3) Writers need to be very fast (open() should be a near no-op) + * + * In the reader case, we can trivially go file-logger_reader-logger_log. + * For a writer, we don't want to maintain a logger_reader, so we just go + * file-logger_log. Thus
Re: RFC: android logger feedback request
On Wed, 2011-12-21 at 15:19 -0800, Greg KH wrote: On Wed, Dec 21, 2011 at 02:59:15PM -0800, Tim Bird wrote: Hi all, I'm looking for feedback on the Android logger code, to see what it would take to make this code acceptable for inclusion in the mainline kernel. Information about the features of Android logging system can be found at: http://elinux.org/Android_Logging_System This system creates a new system-wide logging service, in the kernel, for user-space message. It is more comparable to syslog than to the kernel log buffer, as it holds only user-space messages. It is optimized for write performance, since most of the time the log is written to and never read. It creates multiple log channels, to prevent an abundance of log messages in one channel from overwriting messages in another channel. The log channels have sizes fixed at kernel compile-time. Log messages are stored in very simple in-kernel buffers, that overflow old messages upon wrapping. A fixed set of attributes (pid, tid, timestamp and message), is kept for each message. By convention, Android puts a message priority and context tag into each message. In Android, this system uses a fixed set of device nodes with well-known names: /dev/log/main, /dev/log/events, /dev/log/radio and /dev/log/system. Operations on the log are done via a character device, using standard file operations and some ioctls. The code for this is below (I've moved it from linux-next drivers/staging/android for my own testing). Please let me know what issues you see with this code. That all describes the current code, but you haven't described what's wrong with the existing syslog interface that requires this new driver to be written. And why can't the existing interface be fixed to address these (potential) shortcomings? One specific question I have is where is the most appropriate place for this code to live, in the kernel source tree? Other embedded systems might want to use this system (it is simpler than syslog, and superior in some ways), so I don't think it should remain in an android-specific directory. What way is it superior? Again, why not extend syslog? Why not fix syslog if this really is a superior thing? How does this tie into Kay and Lennard's proposal for work in this area? There is also some overlap functionality wise with pstore as well, as I believe the logger is used as a known location in memory where messages can be fetched from after a kernel panic or crash. thanks -john -- To unsubscribe from this list: send the line unsubscribe linux-embedded in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: RFC: android logger feedback request
On Wed, Dec 21, 2011 at 04:18:11PM -0800, john stultz wrote: On Wed, 2011-12-21 at 15:19 -0800, Greg KH wrote: On Wed, Dec 21, 2011 at 02:59:15PM -0800, Tim Bird wrote: Hi all, I'm looking for feedback on the Android logger code, to see what it would take to make this code acceptable for inclusion in the mainline kernel. Information about the features of Android logging system can be found at: http://elinux.org/Android_Logging_System This system creates a new system-wide logging service, in the kernel, for user-space message. It is more comparable to syslog than to the kernel log buffer, as it holds only user-space messages. It is optimized for write performance, since most of the time the log is written to and never read. It creates multiple log channels, to prevent an abundance of log messages in one channel from overwriting messages in another channel. The log channels have sizes fixed at kernel compile-time. Log messages are stored in very simple in-kernel buffers, that overflow old messages upon wrapping. A fixed set of attributes (pid, tid, timestamp and message), is kept for each message. By convention, Android puts a message priority and context tag into each message. In Android, this system uses a fixed set of device nodes with well-known names: /dev/log/main, /dev/log/events, /dev/log/radio and /dev/log/system. Operations on the log are done via a character device, using standard file operations and some ioctls. The code for this is below (I've moved it from linux-next drivers/staging/android for my own testing). Please let me know what issues you see with this code. That all describes the current code, but you haven't described what's wrong with the existing syslog interface that requires this new driver to be written. And why can't the existing interface be fixed to address these (potential) shortcomings? One specific question I have is where is the most appropriate place for this code to live, in the kernel source tree? Other embedded systems might want to use this system (it is simpler than syslog, and superior in some ways), so I don't think it should remain in an android-specific directory. What way is it superior? Again, why not extend syslog? Why not fix syslog if this really is a superior thing? How does this tie into Kay and Lennard's proposal for work in this area? There is also some overlap functionality wise with pstore as well, as I believe the logger is used as a known location in memory where messages can be fetched from after a kernel panic or crash. That sounds like a major overlap, can't pstore be used for this functionality today instead of the logger code? greg k-h -- To unsubscribe from this list: send the line unsubscribe linux-embedded in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: RFC: android logger feedback request
On Wed, 2011-12-21 at 16:27 -0800, Greg KH wrote: On Wed, Dec 21, 2011 at 04:18:11PM -0800, john stultz wrote: On Wed, 2011-12-21 at 15:19 -0800, Greg KH wrote: What way is it superior? Again, why not extend syslog? Why not fix syslog if this really is a superior thing? How does this tie into Kay and Lennard's proposal for work in this area? There is also some overlap functionality wise with pstore as well, as I believe the logger is used as a known location in memory where messages can be fetched from after a kernel panic or crash. That sounds like a major overlap, can't pstore be used for this functionality today instead of the logger code? It may be able to suffice for some portion of the functionality, it was suggested that we consider doing something like try to make logger use pstore as a backend. I'm not really familiar with pstore, but I don't know if its able to preserve messages from userland, so it may not really be sufficient in of itself. Also, if I understand correctly, logger also is able to store messages from hardware modems or other hardware that run their own micro-os in a central location. This may be closer to the pstore mce bits, but its not clear yet to me how much overlap is there. But I've not yet really spent too much time looking at logger yet, so I can't really comment too strongly here. Just wanted to make sure folks have looked at all the other similar work. thanks -john -- To unsubscribe from this list: send the line unsubscribe linux-embedded in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: RFC: android logger feedback request
On Wed, 21 Dec 2011 16:36:21 -0800 Tim Bird tim.b...@am.sony.com wrote: On 12/21/2011 03:19 PM, Greg KH wrote: That all describes the current code, but you haven't described what's wrong with the existing syslog interface that requires this new driver to be written. And why can't the existing interface be fixed to address these (potential) shortcomings? One specific question I have is where is the most appropriate place for this code to live, in the kernel source tree? Other embedded systems might want to use this system (it is simpler than syslog, and superior in some ways), so I don't think it should remain in an android-specific directory. What way is it superior? Here are some ways that this code is superior to syslog: It is certainly nice and simple. It really looks more like a filesystem than a char device though... though they aren't really files so much as lossy pipes. I don't think that's a problem though, lots of things in filesystems don't behave exactly like files. If you created a 'logbuf' filesystem that used libfs to provide a single directory in which privileged processes could create files then you wouldn't need the kernel to know the allowed logs: radio, events, main, system. The size could be set by ftruncate() (by privileged used again) rather than being hardcoded. You would defined 'read' and 'write' much like you currently do to create a list of datagrams in a circular buffer and replace the ioctls by more standard interfaces: LOGGER_GET_LOG_BUG_SIZE would use 'stat' and the st_blocks field LOGGER_GET_LOG_LEN would use 'stat' and the st_size field LOGGER_GET_NEXT_ENTRY_LEN could use the FIONREAD ioctl LOGGER_FLUSH_LOG could use ftruncate The result would be much the same amount of code, but an interface which has fewer details hard-coded and is generally more versatile and accessible. NeilBrown signature.asc Description: PGP signature
Re: RFC: android logger feedback request
On 12/21/2011 04:51 PM, Greg KH wrote: On Wed, Dec 21, 2011 at 04:36:21PM -0800, Tim Bird wrote: On 12/21/2011 03:19 PM, Greg KH wrote: Huh, I'm not talking about syslogd, I'm talking about the syslog(2) syscall we have. OK - switching gears. Since the kernel log buffer isn't normally used to store use-space messages, I thought you were referring to syslog(3) and the associated (logger(1) and syslogd(8)). This character interface seems very close to the syslog(2) api, but just done in a character interface, with ioctls, which also require userspace tools to manage properly, so I fail to see the big gain here. What am I missing? syslog(2) would more aptly be named klogctrl() (and it is in glibc) There's currently no operation in sys_sylog (the kernel function implementing syslog(2)) for writing to the log. The write operation to the kernel log buffer is also done via a character interface /dev/kmsg (via code in drivers/char/mem.c) This is actually very similar to what the Android logger code does. But while the kernel log buffer has lots of similarities to the Android logger there are some key differences which I think are important to isolate from a user-space logging system. Here's a stream-of-consciousness dump of the differences: The printk interface in the kernel is almost always automatically drained to the device console, at the time of the printk (after the message is dropped into the log buffer itself). This extra operation is not needed for most application-level messages that go into the log, and incurs extra overhead in the log buffer code. The printk code is especially designed to be called from within any kernel context (including interrupt code), and so has some locking avoidance code paths and complexity that are not needed for code which handles strictly user-space messages. Oddly enough, the printk code paths in the kernel can end up doing a fair amount of print formatting, which can be time-consuming. The code path in kmsg_writev() contains at least one kmalloc, which could fail when running out of memory. The code path in the logger is much simpler, consisting really of only a data copy. Timestamping is not automatically appended to messages going into the kernel log buffer (but they can be optionally pre-pended, with control configurable at runtime). They are represented as ASCII text, which consumes a little more than twice the overhead of a 32-bit binary field. PID and TID are not automatically preserved in the log. The kernel keeps it's priority in text also, and has no convention for contextual tagging. I'm not sure that we should change the kernel log buffer to support structured binary data, in addition to the free-form ASCII data that the kernel uses now. The kernel log buffer does not support separate channels for different classes of log messages (indeed, there is only one channel, and it has kernel messages). A new system call (or some backwards-compatible tweak to the existing syslog(2) call would have to be added to support a channel ID in order to support this. There *are* some benefits to intermingling the kernel log messages and the user-space log messages, but I think they are outweighed by the value in keeping these systems separate. There might be the opportunity for code reuse, but I suspect we'd end up with about the same amount of code increase overall (and possibly an additional syscall), and add some unneeded complexity to the prink code path to accomplish it. I just read Neil Brown's suggestion for doing this via a filesystem rather than a char device, and it's interesting. I'll respond to that separately. -- Tim = Tim Bird Architecture Group Chair, CE Workgroup of the Linux Foundation Senior Staff Engineer, Sony Network Entertainment = -- To unsubscribe from this list: send the line unsubscribe linux-embedded in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: RFC: android logger feedback request
On Wed, Dec 21, 2011 at 05:32:36PM -0800, Tim Bird wrote: On 12/21/2011 04:51 PM, Greg KH wrote: On Wed, Dec 21, 2011 at 04:36:21PM -0800, Tim Bird wrote: On 12/21/2011 03:19 PM, Greg KH wrote: Huh, I'm not talking about syslogd, I'm talking about the syslog(2) syscall we have. OK - switching gears. Since the kernel log buffer isn't normally used to store use-space messages, I thought you were referring to syslog(3) and the associated (logger(1) and syslogd(8)). The kernel log buffer has been storing userspace messages for a while now, look at your boot log on the latest Fedora and openSUSE releases (or any other distro using systemd for booting). This character interface seems very close to the syslog(2) api, but just done in a character interface, with ioctls, which also require userspace tools to manage properly, so I fail to see the big gain here. What am I missing? syslog(2) would more aptly be named klogctrl() (and it is in glibc) Maybe, but this is a kernel mailing list, we don't worry about glibc much here :) There's currently no operation in sys_sylog (the kernel function implementing syslog(2)) for writing to the log. The write operation to the kernel log buffer is also done via a character interface /dev/kmsg (via code in drivers/char/mem.c) This is actually very similar to what the Android logger code does. Again, see above, this has been in the kernel for quite a while now... But while the kernel log buffer has lots of similarities to the Android logger there are some key differences which I think are important to isolate from a user-space logging system. Here's a stream-of-consciousness dump of the differences: The printk interface in the kernel is almost always automatically drained to the device console, at the time of the printk (after the message is dropped into the log buffer itself). This extra operation is not needed for most application-level messages that go into the log, and incurs extra overhead in the log buffer code. The printk code is especially designed to be called from within any kernel context (including interrupt code), and so has some locking avoidance code paths and complexity that are not needed for code which handles strictly user-space messages. Oddly enough, the printk code paths in the kernel can end up doing a fair amount of print formatting, which can be time-consuming. The code path in kmsg_writev() contains at least one kmalloc, which could fail when running out of memory. The code path in the logger is much simpler, consisting really of only a data copy. Timestamping is not automatically appended to messages going into the kernel log buffer (but they can be optionally pre-pended, with control configurable at runtime). They are represented as ASCII text, which consumes a little more than twice the overhead of a 32-bit binary field. PID and TID are not automatically preserved in the log. The kernel keeps it's priority in text also, and has no convention for contextual tagging. I'm not sure that we should change the kernel log buffer to support structured binary data, in addition to the free-form ASCII data that the kernel uses now. The kernel log buffer does not support separate channels for different classes of log messages (indeed, there is only one channel, and it has kernel messages). A new system call (or some backwards-compatible tweak to the existing syslog(2) call would have to be added to support a channel ID in order to support this. There *are* some benefits to intermingling the kernel log messages and the user-space log messages, but I think they are outweighed by the value in keeping these systems separate. There might be the opportunity for code reuse, but I suspect we'd end up with about the same amount of code increase overall (and possibly an additional syscall), and add some unneeded complexity to the prink code path to accomplish it. Again, please see what we are already doing in the kernel and userspace, I think a lot of the above is already implemented. Which brings me back to my original question, what does this code do, that is not already present in the kernel, and why is a totally new interface being proposed for it? thanks, greg k-h -- To unsubscribe from this list: send the line unsubscribe linux-embedded in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: RFC: android logger feedback request
On Wed, Dec 21, 2011 at 06:12:39PM -0800, Tim Bird wrote: Secondly, this is not a particularly new or radical interface. It is new to legacy Linux, but it's been in the Android Linux kernel for some years now, and it has worked well. Sorry, but you know full well that is not a justification for _any_ chunk of code to be included in the main kernel tree. Staging is fine, but to add a new kernel/user api that looks at first glance, to merely duplicate the existing in-kernel apis needs a very good explaination for why it needs to be included. greg k-h -- To unsubscribe from this list: send the line unsubscribe linux-embedded in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: RFC: android logger feedback request
On Wed, Dec 21, 2011 at 06:12:39PM -0800, Tim Bird wrote: Again, please see what we are already doing in the kernel and userspace, I think a lot of the above is already implemented. I don't know what systemd has got going on in user-space. I'm looking at a very recent kernel, and I see no support for multiple log channels, or an optimized open/write path. How is the existing syslog read path not optimized? What kind of speed and numbers are we talking about here? Oh, and you do know about the userspace printk tty driver, right? What about using that combined with the existing syslog system call? systemd doesn't use it, but I know of a few embedded systems that are already using it today, and I think it might solve the android issues already. thanks, greg k-h -- To unsubscribe from this list: send the line unsubscribe linux-embedded in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: RFC: android logger feedback request
On Wed, Dec 21, 2011 at 8:12 PM, Kay Sievers kay.siev...@vrfy.org wrote: We would like to have for the general Linux use case is a kmsg, that supports structured data, but looks the same as it looks today to current and legacy tools. We would like to have every printk that does not start with a KERN_CONT to create a record in the kmsg buffer. This record would carry directly a timestamp, the log facility and the log level and a reference to the usual human readable string. The current kmsg prefixing of the timestamp would be a runtime switch, and just added when the records are traversed, instead of the current buffer printing. In addition to this common data, every record can carry a 'dictionary', which basically looks like a process environment, means KEY=value pairs. The additional data can carry all the stuff that is needed for structured logging and provides the base for machine readable log information. We don't want to separate multiple stores to avoid ordering problems when the messages need to be merged from the multiple streams on the receiver side. I think this is theoretically all possible with the current kmsg, without breaking anything. The current interfaces would iterate over the record list and print out the human readable parts and skip the metadata, instead of just copying a linear byte stream. To tools, it could look the same as it is currently. It's just that the ring buffer is not a byte- but a record-buffer. Advanced tools could get a new interface or switch the old interface to pipe out the structured data too. In the systemd journal, we have an ASCII-like stream format that carries out structured data and is binary data capable. If we want to think about any next generation logging, I'm convinced, we need to support records, structured data and binary data; anything else is unlikely worth to change the current stuff. Any thoughts as to how one could allow N userspace agents to log to a single shared buffer without one agent, if buggy or malicious, spamming out all the other contents of the log? This is one of the main reasons we maintain separate logs in Android today, and are likely to continue doing so until we figure out how resolve the issue. Having everything in one unified log definitely has benefits (it certainly makes keeping everything ordered and reading everything back simpler), however we can't control third party applications (by design), so systems that require apps to just don't do that are not really solving that problem for us. Also, as I mentioned earlier, from a security standpoint it is highly desirable to be able to restrict which records certain readers can read. Convincing third party app developers to not log sensitive or PII data remains a challenge -- providing the ability for filtered read access allows some mitigation (app can retrieve it's own logs for a bug report but can't sift through all logs looking for interesting tidbits). Brian -- To unsubscribe from this list: send the line unsubscribe linux-embedded in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: RFC: android logger feedback request
On Thu, Dec 22, 2011 at 04:49, Greg KH gre...@suse.de wrote: On Wed, Dec 21, 2011 at 06:34:08PM -0800, Brian Swetland wrote: The goals behind the logger driver have been: - keep userland and kernel logging separate (so that spammy userland logging doesn't make us lose critical kernel logs or the other way round) Wouldn't a simple userspace daemon solve this, writing the data to a ramfs file? That's what the systemd journal does in /run when /var is not writable or the /var/log/journal directory does not exist. All just lands on a tmpfs until the next reboot. The journal database just acts as a ringbuffer on a filesystem. - make log writing very inexpensive -- avoid having to pass messages between processes (more critical on ARM9 platforms where this implied extra cache flushing), avoid having to make several syscalls to write a log message (getting time of day, etc), and so on ramfs? The receiver gets almost all of this from the socket with SCM_TIMESTAP, and SCM_CREDENTIALS. Needed stuff can be added in a similar way. This is trusted, reliable and pretty cheap to retrieve data. - make log writing reliable -- don't trust userland to report its timestamp, PID, or to correctly format the datagrams, etc existing userspace printk tty driver? Sockets provide all that today already in a trusted way. - allow a log watching process (logcat) to easily pull data from all logs at once what do you mean by at once? /proc/kmsg provides that today, which is basically just a file wrapper around the kernel's (misleadingly named) syslog() call. - avoid committing a vast amount of memory to logging memory where, in code or in the data being logged? - try to prevent clients from spamming each other out of log space (only successful on a coarse granularity right now with the main/system/radio/events logs) - ensure logs are not lost at the moment an app crashes Which logs? Things need rate limits and quota based on the remaining buffer or disk space. The systemd journal maintains a separate journal for every login-uid, so every logged in user gets his own journal database, which can not overflow the systems database or the databases of the other users. Every user can log to the journal, but only the journal daemon can write to the databases. The user can directly read back its own journal database, but not write it. On one hand, having each app (per PID) be able to create their own logs up to a specified size limit could be really useful and is something we've kicked around -- for one it would allow us to avoid the ever present request from userspace developers to increase the log size because of too much log spam (reduce log spam never seems to be an answer that makes them happy) -- but we haven't come up with a reasonable plan for dealing with well if we allow 16KB of log per app and the user installs 100 apps, they may be pinning up to 1.6MB of ram worst case, and so on. I think the userspace printk and syslog might already handle most of this today. Tim, care to look into that and see if it does or not? It's very simple what we have here in the kernel, and if syslog facility prefixes are used, it can probably replace most of the Android log solution just fine. We can not really protect one store to flush the entire buffer though, that's what the 4 separated stores can easily provide. But as mentioned, I think we want the ability to work with structured data, which both kernel tools are incapable of. Kay -- To unsubscribe from this list: send the line unsubscribe linux-embedded in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: RFC: android logger feedback request
On Thu, 22 Dec 2011, Kay Sievers wrote: If we want to think about any next generation logging, I'm convinced, we need to support records, structured data and binary data; anything else is unlikely worth to change the current stuff. note that current generation syslog supports records, structured data, metadata capture of the sending process, everything you listed except unencoded binary data. David Lang -- To unsubscribe from this list: send the line unsubscribe linux-embedded in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: RFC: android logger feedback request
On Wed, Dec 21, 2011 at 8:47 PM, da...@lang.hm wrote: On Wed, 21 Dec 2011, Brian Swetland wrote: Any thoughts as to how one could allow N userspace agents to log to a single shared buffer without one agent, if buggy or malicious, spamming out all the other contents of the log? This is one of the main reasons we maintain separate logs in Android today, and are likely to continue doing so until we figure out how resolve the issue. note that you still don't prevent one app from blowing out the logs, all you do is separate the logs into a handfulof categories and limit the app (through standard filesystem permissions) to only blowing out one category of logs. or do I misunderstand this and you actually keep a separate queue for each pid (which would seem to be a problem as you no longer know how much space you need) You're correct -- that's the model at present. We'd like to move to a model where we have better control and are kicking around ideas for how to do this for the next platform version. The rate at which apps push data into logs is pretty amazing at times. The system booting (maybe 10-20 services and 10+ apps starting up) can blow through 256K of ringbuffer in seconds. Having enough context available when something crashes to be able to diagnose what went wrong, especially when problems could involve interaction between multiple agents in multiple processes is important for the frameworks and apps teams. Trying to get individual developers to be frugal with their logging has turned out to be mostly a losing battle. Brian -- To unsubscribe from this list: send the line unsubscribe linux-embedded in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: RFC: android logger feedback request
On Wed, 21 Dec 2011, Brian Swetland wrote: On Wed, Dec 21, 2011 at 5:20 PM, NeilBrown ne...@suse.de wrote: On Wed, 21 Dec 2011 16:36:21 -0800 Tim Bird tim.b...@am.sony.com wrote: On 12/21/2011 03:19 PM, Greg KH wrote: That all describes the current code, but you haven't described what's wrong with the existing syslog interface that requires this new driver to be written. And why can't the existing interface be fixed to address these (potential) shortcomings? One specific question I have is where is the most appropriate place for this code to live, in the kernel source tree? Other embedded systems might want to use this system (it is simpler than syslog, and superior in some ways), so I don't think it should remain in an android-specific directory. What way is it superior? Here are some ways that this code is superior to syslog: It is certainly nice and simple. It really looks more like a filesystem than a char device though... though they aren't really files so much as lossy pipes. I don't think that's a problem though, lots of things in filesystems don't behave exactly like files. If you created a 'logbuf' filesystem that used libfs to provide a single directory in which privileged processes could create files then you wouldn't need the kernel to know the allowed logs: radio, events, main, system. The size could be set by ftruncate() (by privileged used again) rather than being hardcoded. SNIP The result would be much the same amount of code, but an interface which has fewer details hard-coded and is generally more versatile and accessible. Moving away from hard coding the names/sizes of the logs in the driver is something that has been on the todo list for a while. One thing we'd likely want to accomplish there is avoid creating a vector for consuming large amounts of memory by creating new logs. One planned change (likely to happen in the Android J release timeframe) is to adjust permissions such that any process can write messages, but unless they belong to the correct group they can only read back messages written by their own PID. This is to allow apps to grab their own log output after a crash or during a user problem report without needing to grant them the ability to read all log messages. Currently the logger driver does not provide a mechanism for allowing logs to survive a reboot (unlike the ramconsole), but this is functionality that we've thought about adding. Generally the kernel logs are most interesting after an unexpected panic/reboot, but getting a picture of what userspace has been up to can be useful too. The goals behind the logger driver have been: - keep userland and kernel logging separate (so that spammy userland logging doesn't make us lose critical kernel logs or the other way round) - make log writing very inexpensive -- avoid having to pass messages between processes (more critical on ARM9 platforms where this implied extra cache flushing), avoid having to make several syscalls to write a log message (getting time of day, etc), and so on - make log writing reliable -- don't trust userland to report its timestamp, PID, or to correctly format the datagrams, etc - allow a log watching process (logcat) to easily pull data from all logs at once - avoid committing a vast amount of memory to logging - try to prevent clients from spamming each other out of log space (only successful on a coarse granularity right now with the main/system/radio/events logs) - ensure logs are not lost at the moment an app crashes On one hand, having each app (per PID) be able to create their own logs up to a specified size limit could be really useful and is something we've kicked around -- for one it would allow us to avoid the ever present request from userspace developers to increase the log size because of too much log spam (reduce log spam never seems to be an answer that makes them happy) -- but we haven't come up with a reasonable plan for dealing with well if we allow 16KB of log per app and the user installs 100 apps, they may be pinning up to 1.6MB of ram worst case, and so on. At this point you are starting to sound like something much closer to a traditional syslog daemon. you are adding so many variations, persistant storage, etc that you really don't want to have to have all this in the kernel, make these be interfaces into a userspace logging tool (ideally syslog compatible), and you have the option to easily have different policies, consuming different amounts of space, depending on the device and how resource limited it is. David Lang
Re: RFC: android logger feedback request
On Wed, 21 Dec 2011, Brian Swetland wrote: On Wed, Dec 21, 2011 at 8:47 PM, da...@lang.hm wrote: On Wed, 21 Dec 2011, Brian Swetland wrote: Any thoughts as to how one could allow N userspace agents to log to a single shared buffer without one agent, if buggy or malicious, spamming out all the other contents of the log? This is one of the main reasons we maintain separate logs in Android today, and are likely to continue doing so until we figure out how resolve the issue. note that you still don't prevent one app from blowing out the logs, all you do is separate the logs into a handfulof categories and limit the app (through standard filesystem permissions) to only blowing out one category of logs. or do I misunderstand this and you actually keep a separate queue for each pid (which would seem to be a problem as you no longer know how much space you need) You're correct -- that's the model at present. We'd like to move to a model where we have better control and are kicking around ideas for how to do this for the next platform version. The rate at which apps push data into logs is pretty amazing at times. The system booting (maybe 10-20 services and 10+ apps starting up) can blow through 256K of ringbuffer in seconds. Having enough context available when something crashes to be able to diagnose what went wrong, especially when problems could involve interaction between multiple agents in multiple processes is important for the frameworks and apps teams. this makes it sound more like this belongs in userspace instead of the kernel. if something crashes and they can flip a flag in userspace to say 'keep more logs now' and try to duplicate the crash is going to be far easier than having to compile a new kernel to extend the size of the log. Trying to get individual developers to be frugal with their logging has turned out to be mostly a losing battle. that's for sure, anything that's success depends on developers doing things right that aren't visible to the user are pretty much doomed. David Lang
Re: RFC: android logger feedback request
On Wed, 21 Dec 2011, Brian Swetland wrote: The result would be much the same amount of code, but an interface which has fewer details hard-coded and is generally more versatile and accessible. That sounds better than what has been done in android, but it is still _far_ more limited than doing something that could be replaced by a fairly standard syslog daemon. We're really not interested in adding another daemon to the platform -- the logging system we have has served us well, is integrated with our existing development tools, and we're definitely interested in improving it, but throwing it out and replacing it with a userspace solution is not interesting to us right now. Think very hard before you reject any possibility of doing this in userspace, especially with all the things that you are talking about doing with logging in the future. I really don't think aht a lot of your long-0term plans for logging are going to sit well with the kernel developers, and if your justification is you don't want to change your build process, I really doubt that you will get this upstream. It should be possible to do this without having to change the tools writing the logs, any change in logging will change what it takes to read the logs. I am not saying that you need to have a logging daemon as heavyweight as syslog-ng or rsyslog for the low-end phones, but I do think that as android moves up the stack a bit into talets and netbooks (especially in applications where it will have wifi connectivity almost all the time) having the capability to move to a full-blown syslog daemon will make a huge amount of sense, so you should look at how big a minimalist daemon would be, and what sort of performance it would have. David Lang -- To unsubscribe from this list: send the line unsubscribe linux-embedded in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: RFC: android logger feedback request
On Wed, Dec 21, 2011 at 9:14 PM, da...@lang.hm wrote: On Wed, 21 Dec 2011, Brian Swetland wrote: We're really not interested in adding another daemon to the platform -- the logging system we have has served us well, is integrated with our existing development tools, and we're definitely interested in improving it, but throwing it out and replacing it with a userspace solution is not interesting to us right now. Think very hard before you reject any possibility of doing this in userspace, especially with all the things that you are talking about doing with logging in the future. I really don't think aht a lot of your long-0term plans for logging are going to sit well with the kernel developers, and if your justification is you don't want to change your build process, I really doubt that you will get this upstream. It should be possible to do this without having to change the tools writing the logs, any change in logging will change what it takes to read the logs. I am not saying that you need to have a logging daemon as heavyweight as syslog-ng or rsyslog for the low-end phones, but I do think that as android moves up the stack a bit into talets and netbooks (especially in applications where it will have wifi connectivity almost all the time) having the capability to move to a full-blown syslog daemon will make a huge amount of sense, so you should look at how big a minimalist daemon would be, and what sort of performance it would have. Long term things could certainly move around, but *right now* we really don't see the value in gutting what we've got in favor of writing a new userspace daemon, which is going to require somebody to do that work, maintain it, chase down any behavioural quirks introduced by the changes, etc. Or to throw it out in favor of trying to bolt our logging infrastructure on top of existing syslogds, etc. We're happy to maintain the logger driver out of tree as we have for the past four years. If all discussions of bringing Android kernel support to mainline end up as another round of you should just rewrite it all in userspace or you should use this other subsystem that doesn't actually solve your problem but we think it's close enough, then there's not a lot of point to having the discussions in the first place. If somebody wants to go write a complete compatible replacement that just drops in, we certainly could take a look at it and see how it works, but given that the benefits are not clear to us, we're not interested in going off and doing that work ourselves. Brian -- To unsubscribe from this list: send the line unsubscribe linux-embedded in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: RFC: android logger feedback request
On Wed, Dec 21, 2011 at 09:25:52PM -0800, Brian Swetland wrote: Long term things could certainly move around, but *right now* we really don't see the value in gutting what we've got in favor of writing a new userspace daemon, which is going to require somebody to do that work, maintain it, chase down any behavioural quirks introduced by the changes, etc. Or to throw it out in favor of trying to bolt our logging infrastructure on top of existing syslogds, etc. We're happy to maintain the logger driver out of tree as we have for the past four years. This all came about because Tim asked to merge this to the main tree as-is, and so, people rightfully pointed out different options that might be considered if it were to be merged that way. I have no problem leaving the logger driver in staging, but it seems that Tim is taking on the task to do the harder thing here, which probably would entail work on both sides, which as a openhandset alliance company member, he might have a change that someone like me might not :) Tim, it's now in your court it seems, have a great holiday, greg k-h -- To unsubscribe from this list: send the line unsubscribe linux-embedded in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html