Re: RFC: android logger feedback request

2012-01-06 Thread Tim Bird
I'm back from vacation - sorry for the long delay in responding.

First, let me say thanks to all those who have responded with ideas
and suggestions.  I'll be following up on many of them in the new
few weeks.  This is a background task for me, so it will likely
go relatively slowly.  I have volunteer offers of assistance,
and some resources available for contract work, but it might take time
to set that up.  In general, though, things should be able to move forward.

I could respond to a lot of messages individually, but
to conserve space I'll focus on this one, and present an overall
idea for how I plan to proceed.

On 12/21/2011 11:05 PM, NeilBrown wrote:
 Possibly it would be useful to be clear what we all *are* really interested
 in, because I suspect there is a lot of variety.
 
 You appear to be interested in providing a great platform for a phone, in
 minimising unnecessary churn in that platform, and in having the freedom to
 optimise various aspects however you see fit.  You appear to have little
 problem with maintaining some components out-of-mainline. This is all
 perfectly valid and very much in the spirit of freedom that binds us
 together.
 
 Others want to be able to run a main-line kernel underneath the Android
 user-space - and that is perfectly reasonable as well.
 
 I don't much care about either of those, but I want Linux to be high
 quality (according to my own personal standards of course) which means
 keeping bad stuff out (though you could argue that the horse has well and
 truly bolted there) and including good and useful stuff in.  And I think
 Android has something to offer on both sides there :-)

Thanks very much for this.  I think it *is* important to take a step
back and compare goals, to see how we can achieve the best result for
all parties.

 
 Weighing all that up, I don't think it is useful to set our goal on getting
 Android to use a mainline kernel - that isn't going to happen.
 Rather we should focus primarily on making it *possible* to run android
 user-space on mainline.
 
 That could involve two things
 1/ Making sure the interfaces that Android uses are abstracted at a suitable
   level so that when running a mainline kernel you can just slip in an
   alternate shared library with compatible functionality.
 2/ Adding functionality to Linux so that it is possible to provide the
   functionality that Android needs.

Agreed.

 Android should not *have* to use the interface that the mainline
 community decides is preferred, but nor should mainline be required to
 include the drivers that Android wants to use.  History shows us that isn't
 going to happen.

I agree with the sentiment in the first sentence, but I would like to
make a few observations about this code, and the problem in general,
for consideration.  I'll do this below because it's a bit philosophical,
and I'd rather focus on the way forward first.

 But if there was a fairly low-level API that Android used, then those in the
 community could who want Android on a mainline kernel could work to implement
 that API with whatever mixture of user-space and kernel-space that achieves
 consensus.
 Android could of course change to use the community version eventually if
 they found that was a good thing, or could keep using their own.
 
 So: bringing this back to the android logger...
   What I would like to see is a low-level API which is used to access logging
   for which alternate implementations could be written.  Ideally this would
   be embodied in a single .so, but we have the source so that doesn't need to
   be a barrier.
   Then we could argue to our heart's content about how best to implement that
   API - Journal and nsyslogd and rsyslogd could all implement it in different
   ways and we could be one step closer to running Android on a mainline
   kernel, but the Android developers don't need to be involved (but can if
   they want to of course).
 
   I would be important that the API is specified clearly - neither under
   specified nor over-specified. That means that the Android implementation
   would need to explicitly forbid anything that isn't explicit permitted.
   This is because most testing will happen on the Android platform so it's
   actual behaviour will become the defacto standard.
 
   Could that be a possible way forward?

I'd like to pursue this, as well as some of the minor code cleanups
suggested by Andrew Morton.  Here's what I'm thinking:

I'd like to implement Neil's file-system based solution as a test case,
and compare that with the existing code.  I like the elegance of using
existing filesystem semantics, and the removal of some hard-coded limits
and policy from the kernel.  A lower priority would be to also try
a user-space-only solution, as described by Arnd.  Optimally, it would
be nice to have all three systems to compare (char dev, log fs, and
ram fs) against each other.

I'll try to use these under the existing logger library API to see if the
current semantics 

Re: RFC: android logger feedback request

2012-01-06 Thread Greg KH
On Fri, Jan 06, 2012 at 12:56:27PM -0800, Tim Bird wrote:
 Now, having said all that, let me go off into some philosophical weeds...

Ah, this should be fun :)

 This code is only about 700 lines long, and specializes the kernel for
 an extremely popular (by usage) user space stack.  Code of
 lower quality which specializes the kernel for much less-used hardware
 has been admitted into the kernel with significantly less fuss than
 this code has received.  That's not an argument to accept the code
 as is, it's just an observation about the relative hurdle that this
 code faces compared to lots of other code that goes into the kernel.
 (And please don't interpret this as dissatisfaction with the feedback
 received.)
 
 If this code were a character driver for an obscure serial port
 on a lesser-known chip architecture, I don't think it would get
 any attention at all.  As it is, it's looking like at least a few
 man months of work will be required, as well as some relatively
 unneeded changes to Android user space, to get this feature into
 a permanently acceptable state.  I wouldn't be surprised to see
 this stretch into a few calendar years.
 
 Code that specializes the kernel in weird ways is accepted into
 the kernel all the time, and I've tried to figure out why this
 particular bit of code is treated differently.  Especially since
 this code is self-contained, configurable, and imposes no
 perceivable long-term maintenance burden.  (That's not true of
 all Android patches, but I believe it's true of this one).
 
 I have a few theories:
 1) this is not tied to hardware, and as such represents a general
 feature (but people are not at all required to treat it as such,
 just as they are not required to use other people's weird drivers).

That is exactly true.

 2) people want to avoid duplication with other similar features
 (again, since it's self-contained and configurable, I don't know
 why it would bother people if this existed in tandem with other
 solutions - especially since it's so small)

Again, very true in the duplication sense.

It is general code, not tied to hardware, so it better justify itself
somehow.

 3) there is really no maintainer for this feature category, so
 discussions get bogged down as varying requirements and solutions
 are suggested, which can not easily be compared against each other
 (especially for non-existent implementations)  In particular, it's
 unclear who I have to get the approval of for this code or some
 derivative of it to be accepted.  That makes the development task
 a very open-ended one.

No, I don't think this is a problem, you have bigger ones to deal with
:)

 4) this is for a popular use case, as opposed to some minor
 outlying thing, and so people perceive the need to get it
 exactly right.  In this sense, the code would be a victim of
 it's own success.

It's not the code's success here, it's the this solves a general
problem in a very specific way issue, that just happens to be a use
case a lot of different people want to see addressed.  And since it is
only solving it in a narrow way, that's a problem for all of those other
people.

 5) blocking this is perceived to be a way to accomplish a
 larger, related goal (if this is true then it has lots of
 interesting implications for the economics of open source
 work)

No, I don't think that's the issue here.

 In general, there is a tension between the normal nature of adapting
 the kernel to the most general use cases, and the specialization
 that is performed in developing an embedded product.  Often
 times, solutions to embedded requirements are very finely tuned
 to a particular field of use or situation, and don't lend themselves
 easily to the type of generalization that mainlining usually requires.

Why do people get hung up on the embedded is special case here.  Fact,
it's not.  Do you somehow think that the work done for the HUGE
multiprocessor machines 10 years ago would have gotten away with hacks
saying, this is a special case, only we care about this, when they
were dealing with 4 and 8 way machines?  No, they didn't, and as such,
there's nothing special with embedded here either.  Matter of fact, your
machines are now more powerful than those old 10 year old machines, and
are really general purpose computers, just used in a single/limited
manner, again, just like those original big SMP machines were.

So, I've said it many times before, and I'll say it again:

  Yes, you are special and unique, just like everyone else.

The next person who says the embedded is different phrase again, owes
me a beer of my choice.

 Which brings me to my last point and a question.
 Is it inconceivable for there to be a category of code in the
 Linux kernel which supports ONLY Android user space, and no
 other?  That is, must every Android patch be generalized in
 some manner to a broader use case?

No, that's fine, I have no problem with a drivers/android/ directory for
android only stuff, because:

 I 

Re: RFC: android logger feedback request

2012-01-06 Thread Tim Bird
On 01/06/2012 01:20 PM, Greg KH wrote:
 On Fri, Jan 06, 2012 at 12:56:27PM -0800, Tim Bird wrote:
 4) this is for a popular use case, as opposed to some minor
 outlying thing, and so people perceive the need to get it
 exactly right.  In this sense, the code would be a victim of
 it's own success.
 
 It's not the code's success here, it's the this solves a general
 problem in a very specific way issue, that just happens to be a use
 case a lot of different people want to see addressed.  And since it is
 only solving it in a narrow way, that's a problem for all of those other
 people.

It's only a problem in the sense that those different people
don't get something for free.  The presence of an isolated,
configurable driver somewhere in the kernel source, that
addresses one group's issues and not another's doesn't impose
any measurable burden on that other group.

It's more of an opportunity cost thing.  The opportunity
to develop something more general can be reduced when a narrow
solution is accepted.  But this is exactly related to
encouraging people to develop more general solutions, when
narrow ones accomplish what they need (see point 5).

I understand that the whole model is built on people contributing
code that not only addresses their own needs, but the needs of
others.  But of course there should be balance.  For example
I have concerns about integrating this into the printk code path,
because I think that code path is complex enough as it is,
and that combining this functionality with that is likely to
create more maintenance problems rather than less in the
long run.  (But I'll still plan to look at this option ...)

Sometimes, features should just remain separate.

 
 5) blocking this is perceived to be a way to accomplish a
 larger, related goal (if this is true then it has lots of
 interesting implications for the economics of open source
 work)
 
 No, I don't think that's the issue here.

I don't think anyone is thinking: If I block this
I can get Tim to do x.  But there *is* a desire to
create a more general logging solution.  And not
accepting this as is, is a way to encourage people to
work on that.

For some situations, I would object to that.  But in
this case I actually agree that there are more potential users
of this, and it would be nice to solve more problems than
just the Android case, with a consolidated code base.  That's
why I'm willing to invest in researching other loggers and
doing some experimental development.

 
 In general, there is a tension between the normal nature of adapting
 the kernel to the most general use cases, and the specialization
 that is performed in developing an embedded product.  Often
 times, solutions to embedded requirements are very finely tuned
 to a particular field of use or situation, and don't lend themselves
 easily to the type of generalization that mainlining usually requires.

I should have left off embedded product for this thread.  There is
a tension in general between generalization and specialization.  And
that tension shows up in a lot of discussions about mainlining code.
My goal here is to determine where people think the balance lies
for this particular code.

 Why do people get hung up on the embedded is special case here.  Fact,
 it's not.  Do you somehow think that the work done for the HUGE
 multiprocessor machines 10 years ago would have gotten away with hacks
 saying, this is a special case, only we care about this, when they
 were dealing with 4 and 8 way machines?  No, they didn't, and as such,
 there's nothing special with embedded here either.  Matter of fact, your
 machines are now more powerful than those old 10 year old machines, and
 are really general purpose computers, just used in a single/limited
 manner, again, just like those original big SMP machines were.
 
 So, I've said it many times before, and I'll say it again:
 
   Yes, you are special and unique, just like everyone else.
 
 The next person who says the embedded is different phrase again, owes
 me a beer of my choice.

Consider this your free beer token... :-)

I believe embedded is different because we throw away so much code,
that I just don't think desktop and enterprise do.  The number of
non-mainlined BSPs that have gone straight from product release to
bit bucket is truly staggering.

You're only looking at the Android case, where the machines are as
powerful as they were 10 years ago.  Broad swaths of embedded
are not on that kind of hardware.  It's different (there it is again,
is that two beers?) being at the bottom side of the scalability
spectrum rather than at the top.  Everyone expected that eventually
there would be more SMP machines in the world.  This is not true of
all embedded processors.  We're used to developing code that we'll
throw away after a single release.

But I think this is beside the point for this particular code.

I agree that it's worth some effort to try to generalize this
logger code. Indeed, when I first saw the 

Re: RFC: android logger feedback request

2012-01-06 Thread Greg KH
On Fri, Jan 06, 2012 at 02:41:52PM -0800, Tim Bird wrote:
 On 01/06/2012 01:20 PM, Greg KH wrote:
  On Fri, Jan 06, 2012 at 12:56:27PM -0800, Tim Bird wrote:
  4) this is for a popular use case, as opposed to some minor
  outlying thing, and so people perceive the need to get it
  exactly right.  In this sense, the code would be a victim of
  it's own success.
  
  It's not the code's success here, it's the this solves a general
  problem in a very specific way issue, that just happens to be a use
  case a lot of different people want to see addressed.  And since it is
  only solving it in a narrow way, that's a problem for all of those other
  people.
 
 It's only a problem in the sense that those different people
 don't get something for free.  The presence of an isolated,
 configurable driver somewhere in the kernel source, that
 addresses one group's issues and not another's doesn't impose
 any measurable burden on that other group.

That's been proven incorrect over the long-run as we all must now
maintain that user/kernel API for forever.

You could say the same thing for adding new ioctls for individual
drivers instead of creating a new syscall.  Sometimes, in rare cases, it
does make sense to create a new ioctl.  But that decision has to be
taken very carefully, as again, it needs to be standardized with all
other drivers of that time, and maintained for forever.

 It's more of an opportunity cost thing.  The opportunity
 to develop something more general can be reduced when a narrow
 solution is accepted.  But this is exactly related to
 encouraging people to develop more general solutions, when
 narrow ones accomplish what they need (see point 5).

When people work together on their narrow goals, they end up creating
something that works for way more than just themselves, and in fact, it
turns out it works better for people who originally wasn't even aware of
the issues involved.

Again, this has been proven over time with the Linux kernel development
process.  I wrote a whole chapter on this very topic in the book,
Beautiful Code, if you want to hear more about it.

In fact, I would argue, that is one of the top reasons why Linux has
succeeded so well.  Again, you really aren't unique, others will have
the same issues/problems that you do, and if you solve them in a case
that works for more than one, it makes everything better.

 I understand that the whole model is built on people contributing
 code that not only addresses their own needs, but the needs of
 others.  But of course there should be balance.  For example
 I have concerns about integrating this into the printk code path,
 because I think that code path is complex enough as it is,
 and that combining this functionality with that is likely to
 create more maintenance problems rather than less in the
 long run.  (But I'll still plan to look at this option ...)

Complex code paths are tough, I will grant you that.  But if it was
easy, then, you wouldn't be the person for the job :)

 Sometimes, features should just remain separate.

When they do different things, of course.  When they do the same thing,
of course not.

  Why do people get hung up on the embedded is special case here.  Fact,
  it's not.  Do you somehow think that the work done for the HUGE
  multiprocessor machines 10 years ago would have gotten away with hacks
  saying, this is a special case, only we care about this, when they
  were dealing with 4 and 8 way machines?  No, they didn't, and as such,
  there's nothing special with embedded here either.  Matter of fact, your
  machines are now more powerful than those old 10 year old machines, and
  are really general purpose computers, just used in a single/limited
  manner, again, just like those original big SMP machines were.
  
  So, I've said it many times before, and I'll say it again:
  
Yes, you are special and unique, just like everyone else.
  
  The next person who says the embedded is different phrase again, owes
  me a beer of my choice.
 
 Consider this your free beer token... :-)

Oooh, nice, I'll take you up on that at ELC in a few weeks :)

 I believe embedded is different because we throw away so much code,
 that I just don't think desktop and enterprise do.  The number of
 non-mainlined BSPs that have gone straight from product release to
 bit bucket is truly staggering.

Yes, and that's sad.  And maybe, if embedded feels that they are so
special, and that contributing to mainline doesn't help or matter, then
they shouldn't.

Yes, I mean it, why push stuff upstream if you aren't going to maintain
it and no one will ever build on top of it.

Oh wait, you will end up building on it with your next iteration, so if
you throw it away, oops, you just wasted time and money.

And again, this mirrors the super computer case EXACTLY!  In fact, I
would argue that there are less machines shipped for those types of
systems, than embedded, and do you see those developers saying that they
should just keep their 

Re: RFC: android logger feedback request

2012-01-04 Thread Geunsik Lim
On Thu, Dec 29, 2011 at 9:39 AM, Andrew Morton
a...@linux-foundation.org wrote:
 On Wed, 21 Dec 2011 14:59:15 -0800
 Tim Bird tim.b...@am.sony.com wrote:

 Hi all,

 I'm looking for feedback on the Android logger code,

 The code looks nice.


 ...

 +/* logger_offset - returns index 'n' into the log via (optimized) modulus */
 +#define logger_offset(n)     ((n)  (log-size - 1))
 +

 This macro depends upon the presence of a local variable called log
 and is therefore all stinky.  Pass log in as an arg and convert it to
 a regular C function.


 ...

 +/*
 + * get_entry_len - Grabs the length of the payload of the next entry 
 starting
 + * from 'off'.
 + *
 + * Caller needs to hold log-mutex.
 + */
 +static __u32 get_entry_len(struct logger_log *log, size_t off)
 +{
 +     __u16 val;
 +
 +     switch (log-size - off) {
 +     case 1:
 +             memcpy(val, log-buffer + off, 1);
 +             memcpy(((char *) val) + 1, log-buffer, 1);

 So numbers in the buffer are in host endian order.  That's worth
 capturing in a comment somewhere.

 There must be a way of doing the above more neatly, using
 log-buffer[off] and log-buffer[0].  Perhaps using
 include/linux/unaligned/packed_struct.h.

 +             break;
 +     default:
 +             memcpy(val, log-buffer + off, 2);

 get_unaligned()

 +     }
 +
 +     return sizeof(struct logger_entry) + val;
 +}
 +

 ...

 +static ssize_t logger_read(struct file *file, char __user *buf,
 +                        size_t count, loff_t *pos)
 +{
 +     struct logger_reader *reader = file-private_data;
 +     struct logger_log *log = reader-log;
 +     ssize_t ret;
 +     DEFINE_WAIT(wait);
 +
 +start:
 +     while (1) {
 +             prepare_to_wait(log-wq, wait, TASK_INTERRUPTIBLE);
 +
 +             mutex_lock(log-mutex);

 If mutex_lock() had to wait for the mutex, it will return in state
 TASK_RUNNING, thus rubbing out the effects of prepare_to_wait().  We'll
 just loop again so this is a benign buglet.

 Could we lose a wakeup by running prepaer_to_wait() outside the lock?
 I don't think so, but I stopped thinking about it when I saw the above
 bug.  These two lines should be switched around.

 +             ret = (log-w_off == reader-r_off);
 +             mutex_unlock(log-mutex);
 +             if (!ret)
 +                     break;
 +
 +             if (file-f_flags  O_NONBLOCK) {
 +                     ret = -EAGAIN;
 +                     break;
 +             }
 +
 +             if (signal_pending(current)) {
 +                     ret = -EINTR;
 +                     break;
 +             }
 +
 +             schedule();
 +     }
 +
 +     finish_wait(log-wq, wait);
 +     if (ret)
 +             return ret;
 +
 +     mutex_lock(log-mutex);
 +
 +     /* is there still something to read or did we race? */
 +     if (unlikely(log-w_off == reader-r_off)) {
 +             mutex_unlock(log-mutex);
 +             goto start;
 +     }
 +
 +     /* get the size of the next entry */
 +     ret = get_entry_len(log, reader-r_off);
 +     if (count  ret) {
 +             ret = -EINVAL;
 +             goto out;
 +     }
 +
 +     /* get exactly one entry from the log */
 +     ret = do_read_log_to_user(log, reader, buf, ret);
 +
 +out:
 +     mutex_unlock(log-mutex);
 +
 +     return ret;
 +}
 +

 ...

 +/*
 + * clock_interval - is a  c  b in mod-space? Put another way, does the 
 line
 + * from a to b cross c?
 + */

 This is where my little brain started to hurt.  I assume it's correct ;)

 +static inline int clock_interval(size_t a, size_t b, size_t c)
 +{
 +     if (b  a) {
 +             if (a  c || b = c)
 +                     return 1;
 +     } else {
 +             if (a  c  b = c)
 +                     return 1;
 +     }
 +
 +     return 0;
 +}

 The explicit inline(s) are increaseingly old-fashioned.  gcc is good
 about this now.


 ...

 +static ssize_t do_write_log_from_user(struct logger_log *log,
 +                                   const void __user *buf, size_t count)
 +{
 +     size_t len;
 +
 +     len = min(count, log-size - log-w_off);
 +     if (len  copy_from_user(log-buffer + log-w_off, buf, len))
 +             return -EFAULT;
 +
 +     if (count != len)
 +             if (copy_from_user(log-buffer, buf + len, count - len))
 +                     return -EFAULT;

 Is it correct to simply return here without updating -w_off?
 We've just copied len bytes in from userspace.

 +     log-w_off = logger_offset(log-w_off + count);
 +
 +     return count;
 +}
 +
 +/*
 + * logger_aio_write - our write method, implementing support for write(),
 + * writev(), and aio_write(). Writes are our fast path, and we try to 
 optimize
 + * them above all else.
 + */
 +ssize_t logger_aio_write(struct kiocb *iocb, const struct iovec *iov,
 +                      unsigned long nr_segs, loff_t ppos)
 +{
 +     struct logger_log *log = file_get_log(iocb-ki_filp);
 +     size_t orig = log-w_off;
 +     struct logger_entry header;
 +     struct timespec now;
 +   

Re: RFC: android logger feedback request

2011-12-28 Thread Andrew Morton
On Wed, 21 Dec 2011 14:59:15 -0800
Tim Bird tim.b...@am.sony.com wrote:

 Hi all,
 
 I'm looking for feedback on the Android logger code,

The code looks nice.


 ...

 +/* logger_offset - returns index 'n' into the log via (optimized) modulus */
 +#define logger_offset(n) ((n)  (log-size - 1))
 +

This macro depends upon the presence of a local variable called log
and is therefore all stinky.  Pass log in as an arg and convert it to
a regular C function.


 ...

 +/*
 + * get_entry_len - Grabs the length of the payload of the next entry starting
 + * from 'off'.
 + *
 + * Caller needs to hold log-mutex.
 + */
 +static __u32 get_entry_len(struct logger_log *log, size_t off)
 +{
 + __u16 val;
 +
 + switch (log-size - off) {
 + case 1:
 + memcpy(val, log-buffer + off, 1);
 + memcpy(((char *) val) + 1, log-buffer, 1);

So numbers in the buffer are in host endian order.  That's worth
capturing in a comment somewhere.

There must be a way of doing the above more neatly, using
log-buffer[off] and log-buffer[0].  Perhaps using
include/linux/unaligned/packed_struct.h.

 + break;
 + default:
 + memcpy(val, log-buffer + off, 2);

get_unaligned()

 + }
 +
 + return sizeof(struct logger_entry) + val;
 +}
 +

 ...

 +static ssize_t logger_read(struct file *file, char __user *buf,
 +size_t count, loff_t *pos)
 +{
 + struct logger_reader *reader = file-private_data;
 + struct logger_log *log = reader-log;
 + ssize_t ret;
 + DEFINE_WAIT(wait);
 +
 +start:
 + while (1) {
 + prepare_to_wait(log-wq, wait, TASK_INTERRUPTIBLE);
 +
 + mutex_lock(log-mutex);

If mutex_lock() had to wait for the mutex, it will return in state
TASK_RUNNING, thus rubbing out the effects of prepare_to_wait().  We'll
just loop again so this is a benign buglet.

Could we lose a wakeup by running prepaer_to_wait() outside the lock? 
I don't think so, but I stopped thinking about it when I saw the above
bug.  These two lines should be switched around.

 + ret = (log-w_off == reader-r_off);
 + mutex_unlock(log-mutex);
 + if (!ret)
 + break;
 +
 + if (file-f_flags  O_NONBLOCK) {
 + ret = -EAGAIN;
 + break;
 + }
 +
 + if (signal_pending(current)) {
 + ret = -EINTR;
 + break;
 + }
 +
 + schedule();
 + }
 +
 + finish_wait(log-wq, wait);
 + if (ret)
 + return ret;
 +
 + mutex_lock(log-mutex);
 +
 + /* is there still something to read or did we race? */
 + if (unlikely(log-w_off == reader-r_off)) {
 + mutex_unlock(log-mutex);
 + goto start;
 + }
 +
 + /* get the size of the next entry */
 + ret = get_entry_len(log, reader-r_off);
 + if (count  ret) {
 + ret = -EINVAL;
 + goto out;
 + }
 +
 + /* get exactly one entry from the log */
 + ret = do_read_log_to_user(log, reader, buf, ret);
 +
 +out:
 + mutex_unlock(log-mutex);
 +
 + return ret;
 +}
 +

 ...

 +/*
 + * clock_interval - is a  c  b in mod-space? Put another way, does the line
 + * from a to b cross c?
 + */

This is where my little brain started to hurt.  I assume it's correct ;)

 +static inline int clock_interval(size_t a, size_t b, size_t c)
 +{
 + if (b  a) {
 + if (a  c || b = c)
 + return 1;
 + } else {
 + if (a  c  b = c)
 + return 1;
 + }
 +
 + return 0;
 +}

The explicit inline(s) are increaseingly old-fashioned.  gcc is good
about this now.


 ...

 +static ssize_t do_write_log_from_user(struct logger_log *log,
 +   const void __user *buf, size_t count)
 +{
 + size_t len;
 +
 + len = min(count, log-size - log-w_off);
 + if (len  copy_from_user(log-buffer + log-w_off, buf, len))
 + return -EFAULT;
 +
 + if (count != len)
 + if (copy_from_user(log-buffer, buf + len, count - len))
 + return -EFAULT;

Is it correct to simply return here without updating -w_off? 
We've just copied len bytes in from userspace.

 + log-w_off = logger_offset(log-w_off + count);
 +
 + return count;
 +}
 +
 +/*
 + * logger_aio_write - our write method, implementing support for write(),
 + * writev(), and aio_write(). Writes are our fast path, and we try to 
 optimize
 + * them above all else.
 + */
 +ssize_t logger_aio_write(struct kiocb *iocb, const struct iovec *iov,
 +  unsigned long nr_segs, loff_t ppos)
 +{
 + struct logger_log *log = file_get_log(iocb-ki_filp);
 + size_t orig = log-w_off;
 + struct logger_entry header;
 + struct timespec now;
 + ssize_t ret = 0;
 +
 + now = current_kernel_time();
 +
 + header.pid = current-tgid;
 + 

Re: RFC: android logger feedback request

2011-12-23 Thread Alan Cox
 I have no problem leaving the logger driver in staging, but it seems
 that Tim is taking on the task to do the harder thing here, which
 probably would entail work on both sides, which as a openhandset
 alliance company member, he might have a change that someone like me
 might not :)

I'd like to see it in staging. There is a general discussion going on
about how such logging services should look (and indeed it goes back
before Android bits), and it looks like a good way to proceed is going to
be to play with the Android one - perhaps adding the fs model to it.

So we need something and the Android interface at the very least is a
prototype that has production proof of working.

Alan
--
To unsubscribe from this list: send the line unsubscribe linux-embedded in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RFC: android logger feedback request

2011-12-23 Thread Greg KH
On Fri, Dec 23, 2011 at 03:22:31PM +, Alan Cox wrote:
  I have no problem leaving the logger driver in staging, but it seems
  that Tim is taking on the task to do the harder thing here, which
  probably would entail work on both sides, which as a openhandset
  alliance company member, he might have a change that someone like me
  might not :)
 
 I'd like to see it in staging. There is a general discussion going on
 about how such logging services should look (and indeed it goes back
 before Android bits), and it looks like a good way to proceed is going to
 be to play with the Android one - perhaps adding the fs model to it.
 
 So we need something and the Android interface at the very least is a
 prototype that has production proof of working.

It is already in staging, look in the staging-next branch of linux-next.

thanks,

greg k-h
--
To unsubscribe from this list: send the line unsubscribe linux-embedded in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RFC: android logger feedback request

2011-12-22 Thread Arnd Bergmann
On Thursday 22 December 2011, NeilBrown wrote:
 If you created a 'logbuf' filesystem that used libfs to provide a single
 directory in which privileged processes could create files then you wouldn't
 need the kernel to know the allowed logs: radio, events, main, system.
 The size could be set by ftruncate() (by privileged used again) rather than
 being hardcoded.
 
 You would defined 'read' and 'write' much like you currently do to create a 
 list of
 datagrams in a circular buffer and replace the ioctls by more standard
 interfaces:
 
 LOGGER_GET_LOG_BUG_SIZE would use 'stat' and the st_blocks field
 LOGGER_GET_LOG_LEN would use 'stat' and the st_size field
 LOGGER_GET_NEXT_ENTRY_LEN could use the FIONREAD ioctl
 LOGGER_FLUSH_LOG could use ftruncate
 
 The result would be much the same amount of code, but an interface which has
 fewer details hard-coded and is generally more versatile and accessible.

I like the idea and was going to suggest something very similar, but I wonder
if we could take the approach even further:

* Remove all kernel code for this and use a user space library together
  with tmpfs
* prepopulate the tmpfs at boot time with all the log buffers in the right
  size, and set the maximum file system size so that they cannot grow further.
* Have minimal formatting in the log buffer: A few bytes header (ring buffer
  start and end)
* Mandate that user space must use mmap and atomic operations to reserve space
  in the log and write to the files.
* Provide a tool to get the log data out of the buffer again in a race-free way.

Since any program that is allowed to write to the buffer can overwrite all
existing information in it anyway, I think we don't actually need any kernel
help in maintaining consistency of the contents either -- the reader will
simply discard any data. The main thing we would not be able to guarantee
without kernel help is proving the origin of individual messages, but I'm
not sure if that is a design goal.

Arnd
--
To unsubscribe from this list: send the line unsubscribe linux-embedded in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RFC: android logger feedback request

2011-12-22 Thread Kay Sievers
On Thu, Dec 22, 2011 at 15:59, Arnd Bergmann a...@arndb.de wrote:

 * Remove all kernel code for this and use a user space library together
  with tmpfs
 * prepopulate the tmpfs at boot time with all the log buffers in the right
  size, and set the maximum file system size so that they cannot grow further.
 * Have minimal formatting in the log buffer: A few bytes header (ring buffer
  start and end)
 * Mandate that user space must use mmap and atomic operations to reserve space
  in the log and write to the files.
 * Provide a tool to get the log data out of the buffer again in a race-free 
 way.

 Since any program that is allowed to write to the buffer can overwrite all
 existing information in it anyway, I think we don't actually need any kernel
 help in maintaining consistency of the contents either -- the reader will
 simply discard any data. The main thing we would not be able to guarantee
 without kernel help is proving the origin of individual messages, but I'm
 not sure if that is a design goal.

I'm very sure you want rate limiting, global sequence numbers and
trusted properties added to the log.

I can imagine that the kernel provides a single character device,
instantiates a tmpfs mountpoint and splits all stuff that travels into
that device into several files (ring buffers) in the tmpfs filesystem.
The files are probably just split by the uid of the incoming message
sender, which maps to an app at Android.

Only the character device would be able to write to the file, but the
user with the matching uid can read its own file back directly from
tmpfs.

I'm very convinced that we should not design a logging system today,
where we give the ownership of the data to the actual untrusted
process that logs.

The current kmsg buffer could just be one additional file in this tmpfs mount.

Kay
--
To unsubscribe from this list: send the line unsubscribe linux-embedded in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RFC: android logger feedback request

2011-12-21 Thread Tim Bird
Hi all,

I'm looking for feedback on the Android logger code, to see what
it would take to make this code acceptable for inclusion in
the mainline kernel.

Information about the features of Android logging system
can be found at: http://elinux.org/Android_Logging_System

This system creates a new system-wide logging service, in
the kernel, for user-space message.  It is more comparable
to syslog than to the kernel log buffer, as it holds only
user-space messages.  It is optimized for write
performance, since most of the time the log is written to
and never read.  It creates multiple log channels, to prevent
an abundance of log messages in one channel from overwriting
messages in another channel.  The log channels have sizes
fixed at kernel compile-time.

Log messages are stored in very simple in-kernel buffers, that
overflow old messages upon wrapping.  A fixed set of attributes
(pid, tid, timestamp and message), is kept for each message.
By convention, Android puts a message priority and context tag
into each message.

In Android, this system uses a fixed set of device nodes with
well-known names: /dev/log/main, /dev/log/events, /dev/log/radio
and /dev/log/system.

Operations on the log are done via a character device, using
standard file operations and some ioctls.

The code for this is below (I've moved it from linux-next
drivers/staging/android for my own testing).

Please let me know what issues you see with this code.

One specific question I have is where is the most appropriate
place for this code to live, in the kernel source tree?
Other embedded systems might want to use this system (it
is simpler than syslog, and superior in some ways), so I don't
think it should remain in an android-specific directory.

Thanks,
 -- Tim

P.S. Sorry to do this right before the holidays.  I'll be
away from my work machine for most of the next few weeks.  I
can answer informational questions, and gather feedback,
but I won't have a lot of time for testing new code until
after the New Year.


Here's the code:
diff --git a/drivers/misc/logger.c b/drivers/misc/logger.c
new file mode 100644
index 000..fa76ce7
--- /dev/null
+++ b/drivers/misc/logger.c
@@ -0,0 +1,616 @@
+/*
+ * drivers/misc/logger.c
+ *
+ * A Logging Subsystem
+ *
+ * Copyright (C) 2007-2008 Google, Inc.
+ *
+ * Robert Love rl...@google.com
+ *
+ * This software is licensed under the terms of the GNU General Public
+ * License version 2, as published by the Free Software Foundation, and
+ * may be copied, distributed, and modified under those terms.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#include linux/sched.h
+#include linux/module.h
+#include linux/fs.h
+#include linux/miscdevice.h
+#include linux/uaccess.h
+#include linux/poll.h
+#include linux/slab.h
+#include linux/time.h
+#include logger.h
+
+#include asm/ioctls.h
+
+/*
+ * struct logger_log - represents a specific log, such as 'main' or 'radio'
+ *
+ * This structure lives from module insertion until module removal, so it does
+ * not need additional reference counting. The structure is protected by the
+ * mutex 'mutex'.
+ */
+struct logger_log {
+   unsigned char   *buffer;/* the ring buffer itself */
+   struct miscdevice   misc;   /* misc device representing the log */
+   wait_queue_head_t   wq; /* wait queue for readers */
+   struct list_headreaders; /* this log's readers */
+   struct mutexmutex;  /* mutex protecting buffer */
+   size_t  w_off;  /* current write head offset */
+   size_t  head;   /* new readers start here */
+   size_t  size;   /* size of the log */
+};
+
+/*
+ * struct logger_reader - a logging device open for reading
+ *
+ * This object lives from open to release, so we don't need additional
+ * reference counting. The structure is protected by log-mutex.
+ */
+struct logger_reader {
+   struct logger_log   *log;   /* associated log */
+   struct list_headlist;   /* entry in logger_log's list */
+   size_t  r_off;  /* current read head offset */
+};
+
+/* logger_offset - returns index 'n' into the log via (optimized) modulus */
+#define logger_offset(n)   ((n)  (log-size - 1))
+
+/*
+ * file_get_log - Given a file structure, return the associated log
+ *
+ * This isn't aesthetic. We have several goals:
+ *
+ * 1) Need to quickly obtain the associated log during an I/O operation
+ * 2) Readers need to maintain state (logger_reader)
+ * 3) Writers need to be very fast (open() should be a near no-op)
+ *
+ * In the reader case, we can trivially go file-logger_reader-logger_log.
+ * For a writer, we don't want to maintain a logger_reader, so we just go
+ * file-logger_log. Thus 

Re: RFC: android logger feedback request

2011-12-21 Thread john stultz
On Wed, 2011-12-21 at 15:19 -0800, Greg KH wrote:
 On Wed, Dec 21, 2011 at 02:59:15PM -0800, Tim Bird wrote:
  Hi all,
  
  I'm looking for feedback on the Android logger code, to see what
  it would take to make this code acceptable for inclusion in
  the mainline kernel.
  
  Information about the features of Android logging system
  can be found at: http://elinux.org/Android_Logging_System
  
  This system creates a new system-wide logging service, in
  the kernel, for user-space message.  It is more comparable
  to syslog than to the kernel log buffer, as it holds only
  user-space messages.  It is optimized for write
  performance, since most of the time the log is written to
  and never read.  It creates multiple log channels, to prevent
  an abundance of log messages in one channel from overwriting
  messages in another channel.  The log channels have sizes
  fixed at kernel compile-time.
  
  Log messages are stored in very simple in-kernel buffers, that
  overflow old messages upon wrapping.  A fixed set of attributes
  (pid, tid, timestamp and message), is kept for each message.
  By convention, Android puts a message priority and context tag
  into each message.
  
  In Android, this system uses a fixed set of device nodes with
  well-known names: /dev/log/main, /dev/log/events, /dev/log/radio
  and /dev/log/system.
  
  Operations on the log are done via a character device, using
  standard file operations and some ioctls.
  
  The code for this is below (I've moved it from linux-next
  drivers/staging/android for my own testing).
  
  Please let me know what issues you see with this code.
 
 That all describes the current code, but you haven't described what's
 wrong with the existing syslog interface that requires this new driver
 to be written.  And why can't the existing interface be fixed to address
 these (potential) shortcomings?
 
  One specific question I have is where is the most appropriate
  place for this code to live, in the kernel source tree?
  Other embedded systems might want to use this system (it
  is simpler than syslog, and superior in some ways), so I don't
  think it should remain in an android-specific directory.
 
 What way is it superior?  Again, why not extend syslog?  Why not fix
 syslog if this really is a superior thing?  How does this tie into Kay
 and Lennard's proposal for work in this area?

There is also some overlap functionality wise with pstore as well, as I
believe the logger is used as a known location in memory where messages
can be fetched from after a kernel panic or crash.

thanks
-john

--
To unsubscribe from this list: send the line unsubscribe linux-embedded in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RFC: android logger feedback request

2011-12-21 Thread Greg KH
On Wed, Dec 21, 2011 at 04:18:11PM -0800, john stultz wrote:
 On Wed, 2011-12-21 at 15:19 -0800, Greg KH wrote:
  On Wed, Dec 21, 2011 at 02:59:15PM -0800, Tim Bird wrote:
   Hi all,
   
   I'm looking for feedback on the Android logger code, to see what
   it would take to make this code acceptable for inclusion in
   the mainline kernel.
   
   Information about the features of Android logging system
   can be found at: http://elinux.org/Android_Logging_System
   
   This system creates a new system-wide logging service, in
   the kernel, for user-space message.  It is more comparable
   to syslog than to the kernel log buffer, as it holds only
   user-space messages.  It is optimized for write
   performance, since most of the time the log is written to
   and never read.  It creates multiple log channels, to prevent
   an abundance of log messages in one channel from overwriting
   messages in another channel.  The log channels have sizes
   fixed at kernel compile-time.
   
   Log messages are stored in very simple in-kernel buffers, that
   overflow old messages upon wrapping.  A fixed set of attributes
   (pid, tid, timestamp and message), is kept for each message.
   By convention, Android puts a message priority and context tag
   into each message.
   
   In Android, this system uses a fixed set of device nodes with
   well-known names: /dev/log/main, /dev/log/events, /dev/log/radio
   and /dev/log/system.
   
   Operations on the log are done via a character device, using
   standard file operations and some ioctls.
   
   The code for this is below (I've moved it from linux-next
   drivers/staging/android for my own testing).
   
   Please let me know what issues you see with this code.
  
  That all describes the current code, but you haven't described what's
  wrong with the existing syslog interface that requires this new driver
  to be written.  And why can't the existing interface be fixed to address
  these (potential) shortcomings?
  
   One specific question I have is where is the most appropriate
   place for this code to live, in the kernel source tree?
   Other embedded systems might want to use this system (it
   is simpler than syslog, and superior in some ways), so I don't
   think it should remain in an android-specific directory.
  
  What way is it superior?  Again, why not extend syslog?  Why not fix
  syslog if this really is a superior thing?  How does this tie into Kay
  and Lennard's proposal for work in this area?
 
 There is also some overlap functionality wise with pstore as well, as I
 believe the logger is used as a known location in memory where messages
 can be fetched from after a kernel panic or crash.

That sounds like a major overlap, can't pstore be used for this
functionality today instead of the logger code?

greg k-h
--
To unsubscribe from this list: send the line unsubscribe linux-embedded in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RFC: android logger feedback request

2011-12-21 Thread john stultz
On Wed, 2011-12-21 at 16:27 -0800, Greg KH wrote:
 On Wed, Dec 21, 2011 at 04:18:11PM -0800, john stultz wrote:
  On Wed, 2011-12-21 at 15:19 -0800, Greg KH wrote:
   What way is it superior?  Again, why not extend syslog?  Why not fix
   syslog if this really is a superior thing?  How does this tie into Kay
   and Lennard's proposal for work in this area?
  
  There is also some overlap functionality wise with pstore as well, as I
  believe the logger is used as a known location in memory where messages
  can be fetched from after a kernel panic or crash.
 
 That sounds like a major overlap, can't pstore be used for this
 functionality today instead of the logger code?

It may be able to suffice for some portion of the functionality, it was
suggested that we consider doing something like try to make logger use
pstore as a backend.  I'm not really familiar with pstore, but I don't
know if its able to preserve messages from userland, so it may not
really be sufficient in of itself.

Also, if I understand correctly, logger also is able to store messages
from hardware modems or other hardware that run their own micro-os in a
central location. This may be closer to the pstore mce bits, but its not
clear yet to me how much overlap is there.

But I've not yet really spent too much time looking at logger yet, so I
can't really comment too strongly here. Just wanted to make sure folks
have looked at all the other similar work.

thanks
-john

--
To unsubscribe from this list: send the line unsubscribe linux-embedded in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RFC: android logger feedback request

2011-12-21 Thread NeilBrown
On Wed, 21 Dec 2011 16:36:21 -0800 Tim Bird tim.b...@am.sony.com wrote:

 On 12/21/2011 03:19 PM, Greg KH wrote:
  That all describes the current code, but you haven't described what's
  wrong with the existing syslog interface that requires this new driver
  to be written.  And why can't the existing interface be fixed to address
  these (potential) shortcomings?
 
 
  One specific question I have is where is the most appropriate
  place for this code to live, in the kernel source tree?
  Other embedded systems might want to use this system (it
  is simpler than syslog, and superior in some ways), so I don't
  think it should remain in an android-specific directory.
  
  What way is it superior?
 
 Here are some ways that this code is superior to syslog:

It is certainly nice and simple.  It really looks more like a filesystem than
a char device though...  though they aren't really files so much as lossy
pipes.  I don't think that's a problem though, lots of things in filesystems
don't behave exactly like files.

If you created a 'logbuf' filesystem that used libfs to provide a single
directory in which privileged processes could create files then you wouldn't
need the kernel to know the allowed logs: radio, events, main, system.
The size could be set by ftruncate() (by privileged used again) rather than
being hardcoded.

You would defined 'read' and 'write' much like you currently do to create a 
list of
datagrams in a circular buffer and replace the ioctls by more standard
interfaces:

LOGGER_GET_LOG_BUG_SIZE would use 'stat' and the st_blocks field
LOGGER_GET_LOG_LEN would use 'stat' and the st_size field
LOGGER_GET_NEXT_ENTRY_LEN could use the FIONREAD ioctl
LOGGER_FLUSH_LOG could use ftruncate

The result would be much the same amount of code, but an interface which has
fewer details hard-coded and is generally more versatile and accessible.

NeilBrown



signature.asc
Description: PGP signature


Re: RFC: android logger feedback request

2011-12-21 Thread Tim Bird
On 12/21/2011 04:51 PM, Greg KH wrote:
 On Wed, Dec 21, 2011 at 04:36:21PM -0800, Tim Bird wrote:
 On 12/21/2011 03:19 PM, Greg KH wrote:

 Huh, I'm not talking about syslogd, I'm talking about the syslog(2)
 syscall we have.

OK - switching gears.  Since the kernel log buffer isn't normally
used to store use-space messages, I thought you were referring
to syslog(3) and the associated (logger(1) and syslogd(8)).

 This character interface seems very close to the syslog(2) api, but just
 done in a character interface, with ioctls, which also require userspace
 tools to manage properly, so I fail to see the big gain here.
 
 What am I missing?

syslog(2) would more aptly be named klogctrl() (and it is in glibc)

There's currently no operation in sys_sylog (the kernel function
implementing syslog(2)) for writing to the log.  The write operation
to the kernel log buffer is also done via a character interface
/dev/kmsg (via code in drivers/char/mem.c)  This is actually very
similar to what the Android logger code does.

But while the kernel log buffer has lots of similarities to the Android logger
there are some key differences which I think are important to isolate
from a user-space logging system.

Here's a stream-of-consciousness dump of the differences:

The printk interface in the kernel is almost always automatically drained
to the device console, at the time of the printk (after the message is dropped
into the log buffer itself).  This extra operation is not needed for most
application-level messages that go into the log, and incurs extra overhead
in the log buffer code.

The printk code is especially designed to be called from within any kernel
context (including interrupt code), and so has some locking avoidance code
paths and complexity that are not needed for code which handles strictly
user-space messages.

Oddly enough, the printk code paths in the kernel can end up doing
a fair amount of print formatting, which can be time-consuming.  The code
path in kmsg_writev() contains at least one kmalloc, which could fail
when running out of memory.  The code path in the logger is much simpler,
consisting really of only a data copy.

Timestamping is not automatically appended to messages going into the
kernel log buffer (but they can be optionally pre-pended, with control
configurable at runtime).  They are represented
as ASCII text, which consumes a little more than twice the overhead of
a 32-bit binary field.  PID and TID are not automatically preserved in
the log. The kernel keeps it's priority in text also, and has no convention
for contextual tagging.  I'm not sure that we should change the
kernel log buffer to support structured binary data, in addition to the
free-form ASCII data that the kernel uses now.

The kernel log buffer does not support separate channels for different
classes of log messages (indeed, there is only one channel, and it has
kernel messages).  A new system call (or some backwards-compatible tweak
to the existing syslog(2) call would have to be added to support
a channel ID in order to support this.

There *are* some benefits to intermingling the kernel log messages and the
user-space log messages, but I think they are outweighed by the
value in keeping these systems separate.  There might be the opportunity
for code reuse, but I suspect we'd end up with about the same amount
of code increase overall (and possibly an additional syscall), and add
some unneeded complexity to the prink code path to accomplish it.

I just read Neil Brown's suggestion for doing this via a filesystem rather
than a char device, and it's interesting.  I'll respond to that separately.

 -- Tim

=
Tim Bird
Architecture Group Chair, CE Workgroup of the Linux Foundation
Senior Staff Engineer, Sony Network Entertainment
=

--
To unsubscribe from this list: send the line unsubscribe linux-embedded in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RFC: android logger feedback request

2011-12-21 Thread Greg KH
On Wed, Dec 21, 2011 at 05:32:36PM -0800, Tim Bird wrote:
 On 12/21/2011 04:51 PM, Greg KH wrote:
  On Wed, Dec 21, 2011 at 04:36:21PM -0800, Tim Bird wrote:
  On 12/21/2011 03:19 PM, Greg KH wrote:
 
  Huh, I'm not talking about syslogd, I'm talking about the syslog(2)
  syscall we have.
 
 OK - switching gears.  Since the kernel log buffer isn't normally
 used to store use-space messages, I thought you were referring
 to syslog(3) and the associated (logger(1) and syslogd(8)).

The kernel log buffer has been storing userspace messages for a while
now, look at your boot log on the latest Fedora and openSUSE releases
(or any other distro using systemd for booting).

  This character interface seems very close to the syslog(2) api, but just
  done in a character interface, with ioctls, which also require userspace
  tools to manage properly, so I fail to see the big gain here.
  
  What am I missing?
 
 syslog(2) would more aptly be named klogctrl() (and it is in glibc)

Maybe, but this is a kernel mailing list, we don't worry about glibc
much here :)

 There's currently no operation in sys_sylog (the kernel function
 implementing syslog(2)) for writing to the log.  The write operation
 to the kernel log buffer is also done via a character interface
 /dev/kmsg (via code in drivers/char/mem.c)  This is actually very
 similar to what the Android logger code does.

Again, see above, this has been in the kernel for quite a while now...

 But while the kernel log buffer has lots of similarities to the Android logger
 there are some key differences which I think are important to isolate
 from a user-space logging system.
 
 Here's a stream-of-consciousness dump of the differences:
 
 The printk interface in the kernel is almost always automatically drained
 to the device console, at the time of the printk (after the message is dropped
 into the log buffer itself).  This extra operation is not needed for most
 application-level messages that go into the log, and incurs extra overhead
 in the log buffer code.
 
 The printk code is especially designed to be called from within any kernel
 context (including interrupt code), and so has some locking avoidance code
 paths and complexity that are not needed for code which handles strictly
 user-space messages.
 
 Oddly enough, the printk code paths in the kernel can end up doing
 a fair amount of print formatting, which can be time-consuming.  The code
 path in kmsg_writev() contains at least one kmalloc, which could fail
 when running out of memory.  The code path in the logger is much simpler,
 consisting really of only a data copy.
 
 Timestamping is not automatically appended to messages going into the
 kernel log buffer (but they can be optionally pre-pended, with control
 configurable at runtime).  They are represented
 as ASCII text, which consumes a little more than twice the overhead of
 a 32-bit binary field.  PID and TID are not automatically preserved in
 the log. The kernel keeps it's priority in text also, and has no convention
 for contextual tagging.  I'm not sure that we should change the
 kernel log buffer to support structured binary data, in addition to the
 free-form ASCII data that the kernel uses now.
 
 The kernel log buffer does not support separate channels for different
 classes of log messages (indeed, there is only one channel, and it has
 kernel messages).  A new system call (or some backwards-compatible tweak
 to the existing syslog(2) call would have to be added to support
 a channel ID in order to support this.
 
 There *are* some benefits to intermingling the kernel log messages and the
 user-space log messages, but I think they are outweighed by the
 value in keeping these systems separate.  There might be the opportunity
 for code reuse, but I suspect we'd end up with about the same amount
 of code increase overall (and possibly an additional syscall), and add
 some unneeded complexity to the prink code path to accomplish it.

Again, please see what we are already doing in the kernel and userspace,
I think a lot of the above is already implemented.

Which brings me back to my original question, what does this code do,
that is not already present in the kernel, and why is a totally new
interface being proposed for it?

thanks,

greg k-h
--
To unsubscribe from this list: send the line unsubscribe linux-embedded in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RFC: android logger feedback request

2011-12-21 Thread Greg KH
On Wed, Dec 21, 2011 at 06:12:39PM -0800, Tim Bird wrote:
 Secondly, this is not a particularly new or radical interface. It is new
 to legacy Linux, but it's been in the Android Linux kernel for some years
 now, and it has worked well.

Sorry, but you know full well that is not a justification for _any_
chunk of code to be included in the main kernel tree.  Staging is fine,
but to add a new kernel/user api that looks at first glance, to merely
duplicate the existing in-kernel apis needs a very good explaination for
why it needs to be included.

greg k-h
--
To unsubscribe from this list: send the line unsubscribe linux-embedded in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RFC: android logger feedback request

2011-12-21 Thread Greg KH
On Wed, Dec 21, 2011 at 06:12:39PM -0800, Tim Bird wrote:
  Again, please see what we are already doing in the kernel and userspace,
  I think a lot of the above is already implemented.
 
 I don't know what systemd has got going on in user-space.  I'm looking
 at a very recent kernel, and I see no support for multiple log channels,
 or an optimized open/write path.

How is the existing syslog read path not optimized?  What kind of
speed and numbers are we talking about here?

Oh, and you do know about the userspace printk tty driver, right?  What
about using that combined with the existing syslog system call?

systemd doesn't use it, but I know of a few embedded systems that are
already using it today, and I think it might solve the android issues
already.

thanks,

greg k-h
--
To unsubscribe from this list: send the line unsubscribe linux-embedded in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RFC: android logger feedback request

2011-12-21 Thread Brian Swetland
On Wed, Dec 21, 2011 at 8:12 PM, Kay Sievers kay.siev...@vrfy.org wrote:

 We would like to have for the general Linux use case is a kmsg, that
 supports structured data, but looks the same as it looks today to
 current and legacy tools.

 We would like to have every printk that does not start with a
 KERN_CONT to create a record in the kmsg buffer. This record would
 carry directly a timestamp, the log facility and the log level and a
 reference to the usual human readable string.

 The current kmsg prefixing of the timestamp would be a runtime switch,
 and just added when the records are traversed, instead of the current
 buffer printing.

 In addition to this common data, every record can carry a
 'dictionary', which basically looks like a process environment, means
 KEY=value pairs. The additional data can carry all the stuff that is
 needed for structured logging and provides the base for machine
 readable log information.

 We don't want to separate multiple stores to avoid ordering problems
 when the messages need to be merged from the multiple streams on the
 receiver side.

 I think this is theoretically all possible with the current kmsg,
 without breaking anything. The current interfaces would iterate over
 the record list and print out the human readable parts and skip the
 metadata, instead of just copying a linear byte stream. To tools, it
 could look the same as it is currently. It's just that the ring buffer
 is not a byte- but a record-buffer.

 Advanced tools could get a new interface or switch the old interface
 to pipe out the structured data too.

 In the systemd journal, we have an ASCII-like stream format that
 carries out structured data and is binary data capable.

 If we want to think about any next generation logging, I'm convinced,
 we need to support records, structured data and binary data; anything
 else is unlikely worth to change the current stuff.

Any thoughts as to how one could allow N userspace agents to log to a
single shared buffer without one agent, if buggy or malicious,
spamming out all the other contents of the log?  This is one of the
main reasons we maintain separate logs in Android today, and are
likely to continue doing so until we figure out how resolve the issue.

Having everything in one unified log definitely has benefits (it
certainly makes keeping everything ordered and reading everything back
simpler), however we can't control third party applications (by
design), so systems that require apps to just don't do that are not
really solving that problem for us.

Also, as I mentioned earlier, from a security standpoint it is highly
desirable to be able to restrict which records certain readers can
read.  Convincing third party app developers to not log sensitive or
PII data remains a challenge -- providing the ability for filtered
read access allows some mitigation (app can retrieve it's own logs for
a bug report but can't sift through all logs looking for interesting
tidbits).

Brian
--
To unsubscribe from this list: send the line unsubscribe linux-embedded in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RFC: android logger feedback request

2011-12-21 Thread Kay Sievers
On Thu, Dec 22, 2011 at 04:49, Greg KH gre...@suse.de wrote:
 On Wed, Dec 21, 2011 at 06:34:08PM -0800, Brian Swetland wrote:
 The goals behind the logger driver have been:
 - keep userland and kernel logging separate (so that spammy userland
 logging doesn't make us lose critical kernel logs or the other way
 round)

 Wouldn't a simple userspace daemon solve this, writing the data to a
 ramfs file?

That's what the systemd journal does in /run when /var is not writable
or the /var/log/journal directory does not exist. All just lands on a
tmpfs until the next reboot. The journal database just acts as a
ringbuffer on a filesystem.

 - make log writing very inexpensive -- avoid having to pass messages
 between processes (more critical on ARM9 platforms where this implied
 extra cache flushing), avoid having to make several syscalls to write
 a log message (getting time of day, etc), and so on

 ramfs?

The receiver gets almost all of this from the socket with
SCM_TIMESTAP, and SCM_CREDENTIALS. Needed stuff can be added in a
similar way. This is trusted, reliable and pretty cheap to retrieve
data.

 - make log writing reliable -- don't trust userland to report its
 timestamp, PID, or to correctly format the datagrams, etc

 existing userspace printk tty driver?

Sockets provide all that today already in a trusted way.

 - allow a log watching process (logcat) to easily pull data from all
 logs at once

 what do you mean by at once?

/proc/kmsg provides that today, which is basically just a file wrapper
around the kernel's (misleadingly named) syslog() call.

 - avoid committing a vast amount of memory to logging

 memory where, in code or in the data being logged?

 - try to prevent clients from spamming each other out of log space
 (only successful on a coarse granularity right now with the
 main/system/radio/events logs)
 - ensure logs are not lost at the moment an app crashes

 Which logs?

Things need rate limits and quota based on the remaining buffer or
disk space. The systemd journal maintains a separate journal for every
login-uid, so every logged in user gets his own journal database,
which can not overflow the systems database or the databases of the
other users.

Every user can log to the journal, but only the journal daemon can
write to the databases. The user can directly read back its own
journal database, but not write it.

 On one hand, having each app (per PID) be able to create their own
 logs up to a specified size limit could be really useful and is
 something we've kicked around -- for one it would allow us to avoid
 the ever present request from userspace developers to increase the log
 size because of too much log spam (reduce log spam never seems to be
 an answer that makes them happy) -- but we haven't come up with a
 reasonable plan for dealing with well if we allow 16KB of log per app
 and the user installs 100 apps, they may be pinning up to 1.6MB of ram
 worst case, and so on.

 I think the userspace printk and syslog might already handle most of
 this today.  Tim, care to look into that and see if it does or not?

It's very simple what we have here in the kernel, and if syslog
facility prefixes are used, it can probably replace most of the
Android log solution just fine. We can not really protect one store to
flush the entire buffer though, that's what the 4 separated stores can
easily provide.

But as mentioned, I think we want the ability to work with structured
data, which both kernel tools are incapable of.

Kay
--
To unsubscribe from this list: send the line unsubscribe linux-embedded in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RFC: android logger feedback request

2011-12-21 Thread david

On Thu, 22 Dec 2011, Kay Sievers wrote:


If we want to think about any next generation logging, I'm convinced,
we need to support records, structured data and binary data; anything
else is unlikely worth to change the current stuff.


note that current generation syslog supports records, structured data, 
metadata capture of the sending process, everything you listed except 
unencoded binary data.


David Lang
--
To unsubscribe from this list: send the line unsubscribe linux-embedded in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RFC: android logger feedback request

2011-12-21 Thread Brian Swetland
On Wed, Dec 21, 2011 at 8:47 PM,  da...@lang.hm wrote:
 On Wed, 21 Dec 2011, Brian Swetland wrote:
 Any thoughts as to how one could allow N userspace agents to log to a
 single shared buffer without one agent, if buggy or malicious,
 spamming out all the other contents of the log?  This is one of the
 main reasons we maintain separate logs in Android today, and are
 likely to continue doing so until we figure out how resolve the issue.

 note that you still don't prevent one app from blowing out the logs, all you
 do is separate the logs into a handfulof categories and limit the app
 (through standard filesystem permissions) to only blowing out one category
 of logs.

 or do I misunderstand this and you actually keep a separate queue for each
 pid (which would seem to be a problem as you no longer know how much space
 you need)

You're correct -- that's the model at present.  We'd like to move to a
model where we have better control and are kicking around ideas for
how to do this for the next platform version.

The rate at which apps push data into logs is pretty amazing at times.
 The system booting (maybe 10-20 services and 10+ apps starting up)
can blow through 256K of ringbuffer in seconds.

Having enough context available when something crashes to be able to
diagnose what went wrong, especially when problems could involve
interaction between multiple agents in multiple processes is important
for the frameworks and apps teams.

Trying to get individual developers to be frugal with their logging
has turned out to be mostly a losing battle.

Brian
--
To unsubscribe from this list: send the line unsubscribe linux-embedded in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RFC: android logger feedback request

2011-12-21 Thread david

On Wed, 21 Dec 2011, Brian Swetland wrote:


On Wed, Dec 21, 2011 at 5:20 PM, NeilBrown ne...@suse.de wrote:

On Wed, 21 Dec 2011 16:36:21 -0800 Tim Bird tim.b...@am.sony.com wrote:


On 12/21/2011 03:19 PM, Greg KH wrote:

That all describes the current code, but you haven't described what's
wrong with the existing syslog interface that requires this new driver
to be written.  And why can't the existing interface be fixed to address
these (potential) shortcomings?



One specific question I have is where is the most appropriate
place for this code to live, in the kernel source tree?
Other embedded systems might want to use this system (it
is simpler than syslog, and superior in some ways), so I don't
think it should remain in an android-specific directory.


What way is it superior?


Here are some ways that this code is superior to syslog:


It is certainly nice and simple.  It really looks more like a filesystem than
a char device though...  though they aren't really files so much as lossy
pipes.  I don't think that's a problem though, lots of things in filesystems
don't behave exactly like files.

If you created a 'logbuf' filesystem that used libfs to provide a single
directory in which privileged processes could create files then you wouldn't
need the kernel to know the allowed logs: radio, events, main, system.
The size could be set by ftruncate() (by privileged used again) rather than
being hardcoded.

SNIP

The result would be much the same amount of code, but an interface which has
fewer details hard-coded and is generally more versatile and accessible.


Moving away from hard coding the names/sizes of the logs in the driver
is something that has been on the todo list for a while.  One thing
we'd likely want to accomplish there is avoid creating a vector for
consuming large amounts of memory by creating new logs.

One planned change (likely to happen in the Android J release
timeframe) is to adjust permissions such that any process can write
messages, but unless they belong to the correct group they can only
read back messages written by their own PID.  This is to allow apps to
grab their own log output after a crash or during a user problem
report without needing to grant them the ability to read all log
messages.

Currently the logger driver does not provide a mechanism for allowing
logs to survive a reboot (unlike the ramconsole), but this is
functionality that we've thought about adding.  Generally the kernel
logs are most interesting after an unexpected panic/reboot, but
getting a picture of what userspace has been up to can be useful too.

The goals behind the logger driver have been:
- keep userland and kernel logging separate (so that spammy userland
logging doesn't make us lose critical kernel logs or the other way
round)
- make log writing very inexpensive -- avoid having to pass messages
between processes (more critical on ARM9 platforms where this implied
extra cache flushing), avoid having to make several syscalls to write
a log message (getting time of day, etc), and so on
- make log writing reliable -- don't trust userland to report its
timestamp, PID, or to correctly format the datagrams, etc
- allow a log watching process (logcat) to easily pull data from all
logs at once
- avoid committing a vast amount of memory to logging
- try to prevent clients from spamming each other out of log space
(only successful on a coarse granularity right now with the
main/system/radio/events logs)
- ensure logs are not lost at the moment an app crashes

On one hand, having each app (per PID) be able to create their own
logs up to a specified size limit could be really useful and is
something we've kicked around -- for one it would allow us to avoid
the ever present request from userspace developers to increase the log
size because of too much log spam (reduce log spam never seems to be
an answer that makes them happy) -- but we haven't come up with a
reasonable plan for dealing with well if we allow 16KB of log per app
and the user installs 100 apps, they may be pinning up to 1.6MB of ram
worst case, and so on.


At this point you are starting to sound like something much closer to a 
traditional syslog daemon. you are adding so many variations, persistant 
storage, etc that you really don't want to have to have all this in the 
kernel, make these be interfaces into a userspace logging tool (ideally 
syslog compatible), and you have the option to easily have different 
policies, consuming different amounts of space, depending on the device 
and how resource limited it is.


David Lang

Re: RFC: android logger feedback request

2011-12-21 Thread david

On Wed, 21 Dec 2011, Brian Swetland wrote:


On Wed, Dec 21, 2011 at 8:47 PM,  da...@lang.hm wrote:

On Wed, 21 Dec 2011, Brian Swetland wrote:

Any thoughts as to how one could allow N userspace agents to log to a
single shared buffer without one agent, if buggy or malicious,
spamming out all the other contents of the log?  This is one of the
main reasons we maintain separate logs in Android today, and are
likely to continue doing so until we figure out how resolve the issue.


note that you still don't prevent one app from blowing out the logs, all you
do is separate the logs into a handfulof categories and limit the app
(through standard filesystem permissions) to only blowing out one category
of logs.

or do I misunderstand this and you actually keep a separate queue for each
pid (which would seem to be a problem as you no longer know how much space
you need)


You're correct -- that's the model at present.  We'd like to move to a
model where we have better control and are kicking around ideas for
how to do this for the next platform version.

The rate at which apps push data into logs is pretty amazing at times.
The system booting (maybe 10-20 services and 10+ apps starting up)
can blow through 256K of ringbuffer in seconds.

Having enough context available when something crashes to be able to
diagnose what went wrong, especially when problems could involve
interaction between multiple agents in multiple processes is important
for the frameworks and apps teams.


this makes it sound more like this belongs in userspace instead of the 
kernel. if something crashes and they can flip a flag in userspace to say 
'keep more logs now' and try to duplicate the crash is going to be far 
easier than having to compile a new kernel to extend the size of the log.



Trying to get individual developers to be frugal with their logging
has turned out to be mostly a losing battle.


that's for sure, anything that's success depends on developers doing 
things right that aren't visible to the user are pretty much doomed.


David Lang

Re: RFC: android logger feedback request

2011-12-21 Thread david

On Wed, 21 Dec 2011, Brian Swetland wrote:


The result would be much the same amount of code, but an interface which
has
fewer details hard-coded and is generally more versatile and accessible.


That sounds better than what has been done in android, but it is still _far_
more limited than doing something that could be replaced by a fairly
standard syslog daemon.


We're really not interested in adding another daemon to the platform
-- the logging system we have has served us well, is integrated with
our existing development tools, and we're definitely interested in
improving it, but throwing it out and replacing it with a userspace
solution is not interesting to us right now.


Think very hard before you reject any possibility of doing this in 
userspace, especially with all the things that you are talking about doing 
with logging in the future. I really don't think aht a lot of your 
long-0term plans for logging are going to sit well with the kernel 
developers, and if your justification is you don't want to change your 
build process, I really doubt that you will get this upstream.


It should be possible to do this without having to change the tools 
writing the logs, any change in logging will change what it takes to read 
the logs.


I am not saying that you need to have a logging daemon as heavyweight as 
syslog-ng or rsyslog for the low-end phones, but I do think that as 
android moves up the stack a bit into talets and netbooks (especially in 
applications where it will have wifi connectivity almost all the time) 
having the capability to move to a full-blown syslog daemon will make a 
huge amount of sense, so you should look at how big a minimalist daemon 
would be, and what sort of performance it would have.


David Lang
--
To unsubscribe from this list: send the line unsubscribe linux-embedded in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RFC: android logger feedback request

2011-12-21 Thread Brian Swetland
On Wed, Dec 21, 2011 at 9:14 PM,  da...@lang.hm wrote:
 On Wed, 21 Dec 2011, Brian Swetland wrote:

 We're really not interested in adding another daemon to the platform
 -- the logging system we have has served us well, is integrated with
 our existing development tools, and we're definitely interested in
 improving it, but throwing it out and replacing it with a userspace
 solution is not interesting to us right now.

 Think very hard before you reject any possibility of doing this in
 userspace, especially with all the things that you are talking about doing
 with logging in the future. I really don't think aht a lot of your
 long-0term plans for logging are going to sit well with the kernel
 developers, and if your justification is you don't want to change your
 build process, I really doubt that you will get this upstream.

 It should be possible to do this without having to change the tools writing
 the logs, any change in logging will change what it takes to read the logs.

 I am not saying that you need to have a logging daemon as heavyweight as
 syslog-ng or rsyslog for the low-end phones, but I do think that as android
 moves up the stack a bit into talets and netbooks (especially in
 applications where it will have wifi connectivity almost all the time)
 having the capability to move to a full-blown syslog daemon will make a huge
 amount of sense, so you should look at how big a minimalist daemon would be,
 and what sort of performance it would have.

Long term things could certainly move around, but *right now* we
really don't see the value in gutting what we've got in favor of
writing a new userspace daemon, which is going to require somebody to
do that work, maintain it, chase down any behavioural quirks
introduced by the changes, etc.  Or to throw it out in favor of trying
to bolt our logging infrastructure on top of existing syslogds, etc.

We're happy to maintain the logger driver out of tree as we have for
the past four years.

If all discussions of bringing Android kernel support to mainline end
up as another round of you should just rewrite it all in userspace
or you should use this other subsystem that doesn't actually solve
your problem but we think it's close enough, then there's not a lot
of point to having the discussions in the first place.

If somebody wants to go write a complete compatible replacement that
just drops in, we certainly could take a look at it and see how it
works, but given that the benefits are not clear to us, we're not
interested in going off and doing that work ourselves.

Brian
--
To unsubscribe from this list: send the line unsubscribe linux-embedded in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RFC: android logger feedback request

2011-12-21 Thread Greg KH
On Wed, Dec 21, 2011 at 09:25:52PM -0800, Brian Swetland wrote:
 Long term things could certainly move around, but *right now* we
 really don't see the value in gutting what we've got in favor of
 writing a new userspace daemon, which is going to require somebody to
 do that work, maintain it, chase down any behavioural quirks
 introduced by the changes, etc.  Or to throw it out in favor of trying
 to bolt our logging infrastructure on top of existing syslogds, etc.
 
 We're happy to maintain the logger driver out of tree as we have for
 the past four years.

This all came about because Tim asked to merge this to the main tree
as-is, and so, people rightfully pointed out different options that
might be considered if it were to be merged that way.

I have no problem leaving the logger driver in staging, but it seems
that Tim is taking on the task to do the harder thing here, which
probably would entail work on both sides, which as a openhandset
alliance company member, he might have a change that someone like me
might not :)

Tim, it's now in your court it seems, have a great holiday,

greg k-h
--
To unsubscribe from this list: send the line unsubscribe linux-embedded in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html