Re: Memory pool interface design

2015-05-16 Thread Elazar Leibovich
The question of whether to use a global malloc function, or to use a
function pointer is orthogonal to my question.

My question is, should I support the case of malloc failure. On one hand,
it complicates the API significantly, but on the other hand it might be
useful for some use cases.

It's pretty obvious to me that in a modern Linux userspace program,
supporting malloc failure does not worth the trouble. But are there other
use cases where it's vital?

Another clarification, my code would never have abort. What I was saying,
that the malloc could simply abort current task, if it does not have memory.

As a side note, In my experience, it is sometimes useful to use
preallocated memory pools[0]. Letting the user choose memory allocator is
also useful when using it in the kernel, since otherwise the library simply
won't compile. See for example protobuf-c which receives an allocator in
its functions,
https://github.com/protobuf-c/protobuf-c/blob/master/protobuf-c/protobuf-c.c#L2019


[0]
http://eli.thegreenplace.net/2008/10/17/memmgr-a-fixed-pool-memory-allocator

On Fri, May 15, 2015 at 9:00 PM, Baruch Even bar...@ev-en.org wrote:

 I would question the need to abstract away the memory allocations of your
 library compared to everything else. If someone cares enough about it he
 can replace malloc and free completely to use a different allocation scheme.

 In most cases I've cared about memory allocations I just wanted none of
 them at all and only wanted intrusive data structures and just running the
 system with a fixed memory allocation from the start to the end. It's not
 always possible in a generic library though..

 If you are writing a library you should never abort inside it, that would
 be very annoying to the user. Give him a null and let him crash or handle
 it as he sees fit.

 Baruch

 On Fri, May 15, 2015 at 5:47 PM, Elazar Leibovich elaz...@gmail.com
 wrote:

 I'm writing a small C library, that I want to open source.

 I want them to be usable for embedded environment, where memory
 allocation must be controlled.

 Hence, I abstracted away calls to malloc/realloc, and replaced them with

 struct mem_pool {
 void *(*allloc)(void *mem_pool, void *prev_ptr, int size);
 };

 User would implement

 struct my_mem_pool {
 struct mem_pool pool;
 ...
 };

 struct my_mem_pool pool = { { my_alloc_func }, ...);

 I've had to design question I'm interested with:

 1) Should I support both malloc and realloc?

 I think the performance benefits of supporting malloc instead of
 realloc(NULL) are negligible, and not worth complicating the interface.

 2) Should the memory pool be allowed to fail?

 In typical Linux system, where memory overcommit is allowed, checking
 malloc return value provides little benefit. But is it the same for
 embedded system?

 My feeling is, embedded system should predict the memory usage for each
 input size, and avoid processing input which is too large.

 For example, stack overflow error can never be handled, and one is
 expected to calculate the longest stack length for any input and make sure
 he wouldn't overflow.

 So I think it's still reasonable never to report allocation failure, and
 to expect the memory allocator to raise the relevant abort/panic/exception
 in such a case.

 But I'll be happy to hear other considerations I missed.

 Thanks,

 ___
 Linux-il mailing list
 Linux-il@cs.huji.ac.il
 http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il



___
Linux-il mailing list
Linux-il@cs.huji.ac.il
http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il


Re: Memory pool interface design

2015-05-16 Thread Orna Agmon Ben-Yehuda
Hi Elazar,

I find that malloc failure checking is vital (within the user program) even
on a regular system with gigabytes of memory. For example, when the program
gets into a recursive loop which allocates memory and then digs deeper. In
other cases, It is useful to check the return value of malloc when the
program input size is unlimited, and it is better to inform the user about
the too-large element of the input rather than to crash.

I do not understand, however, how memory overcommitment leads you to not
require malloc to fail. Maybe when you say overcommitment you do not mean
what I mean, but in our scenario[1] of memory overcommitment, where memory
balloon drivers are used, we configured the memory allocations to fail when
there was not enough physical memory, so that the guest application would
be able to tell when there is memory pressure. Otherwise, the burden of
handling memory pressure is laid purely on the operating system itself.

[1]  Ginseng: Market-Driven Memory Allocation
http://www.cs.technion.ac.il/~ladypine/vee18-agmon-ben-yehuda.pdf, Orna
Agmon Ben-Yehuda, Eyal Posener, Muli Ben-Yehuda, Assaf Schuster, Ahuva
Mu'alem. In proceedings of VEE 2014.


On Sat, May 16, 2015 at 9:14 PM, Elazar Leibovich elaz...@gmail.com wrote:

 The question of whether to use a global malloc function, or to use a
 function pointer is orthogonal to my question.

 My question is, should I support the case of malloc failure. On one hand,
 it complicates the API significantly, but on the other hand it might be
 useful for some use cases.

 It's pretty obvious to me that in a modern Linux userspace program,
 supporting malloc failure does not worth the trouble. But are there other
 use cases where it's vital?

 Another clarification, my code would never have abort. What I was saying,
 that the malloc could simply abort current task, if it does not have memory.

 As a side note, In my experience, it is sometimes useful to use
 preallocated memory pools[0]. Letting the user choose memory allocator is
 also useful when using it in the kernel, since otherwise the library simply
 won't compile. See for example protobuf-c which receives an allocator in
 its functions,
 https://github.com/protobuf-c/protobuf-c/blob/master/protobuf-c/protobuf-c.c#L2019


 [0]
 http://eli.thegreenplace.net/2008/10/17/memmgr-a-fixed-pool-memory-allocator

 On Fri, May 15, 2015 at 9:00 PM, Baruch Even bar...@ev-en.org wrote:

 I would question the need to abstract away the memory allocations of your
 library compared to everything else. If someone cares enough about it he
 can replace malloc and free completely to use a different allocation scheme.

 In most cases I've cared about memory allocations I just wanted none of
 them at all and only wanted intrusive data structures and just running the
 system with a fixed memory allocation from the start to the end. It's not
 always possible in a generic library though..

 If you are writing a library you should never abort inside it, that would
 be very annoying to the user. Give him a null and let him crash or handle
 it as he sees fit.

 Baruch

 On Fri, May 15, 2015 at 5:47 PM, Elazar Leibovich elaz...@gmail.com
 wrote:

 I'm writing a small C library, that I want to open source.

 I want them to be usable for embedded environment, where memory
 allocation must be controlled.

 Hence, I abstracted away calls to malloc/realloc, and replaced them with

 struct mem_pool {
 void *(*allloc)(void *mem_pool, void *prev_ptr, int size);
 };

 User would implement

 struct my_mem_pool {
 struct mem_pool pool;
 ...
 };

 struct my_mem_pool pool = { { my_alloc_func }, ...);

 I've had to design question I'm interested with:

 1) Should I support both malloc and realloc?

 I think the performance benefits of supporting malloc instead of
 realloc(NULL) are negligible, and not worth complicating the interface.

 2) Should the memory pool be allowed to fail?

 In typical Linux system, where memory overcommit is allowed, checking
 malloc return value provides little benefit. But is it the same for
 embedded system?

 My feeling is, embedded system should predict the memory usage for each
 input size, and avoid processing input which is too large.

 For example, stack overflow error can never be handled, and one is
 expected to calculate the longest stack length for any input and make sure
 he wouldn't overflow.

 So I think it's still reasonable never to report allocation failure, and
 to expect the memory allocator to raise the relevant abort/panic/exception
 in such a case.

 But I'll be happy to hear other considerations I missed.

 Thanks,

 ___
 Linux-il mailing list
 Linux-il@cs.huji.ac.il
 http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il




 ___
 Linux-il mailing list
 Linux-il@cs.huji.ac.il
 http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il




-- 
Orna Agmon Ben-Yehuda.

Re: Memory pool interface design

2015-05-16 Thread Elazar Leibovich
Thanks Orna,

My understanding is, that in stock Linux kernel, a process that allocates
too much memory is unlikely to receive NULL from malloc. The more likely
scenario is, the whole system would swap out pages, and the OOM killer
would, hopefully, kill the offending process. During that time it is
unlikely that malloc would return NULL.

Indeed, if you look at actual applications written for Linux, you'd see
that many use variants of xmalloc, which simply aborts if malloc returns
NULL. I think that the logic is similar to what I wrote before.

I thought to give a list of such real world programs, but someone wrote it
much better than I did:

http://eli.thegreenplace.net/2009/10/30/handling-out-of-memory-conditions-in-c

But feel free to correct me if I'm wrong.

PS,
I think we have miscommunication here, since from what I understand the
paper linked is about memory allocation for a VM, while I'm talking about
memory allocation for a process.

On Sat, May 16, 2015 at 10:10 PM, Orna Agmon Ben-Yehuda ladyp...@gmail.com
wrote:

 Hi Elazar,

 I find that malloc failure checking is vital (within the user program)
 even on a regular system with gigabytes of memory. For example, when the
 program gets into a recursive loop which allocates memory and then digs
 deeper. In other cases, It is useful to check the return value of malloc
 when the program input size is unlimited, and it is better to inform the
 user about the too-large element of the input rather than to crash.

 I do not understand, however, how memory overcommitment leads you to not
 require malloc to fail. Maybe when you say overcommitment you do not mean
 what I mean, but in our scenario[1] of memory overcommitment, where memory
 balloon drivers are used, we configured the memory allocations to fail when
 there was not enough physical memory, so that the guest application would
 be able to tell when there is memory pressure. Otherwise, the burden of
 handling memory pressure is laid purely on the operating system itself.

 [1]  Ginseng: Market-Driven Memory Allocation
 http://www.cs.technion.ac.il/~ladypine/vee18-agmon-ben-yehuda.pdf, Orna
 Agmon Ben-Yehuda, Eyal Posener, Muli Ben-Yehuda, Assaf Schuster, Ahuva
 Mu'alem. In proceedings of VEE 2014.


 On Sat, May 16, 2015 at 9:14 PM, Elazar Leibovich elaz...@gmail.com
 wrote:

 The question of whether to use a global malloc function, or to use a
 function pointer is orthogonal to my question.

 My question is, should I support the case of malloc failure. On one hand,
 it complicates the API significantly, but on the other hand it might be
 useful for some use cases.

 It's pretty obvious to me that in a modern Linux userspace program,
 supporting malloc failure does not worth the trouble. But are there other
 use cases where it's vital?

 Another clarification, my code would never have abort. What I was saying,
 that the malloc could simply abort current task, if it does not have memory.

 As a side note, In my experience, it is sometimes useful to use
 preallocated memory pools[0]. Letting the user choose memory allocator is
 also useful when using it in the kernel, since otherwise the library simply
 won't compile. See for example protobuf-c which receives an allocator in
 its functions,
 https://github.com/protobuf-c/protobuf-c/blob/master/protobuf-c/protobuf-c.c#L2019


 [0]
 http://eli.thegreenplace.net/2008/10/17/memmgr-a-fixed-pool-memory-allocator

 On Fri, May 15, 2015 at 9:00 PM, Baruch Even bar...@ev-en.org wrote:

 I would question the need to abstract away the memory allocations of
 your library compared to everything else. If someone cares enough about it
 he can replace malloc and free completely to use a different allocation
 scheme.

 In most cases I've cared about memory allocations I just wanted none of
 them at all and only wanted intrusive data structures and just running the
 system with a fixed memory allocation from the start to the end. It's not
 always possible in a generic library though..

 If you are writing a library you should never abort inside it, that
 would be very annoying to the user. Give him a null and let him crash or
 handle it as he sees fit.

 Baruch

 On Fri, May 15, 2015 at 5:47 PM, Elazar Leibovich elaz...@gmail.com
 wrote:

 I'm writing a small C library, that I want to open source.

 I want them to be usable for embedded environment, where memory
 allocation must be controlled.

 Hence, I abstracted away calls to malloc/realloc, and replaced them with

 struct mem_pool {
 void *(*allloc)(void *mem_pool, void *prev_ptr, int size);
 };

 User would implement

 struct my_mem_pool {
 struct mem_pool pool;
 ...
 };

 struct my_mem_pool pool = { { my_alloc_func }, ...);

 I've had to design question I'm interested with:

 1) Should I support both malloc and realloc?

 I think the performance benefits of supporting malloc instead of
 realloc(NULL) are negligible, and not worth complicating the interface.

 2) Should the memory pool be allowed to 

Re: Memory pool interface design

2015-05-16 Thread Oleg Goldshmidt
Elazar Leibovich elaz...@gmail.com writes:

 My question is, should I support the case of malloc failure. On one
 hand, it complicates the API significantly, but on the other hand it
 might be useful for some use cases.

This sounds like, can you guys tell me what my requirements are? ;-)

If I understand correctly, you want to provide an alternative (to the
standard malloc() and friends) mechanism for memory allocation,
targeting primarily embedded systems. If I am not completely wrong,
consider the following:

1. Is mechanism the operative word? If so, then you should leave
   *policies* - including exception handling - to the client. If you
   intend to restrict your library to a single OOM excepton policy you
   should document the restricton. E.g., if your policy is going to be
   segfault or commit a clean(ish) seppuku, you should tell
   potential users, using big bold red letters, if this doesn't suit you
   don't use the library.  How much this will affect your library's
   usefulness/popularity I don't care to predict.

2. Naively, I cannot imagine *not* letting clients of a
   production-quality library decide what to do, if only to write
   something sensible to a log using the client's preferred format and
   destination. Some 20 year ago I saw popular (numerical) libraries
   whose authors (probably members of the academia) considered abort a
   legitimate way of handling failures. A scientist running a numerical
   application with the ultimate purpose of writing a paper certainly is
   justified in thinking that way. We, however, disqualified those
   libraries for any production use for that reason alone, regardless of
   their other qualities. IIRC we liked one of them enough to find *all*
   the places where it aborted and modify the code (FOSS rules, huh?).

3. There are enough examples of custom allocators. I am sure you can
   find an awful lot of code, say, overriding new/delete in C++. Even
   the standard libraries provide for overriding allocators. Find a few
   reputable example, see how exceptions are handled, follow the
   pattern? I suspect in most cases it is left to the library clients
   (arguably easier with longjumping exceptions than with C-style error
   propagation, but the point is, library code does not decide,
   usually).

4. What *are* your requirements? If a git client (an example you cited)
   tries to malloc, gets NULL, tries to recover, and then gives up and
   dies writing something to stderr, that's one thing. An embedded
   device just crashing without telling anyone what's wrong? Maybe a
   different kettle of fish altogether. Do you target devices with
   limited or somewhat limited - resources? May make a difference.

5. You mentioned swapping. That does not mean you are out of memory
   (malloc does not fail whan you swap pages). But I am sure you know
   that.

6. Kernel's OOM killer mechanism is also not directly related to
   malloc() failing. It means that *some* process, not necessarily (or
   even likely) the process that is requesting memory at the moment,
   will be killed, according to some policy. No one can decide in
   advance whether killing *something* is a good decision in an
   unspecified embedded system.

7. Why do you say handling failures will complicate the API a lot? It is
   not clear from what you wrote. After all, malloc() is not more
   complex because it can return NULL, is it? So can your alloc() member
   - what's the problem?

-- 
Oleg Goldshmidt | p...@goldshmidt.org

___
Linux-il mailing list
Linux-il@cs.huji.ac.il
http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il


Re: Memory pool interface design

2015-05-16 Thread Elazar Leibovich
I think that I didn't explain myself correctly.

Let me try again.

I'm writing a C library, and I want it to be useful not only in Linux
userland, but also in other contexts, such as embedded devices, and inside
the Linux kernel.

This library sometimes allocates memory.

If I'll just allocate memory with malloc, my library wouldn't even compile
with embedded devices. Hence, I'll receive an allocator function from the
user, and use it to allocate memory.

A concrete example, a regular read_line function

char *read_line(struct reader *r) { char *rv = malloc(len); read_to(rv);
return rv; }

A more flexible read_line function:

char *read_line(struct reader *r, struct mem_pool *pool) { char *rv =
pool-alloc(pool, len); read_to(rv); return rv; }

Now I'm fine, because I can define pool-alloc to be kmalloc in the kernel
context, or use preallocated pools in embedded device.

What should I do if memory allocation fails? I can return an error for the
user, but it makes the API more complicated. Since many functions would
have to return error just in case of memory allocation failure.

Our read_line example would now look like

struct error read_line(struct reader *r, char **line);

And users would have to check error at each invocation.

I think I can avoid that. What do I intend to do?

My library would assume each memory allocation is successful, and the
client that provides the memory allocator would be responsible to failure
handling.

For example, in a regular linux userspace program, the mem_pool would be
something like

void *linux_userspace_mem_pool(struct mem_pool *pool, int size) {
   void *rv = malloc(size);
   if (rv == NULL) {
   syslog(ENOMEM);
   exit(0);
   }
}

An embedded client could throw an exception, or longjmp to the main loop,
or reset the system.

Now my question is, is that a reasonable behavior that would suite embedded
devices. I do not have enough experience to know that. Indeed, since I'm
writing a library that I hope would serve as broad audience as possible, it
is hard to know the requirements in advance.

Hence, I think 1-4 are already addressed, I always gives the user control
what would happen when he's out of memory.

Regarding 5-6. What I'm saying is, seeing malloc returning NULL in
production is very rare. I personally never seen that. I think that the OOM
killer would wreck havoc to the system before it would happen, hence,
crashing when malloc returns NULL is a reasonable behavior.

Regarding 7, this does not complicate the allocation API, it complicates my
API, since I'll have functions that cannot fail, generally speaking, but
allocates memory. Those would have to return error, and the user would have
to check the error.

Thanks,


On Sat, May 16, 2015 at 11:18 PM, Oleg Goldshmidt p...@goldshmidt.org
wrote:

 Elazar Leibovich elaz...@gmail.com writes:

  My question is, should I support the case of malloc failure. On one
  hand, it complicates the API significantly, but on the other hand it
  might be useful for some use cases.

 This sounds like, can you guys tell me what my requirements are? ;-)

 If I understand correctly, you want to provide an alternative (to the
 standard malloc() and friends) mechanism for memory allocation,
 targeting primarily embedded systems. If I am not completely wrong,
 consider the following:

 1. Is mechanism the operative word? If so, then you should leave
*policies* - including exception handling - to the client. If you
intend to restrict your library to a single OOM excepton policy you
should document the restricton. E.g., if your policy is going to be
segfault or commit a clean(ish) seppuku, you should tell
potential users, using big bold red letters, if this doesn't suit you
don't use the library.  How much this will affect your library's
usefulness/popularity I don't care to predict.

 2. Naively, I cannot imagine *not* letting clients of a
production-quality library decide what to do, if only to write
something sensible to a log using the client's preferred format and
destination. Some 20 year ago I saw popular (numerical) libraries
whose authors (probably members of the academia) considered abort a
legitimate way of handling failures. A scientist running a numerical
application with the ultimate purpose of writing a paper certainly is
justified in thinking that way. We, however, disqualified those
libraries for any production use for that reason alone, regardless of
their other qualities. IIRC we liked one of them enough to find *all*
the places where it aborted and modify the code (FOSS rules, huh?).

 3. There are enough examples of custom allocators. I am sure you can
find an awful lot of code, say, overriding new/delete in C++. Even
the standard libraries provide for overriding allocators. Find a few
reputable example, see how exceptions are handled, follow the
pattern? I suspect in most cases it is left to the library clients

Re: Memory pool interface design

2015-05-16 Thread guy keren


as a rule - if it's a general-purpose library,k and it can fail - it 
must return an error using the language's natural error mechanism.

in C - this comes as a return status.

i *have* seen malloc returning NULL in some situations.

the application that uses your library may decide to simply terminate 
the service provided by the specific thread, or it can decide to forgo 
freeing memory it is holding in different parts of the code, or.


what you are trying to do is behave in a non-C-like manner. this is 
counter-productive.


on the other hand - this is your library - do whatever you want, and 
we'll see what the users decide to do with it.


--guy

On 05/17/2015 12:04 AM, Elazar Leibovich wrote:

I think that I didn't explain myself correctly.

Let me try again.

I'm writing a C library, and I want it to be useful not only in Linux
userland, but also in other contexts, such as embedded devices, and
inside the Linux kernel.

This library sometimes allocates memory.

If I'll just allocate memory with malloc, my library wouldn't even
compile with embedded devices. Hence, I'll receive an allocator function
from the user, and use it to allocate memory.

A concrete example, a regular read_line function

char *read_line(struct reader *r) { char *rv = malloc(len); read_to(rv);
return rv; }

A more flexible read_line function:

char *read_line(struct reader *r, struct mem_pool *pool) { char *rv =
pool-alloc(pool, len); read_to(rv); return rv; }

Now I'm fine, because I can define pool-alloc to be kmalloc in the
kernel context, or use preallocated pools in embedded device.

What should I do if memory allocation fails? I can return an error for
the user, but it makes the API more complicated. Since many functions
would have to return error just in case of memory allocation failure.

Our read_line example would now look like

struct error read_line(struct reader *r, char **line);

And users would have to check error at each invocation.

I think I can avoid that. What do I intend to do?

My library would assume each memory allocation is successful, and the
client that provides the memory allocator would be responsible to
failure handling.

For example, in a regular linux userspace program, the mem_pool would
be something like

void *linux_userspace_mem_pool(struct mem_pool *pool, int size) {
void *rv = malloc(size);
if (rv == NULL) {
syslog(ENOMEM);
exit(0);
}
}

An embedded client could throw an exception, or longjmp to the main
loop, or reset the system.

Now my question is, is that a reasonable behavior that would suite
embedded devices. I do not have enough experience to know that. Indeed,
since I'm writing a library that I hope would serve as broad audience as
possible, it is hard to know the requirements in advance.

Hence, I think 1-4 are already addressed, I always gives the user
control what would happen when he's out of memory.

Regarding 5-6. What I'm saying is, seeing malloc returning NULL in
production is very rare. I personally never seen that. I think that the
OOM killer would wreck havoc to the system before it would happen,
hence, crashing when malloc returns NULL is a reasonable behavior.

Regarding 7, this does not complicate the allocation API, it complicates
my API, since I'll have functions that cannot fail, generally speaking,
but allocates memory. Those would have to return error, and the user
would have to check the error.

Thanks,


On Sat, May 16, 2015 at 11:18 PM, Oleg Goldshmidt p...@goldshmidt.org
mailto:p...@goldshmidt.org wrote:

Elazar Leibovich elaz...@gmail.com mailto:elaz...@gmail.com writes:

 My question is, should I support the case of malloc failure. On one
 hand, it complicates the API significantly, but on the other hand it
 might be useful for some use cases.

This sounds like, can you guys tell me what my requirements are? ;-)

If I understand correctly, you want to provide an alternative (to the
standard malloc() and friends) mechanism for memory allocation,
targeting primarily embedded systems. If I am not completely wrong,
consider the following:

1. Is mechanism the operative word? If so, then you should leave
*policies* - including exception handling - to the client. If you
intend to restrict your library to a single OOM excepton policy you
should document the restricton. E.g., if your policy is going to be
segfault or commit a clean(ish) seppuku, you should tell
potential users, using big bold red letters, if this doesn't
suit you
don't use the library.  How much this will affect your library's
usefulness/popularity I don't care to predict.

2. Naively, I cannot imagine *not* letting clients of a
production-quality library decide what to do, if only to write
something sensible to a log using the client's preferred format and
destination. Some 20 year ago I saw popular (numerical) libraries
whose