Re: Memory pool interface design
The question of whether to use a global malloc function, or to use a function pointer is orthogonal to my question. My question is, should I support the case of malloc failure. On one hand, it complicates the API significantly, but on the other hand it might be useful for some use cases. It's pretty obvious to me that in a modern Linux userspace program, supporting malloc failure does not worth the trouble. But are there other use cases where it's vital? Another clarification, my code would never have abort. What I was saying, that the malloc could simply abort current task, if it does not have memory. As a side note, In my experience, it is sometimes useful to use preallocated memory pools[0]. Letting the user choose memory allocator is also useful when using it in the kernel, since otherwise the library simply won't compile. See for example protobuf-c which receives an allocator in its functions, https://github.com/protobuf-c/protobuf-c/blob/master/protobuf-c/protobuf-c.c#L2019 [0] http://eli.thegreenplace.net/2008/10/17/memmgr-a-fixed-pool-memory-allocator On Fri, May 15, 2015 at 9:00 PM, Baruch Even bar...@ev-en.org wrote: I would question the need to abstract away the memory allocations of your library compared to everything else. If someone cares enough about it he can replace malloc and free completely to use a different allocation scheme. In most cases I've cared about memory allocations I just wanted none of them at all and only wanted intrusive data structures and just running the system with a fixed memory allocation from the start to the end. It's not always possible in a generic library though.. If you are writing a library you should never abort inside it, that would be very annoying to the user. Give him a null and let him crash or handle it as he sees fit. Baruch On Fri, May 15, 2015 at 5:47 PM, Elazar Leibovich elaz...@gmail.com wrote: I'm writing a small C library, that I want to open source. I want them to be usable for embedded environment, where memory allocation must be controlled. Hence, I abstracted away calls to malloc/realloc, and replaced them with struct mem_pool { void *(*allloc)(void *mem_pool, void *prev_ptr, int size); }; User would implement struct my_mem_pool { struct mem_pool pool; ... }; struct my_mem_pool pool = { { my_alloc_func }, ...); I've had to design question I'm interested with: 1) Should I support both malloc and realloc? I think the performance benefits of supporting malloc instead of realloc(NULL) are negligible, and not worth complicating the interface. 2) Should the memory pool be allowed to fail? In typical Linux system, where memory overcommit is allowed, checking malloc return value provides little benefit. But is it the same for embedded system? My feeling is, embedded system should predict the memory usage for each input size, and avoid processing input which is too large. For example, stack overflow error can never be handled, and one is expected to calculate the longest stack length for any input and make sure he wouldn't overflow. So I think it's still reasonable never to report allocation failure, and to expect the memory allocator to raise the relevant abort/panic/exception in such a case. But I'll be happy to hear other considerations I missed. Thanks, ___ Linux-il mailing list Linux-il@cs.huji.ac.il http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il ___ Linux-il mailing list Linux-il@cs.huji.ac.il http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il
Re: Memory pool interface design
Hi Elazar, I find that malloc failure checking is vital (within the user program) even on a regular system with gigabytes of memory. For example, when the program gets into a recursive loop which allocates memory and then digs deeper. In other cases, It is useful to check the return value of malloc when the program input size is unlimited, and it is better to inform the user about the too-large element of the input rather than to crash. I do not understand, however, how memory overcommitment leads you to not require malloc to fail. Maybe when you say overcommitment you do not mean what I mean, but in our scenario[1] of memory overcommitment, where memory balloon drivers are used, we configured the memory allocations to fail when there was not enough physical memory, so that the guest application would be able to tell when there is memory pressure. Otherwise, the burden of handling memory pressure is laid purely on the operating system itself. [1] Ginseng: Market-Driven Memory Allocation http://www.cs.technion.ac.il/~ladypine/vee18-agmon-ben-yehuda.pdf, Orna Agmon Ben-Yehuda, Eyal Posener, Muli Ben-Yehuda, Assaf Schuster, Ahuva Mu'alem. In proceedings of VEE 2014. On Sat, May 16, 2015 at 9:14 PM, Elazar Leibovich elaz...@gmail.com wrote: The question of whether to use a global malloc function, or to use a function pointer is orthogonal to my question. My question is, should I support the case of malloc failure. On one hand, it complicates the API significantly, but on the other hand it might be useful for some use cases. It's pretty obvious to me that in a modern Linux userspace program, supporting malloc failure does not worth the trouble. But are there other use cases where it's vital? Another clarification, my code would never have abort. What I was saying, that the malloc could simply abort current task, if it does not have memory. As a side note, In my experience, it is sometimes useful to use preallocated memory pools[0]. Letting the user choose memory allocator is also useful when using it in the kernel, since otherwise the library simply won't compile. See for example protobuf-c which receives an allocator in its functions, https://github.com/protobuf-c/protobuf-c/blob/master/protobuf-c/protobuf-c.c#L2019 [0] http://eli.thegreenplace.net/2008/10/17/memmgr-a-fixed-pool-memory-allocator On Fri, May 15, 2015 at 9:00 PM, Baruch Even bar...@ev-en.org wrote: I would question the need to abstract away the memory allocations of your library compared to everything else. If someone cares enough about it he can replace malloc and free completely to use a different allocation scheme. In most cases I've cared about memory allocations I just wanted none of them at all and only wanted intrusive data structures and just running the system with a fixed memory allocation from the start to the end. It's not always possible in a generic library though.. If you are writing a library you should never abort inside it, that would be very annoying to the user. Give him a null and let him crash or handle it as he sees fit. Baruch On Fri, May 15, 2015 at 5:47 PM, Elazar Leibovich elaz...@gmail.com wrote: I'm writing a small C library, that I want to open source. I want them to be usable for embedded environment, where memory allocation must be controlled. Hence, I abstracted away calls to malloc/realloc, and replaced them with struct mem_pool { void *(*allloc)(void *mem_pool, void *prev_ptr, int size); }; User would implement struct my_mem_pool { struct mem_pool pool; ... }; struct my_mem_pool pool = { { my_alloc_func }, ...); I've had to design question I'm interested with: 1) Should I support both malloc and realloc? I think the performance benefits of supporting malloc instead of realloc(NULL) are negligible, and not worth complicating the interface. 2) Should the memory pool be allowed to fail? In typical Linux system, where memory overcommit is allowed, checking malloc return value provides little benefit. But is it the same for embedded system? My feeling is, embedded system should predict the memory usage for each input size, and avoid processing input which is too large. For example, stack overflow error can never be handled, and one is expected to calculate the longest stack length for any input and make sure he wouldn't overflow. So I think it's still reasonable never to report allocation failure, and to expect the memory allocator to raise the relevant abort/panic/exception in such a case. But I'll be happy to hear other considerations I missed. Thanks, ___ Linux-il mailing list Linux-il@cs.huji.ac.il http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il ___ Linux-il mailing list Linux-il@cs.huji.ac.il http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il -- Orna Agmon Ben-Yehuda.
Re: Memory pool interface design
Thanks Orna, My understanding is, that in stock Linux kernel, a process that allocates too much memory is unlikely to receive NULL from malloc. The more likely scenario is, the whole system would swap out pages, and the OOM killer would, hopefully, kill the offending process. During that time it is unlikely that malloc would return NULL. Indeed, if you look at actual applications written for Linux, you'd see that many use variants of xmalloc, which simply aborts if malloc returns NULL. I think that the logic is similar to what I wrote before. I thought to give a list of such real world programs, but someone wrote it much better than I did: http://eli.thegreenplace.net/2009/10/30/handling-out-of-memory-conditions-in-c But feel free to correct me if I'm wrong. PS, I think we have miscommunication here, since from what I understand the paper linked is about memory allocation for a VM, while I'm talking about memory allocation for a process. On Sat, May 16, 2015 at 10:10 PM, Orna Agmon Ben-Yehuda ladyp...@gmail.com wrote: Hi Elazar, I find that malloc failure checking is vital (within the user program) even on a regular system with gigabytes of memory. For example, when the program gets into a recursive loop which allocates memory and then digs deeper. In other cases, It is useful to check the return value of malloc when the program input size is unlimited, and it is better to inform the user about the too-large element of the input rather than to crash. I do not understand, however, how memory overcommitment leads you to not require malloc to fail. Maybe when you say overcommitment you do not mean what I mean, but in our scenario[1] of memory overcommitment, where memory balloon drivers are used, we configured the memory allocations to fail when there was not enough physical memory, so that the guest application would be able to tell when there is memory pressure. Otherwise, the burden of handling memory pressure is laid purely on the operating system itself. [1] Ginseng: Market-Driven Memory Allocation http://www.cs.technion.ac.il/~ladypine/vee18-agmon-ben-yehuda.pdf, Orna Agmon Ben-Yehuda, Eyal Posener, Muli Ben-Yehuda, Assaf Schuster, Ahuva Mu'alem. In proceedings of VEE 2014. On Sat, May 16, 2015 at 9:14 PM, Elazar Leibovich elaz...@gmail.com wrote: The question of whether to use a global malloc function, or to use a function pointer is orthogonal to my question. My question is, should I support the case of malloc failure. On one hand, it complicates the API significantly, but on the other hand it might be useful for some use cases. It's pretty obvious to me that in a modern Linux userspace program, supporting malloc failure does not worth the trouble. But are there other use cases where it's vital? Another clarification, my code would never have abort. What I was saying, that the malloc could simply abort current task, if it does not have memory. As a side note, In my experience, it is sometimes useful to use preallocated memory pools[0]. Letting the user choose memory allocator is also useful when using it in the kernel, since otherwise the library simply won't compile. See for example protobuf-c which receives an allocator in its functions, https://github.com/protobuf-c/protobuf-c/blob/master/protobuf-c/protobuf-c.c#L2019 [0] http://eli.thegreenplace.net/2008/10/17/memmgr-a-fixed-pool-memory-allocator On Fri, May 15, 2015 at 9:00 PM, Baruch Even bar...@ev-en.org wrote: I would question the need to abstract away the memory allocations of your library compared to everything else. If someone cares enough about it he can replace malloc and free completely to use a different allocation scheme. In most cases I've cared about memory allocations I just wanted none of them at all and only wanted intrusive data structures and just running the system with a fixed memory allocation from the start to the end. It's not always possible in a generic library though.. If you are writing a library you should never abort inside it, that would be very annoying to the user. Give him a null and let him crash or handle it as he sees fit. Baruch On Fri, May 15, 2015 at 5:47 PM, Elazar Leibovich elaz...@gmail.com wrote: I'm writing a small C library, that I want to open source. I want them to be usable for embedded environment, where memory allocation must be controlled. Hence, I abstracted away calls to malloc/realloc, and replaced them with struct mem_pool { void *(*allloc)(void *mem_pool, void *prev_ptr, int size); }; User would implement struct my_mem_pool { struct mem_pool pool; ... }; struct my_mem_pool pool = { { my_alloc_func }, ...); I've had to design question I'm interested with: 1) Should I support both malloc and realloc? I think the performance benefits of supporting malloc instead of realloc(NULL) are negligible, and not worth complicating the interface. 2) Should the memory pool be allowed to
Re: Memory pool interface design
Elazar Leibovich elaz...@gmail.com writes: My question is, should I support the case of malloc failure. On one hand, it complicates the API significantly, but on the other hand it might be useful for some use cases. This sounds like, can you guys tell me what my requirements are? ;-) If I understand correctly, you want to provide an alternative (to the standard malloc() and friends) mechanism for memory allocation, targeting primarily embedded systems. If I am not completely wrong, consider the following: 1. Is mechanism the operative word? If so, then you should leave *policies* - including exception handling - to the client. If you intend to restrict your library to a single OOM excepton policy you should document the restricton. E.g., if your policy is going to be segfault or commit a clean(ish) seppuku, you should tell potential users, using big bold red letters, if this doesn't suit you don't use the library. How much this will affect your library's usefulness/popularity I don't care to predict. 2. Naively, I cannot imagine *not* letting clients of a production-quality library decide what to do, if only to write something sensible to a log using the client's preferred format and destination. Some 20 year ago I saw popular (numerical) libraries whose authors (probably members of the academia) considered abort a legitimate way of handling failures. A scientist running a numerical application with the ultimate purpose of writing a paper certainly is justified in thinking that way. We, however, disqualified those libraries for any production use for that reason alone, regardless of their other qualities. IIRC we liked one of them enough to find *all* the places where it aborted and modify the code (FOSS rules, huh?). 3. There are enough examples of custom allocators. I am sure you can find an awful lot of code, say, overriding new/delete in C++. Even the standard libraries provide for overriding allocators. Find a few reputable example, see how exceptions are handled, follow the pattern? I suspect in most cases it is left to the library clients (arguably easier with longjumping exceptions than with C-style error propagation, but the point is, library code does not decide, usually). 4. What *are* your requirements? If a git client (an example you cited) tries to malloc, gets NULL, tries to recover, and then gives up and dies writing something to stderr, that's one thing. An embedded device just crashing without telling anyone what's wrong? Maybe a different kettle of fish altogether. Do you target devices with limited or somewhat limited - resources? May make a difference. 5. You mentioned swapping. That does not mean you are out of memory (malloc does not fail whan you swap pages). But I am sure you know that. 6. Kernel's OOM killer mechanism is also not directly related to malloc() failing. It means that *some* process, not necessarily (or even likely) the process that is requesting memory at the moment, will be killed, according to some policy. No one can decide in advance whether killing *something* is a good decision in an unspecified embedded system. 7. Why do you say handling failures will complicate the API a lot? It is not clear from what you wrote. After all, malloc() is not more complex because it can return NULL, is it? So can your alloc() member - what's the problem? -- Oleg Goldshmidt | p...@goldshmidt.org ___ Linux-il mailing list Linux-il@cs.huji.ac.il http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il
Re: Memory pool interface design
I think that I didn't explain myself correctly. Let me try again. I'm writing a C library, and I want it to be useful not only in Linux userland, but also in other contexts, such as embedded devices, and inside the Linux kernel. This library sometimes allocates memory. If I'll just allocate memory with malloc, my library wouldn't even compile with embedded devices. Hence, I'll receive an allocator function from the user, and use it to allocate memory. A concrete example, a regular read_line function char *read_line(struct reader *r) { char *rv = malloc(len); read_to(rv); return rv; } A more flexible read_line function: char *read_line(struct reader *r, struct mem_pool *pool) { char *rv = pool-alloc(pool, len); read_to(rv); return rv; } Now I'm fine, because I can define pool-alloc to be kmalloc in the kernel context, or use preallocated pools in embedded device. What should I do if memory allocation fails? I can return an error for the user, but it makes the API more complicated. Since many functions would have to return error just in case of memory allocation failure. Our read_line example would now look like struct error read_line(struct reader *r, char **line); And users would have to check error at each invocation. I think I can avoid that. What do I intend to do? My library would assume each memory allocation is successful, and the client that provides the memory allocator would be responsible to failure handling. For example, in a regular linux userspace program, the mem_pool would be something like void *linux_userspace_mem_pool(struct mem_pool *pool, int size) { void *rv = malloc(size); if (rv == NULL) { syslog(ENOMEM); exit(0); } } An embedded client could throw an exception, or longjmp to the main loop, or reset the system. Now my question is, is that a reasonable behavior that would suite embedded devices. I do not have enough experience to know that. Indeed, since I'm writing a library that I hope would serve as broad audience as possible, it is hard to know the requirements in advance. Hence, I think 1-4 are already addressed, I always gives the user control what would happen when he's out of memory. Regarding 5-6. What I'm saying is, seeing malloc returning NULL in production is very rare. I personally never seen that. I think that the OOM killer would wreck havoc to the system before it would happen, hence, crashing when malloc returns NULL is a reasonable behavior. Regarding 7, this does not complicate the allocation API, it complicates my API, since I'll have functions that cannot fail, generally speaking, but allocates memory. Those would have to return error, and the user would have to check the error. Thanks, On Sat, May 16, 2015 at 11:18 PM, Oleg Goldshmidt p...@goldshmidt.org wrote: Elazar Leibovich elaz...@gmail.com writes: My question is, should I support the case of malloc failure. On one hand, it complicates the API significantly, but on the other hand it might be useful for some use cases. This sounds like, can you guys tell me what my requirements are? ;-) If I understand correctly, you want to provide an alternative (to the standard malloc() and friends) mechanism for memory allocation, targeting primarily embedded systems. If I am not completely wrong, consider the following: 1. Is mechanism the operative word? If so, then you should leave *policies* - including exception handling - to the client. If you intend to restrict your library to a single OOM excepton policy you should document the restricton. E.g., if your policy is going to be segfault or commit a clean(ish) seppuku, you should tell potential users, using big bold red letters, if this doesn't suit you don't use the library. How much this will affect your library's usefulness/popularity I don't care to predict. 2. Naively, I cannot imagine *not* letting clients of a production-quality library decide what to do, if only to write something sensible to a log using the client's preferred format and destination. Some 20 year ago I saw popular (numerical) libraries whose authors (probably members of the academia) considered abort a legitimate way of handling failures. A scientist running a numerical application with the ultimate purpose of writing a paper certainly is justified in thinking that way. We, however, disqualified those libraries for any production use for that reason alone, regardless of their other qualities. IIRC we liked one of them enough to find *all* the places where it aborted and modify the code (FOSS rules, huh?). 3. There are enough examples of custom allocators. I am sure you can find an awful lot of code, say, overriding new/delete in C++. Even the standard libraries provide for overriding allocators. Find a few reputable example, see how exceptions are handled, follow the pattern? I suspect in most cases it is left to the library clients
Re: Memory pool interface design
as a rule - if it's a general-purpose library,k and it can fail - it must return an error using the language's natural error mechanism. in C - this comes as a return status. i *have* seen malloc returning NULL in some situations. the application that uses your library may decide to simply terminate the service provided by the specific thread, or it can decide to forgo freeing memory it is holding in different parts of the code, or. what you are trying to do is behave in a non-C-like manner. this is counter-productive. on the other hand - this is your library - do whatever you want, and we'll see what the users decide to do with it. --guy On 05/17/2015 12:04 AM, Elazar Leibovich wrote: I think that I didn't explain myself correctly. Let me try again. I'm writing a C library, and I want it to be useful not only in Linux userland, but also in other contexts, such as embedded devices, and inside the Linux kernel. This library sometimes allocates memory. If I'll just allocate memory with malloc, my library wouldn't even compile with embedded devices. Hence, I'll receive an allocator function from the user, and use it to allocate memory. A concrete example, a regular read_line function char *read_line(struct reader *r) { char *rv = malloc(len); read_to(rv); return rv; } A more flexible read_line function: char *read_line(struct reader *r, struct mem_pool *pool) { char *rv = pool-alloc(pool, len); read_to(rv); return rv; } Now I'm fine, because I can define pool-alloc to be kmalloc in the kernel context, or use preallocated pools in embedded device. What should I do if memory allocation fails? I can return an error for the user, but it makes the API more complicated. Since many functions would have to return error just in case of memory allocation failure. Our read_line example would now look like struct error read_line(struct reader *r, char **line); And users would have to check error at each invocation. I think I can avoid that. What do I intend to do? My library would assume each memory allocation is successful, and the client that provides the memory allocator would be responsible to failure handling. For example, in a regular linux userspace program, the mem_pool would be something like void *linux_userspace_mem_pool(struct mem_pool *pool, int size) { void *rv = malloc(size); if (rv == NULL) { syslog(ENOMEM); exit(0); } } An embedded client could throw an exception, or longjmp to the main loop, or reset the system. Now my question is, is that a reasonable behavior that would suite embedded devices. I do not have enough experience to know that. Indeed, since I'm writing a library that I hope would serve as broad audience as possible, it is hard to know the requirements in advance. Hence, I think 1-4 are already addressed, I always gives the user control what would happen when he's out of memory. Regarding 5-6. What I'm saying is, seeing malloc returning NULL in production is very rare. I personally never seen that. I think that the OOM killer would wreck havoc to the system before it would happen, hence, crashing when malloc returns NULL is a reasonable behavior. Regarding 7, this does not complicate the allocation API, it complicates my API, since I'll have functions that cannot fail, generally speaking, but allocates memory. Those would have to return error, and the user would have to check the error. Thanks, On Sat, May 16, 2015 at 11:18 PM, Oleg Goldshmidt p...@goldshmidt.org mailto:p...@goldshmidt.org wrote: Elazar Leibovich elaz...@gmail.com mailto:elaz...@gmail.com writes: My question is, should I support the case of malloc failure. On one hand, it complicates the API significantly, but on the other hand it might be useful for some use cases. This sounds like, can you guys tell me what my requirements are? ;-) If I understand correctly, you want to provide an alternative (to the standard malloc() and friends) mechanism for memory allocation, targeting primarily embedded systems. If I am not completely wrong, consider the following: 1. Is mechanism the operative word? If so, then you should leave *policies* - including exception handling - to the client. If you intend to restrict your library to a single OOM excepton policy you should document the restricton. E.g., if your policy is going to be segfault or commit a clean(ish) seppuku, you should tell potential users, using big bold red letters, if this doesn't suit you don't use the library. How much this will affect your library's usefulness/popularity I don't care to predict. 2. Naively, I cannot imagine *not* letting clients of a production-quality library decide what to do, if only to write something sensible to a log using the client's preferred format and destination. Some 20 year ago I saw popular (numerical) libraries whose