Date: Thu, 12 Sep 2019 15:12:25 +0200 From: Kamil Rytarowski <n...@gmx.com> Message-ID: <2a6e1fb2-cedc-4a57-750b-45f101be9...@gmx.com>
| This proposal is practically equivalent of disabling leak detection at | all and removes the whole purpose. No it isn't. Or rather, it might be, the way you describe LSan to work, but if that's what it is doing, then it is close to useless: | Leak detection works simply by scanning memory (TLS, stacks, heap) for | pointers to allocations. If there is no longer a reference, than there | is a leak. So the following code struct list { struct list *prev; struct list *next; int age; /* other stuff not relevant here */ }; struct list * new_list(int age1, int age2) { strust list *l1, *l2; l1 = malloc(sizeof *l1); /* check for NULL elided here */ l2 = malloc(sizeof *l2); /* ditto */ l1->prev = NULL; l1->next = l2; l1->age = age1; l2->prev = l1; l2->next = NULL; l2->age = age2; if (age1 > age2) return NULL; return l1; } has no leak (in the case age1 > age2) ? Scanning memory would see a pointer to *l1 (in *l2), and a pointer to *l2 (in *l1). Both blocks of memory are referenced according to that trivial analysis. Useless. The "close to" just above came from there being cases that it can find. Because of this, I also looked at the change you pade to "fix" ps. You added free(pinfo); But *pinfo (a struct pinfo) contains 3 pointers, each of which may refer to other malloc'd blocks. Further, pinfo points to an array of those structs. Because of the memory scan, those pointers (in each element of the array) will still have been seen, but when you free(pinfo) then those pointers can no longer be accessed (depending upon what the free() in use while the sanatiser is running does, they may not still exist in memory, or they might, just in a block that is now free). Anyway the struct kinfo_proc2 struct kinfo_lwp and char * (prefix) entries in all the struct pinfo's are not being freed. There was no leak before, but your "fix" introduced one (or three * N). I'd suggest reverting the change you made, and instead make pinfo a global instead of a local in main - then its value will be seen still existing by the memory scan (LSan will be happy) and ps won't be wasting (a trivial amount of) time doing an incorrect (as it is now) free(), and we'll all be happier. Incidentally, what happens with code like: struct foo { int a, b, c; }; int * make_foo(int a, int b, int c) { struct foo *p; p = malloc(sizeof *p); /* again NULL check elided here */ p->a = a; p->b = b; p->c = c; return (&p->b); } int * global_ptr; main() { global_ptr = make_foo(1,2,3); /* ... */ { struct foo *p; p = (struct foo *)((intptr_t)global_ptr - offsetof(struct foo, b)); p->a++; } /* ... */ } Ugly (and stupid) but not illegal I think. The more common case is a list where the back pointers point at the forward pointer, rather than at the struct itself, so one can do if ((*(p->back) = p->next) != NULL) p->next->back = p->back free(p); to delete an element from the middle of a list easily - which works even if the element to be deleted is first or last. There if the next field is the first in the struct, the back field will refer to the malloc's memory block (by accident) but if it isn't, it won't. | I have not investigated where are the sh(1) leaks, As far as I know it doesn't. What I said is that it does not free memory it allocates before it exits. That is not a leak. | In sanitizers we cannot overload or do anything special with main() | (such as replace it with our own copy). That wasn't the suggestion. However, what was assumed a more rational leak detection algorithm than appears to be in use - I agree, by simply scanning memory you cannot do anything special with main() as when the code runs, apparently, main's stack frame has already gone. But were this done a different way -- the right way is to annotate, in the code, all assignments to pointers , and add a ref counter to all allocated memory blocks - when a pointer is assigned, and it points to something allocated, then incr the ref count. Whan the value of a pointer is overwritten, and it used to point to allocated memory, decr the ref count. If that decrease drops it to 0, then memory has been leaked. In the exit code for every function (except main, which can be recognised, we know its name is "main", in this scenario) put code that checks every local var that is a pointer, for each which points at allocated memory, decrease the ref count. If it reaches 0, that's a leak. This will still give false errors, eg in main() { init(); read_config(); return do_the_work(); } assuming that none of init() read_config() or do_the_work() are called anywhere else, then memory they allocate (and particularly do_the_work()) which is not freed is not a leak. This method still has problems with doubly linked lists, handling those properly needs code to recognise what is going on, and when analysing a list pointer drref, instead of checking against 0, scan the list and find out how many internal pointers ref the list elements, and ensure that (at least) one element on the list (ahead of the one being deleted, for singly linked lists) has a higher ref count than can be attributed to pointers inside the list itself. That's a lot of work for big lists (and trees, etc) especially ones that are constantly being updated. | main() is the same entity as any regular fun(). Unless we decorate it | with some attributes in .c code, it is not distinguishable. Decoration is not needed, its name is its decoration. | Globals are treated specially in sanitizers as sanitizers generate | constructors and destructors to keep the track of them. Sorry, that (by itself) means nothing to me, What do those constructors and destructors do? What we have here is a global pointer, which points to allocated memory, and which is still valid when the program exits (the memory is still allocated, and the global var still points to it). | I have benchmarked in the past that we execute for true(1) or similar | applications at least 1 million instructions on CPU. | | | $ ./singlestepper true | Total count: 1639915 True is a chell script. You're measuring the startup (including dynamic linking) of a fairly large program. Try main() { _exit(0); } and cc -static and then measure that, and see how many instructions are executed (if we wanted an executable "true" - that is, if that were ever actually (significantly) used outside shell scripts, where it is a builtin, that is how we would write it.) [Aside: for some reason, probably related to crt0 or something, the result of this is ridiculously large, and seems to have a good fraction of libc linked into it. That could be fixed...] | $ ./singlestepper echo | Total count: 1039124 Again, that will almost all be the dynamic linker. | Calling one 1 dummy function in libc in comparison to that is negligible. Sure, if one function call is all it takes - but that would have to be "turn off the memory scan" - ie: do no leak detection at all. That isn't what is desired, that would truly make the whole thing useless. leak detection that finds bugs like the one I invented earlier in this message is a good thing to have - it is very easy to forget to free everything that has been allocated in a complicated function that is bailing out in an error case. Finding those bugs is definitely worthwhile. But insisting that all memory that is ever allocated is explicitly freed, just so the sanitiser doesn't claim it has "leaked" is absurd. If it is possible to write code to run just before exit() which frees memory, that memory, by definition, has not been leaked - only memory that we no longer have a reference to, and so cannot possibly free() can be rationally considered leaked. | We are talking here about simple programs that are expected to get some | measurable optimization by skipping free(3) calls. Such as sh(1) or ps(1). I don't know the insides of ps, I don't know that there is much that needs to be freed which isn't already. None of the things that the pinfo points to contain pointers themselves, so all that would be needed is 3*N + 1 free() calls (assuming this is all there is). Certainly I did not care about the one free(psinfo) call that was added - that (if it were not creating a leak) would be irrelevant to anyone. It was the general principle of requiring memory to be freed - actually writing code to do that, and then immediately calling exit() which frees everything, always, that I was objecting to. The only reason for ever doing that extra work is to make it easier for a half baked sanitiser implementation to not report non-problems as errors. | In all other cases I consider this style of programming as buggy. You can consider it however you like, but it isn't. The interface specs for the OS guarantee that when a process exits, all of its resources (memory, fds, ...) are released. Programs (and programmers) are entitled to rely upon this, and allow it to work for them. All kinds of programs build all kinds of data structs as they work, and then they're done, they exit() - tearing it all down for no reason is lunacy. | If there are thousands of leaked memory objects than it is not just | simple optimization, but really wrong style of programming. Not wrong, just not the style you (apparently) like. Incidentally, do you close(0) close(1) close(2) in all your programs before you exit()? (fclose(stdin) etc ae as good when stdio is being used). If not, your program, according to you, is leaking file descriptors. The kernel closes them for you during the exit processing of course, but in your apparent philosophy that's not good enough... | One example | of that is ATF, it randomly picks one set of pointers to free before | exit, and skips others leaking the memory. It is bug. If it is actually doing just what you say, then no, it is not a bug. If it is losing memory while running, so as it runs more and more tests (or subtests) it keeps getting bigger and bigger as (some of) the memory for a previous (sub-)test hasn't been freed, but is no longer needed, or accessible - then that would be a bug, and should be fixed. But if it is just doing what you say, and not bothering to free memory that it has been using (for example) for accumulating stats on the number of tests run, how many failed, the names of those that failed, ... and at the end it prints all that and exits without freeing that memory, that is 100% perfectly OK. | As I have stated originally, we could remove free(3) from almost every | basesystem program and not many would notice. I know what you're saying, but people would notice, as ... | There is even a C compiler | that performs only allocations and never frees the memory. It is still a | correct UNIX application, regardless of taking like 2GB for building a | hello world application. people observe that kind of thing, programs that are huge when they should not be. Certainly with the huge (comparatively) VM space we have available today it is easy to let many of these kinds of things slide as not being all that important. But that isn't the case we're talking about - rather memory that *cannot* be freed until the program exits (the data will still be needed as long as the program remains running) but which is clearly useless after the program has finished. exit() (inside the kernel, not the libc function) frees all of that for us. Using it is not incorrect, not even slightly. kre ps: an alternative shcnge for ps.c would be to alter return eval; at the end of main() to be fflush(stdout); _exit(eval); ps doesn't need atexit handlers (other than to flush its stdout, in case it has been buffered), so we do the flush manually, and then _exit sends back the exit code. No atexit() handlers ever get run, and LSan won't get a chance to object to the pinfo value not being freed. I might start altering all programs I work on to use that technique. (Not really!)