Hi Guy, I liked the tutorial a lot.
Here are my comments on it. The comments range from mere typos (most of them) to (IMO) some more advanced ones. Unfortunately they are all messed up together in this large message. I really thought quite a bit whether to send the comment in private or on the lists. However I think that sending to the lists will be better because other people might be interested in them and also might correct me if I was wrong. For those of you who are not interested in the comments, the delete command is always there. Once again, I want to congratulate you for the great job! Emil Subsection: Machine Architectures - Memory, Cache, Registers, 2nd paragraph: Usually, the cache is hidden from our our programs ^^^----------repeated Subsection: Machine Architectures - Memory, Cache, Registers A suggestion: Maybe it would be a good idea to mention the fact that many machines require that at least one of the operands should reside in a register. More generally I think that it should be stressed that unlike cache, registers are not transparent at the assembly language level. (OUTCH, I already introduced `assembly language' without defining what that is. I'm sure, you will do a better job ;-)) Subsection: Virtual Memory. "Virtual memory works by the operating system...". Though the idea is clear, I think the wording is a little bit weird and should the sentence should be somehow rephrased. Subsection: Memory Protection. In addition to preventing from one program from corrupting the memory of other programs, the memory protection mechanism makes sure that no program can corrupt the operating system itself. Subsection: Run-time Management Of Virtual Memory. "the CPU identifies a page fault". Replacing `identifies' with `generates' is better IMO (even though the idea might not be the same). Subsection: SIGSEGV This! SIGBUS That! The message displayed during a segmentation violation depends on the operating system and even the shell. I remember that some version of sh (on Solaris) would display `Memory error'. Hmmm...;-) IIRC, FreeBSD generates SIGBUS on protection errors. Also, most Linux kernels do not turn on the Aligmnent Check flag on ix86, meaning that it won't generate bus errors due to pointer misalignments. In short, it's worth mentioning that various flavors of UNIX behave differently under these conditions. NULL on *all* machines is defined as '0'. It is the compiler's job to translate the constant '0' into the appropriate bit pattern for the address corresponding to the null pointer. Indeed, on most machines that bit pattern is also all-bits zero, but not on all. Why should there be non-all-zero bit patterns for null pointers? Because on those machines those non-zero bits were supposed to trigger some access exceptions. But indeed, due to the sheer amounts of bugs, most hardware manufacturers to give in and use all-zero bit patterns for null pointers. The C language faq gives an excellent explanation of all these issues. Subsection: Load On Demand. Another solution to the problem of overwriting the executable while it is running is locking the executable file for writing. This is the solution adopted by Windows NT. I think that there are also some UNIX versions which use this strategy. Subsection: Read-Only Memory. As I said before, IIRC, FreeBSD sends SIGBUS for protection errors. In addition to mmap(), you could also directo the readers to the man page of mprotect(). Section: Memory managers (introductory paragraph). Therefor,.. typo repl. with Therefore----^^^^^^^ Subsubsection: Allocating A Chunk, 2nd paragraph: Thus, Assuming ^------------lower case Subsubsecrion: Freeing A Chunk 4th paragraph: is adjucent ^----------typo, repl. with adjacent last paragraph: in actuall ^---------typo, repl. with actual Subsection: Interaction With Unix's Memory Management end of 1st paragraph: this ammount ^--------typo, repl. with amount (also at the end of the phrase) Note that the memory used for storing data whose size is not known in advance is the dynamically allocated memory. The size of the statically allocated variables is known in advance and the appropriate memory is allocated by the OS during process startup. 3d and last paragraph: amount mistyped as ammount. Subsection: Specialized Memory Managers general memry managers are ^---------typo, repl. with memory Subsection: Memory Manager's Interaction With Program's Code End of 1st paragraph: memery manager's functions ^------typo, repl. with memory Section: C Runtime Memory Management introductory paragraph: being a language that works on a low level ^----repl with `at' Subsection: The Stack And Local Variables End of 1st paragraph: fucntion 'main' ^-------------typo, repl. with function 3d paragraph. I think you should stress that The local variables are not *automatically* initialized. Obviously they are if you make an explicit initialization. You could also contrast this with global (and static) variables which are all initialized to zeros and (NULLs). Last sentence in the final note: Ofcourse ^-------------missing space. Subsection: The Essence Of Pointers So what realy is a pointer? ^----------------------typo, repl. with really In the figures, the representation of the address is slightly weird. I think you'd better use big endian, 00001000 for or little endian 00100000. Probably little endian is better, because you talk about ix86 machines. But big endian is probably easier to follow. The coice is yours. However, I agree that nothing prevents a machine from storing the addresses in your `middle endian' format. Another suggestion: You can make the bugs a little more subtle (or closer to `real life' programming) by using: char data[8] = "abcdefgh" for (i=0; i<=8; i++) { .... Or even for (i=1; i<=8; i++) { The effect is pretty much the same. This kind of bugs are exteremely frequent to programmers comming from languages using 1 based indices such as BASIC. PASCAL programmers also tend to make such mistakes. Subsection: Pointers And Assembly Language Introductory paragraph: ofcourse is missing a space Regarding the example code Perhaps it would be better if the example code would explicitly return 0 from main (and not simply call return without a value). IIRC in C++, this is OK (i.e. it is guaranteed that main() will return zero, while in C this is not), but I might be wrong here. Anyway, I think this is ugly and should be avoided. The paragraph before showing the stack layout: Regarsing ^---------------trypo, repl. with Regarding Assembly listing: Comment to the `leave' statement is missing. IIRC, this statement loads esp back with ebp and then pops ebp. But you'd better check with the intel manuals. Obviously, the stack has to be balanced before returning from the function!!! Subsection: Heap Management With malloc() and free() Subsubsection: Usage Of malloc() Example code: IMO, the parantheses in (*p_i) = 4; are superfluous. Subsubsection: Usage Of free(). For consistency, in the first sentence: ...we use the free function ^---------------------------replace with free() About the second bullet, I don't think this is true. Memory allocators are free to do what they want with the data that has been freed. Even in your example memory allocator (several sections before), freeing a block means writing the pointer to the next free block in its first 4 bytes (assuming a machine with 4 bytes pointers). While this does not exactly mean `clearing' the are, you should probably stress that the freed area might (totally or partially) be modified during freeing. The simplest rule is the best. The data contains poison. Don't touch it! Last sentence in paragraph: negligable ^------typo, repl. with negligible Subsection: Freeing The Same Chunk Twice Last paragraph, 1st sentence: Another problem that might occure ^-------------typo, repl. with occur Section: C++ Runtime Memory Management Subsection: Heap Management with new and delete Subsubsection: Usage Of new 2nd paragraph: rather than a point to char, ^^^^^--------repl. with pointer In ANSI C, malloc() returns pointers to void and not to char. malloc used to return pointers to char in old C implementations which did not have pointers to void. In the example class A I think the 2nd constructor should use an initialization list A(int num) : i(num) {} While for native types, this constructor behaves the same way like yours (or at least is supposed to ;-), if the type of i was not native, being of type say class T, in my example its constructor of T accepting a single int parameter would have been invoked. In your example, the constructor of T without parameters would be invoked for i, and after that T's assignment operator inside the body of A's constructor. Depending on the actual implementations of the constructors and assignment operators, this may or may not be equivalent (but it is very ugly if they are not ;-). Your example generates syntax errors if class T defines only a constructor accepting a single int and does not define a constructor callable without any parameters. Last sentence of the last paragraph: seperation ^-----------typo, repl. with separation Subsubsection: Usage Of delete For native types, delete just freed the ^-------typo, repl. with frees Example code comment: ammount typo, repl. with amount Inside the Note, explaining virtual destructors For example, lets suppose ^^^^------------------typo, repl. with let's Example code explaining virtual destructors ~Parent { delete p_parent; } ^---------------------------Missing () Comment before delete p_obj1; destrcutor, '~Parent()' ^^^^---------------typo, repl. with destructor After that: Lets delete the object now. Because we are freed it via ^------typo, repl. with Let's ^---repl. with have the compiler can not... ^^^^^^-----------repl. with cannot Fixed virtual destructor example: virtual ~Parent { delete p_parent; } ^------------------------Missing () Subsubsection: new Vs. new[] Last example: A a_vector = new A[10]; ^---------------------------Missing * Subsubsection: delete Vs. delete[] "Note that the compiler (or runtime library) cannot know by itself whether to use the delete or the delete[] operator - we need to tell it that." I don't think that this is correct. Nothing prevents the runtime library from explicitly marking blocks returned via new[] as arrays and act `correctly' even if delete has been invoked instead of delete[]. IIRC there used to be C++ runtimes that do that and correctly invoke delete[] if someone used delete instead. The only reason why the C++ runtime does not behave this way is efficiency. Regarding the paragraph after the example, note that mixing delete and delete[] in ether direction can lead to memory corruption, assuming the following implementation of the memory manager: Blocks allocated with new have the following format: +-----+------------------------------------+ |size | data | +-----+------------------------------------+ ^----pointer returned to user while blocks allocaetd with new[] have the following format +-----+--------------+---------------------+ |size |num elements |data | +-----+--------------+---------------------+ ^--------pointer returned to the user The number of elements of the array has to be stored, because the destructors for each element in the array have to be invoked when the block is freed. Assuming a 32 bit machine, delete will skip 4 bytes backwards to figure out the block size, and then add the block to the free list (after it has invoked the destructor). However if delete is invoked on a block allocated with new[] (instead of new), it will skip back 4 bytes and treat the (num_elements as the block size), and even if we are so lucky that they are the same (an array of 1 char :-))), the freed block will start *after* the block size. I think it's not hard to imagine that now the heap is seriously corrupted. The other way round (calling delete[] for a block allocated with new) leads to an equally nasty corruption. So the moral is no, calling delete on a block allocated with new[] is not less innocent than calling delete[] on a block allocated with new. Subsection: "Inter-mixing C++ And C Memory Managers" First sentece: extention ^---------typo, repl. with extension Also note that because of the non-standard name mangling scheme of the C++ compilers, libraries compiled with a C++ compiler distributed in binary form will in general not work with other C++ compilers. Subsection: The Odds Of References You should probably stress the fact that unlike pointers references are bound to the referthe referring object *forever* (i.e. until they run out of scope). You cannot change the association between a reference amd an object after the reference has been initialized. From this respect, a reference to T behaves more like a T* const (i.e. the pointer itself cannot be changed) rather than a pure T*. Also, note that there are places in C++ where references cannot be avoided, such as operator overloading. Section: Tools For C/C++ Memory Problems Resolving Subsection: Free Libraries For Locating Memory Leaks Misspelled Ofcourse (i.e. no space) in the 1st paragraph. Subsection: "Must-Have" Free Tools For Many Memory Problems Types - Linux/x86 Last setence in the 3rd paragraph: faster then trying to find the bugs ^----typo, repl. with than On Mon, 30 Dec 2002, guy keren wrote: > > On Wed, 25 Dec 2002, Michael Sternberg wrote: > > > I'm looking for a good URL that explains > > memory management in modern Linux kernel. > > > > I.e. what happens when I'm typing malloc, > > how many memory is assigned to a process, > > what is virtual memory, how swapping works, > > etc... > > ok. this was the last push i needed ;) > a new tutorial, titled "Unix and C/C++ runtime memory management for > programmers" tutorial was just released on the LUPG site > (http://www.actcom.co.il/~choo/lupg/). the tutorial actually tries to > explain what happens when you're using malloc(), what is virtual memory, > etc. took me over a year to eventually complete it. comments and fixes are > welcome, as usual. > > the tutorial's direct url is > http://www.actcom.co.il/~choo/lupg/tutorials/unix-memory/unix-memory.html > > hope this helps, > -- > guy > > "For world domination - press 1, > or dial 0, and please hold, for the creator." -- nob o. dy > > > > -------------------------------------------------------------------------- > Haifa Linux Club Mailing List (http://www.haifux.org) > To unsub send an empty message to [EMAIL PROTECTED] > > > -------------------------------------------------------------------------- Haifa Linux Club Mailing List (http://www.haifux.org) To unsub send an empty message to [EMAIL PROTECTED]