--- In [email protected], Thomas Hruska <[EMAIL PROTECTED]> wrote: > > pushkar raj wrote: > > I am dire straits regarding performance issue...But my > > project constraints are such that I must solve this > > problem only through c language. > > Read in the first file in its entirety. > > Extract each and every line. You only need one malloc() call > - just point to each entry in the original file and alter its > contents to have a terminating null '\0' at the end of each > line. Memory allocation is always a performance chewer. > Read Safe C++ Design Principles' section on memory allocation > - applies to C as well. > You should make two passes over the data. The first pass > determines how many lines are in the first file. Then you > call malloc(). Then you fill the allocated memory > with pointers to the data during the second pass. <snip>
I would like to add that this might be doable a bit simpler: stat() the file; the size is important. Now malloc() a memory buffer for this file of the size of the file itself (that's why you should stat() the file first). Now you can read the file into memory in one chunk. Next step is to count the number of lines in this file. Because you have read the whole file into a memory buffer this is an in-memory operation and should work ugly fast (as long as you're careful not to run an infinite loop for counting the number of lines). Now only you should allocate the array of line pointers and then -as suggested by Thomas- sort this array of pointers. Then you loop through file 2 and search (with a binary search or something similar) through the lines in your memory buffer. This way you will decrease runtime to an easily manageable minimum. There are faster methods, but I fear they are out of your scope of current knowledge (no offense intended). Regards, Nico
