--- In [email protected], Thomas Hruska <[EMAIL PROTECTED]> wrote:
>
> pushkar raj wrote:
> > I am dire straits regarding performance issue...But my
> > project constraints are such that I must solve this
> > problem only through c language.
> 
> Read in the first file in its entirety.
> 
> Extract each and every line.  You only need one malloc() call
> - just point to each entry in the original file and alter its
> contents to have a terminating null '\0' at the end of each
> line. Memory allocation is always a performance chewer.
> Read Safe C++ Design Principles' section on memory allocation
> - applies to C as well.
> You should make two passes over the data.  The first pass
> determines how many lines are in the first file.  Then you
> call malloc().  Then you fill the allocated memory 
> with pointers to the data during the second pass.
<snip>

I would like to add that this might be doable a bit simpler:
stat() the file; the size is important.
Now malloc() a memory buffer for this file of the size of the file
itself (that's why you should stat() the file first).
Now you can read the file into memory in one chunk.

Next step is to count the number of lines in this file. Because you
have read the whole file into a memory buffer this is an in-memory
operation and should work ugly fast (as long as you're careful not to
run an infinite loop for counting the number of lines).
Now only you should allocate the array of line pointers and then -as
suggested by Thomas- sort this array of pointers.

Then you loop through file 2 and search (with a binary search or
something similar) through the lines in your memory buffer. This way
you will decrease runtime to an easily manageable minimum. There are
faster methods, but I fear they are out of your scope of current
knowledge (no offense intended).

Regards,
Nico

Reply via email to