i've got some C code that is reading from a 800 MB CSV file and allocates memory for an array to store the data in. the method used is to read the CSV file line-by-line and realloc additional space with each line read. having timed this and found the realloc speed to be low when the array is large, i am aiming to make this faster but am not sure about the best way to proceed.
the current code uses realloc in the manner suggested by the manpage: newsize = size + 1; time(&t1); // start timing realloc if ((newap = (int *)realloc(ap, newsize*sizeof(int))) == NULL) { free(ap); ap = NULL; size = 0; return (NULL); } time(&t2); // stop timing realloc; start timing fscanf as the size of ap grows, so does the time it takes to realloc the space. an alternative to this procedure would be to scan through the CSV file to determine how many array entries i would need, realloc it all at once, then go back through the CSV file again to read the data into the array. i'm not confident this is the only way to do this and would appreciate any suggestions for speeding up this procedure. cheers, jake