i've got some C code that is reading from a 800 MB CSV file and allocates memory
for an array to store the data in. the method used is to read the CSV file
line-by-line and realloc additional space with each line read. having timed this
and found the realloc speed to be low when the array is large, i am aiming to
make this faster but am not sure about the best way to proceed.

the current code uses realloc in the manner suggested by the manpage:

    newsize = size + 1;
    time(&t1);                  // start timing realloc
    if ((newap = (int *)realloc(ap, newsize*sizeof(int))) == NULL) {
        free(ap);
        ap = NULL;
        size = 0;
        return (NULL);
    }
    time(&t2);                  // stop timing realloc; start timing fscanf

as the size of ap grows, so does the time it takes to realloc the space.

an alternative to this procedure would be to scan through the CSV file to
determine how many array entries i would need, realloc it all at once, then go
back through the CSV file again to read the data into the array. i'm not
confident this is the only way to do this and would appreciate any suggestions
for speeding up this procedure.

cheers,
jake 

Reply via email to