On Jan 17, 2008, at 11:22 PM, Till Bandi wrote:

I guess that would be a good solution - my only problem is that the file has 5.6 Gigabytes. So I can't put it all in memory.

My real problem is here:

    put 1 into vStartwert
    set the cursor to watch
    open file tFileIn
    open file tFileOut

    repeat forever
        read from file tFileIn at vStartwert for 1000000 lines
        put it into vChunk
        replace ";" with tab in vChunk
        replace "," with "." in vChunk
        write vChunk to  file vFileOut at end
        add number of chars of vChunk to vStartwert
    end repeat
    close file tFileIn
    close file tFileOut

This works until the output file reaches 2 Gigabytes. So I think vStartwert is getting to big and doesn't work anymore. Therefore I wanted to delete the Lines I already treated in the input file so I could always start at the first line.

Maybe there is a better solution?

As you know you can read in large chunks of the file in sequence and write it out to a second file without having to have the whole file in memory all at once (which is the main drawback of using URLs as opposed to file read/writes.) What if you broke up the file into manageable chunks before you did any processing? Something like (pseudocode):

read in 1000000 lines from bigFile
write out 1000000 lines to destFile1
read in next 1000000 lines from bigFile
write out 1000000 lines to destFile2
etc.

Then process each destFileN separately, and recombine them when you're finished?

Devin

Devin Asay
Humanities Technology and Research Support Center
Brigham Young University

_______________________________________________
use-revolution mailing list
[email protected]
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution

Reply via email to