The fact that you are calling readline pretty much makes any attempt at 
controlling the page size pointless, as the amount read each time is dictated 
by the length of each line in the file (which in the example given in the link 
is pretty small).

Back in my uni days I did an interesting exercise writing a C program to read 
from a file and time how long it took using different buffer sizes. And yes, 
there was a large variation in times with different sizes. However in order to 
make this work we had to only use low level file operations. At the end we did 
the same thing but using stdio and found that it always matched the best times 
we could manage manually tweaking buffer sizes.

As others have pointed out, this exercise is probably not the best for playing 
with IO optimisations because by far the most expensive part is manipulating 
the csv strings.

Also, that Python example you gave is not quite the same in the sense that it 
uses built in operations to modify the field and rebuild the output, whereas 
your nim program simply iterates over the fields. I came up with the following 
as a more equivalent nim example. Compiling this for production resulted in a 
program faster than the Python one (though not by much, a testament to the 
optimisations done in the Python libraries). 
    
    
    import os, strutils
    
    proc main() =
      var
        pageSize: int = 4096
        input = open(paramStr(1), fmRead, pageSize)
        output = open(paramStr(4), fmWrite, pageSize)
        changeAt = 0
      
      let
        toChange = $paramStr 2
        changeWith = $paramStr 3
      
      var i: int = 0
      for column in input.readLine().split(','):
        if column == toChange:
            changeAt = i
            break
        inc i
      
      for row in input.lines():
        var fields = row.split(',')
        fields[changeAt] = changeWith
        output.writeLine fields.join(",")
    
    main()
    

Reply via email to