Digging Huge Files

Sivakatirswami Mon, 03 Aug 2009 20:11:48 -0700

I'm wanting to process both error logs and access logs on our web sites.

I have logs rotating out weekly, so that means we have 7 access_logfiles each ranging about 250-300MB in size.


It is my understanding that

put url "file:/someGiantFile.txt" into tAccessLog

will load the entire variable into memory. I might try that on one ofour G5 quad towers, but prefer to work on my power book.


So I assume that one could use the read command

This works:

on mouseUp
  open file (fld "path") for read # path to local 245MB access_log.1 file

  put 1 into tStart
  put 20000 into tStep
 put 1000 into tChunkSize

 repeat  until tAccessLogFileChunk is empty

read from file (fld "path") for tChunkSize lines

 put it into tAccessLogFileChunk
 put processLogs(tAccessLogFileChunk) & cr after tOutput
 put (tStart + tStep) into tStart
 put tStart & cr after tOutput
end repeat
ask file "Where should we save this?" with "LogResults.txt"
put it into tURL
put tOutput into url ("file:" & tURL)
end mouseUp

function processLogs tAccessLogFileChunk
  put empty into tFoundLines
  repeat for each line x in tAccessLogFileChunk
     if x contains "revolution" then
        put x & cr after tFoundLInes
     end if
  end repeat
  return tFoundLines
End processLogs



any comments on optimizing this?

Actually it was pretty speedy, until a minute to get to the save theresults out... I could add pathsto all six access files for a week and probably get all the results fora search out in under 5 minutes for 1.5Gig...So, that's really not too bad.


Sivakatirswami





_______________________________________________
use-revolution mailing list
[email protected]
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution

Digging Huge Files

Reply via email to