It started with some friendly ribbing. I have a friend who thinks that the
answer to all things is "perl," and he goes on and on about its benefits,
speed, flexibility, etc., etc. Meanwhile, perl code looks to me like
something my cat created when he walked across the keyboard.
Having had enough, and with an actual, simple project to test, I created
equivalent perl and RB apps to process a 100 MB text file. The processing
was fairly simple: If a line was a pure number, a constant string was added
to the beginning of it, otherwise, the text was wrapped in another constant
string and quotes were replaced by double-quotes.
For example, this text:
1
Text 1
2
Text 2
3
Text "3"
Became this text:
case 1
r = "Text 1"
case 2
r = "Text 2"
case 3
r = "Text ""3"""
Imagine my surprise when it took RB a minute to do what perl did in 15
seconds. I'm not surprised that perl is faster, only HOW much faster it is.
My argument was going to be, "see, difference wasn't that much, and RB is
much easier to read," but these results make it no contest.
So my question is: Why is RB so much slower? What is it doing that's making
the difference? And could I do something to close the gap?
The code I used for each is as follows:
perl -lne 'chomp ; if(/^\d+$/){print ('\''case '\'' , $_)} else {s
|\"|\"\"|g ; print("r = \"" , $_ , "\"\n")}' test.txt > perltest.txt
And for REALbasic:
#pragma BackgroundTasks false
#pragma BoundsChecking false
dim fIn, fOut as FolderItem
dim t as Double = microseconds
fIn = DesktopFolder.Child( "test.txt" )
fOut = DesktopFolder.Child( "output.txt" )
dim tIn as TextInputStream = fIn.OpenAsTextFile
dim tOut as TextOutputStream = fOut.CreateTextFile
dim line as string
while not tIn.EOF
line = tIn.ReadLine
if IsNumeric( line ) then
tOut.WriteLine( "case " + line )
else
line = ReplaceAllB( line, """", """""" )
tOut.WriteLine( "r = """ + line + """" + chr( 13 ) )
end if
wend
tIn.Close
tOut.Close
t = microseconds - t
t = t / 1000000
MsgBox( format( t, "#,0.000" ) + " seconds" )
BTW, I also created a version that loaded the whole file at once, split it,
modified it without testing (since I knew that every odd line was a number
and even line text) and dumped it back to a file. That took 24 seconds!
__________________________________________________________________________
Kem Tekinay (212) 201-1465
MacTechnologies Consulting Fax (914) 242-7294
http://www.mactechnologies.com Pager (917) 491-5546
To join the MacTechnologies Consulting mailing list, send an e-mail to:
[EMAIL PROTECTED]
_______________________________________________
Unsubscribe or switch delivery mode:
<http://www.realsoftware.com/support/listmanager/>
Search the archives of this list here:
<http://support.realsoftware.com/listarchives/lists.html>