This is the sort of thing for which Perl is designed. I think your REALbasic code could be made a little faster, but probably you're still not going to beat Perl here. In fact, you might not be able to do it with a C program without some work.

Charles Yeomans


On Dec 8, 2006, at 11:07 AM, Kem Tekinay wrote:

It started with some friendly ribbing. I have a friend who thinks that the answer to all things is "perl," and he goes on and on about its benefits,
speed, flexibility, etc., etc. Meanwhile, perl code looks to me like
something my cat created when he walked across the keyboard.

Having had enough, and with an actual, simple project to test, I created equivalent perl and RB apps to process a 100 MB text file. The processing was fairly simple: If a line was a pure number, a constant string was added to the beginning of it, otherwise, the text was wrapped in another constant
string and quotes were replaced by double-quotes.

For example, this text:

 1
 Text 1
 2
 Text 2
 3
 Text "3"

Became this text:

 case 1
 r = "Text 1"
 case 2
 r = "Text 2"
 case 3
 r = "Text ""3"""

Imagine my surprise when it took RB a minute to do what perl did in 15
seconds. I'm not surprised that perl is faster, only HOW much faster it is. My argument was going to be, "see, difference wasn't that much, and RB is
much easier to read," but these results make it no contest.

So my question is: Why is RB so much slower? What is it doing that's making
the difference? And could I do something to close the gap?

The code I used for each is as follows:

 perl -lne 'chomp ; if(/^\d+$/){print ('\''case '\'' , $_)} else {s
|\"|\"\"|g ; print("r = \"" , $_ , "\"\n")}' test.txt > perltest.txt

And for REALbasic:

  #pragma BackgroundTasks false
  #pragma BoundsChecking false

  dim fIn, fOut as FolderItem
  dim t as Double = microseconds

  fIn = DesktopFolder.Child( "test.txt" )
  fOut = DesktopFolder.Child( "output.txt" )

  dim tIn as TextInputStream = fIn.OpenAsTextFile
  dim tOut as TextOutputStream = fOut.CreateTextFile

  dim line as string
  while not tIn.EOF
    line =  tIn.ReadLine
    if IsNumeric( line ) then
      tOut.WriteLine( "case " + line )
    else
      line = ReplaceAllB( line, """", """""" )
      tOut.WriteLine( "r = """ + line + """" + chr( 13 ) )
    end if
  wend

  tIn.Close
  tOut.Close

  t = microseconds - t
  t = t / 1000000

  MsgBox( format( t, "#,0.000" ) + " seconds" )

BTW, I also created a version that loaded the whole file at once, split it, modified it without testing (since I knew that every odd line was a number and even line text) and dumped it back to a file. That took 24 seconds!

______________________________________________________________________ ____ Kem Tekinay (212) 201-1465 MacTechnologies Consulting Fax (914) 242-7294 http://www.mactechnologies.com Pager (917) 491-5546

To join the MacTechnologies Consulting mailing list, send an e- mail to:
           [EMAIL PROTECTED]








_______________________________________________
Unsubscribe or switch delivery mode:
<http://www.realsoftware.com/support/listmanager/>

Search the archives of this list here:
<http://support.realsoftware.com/listarchives/lists.html>

_______________________________________________
Unsubscribe or switch delivery mode:
<http://www.realsoftware.com/support/listmanager/>

Search the archives of this list here:
<http://support.realsoftware.com/listarchives/lists.html>

Reply via email to