Re: [racket-dev] read-line performance problem

Neil Van Dyke Wed, 02 Nov 2011 17:55:34 -0700

Racket can do this somewhat faster, but I suggest any effort be focusedon improvements that are also relevant to substantial programs, and noton trying to compete on Perl one-liners and poor benchmarks.


Details follow...

Trying this 'benchmark' on a 700MB log file (just Linux "dmesg" output,duplicated many times), I saw somewhat comparable numbers with Racket5.x as those on Stackoverflow. (This was on Linux on an old 2GHzlaptop, no swap space, and the kernel had cached the 700MB in RAMbuffer, so it was just Racket pegging a CPU core at 100%.)

Using a "regexp-match" was significantly faster than "read-bytes-line",but I'm sure still slower than the other languages mentioned.

The process size stayed at 40MB total (shared libraries andeverything). It looked like there were near-constant quick GC cycles.GC tuning might help?

This would be a more useful benchmark if it required actually doingsomething plausible with the allocations, rather than immediatelythrowing them away and doing no actual processing. I suspect Racketwould perform relatively better on something closer to a real-world task.

Were I writing high-performance I/O code, I might use"read-bytes-avail!", to try to reduce allocations. Of course,sys-admins would not be doing this for quick scripting Perl-like tasks.(Were we to max out what we can do with GC tuning and optimizations, wecould always try making a minilanguage for this traditional Perl-liketask, which optimized away some allocations, such as by allocating onlytext that we use.)

Matching Perl I/O performance would be nice, but I'm not disappointed ifnobody does. Perl was originally developed for pretty much this exacttask (i.e., going through a line-oriented text-ish data file, applying aregexp to each line) and to be fast even on a 16MHz 4MB Sun 3/50 of over20 years ago.

Also, I think we discussed this a while ago (perhaps when making thefew-liner examples for the new Web site), but I think that nobody willwin over any Perl programmers by trying to get their language to do20-year-old Perl one-liners. This program is a handful of characters inPerl, and telling people that they could be typing "lambda" andparentheses and such instead, and wouldn't that be so much better, makesone look like a crazy person. Focus on things that are *not* Perlone-liners, but are substantial programs -- especially ones that benefitfrom syntactic extension, functional-ish programming, andmaintainability -- since that's where Racket becomes a smart tool ofsmart people, and where Perl becomes a burden of crazy people.

With that in mind, from a PR perspective, if a Perl-type person asksyou, "What does this Perl one-liner look like in Racket?", the preferredresponses are: (1) "That task looks like what Perl is good at"; (2) doas politicians do, and answer the question that you wish you had beenasked; (3) pretend to speak only Swahili and to not understand the question.



Sam Tobin-Hochstadt wrote at 11/02/2011 07:14 PM:

On StackOverflow [1], someone reported that Racket's I/O performance
on large files was substantially worse than other languages for a
simple task.  I haven't yet tried it on a similarly large volume of
data, but I did see a performance difference relative to Chicken for
large but not huge files, and Ryan seems to have gotten similar
results.

[1] http://stackoverflow.com/questions/7946745/i-o-performance-in-mzscheme


--
http://www.neilvandyke.org/
_________________________________________________
 For list-related administrative tasks:
 http://lists.racket-lang.org/listinfo/dev

Re: [racket-dev] read-line performance problem

Reply via email to