On Saturday, January 16, 2016 at 11:26:14 PM UTC-5, Neil Van Dyke wrote:
> Your code is very string-ops-heavy, and I would start looking at that.  
> One thing you could do is look for opportunities to construct fewer 
> intermediate strings, including across multiple procedure calls.  You 
> could also look for operations that are expensive, and use 
> less-expensive ones.  If your strings have no multibyte characters, it'd 
> be easier to make it a lot faster with reused bytes I/O buffers and a 
> tons less copying, but you could also try that with multibyte characters 
> (it's harder and slower, though).
> 
> If you get fancy with the optimization, you might end up with a DSL or 
> mini-language for the formats and/or transformation, to simplify the 
> source code while making its behavior more sophisticated.  But maybe you 
> want to focus on a proof-of-concept for the optimizations first, before 
> you go to the work of implementing the DSL.
> 
> BTW, the below comment, from an aborted response to your previous email, 
> doesn't seem to apply to your code, but I'll just note it for the record:
> 
> >
> > Without tracing through the actual code being profiled (not 
> > volunteering!), it's hard to say what the fix is.
> >
> > One little tip, from glancing at "collects/racket/string.rkt"... If 
> > your code happens to be calling `string-trim` with a large number of 
> > different `sep` arguments (different by `eq?`), it looks like it will 
> > be especially slow.  Or if you're calling `string-trim` a huge number 
> > of times with a nonzero number of non-`#f` `sep` arguments, and you're 
> > GCing.  (The implementation assembles regular expressions for non-`#f` 
> > `sep` arguments, and caches them in a weak hash with `eq?` test.)
> 
> BTW, I wouldn't be surprised if `string-trim` could be made faster for 
> the normal case of `sep` being `#f`, though the code doesn't look bad 
> right now.  (The majority of code that people write is not very 
> efficient, and `string-trim` looks better than the norm.) However, it 
> was written in a generalized way, and I don't know if anyone sat down 
> and hand-optimized for the `#f` case specifically. Or, it might have 
> been written a long time ago, and the VM/compiler or strings have 
> changed quite a bit since then, and so the code could benefit from 
> re-hand-optimizing.  (I've done a bunch of that kind of optimizing as a 
> consultant, and it can easily take a few hours for a single procedure, 
> so tends to only happen as-needed.)
> 
> Neil V.

For this app, the data is actually straight ASCII upper case mainframe data :)

Can you elaborate re: "reused bytes I/O buffers "  ?  One of the things I did 
in the C code was to reuse character arrays a lot and never malloc, but I'm 
less familiar with doing similar things in Racket.

I'm not opposed to hand optimizing (the C program was basically one gigantic 
optimization), but to be fair, it seems the same optimizations would need to be 
done to the Ruby code for comparison.

In other words, I have two related, but different goals:

1) I'd like to get to the point of being able to write expressive, "high level" 
code in Racket, in a similar manner as I've been accustomed to with Ruby, but 
with better performance than Ruby. Given Ruby typically trails the pack with 
respect to speed, that doesn't seem unreasonable.

2) I'd also like to get a better idea of practical optimizations that I can use 
with Racket when I need more speed, even if it lessens other aspects of the 
code such as readability, etc. I suppose the string-squeeze function in the 
Racket code is an example of that.

-- 
You received this message because you are subscribed to the Google Groups 
"Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to racket-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to