On Thu, Mar 01, 2012 at 12:04:39AM +0100, Martin Nowak wrote:
[...]
> Mmh, I've retested and you're right dmd's lexer is about 2x faster.
> The main overhead stems from using ranges and enforce.
> 
> Quick profiling shows that 25% is spent in popFront and
> std.utf.stride.  Last time I worked on this I rewrote std.utf.decode
> to be much faster.  But utf characters are still "decoded" twice, once
> for front and then again for popFront. Also stride uses table lookup
> and can't be inlined.
[...]

One way to not decode characters twice is by using a single-dchar buffer
in your range. Something like:

        struct MyRange {
                private File src;
                char buf[];
                dchar readahead;

                this(File _src) {
                        // ... fill up buf from src here
                        popFront();
                }

                @property pure dchar front() {
                        return readahead;
                }

                void popFront() {
                        int stride;
                        readahead = decode(buf, stride);
                        buf = buf[stride..$];
                        // ... fill up buf more if needed
                }
        }


T

-- 
"A man's wife has more power over him than the state has." -- Ralph Emerson

Reply via email to