Re: String compare performance

bearophile Sat, 27 Nov 2010 19:12:07 -0800

I have done another test:

Timings, dmd compiler, best of 4, seconds:
  D #1: 5.72
  D #4: 1.84
  D #5: 1.73
  Psy:  1.59
  D #2: 0.55
  D #6: 0.47  
  D #3: 0.34



import std.file: read;
import std.c.stdio: printf;

int test(char[] data) {
    int count;
    foreach (i; 0 ..  data.length - 3) {
        char[] codon = data[i .. i + 3];
        if ((codon.length == 3 && codon[0] == 'T' && codon[1] == 'A' && 
codon[2] == 'G') ||
            (codon.length == 3 && codon[0] == 'T' && codon[1] == 'G' && 
codon[2] == 'A') ||
            (codon.length == 3 && codon[0] == 'T' && codon[1] == 'A' && 
codon[2] == 'A'))
            count++;
    }
    return count;
}

void main() {
    char[] data0 = cast(char[])read("data.txt");
    int n = 300;
    char[] data = new char[data0.length * n];
    for (size_t pos; pos < data.length; pos += data0.length)
        data[pos .. pos+data0.length] = data0;

    printf("%d\n", test(data));
}


So when there is to compare among strings known at compile-time to be small 
(like < 6 char), the comparison shall be replaced with inlined single char 
comparisons. This makes the code longer so it increases code cache pressure, 
but seeing how much slow the alternative is, I think it's an improvement.

(A smart compiler is even able to remove the codon.length==3 test because the 
slice data[i..i+3] is always of length 3 (here mysteriously if you remove those 
three length tests the program compiled with dmd gets slower)).

Bye,
bearophile

Re: String compare performance

Reply via email to