I have done another test: Timings, dmd compiler, best of 4, seconds: D #1: 5.72 D #4: 1.84 D #5: 1.73 Psy: 1.59 D #2: 0.55 D #6: 0.47 D #3: 0.34
import std.file: read; import std.c.stdio: printf; int test(char[] data) { int count; foreach (i; 0 .. data.length - 3) { char[] codon = data[i .. i + 3]; if ((codon.length == 3 && codon[0] == 'T' && codon[1] == 'A' && codon[2] == 'G') || (codon.length == 3 && codon[0] == 'T' && codon[1] == 'G' && codon[2] == 'A') || (codon.length == 3 && codon[0] == 'T' && codon[1] == 'A' && codon[2] == 'A')) count++; } return count; } void main() { char[] data0 = cast(char[])read("data.txt"); int n = 300; char[] data = new char[data0.length * n]; for (size_t pos; pos < data.length; pos += data0.length) data[pos .. pos+data0.length] = data0; printf("%d\n", test(data)); } So when there is to compare among strings known at compile-time to be small (like < 6 char), the comparison shall be replaced with inlined single char comparisons. This makes the code longer so it increases code cache pressure, but seeing how much slow the alternative is, I think it's an improvement. (A smart compiler is even able to remove the codon.length==3 test because the slice data[i..i+3] is always of length 3 (here mysteriously if you remove those three length tests the program compiled with dmd gets slower)). Bye, bearophile