On 2013-02-04 20:39, monarch_dodra wrote:
AFAIK, he is reading text data that needs to be parsed line by line, so byChunk may not be the best approach. Or at least, not the easiest approach.
He can still read a chunk from the file, or the whole file and then read that chunk line by line.
I'm just wondering if maybe the reason the D code is slow is not just because of: - unicode. - front + popFront. ranges in D are "notorious" for being slow to iterate on text, due to the "double decode". If you are *certain* that the file contains nothing but ASCII (which should be the case for fastq, right?), you can get more bang for your buck if you attempt to iterate over it as an array of bytes, and convert the bytes to char on the fly, bypassing any and all unicode processing.
Depending on what you're doing you can blast through the bytes even if it's Unicode. It will of course not validate the Unicode.
-- /Jacob Carlborg