FG wrote: > On 2013-02-04 15:04, bioinfornatics wrote: >> I am looking to parse efficiently huge file but i think D lacking for this >> purpose. To parse 12 Go i need 11 minutes wheras fastxtoolkit (written in c++ >> ) need 2 min. >> >> My code is maybe not easy as is not easy to parse a fastq file and is more >> harder when using memory mapped file. > > Why are you using mmap? Don't you just go through the file sequentially? > In that case it should be faster to read in chunks: > > foreach (ubyte[] buffer; file.byChunk(chunkSize)) { ... }
I would go even further, and organise the file so N Data objects fit one page, and read the file page by page. The page-size can easily be obtained from the system. IMHO that would beat this fastxtoolkit. :) -- Dejan Lekic dejan.lekic (a) gmail.com http://dejan.lekic.org