Re: How to read fastly files ( I/O operation)

monarch_dodra Wed, 06 Feb 2013 23:30:47 -0800

On Wednesday, 6 February 2013 at 22:55:14 UTC, FG wrote:

On 2013-02-06 21:43, monarch_dodra wrote:

On Wednesday, 6 February 2013 at 19:19:52 UTC, FG wrote:
I have processed the file SRR077487_1.filt.fastq from
ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/data/HG00096/sequence_read/
and expect this syntax (no multiline sequences or whitespace).
File takes up almost 6 GB processing took 1m45s - twice asfast as the
fastest D solution so far
Do you mean my solution above? I tried your solution with dmd,with -release -O-inline, and both gave about the same result (69s yours, 67smine).


Yes. Maybe CPU is the bottleneck on my end.
With DMD32 2.060 on win7-64 compiled with same flags I got:
MD: 4m30 / FG: 1m55s - both using 100% of one core.
Quite similar results with GDC64.

You have timed the same file SRR077487_1.filt.fastq at 67s?

Yes, that file exactly. That said, I'm working on an SSD, somaybe I'm less IO bound than you are?

My attempt was mostly to try and see how fast we could go, whiledoing it only with high level stuff (eg, no fSomething calls).

Probably, going lower level, and parsing the text manually,waiting for magic characters could yield better result (like whatyou did).

I'm going to also try playing around with threads: Just last weekI wrote a program that did exactly this (asynchronous file reads).

That said, I'll be making this priority n°2. I'd like to make theparser work perfectly first, and in a way that is easilyupgradeable/useable. Mr. bio made it perfectly clear that heneeded support for whites and line feeds ;)

Re: How to read fastly files ( I/O operation)

Reply via email to