You can try a FASTA version of the file to measure performance gain.
File file = new File("filename");
Boolean lazySequenceLoad = true;
LinkedHashMap<String, DNASequence> sequences =
FastaReaderHelper.readFastaDNASequence(file,lazySequenceLoad);
This will go through and index the accession id and not load any sequence
data which means no memory allocation and speed. You can then reference
the DNASequence by name and when you need the sequence data it will use
the file index to load the sequence data from the file for that specific
sequence. The same approach can be applied to FASTQ files.
Scooter
On 1/24/12 3:37 AM, "Mic" <[email protected]> wrote:
>Hello,
>I have found the following benchmark (
>http://biostar.stackexchange.com/questions/10376/how-to-efficiently-parse-
>a-huge-fastq-file/11279#11279
>)
>and I just wonder whether it is possible to make Java example even faster?
>
>Thank you in advance.
>_______________________________________________
>Biojava-l mailing list - [email protected]
>http://lists.open-bio.org/mailman/listinfo/biojava-l
_______________________________________________
Biojava-l mailing list - [email protected]
http://lists.open-bio.org/mailman/listinfo/biojava-l