Re-directed from R-devel, where I guess it went by accident.


On 11/18/2014 09:00 AM, Cook, Malcolm wrote:
Hi,

I understand ShortRead::FastqStreamer will read chunks in parallel depending
on the value of ShortRead:::.set_omp_threads

I see this discussed here:
https://stat.ethz.ch/pipermail/bioc-devel/2013-May/004355.html and nowhere
else.

It probably should be documented in ShortRead.

yes, it's now documented on the FastqStreamer / Sampler and trim* pages.


Possibly this has already changed for I am using still R 3.1.0.   I thought
I'd check.

Oh, and, in my hands/hardware, the value of this FastqStreamer's use of
srapply's parallelization is negligible, at least if the consumer of
successive yields is in the main process.  I see that the new bpiterate
appears to take advantage of yielding in forked processes, which sounds
promising.  Is that the idea?

Yes, individual instances of FastqStreamer (and Sampler) don't benefit from R-level parallel evaluation; they both are 'readers' that iterate sequentially through the entire file. If you were streaming or sampling from several files (as when creating a qa report, where FastqSampler is used 'under the hood'), the srapply (or nowadays just BiocParallel::bplapply would distribute the streaming / sampling of each file to a separate process. This would be an effective way of managing memory while performing parallel evaluation.

bpiterate could be used effectively with FastqStreamer, if the operation done with the chunk of the file were somehow expensive; when processing several files it is probably more scalable to parallelize over files, using FastqStreamer to manage memory.

Martin


Looking forward....

Malcolm Cook

______________________________________________
r-de...@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



--
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M1 B861
Phone: (206) 667-2793

_______________________________________________
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Reply via email to