Re-directed from R-devel, where I guess it went by accident.
On 11/18/2014 09:00 AM, Cook, Malcolm wrote:
Hi, I understand ShortRead::FastqStreamer will read chunks in parallel depending on the value of ShortRead:::.set_omp_threads I see this discussed here: https://stat.ethz.ch/pipermail/bioc-devel/2013-May/004355.html and nowhere else. It probably should be documented in ShortRead.
yes, it's now documented on the FastqStreamer / Sampler and trim* pages.
Possibly this has already changed for I am using still R 3.1.0. I thought I'd check. Oh, and, in my hands/hardware, the value of this FastqStreamer's use of srapply's parallelization is negligible, at least if the consumer of successive yields is in the main process. I see that the new bpiterate appears to take advantage of yielding in forked processes, which sounds promising. Is that the idea?
Yes, individual instances of FastqStreamer (and Sampler) don't benefit from R-level parallel evaluation; they both are 'readers' that iterate sequentially through the entire file. If you were streaming or sampling from several files (as when creating a qa report, where FastqSampler is used 'under the hood'), the srapply (or nowadays just BiocParallel::bplapply would distribute the streaming / sampling of each file to a separate process. This would be an effective way of managing memory while performing parallel evaluation.
bpiterate could be used effectively with FastqStreamer, if the operation done with the chunk of the file were somehow expensive; when processing several files it is probably more scalable to parallelize over files, using FastqStreamer to manage memory.
Martin
Looking forward.... Malcolm Cook ______________________________________________ r-de...@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
-- Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M1 B861 Phone: (206) 667-2793 _______________________________________________ Bioc-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel