Hi Michael, True; there are no scores in this BED file. That brings us to Sean's remark.
You may have noticed that all features in the BED file are 36 bases long. Each feature is the position and orientation of a Solexa read. I want the value in each bin in the WIG file to represent the number of tags contained in that bin. So, coverage() is close to what I am looking for but I think that pileup() is perhaps better. Then comes the problem of how to place those values in an object exportable as WIG. Lets not talk about efficiency for now. I think that I can solve this problem myself but given that: 1) there is a package that can read BED and write WIG files and 2) BED to WIG conversion is such a common task I thought that there must have been an existing tool or perhaps an established efficient way of doing this with BioC tools. Why is this important? Well, this myRegions.bed contains 4.3 million records. Loading them with import() used about 1GB of RAM. My real world BED files are never less than 50 million reads, 150 more like. Conclusion: I DO care about an expert's opinion of BED-> WIG conversion within R. Maybe nobody does this with R. That is also acceptable. What tool do you use then? Thank you, Ivan Ivan Gregoretti, PhD National Institute of Diabetes and Digestive and Kidney Diseases National Institutes of Health 5 Memorial Dr, Building 5, Room 205. Bethesda, MD 20892. USA. Phone: 1-301-496-1592 Fax: 1-301-496-9878 On Thu, Sep 17, 2009 at 4:45 PM, Sean Davis <[email protected]> wrote: > > > On Thu, Sep 17, 2009 at 4:43 PM, Michael Lawrence <[email protected]> > wrote: >> >> On Thu, Sep 17, 2009 at 8:14 AM, Ivan Gregoretti <[email protected]> >> wrote: >> >> > Hello everybody >> > >> > How do you convert BED formatted files to WIG files? >> > >> > >> > I tried to do that with rtracklayer but it didn't quite succeed. >> > >> > This is the session's transcript: >> > >> > First, you can download >> > http://dl.getdropbox.com/u/2051155/myRegions.bed, which looks like >> > this >> > >> > chr1 3002444 3002479 + >> > chr1 3002989 3003024 - >> > chr1 3017603 3017638 + >> > chr1 3017879 3017914 - >> > chr1 3018173 3018208 + >> > chr1 3018183 3018218 - >> > chr1 3018183 3018218 - >> > chr1 3019065 3019100 + >> > chr1 3019761 3019796 - >> > chr1 3020044 3020079 - >> > ... >> > >> > Now to R >> > >> > suppressMessages(library(rtracklayer)) >> > myRegions <- import('myRegions.bed') >> > >> > So far so good. Now I try: >> > export(myRegions, 'myRegions.wig', format = 'wig') >> > >> > but I get: >> > Error in export.ucsc(object, con, subformat, ...) : >> > Track not compatible with WIG format: Overlapping features must be >> > on separate strands and every feature width must be positive >> > >> > I seems that the error message is a feature rather than a bug. My >> > interpretation is that export() does not like records like lines 6 and >> > 7: >> > chr1 3018183 3018218 - >> > chr1 3018183 3018218 - >> > >> > So, how do you convert BED to WIG in your everyday work? >> > >> > >> Well, as rtracklayer said, this track is not representable by WIG, which >> is >> meant to communicate a single data value for a given genomic region (to >> generate the bar plots in the UCSC browser, for instance). There aren't >> any >> scores in your file, so why do you want to use WIG? >> > > Just guessing, here, but you may want to calculate coverage() first, Ivan? > > Sean > >> >> > Thank you, >> > >> > Ivan >> > >> > > sessionInfo() >> > R version 2.9.2 (2009-08-24) >> > x86_64-redhat-linux-gnu >> > >> > locale: >> > >> > >> > LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=C;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C >> > >> > attached base packages: >> > [1] stats graphics grDevices utils datasets methods base >> > >> > other attached packages: >> > [1] rtracklayer_1.4.0 RCurl_0.94-1 >> > >> > loaded via a namespace (and not attached): >> > [1] Biobase_2.4.0 Biostrings_2.12.0 BSgenome_1.12.0 IRanges_1.2.0 >> > [5] tools_2.9.2 XML_2.3-0 >> > >> > Ivan Gregoretti, PhD >> > National Institute of Diabetes and Digestive and Kidney Diseases >> > National Institutes of Health >> > 5 Memorial Dr, Building 5, Room 205. >> > Bethesda, MD 20892. USA. >> > Phone: 1-301-496-1592 >> > Fax: 1-301-496-9878 >> > >> > _______________________________________________ >> > Bioc-sig-sequencing mailing list >> > [email protected] >> > https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing >> > >> >> [[alternative HTML version deleted]] >> >> _______________________________________________ >> Bioc-sig-sequencing mailing list >> [email protected] >> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing > > _______________________________________________ Bioc-sig-sequencing mailing list [email protected] https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
