On Tue, Oct 20, 2009 at 10:55 AM, Michael Lawrence <[email protected]>wrote:
> > > On Tue, Oct 20, 2009 at 9:11 AM, [email protected] < > [email protected]> wrote: > >> Dear bioc-sig-sequencing, >> >> Can you comment on the following error message I received while using the >> IRanges package. >> >> I tried the following with a gene information file downloaded from UCSC >> Table Browser (C. elegans, chrIII): >> >> The first two lines of the gene information file are: >> >> mte...@system76-pc:~/data09/R_working$ head -n 2 celegans_chrIII.txt >> #name chrom strand txStart txEnd cdsStart cdsEnd exonCount >> exonStarts exonEnds proteinID >> cTel54X.1 chrIII - 1270 2917 1270 2917 4 >> 1270,1557,2680,2816, 1507,2167,2764,2917, fbxa-6 >> >> genetable<-read.table("celegans_chrIII.txt", header=T, sep="\t") >> > head(genetable, n = 2L) >> cTel54X.1 chrIII X. X1270 X2917 X1270.1 X2917.1 X4 >> 1 H10E21.2 chrIII + 8855 11940 8855 11940 6 >> 2 H10E21.3a chrIII - 12185 14801 12188 14753 7 >> X1270.1557.2680.2816. >> 1 8855,9268,9886,10515,10796,11674, >> 2 12185,12475,12964,13463,13811,14521,14740, >> X1507.2167.2764.2917. fbxa.6 >> 1 8957,9428,10072,10641,11105,11940, H10E21.2 >> 2 12401,12627,13391,13682,14083,14686,14801, nhr-80 >> > >> >> ############################################################## >> I note: the preceding 'read.table' command made the 1st data line in >> 'genetable' into sort of a header? The error in the following code line >> cites a problem in row 1 of the 'genetable' data structure. >> ################################################################ >> >> > Right, the issue is that read.table treats '#' as a comment. So you need to > either change the file (remove the #) or specify the argument > comment.char="" to workaround that. Also, you might check out the > GenomicFeatures package, which has utilities for working with a data.frame > representation of the UCSC genes table. For example, the 'transcripts' > function will give you a set of regions, including the promoters you're > trying to generate below. > > I also thought I would mention that you could use rtracklayer to do this and avoid these details. Something like: session <- browserSession() genome(session) <- "ce6" query <- ucscTableQuery(session, "sangerGene") tab <- getTable(query) Michael > > >> > >> promoter<-IRanges(start=genetable$txStart-1000*as.real(genetable$strand=="+"), >> width=1000) >> Error in solveUserSEW0(start = start, end = end, width = width) : >> solving row 1: range cannot be determined from the supplied arguments >> (too many NAs) >> >> > sessionInfo() >> R version 2.10.0 Under development (unstable) (2009-09-11 r49665) >> x86_64-unknown-linux-gnu >> >> locale: >> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C >> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 >> [5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8 >> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C >> [9] LC_ADDRESS=C LC_TELEPHONE=C >> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C >> >> attached base packages: >> [1] stats graphics grDevices utils datasets methods base >> >> other attached packages: >> [1] ShortRead_1.3.40 lattice_0.17-26 rtracklayer_1.5.23 RCurl_1.2-1 >> [5] bitops_1.0-4.1 BSgenome_1.13.16 Biostrings_2.13.52 >> IRanges_1.3.93 >> >> loaded via a namespace (and not attached): >> [1] Biobase_2.5.8 grid_2.10.0 hwriter_1.1 XML_2.6-0 >> > >> >> Thanks, >> P. Terry >> [email protected] >> >> _______________________________________________ >> Bioc-sig-sequencing mailing list >> [email protected] >> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing >> > > [[alternative HTML version deleted]] _______________________________________________ Bioc-sig-sequencing mailing list [email protected] https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
