On Tue, Oct 20, 2009 at 9:11 AM, [email protected] <
[email protected]> wrote:
> Dear bioc-sig-sequencing,
>
> Can you comment on the following error message I received while using the
> IRanges package.
>
> I tried the following with a gene information file downloaded from UCSC
> Table Browser (C. elegans, chrIII):
>
> The first two lines of the gene information file are:
>
> mte...@system76-pc:~/data09/R_working$ head -n 2 celegans_chrIII.txt
> #name chrom strand txStart txEnd cdsStart cdsEnd exonCount
> exonStarts exonEnds proteinID
> cTel54X.1 chrIII - 1270 2917 1270 2917 4
> 1270,1557,2680,2816, 1507,2167,2764,2917, fbxa-6
>
> genetable<-read.table("celegans_chrIII.txt", header=T, sep="\t")
> > head(genetable, n = 2L)
> cTel54X.1 chrIII X. X1270 X2917 X1270.1 X2917.1 X4
> 1 H10E21.2 chrIII + 8855 11940 8855 11940 6
> 2 H10E21.3a chrIII - 12185 14801 12188 14753 7
> X1270.1557.2680.2816.
> 1 8855,9268,9886,10515,10796,11674,
> 2 12185,12475,12964,13463,13811,14521,14740,
> X1507.2167.2764.2917. fbxa.6
> 1 8957,9428,10072,10641,11105,11940, H10E21.2
> 2 12401,12627,13391,13682,14083,14686,14801, nhr-80
> >
>
> ##############################################################
> I note: the preceding 'read.table' command made the 1st data line in
> 'genetable' into sort of a header? The error in the following code line
> cites a problem in row 1 of the 'genetable' data structure.
> ################################################################
>
>
Right, the issue is that read.table treats '#' as a comment. So you need to
either change the file (remove the #) or specify the argument
comment.char="" to workaround that. Also, you might check out the
GenomicFeatures package, which has utilities for working with a data.frame
representation of the UCSC genes table. For example, the 'transcripts'
function will give you a set of regions, including the promoters you're
trying to generate below.
Michael
> >
> promoter<-IRanges(start=genetable$txStart-1000*as.real(genetable$strand=="+"),
> width=1000)
> Error in solveUserSEW0(start = start, end = end, width = width) :
> solving row 1: range cannot be determined from the supplied arguments (too
> many NAs)
>
> > sessionInfo()
> R version 2.10.0 Under development (unstable) (2009-09-11 r49665)
> x86_64-unknown-linux-gnu
>
> locale:
> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
> [5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8
> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
> [9] LC_ADDRESS=C LC_TELEPHONE=C
> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>
> attached base packages:
> [1] stats graphics grDevices utils datasets methods base
>
> other attached packages:
> [1] ShortRead_1.3.40 lattice_0.17-26 rtracklayer_1.5.23 RCurl_1.2-1
> [5] bitops_1.0-4.1 BSgenome_1.13.16 Biostrings_2.13.52 IRanges_1.3.93
>
> loaded via a namespace (and not attached):
> [1] Biobase_2.5.8 grid_2.10.0 hwriter_1.1 XML_2.6-0
> >
>
> Thanks,
> P. Terry
> [email protected]
>
> _______________________________________________
> Bioc-sig-sequencing mailing list
> [email protected]
> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
>
[[alternative HTML version deleted]]
_______________________________________________
Bioc-sig-sequencing mailing list
[email protected]
https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing