It's not feasible to download an entire genome's worth of mappability data using rtracklayer and the underlying table browser interface. UCSC has limits in place that truncate the response. rtracklayer has little way of knowing whether the user is requesting too many records. Just download the mappability as a bigwig file via FTP and query that with rtracklayer, instead.
On Wed, Oct 9, 2013 at 9:45 AM, laurent jacob <laurent.ja...@gmail.com>wrote: > Hi everyone, > > I'm trying to use the ucscTableQuery function from the rtracklayer package > to download a mapability table from the ucsc genome browser. > > Everything works fine if I restrict the query to a small range, but I get > an error message when querying the entire genome (at the moment where I > convert the UCSCTableQuery using track()): > > Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, > na.strings, : > scan() expected 'an integer', got 'section' > > Here is a short example: > > --------- > library(rtracklayer) > mySession = browserSession('UCSC') > genome(mySession) <- 'hg19' > range <- GRanges('chr1', IRanges(start=10013, end=10021)) > query.range <- ucscTableQuery(mySession, track='wgEncodeMapability', > range=range, > table='wgEncodeCrgMapabilityAlign100mer') > > query.full <- ucscTableQuery(mySession, track='wgEncodeMapability', > range='hg19', > table='wgEncodeCrgMapabilityAlign100mer') > > ## This works > track(query.range) > ## This fails > track(query.full) > ----------- > > Do you have any idea of what may cause this error? > > My sessionInfo() and traceback() of the error are given below. > > Best, > > Laurent > > -------------------------------- > > sessionInfo() > R version 3.0.2 (2013-09-25) > Platform: x86_64-pc-linux-gnu (64-bit) > > locale: > [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C > [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 > [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 > [7] LC_PAPER=en_US.UTF-8 LC_NAME=C > [9] LC_ADDRESS=C LC_TELEPHONE=C > [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C > > attached base packages: > [1] parallel stats graphics grDevices utils datasets methods > [8] base > > other attached packages: > [1] rtracklayer_1.21.12 GenomicRanges_1.13.51 XVector_0.1.4 > [4] IRanges_1.19.38 BiocGenerics_0.7.5 > > loaded via a namespace (and not attached): > [1] Biostrings_2.29.19 bitops_1.0-6 BSgenome_1.29.1 > RCurl_1.95-4.1 > [5] Rsamtools_1.13.48 stats4_3.0.2 tools_3.0.2 > XML_3.98-1.1 > [9] zlibbioc_1.7.0 > --------------------------------- > > --------------------------------- > > traceback() > 34: scan(file = file, what = what, sep = sep, quote = quote, dec = dec, > nmax = nrows, skip = 0, na.strings = na.strings, quiet = TRUE, > fill = fill, strip.white = strip.white, blank.lines.skip = > blank.lines.skip, > multi.line = FALSE, comment.char = comment.char, allowEscapes = > allowEscapes, > flush = flush, encoding = encoding) > 33: read.table(con, colClasses = bedClasses, as.is = TRUE, na.strings = > ".", > comment.char = "") > 32: DataFrame(read.table(con, colClasses = bedClasses, as.is = TRUE, > na.strings = ".", comment.char = "")) > 31: .local(con, format, text, ...) > 30: import(FileForFormat(con, format), ...) > 29: import(FileForFormat(con, format), ...) > 28: import(text = lines, format = "bedGraph", genome = genome, > asRangedData = asRangedData, > which = which, seqinfo = seqinfo) > 27: import(text = lines, format = "bedGraph", genome = genome, > asRangedData = asRangedData, > which = which, seqinfo = seqinfo) > 26: .local(con, format, text, ...) > 25: import(FileForFormat(con, format), ...) > 24: import(FileForFormat(con, format), ...) > 23: import(format = subformat, text = text, asRangedData = asRangedData, > genome = genome, ...) > 22: import(format = subformat, text = text, asRangedData = asRangedData, > genome = genome, ...) > 21: FUN(1L[[1L]], ...) > 20: lapply(seq_along(trackLines), makeTrackSet) > 19: lapply(seq_along(trackLines), makeTrackSet) > 18: .local(con, format, text, ...) > 17: import(FileForFormat(con, format), ...) > 16: import(FileForFormat(con, format), ...) > 15: import(con, "ucsc", ...) > 14: import(con, "ucsc", ...) > 13: import.ucsc(resource(con), subformat = subformat, ...) > 12: import.ucsc(resource(con), subformat = subformat, ...) > 11: .local(con, ...) > 10: import.ucsc(initialize(file, resource = con), drop = TRUE, trackLine = > FALSE, > genome = genome, asRangedData = asRangedData, which = which, > seqinfo = seqinfo, ...) > 9: import.ucsc(initialize(file, resource = con), drop = TRUE, trackLine = > FALSE, > genome = genome, asRangedData = asRangedData, which = which, > seqinfo = seqinfo, ...) > 8: .local(con, format, text, ...) > 7: import(FileForFormat(con, format), ...) > 6: import(FileForFormat(con, format), ...) > 5: import(text = output, format = format, asRangedData = asRangedData, > seqinfo = seqinfo(range(object))) > 4: import(text = output, format = format, asRangedData = asRangedData, > seqinfo = seqinfo(range(object))) > 3: .local(object, ...) > 2: track(query.full) > 1: track(query.full) > -------------------------------------- > > > -- > Laurent Jacob > Laboratoire de Biométrie et Biologie Évolutive > CNRS/Université Lyon 1 > http://cbio.ensmp.fr/~ljacob > [[alternative HTML version deleted]]
_______________________________________________ Bioc-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel