(I recently had the same problem downloading dbSnp)

It would be an improvement if the parsing of the download data was inside a
try() statement, with a good error message about USCS possibly truncating
the record.  Also, perhaps mention truncation in the vignette (or make it
more visible, if it is there).

I certainly expected to be able to download (big) tables from UCSC, perhaps
I was naive, but that was my expectation.

Best,
Kasper


On Wed, Oct 9, 2013 at 1:09 PM, Michael Lawrence
<lawrence.mich...@gene.com>wrote:

> It's not feasible to download an entire genome's worth of mappability data
> using rtracklayer and the underlying table browser interface. UCSC has
> limits in place that truncate the response. rtracklayer has little way of
> knowing whether the user is requesting too many records. Just download the
> mappability as a bigwig file via FTP and query that with rtracklayer,
> instead.
>
>
> On Wed, Oct 9, 2013 at 9:45 AM, laurent jacob <laurent.ja...@gmail.com
> >wrote:
>
> > Hi everyone,
> >
> > I'm trying to use the ucscTableQuery function from the rtracklayer
> package
> > to download a mapability table from the ucsc genome browser.
> >
> > Everything works fine if I restrict the query to a small range, but I get
> > an error message when querying the entire genome (at the moment where I
> > convert the UCSCTableQuery using track()):
> >
> > Error in scan(file, what, nmax, sep, dec, quote, skip, nlines,
> > na.strings,  :
> >   scan() expected 'an integer', got 'section'
> >
> > Here is a short example:
> >
> > ---------
> > library(rtracklayer)
> > mySession = browserSession('UCSC')
> > genome(mySession) <- 'hg19'
> > range <- GRanges('chr1', IRanges(start=10013, end=10021))
> > query.range <- ucscTableQuery(mySession, track='wgEncodeMapability',
> >                                   range=range,
> > table='wgEncodeCrgMapabilityAlign100mer')
> >
> > query.full <- ucscTableQuery(mySession, track='wgEncodeMapability',
> >                       range='hg19',
> > table='wgEncodeCrgMapabilityAlign100mer')
> >
> > ## This works
> > track(query.range)
> > ## This fails
> > track(query.full)
> > -----------
> >
> > Do you have any idea of what may cause this error?
> >
> > My sessionInfo() and traceback() of the error are given below.
> >
> > Best,
> >
> > Laurent
> >
> > --------------------------------
> > > sessionInfo()
> > R version 3.0.2 (2013-09-25)
> > Platform: x86_64-pc-linux-gnu (64-bit)
> >
> > locale:
> >  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
> >  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
> >  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
> >  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
> >  [9] LC_ADDRESS=C               LC_TELEPHONE=C
> > [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
> >
> > attached base packages:
> > [1] parallel  stats     graphics  grDevices utils     datasets  methods
> > [8] base
> >
> > other attached packages:
> > [1] rtracklayer_1.21.12   GenomicRanges_1.13.51 XVector_0.1.4
> > [4] IRanges_1.19.38       BiocGenerics_0.7.5
> >
> > loaded via a namespace (and not attached):
> > [1] Biostrings_2.29.19 bitops_1.0-6       BSgenome_1.29.1
> > RCurl_1.95-4.1
> > [5] Rsamtools_1.13.48  stats4_3.0.2       tools_3.0.2
> > XML_3.98-1.1
> > [9] zlibbioc_1.7.0
> > ---------------------------------
> >
> > ---------------------------------
> > > traceback()
> > 34: scan(file = file, what = what, sep = sep, quote = quote, dec = dec,
> >         nmax = nrows, skip = 0, na.strings = na.strings, quiet = TRUE,
> >         fill = fill, strip.white = strip.white, blank.lines.skip =
> > blank.lines.skip,
> >         multi.line = FALSE, comment.char = comment.char, allowEscapes =
> > allowEscapes,
> >         flush = flush, encoding = encoding)
> > 33: read.table(con, colClasses = bedClasses, as.is = TRUE, na.strings =
> > ".",
> >         comment.char = "")
> > 32: DataFrame(read.table(con, colClasses = bedClasses, as.is = TRUE,
> >         na.strings = ".", comment.char = ""))
> > 31: .local(con, format, text, ...)
> > 30: import(FileForFormat(con, format), ...)
> > 29: import(FileForFormat(con, format), ...)
> > 28: import(text = lines, format = "bedGraph", genome = genome,
> > asRangedData = asRangedData,
> >         which = which, seqinfo = seqinfo)
> > 27: import(text = lines, format = "bedGraph", genome = genome,
> > asRangedData = asRangedData,
> >         which = which, seqinfo = seqinfo)
> > 26: .local(con, format, text, ...)
> > 25: import(FileForFormat(con, format), ...)
> > 24: import(FileForFormat(con, format), ...)
> > 23: import(format = subformat, text = text, asRangedData = asRangedData,
> >         genome = genome, ...)
> > 22: import(format = subformat, text = text, asRangedData = asRangedData,
> >         genome = genome, ...)
> > 21: FUN(1L[[1L]], ...)
> > 20: lapply(seq_along(trackLines), makeTrackSet)
> > 19: lapply(seq_along(trackLines), makeTrackSet)
> > 18: .local(con, format, text, ...)
> > 17: import(FileForFormat(con, format), ...)
> > 16: import(FileForFormat(con, format), ...)
> > 15: import(con, "ucsc", ...)
> > 14: import(con, "ucsc", ...)
> > 13: import.ucsc(resource(con), subformat = subformat, ...)
> > 12: import.ucsc(resource(con), subformat = subformat, ...)
> > 11: .local(con, ...)
> > 10: import.ucsc(initialize(file, resource = con), drop = TRUE, trackLine
> =
> > FALSE,
> >         genome = genome, asRangedData = asRangedData, which = which,
> >         seqinfo = seqinfo, ...)
> > 9: import.ucsc(initialize(file, resource = con), drop = TRUE, trackLine =
> > FALSE,
> >        genome = genome, asRangedData = asRangedData, which = which,
> >        seqinfo = seqinfo, ...)
> > 8: .local(con, format, text, ...)
> > 7: import(FileForFormat(con, format), ...)
> > 6: import(FileForFormat(con, format), ...)
> > 5: import(text = output, format = format, asRangedData = asRangedData,
> >        seqinfo = seqinfo(range(object)))
> > 4: import(text = output, format = format, asRangedData = asRangedData,
> >        seqinfo = seqinfo(range(object)))
> > 3: .local(object, ...)
> > 2: track(query.full)
> > 1: track(query.full)
> > --------------------------------------
> >
> >
> > --
> > Laurent Jacob
> > Laboratoire de Biométrie et Biologie Évolutive
> > CNRS/Université Lyon 1
> > http://cbio.ensmp.fr/~ljacob
> >
>
>         [[alternative HTML version deleted]]
>
>
> _______________________________________________
> Bioc-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>
>

        [[alternative HTML version deleted]]

_______________________________________________
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Reply via email to