Hi Stuart,

If you are able to rsync the Mapability bigWig file from the UCSC downloads server and covert to BED using their compiled tools (also available on same server), then the rest should be fairly straightforward.

1 - Load the data into Galaxy using FTP: http://wiki.g2.bx.psu.edu/FTPUpload

2 - Merge the fragmented intervals into ranges that better suit your needs with Galaxy tools in the group "Operate on Genomic Intervals", in particular see the "Merge" and "Cluster" tools.

This data is large, but the only way to determine if it is too large to run on the public main instance is to try. If you end up with a memory error, then moving to a local or cloud instance would be the recommendation. Full instructions are here: http://usegalaxy.org

Hopefully this simplifies the process for you!

Best,

Jen
Galaxy team

On 5/1/12 9:02 AM, Brown, Stuart wrote:

I want to make an intersection between a few hundreds of genomic intervals 
(predicted translocation sites from SVDetect) and low mappability regions in 
genomes (we are working with mm9 right now).

UCSC has an excellent mappability track that exactly matches our sequencing 
data (50 bp kmers), but it seems very difficult to get that data into Galaxy. I 
want a BED format that summarizes intervals of low mappability (ie. less than 
0.5 on the scale used by UCSC). The UCSC Table Browser has a limit of 10M 
lines, which seems to give just part of chromosome 1. It will be very messy to 
try to get the whole genome bit by bit using this method and then stitch it 
back together using some sort of concatenation.

UCSC Help suggests downloading the mappability data for the whole genome as a 
bigwig formatted file, then convert to BED. I gave this a try, but we get a 4 
GB file, with intervals of just one or two base pairs. Again, lots of work to 
get back to the nicer BED that I could make with the UCSC tools over smaller 
genomic regions. Also, super-painful to upload this huge file to Galaxy, and 
unhappy trying to write my own parsers to filter and smooth this file.

Any other suggestions? Maybe someone else knows where to find a mappability 
file (for mm9) that has nice intervals in a Galaxy compatible format.

—Stuart Brown



___________________________________________________________
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

   http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

   http://lists.bx.psu.edu/


--
Jennifer Jackson
http://galaxyproject.org
___________________________________________________________
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

 http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

 http://lists.bx.psu.edu/

Reply via email to