to chime into this discussion.
I found some inconsistency during my rsync endeavor and I'm curious if
there is any way to contribute to that service.
xenTro3 xenTro3 Frog (Xenopus tropicalis):
ce6 /data/0/ref_genomes/ce6/ce6.2bit is missing from twobit.loc
ce6 has no .fa file under seq/ but in allfasta.loc there is a reference
to it ce6 Caenorhabditis elegans: ce6 /galaxy/data/ce6/seq/ce6.fa
TAIR9 and TAIR10 is not available via rync
Bowtie2 indices are missing for ce6, xentTro3
> Hi Jennifer,
> Today I was trying to pull some bowtie2 indices from Galaxy rsync server for
> PhiX to run some tests and just got the ones for bowtie1… I'm wondering
> what's the state in regards to this past thread and what we can do to help in
> 7 mar 2013 kl. 20:01 skrev Jennifer Jackson <j...@bx.psu.edu>:
> > Hi Brad (and Roman),
> > The team has talked about this in detail. There are a few wrinkles with
> > just pulling in indexes - Dan is doing some work that could change this
> > later on, but for now, the rsync will continue to point to the same
> > location as Main's genome data source. This means that there are some
> > limits on what we can do immediately. Setting up a submission pipe is one
> > of them - there just isn't resource to do this right now or a common place
> > distinct from Main to house the data. A few other ideas came up - we can
> > chat later, each had side issues.
> > But I saw your tweet and think that it is great that you are pulling
> > CloudBioLinux data from the rsync now, so let's get as much data in common
> > as possible, so you have data to work with near term.
> > I am in the process of adding bt2 indexes - some are published to
> > Main/rsync server already and some are not, but more will show up over the
> > next week or so (along with more genomes and other indexes). I'll take a
> > look at what you have and pull/match what I can. Genome sort order and
> > variants are my concerns, both require special handling in processing and
> > .locs. If it takes longer to check, I am just going to create here if I
> > haven't already. The GATK-sort hg19 canonical is already on my list - it
> > needed all indexes, not just bw2. When the next distribution goes out, I'll
> > list what is new on the rsync in the News Brief.
> > For the Novoalign indexes, I'm not quite sure what to do about those yet.
> > Or for any indexes associated with tools or genomes not hosted on Main. Do
> > you want to open a card for those and any other cases that are similar? We
> > can discuss a strategy from there, maybe at IUC, if Greg/Dan thinks it is
> > appropriate. Please add me so I can follow.
> > I'll be in touch as I go through the data. Thanks for your patience on this!
> > Jen
> > Galaxy team
> > On 2/21/13 12:43 PM, Brad Chapman wrote:
> >> Hi all;
> >> Is there a way for community members to contribute indexes to the rsync
> >> server? This resource is awesome and I'm working on migrating the
> >> CloudBioLinux retrieval scripts to use this instead of the custom S3
> >> buckets we'd set up previously:
> >> https://github.com/chapmanb/cloudbiolinux/blob/master/cloudbio/biodata/galaxy.py
> >> It's great to have this as a public shared resource and I'd like to be
> >> able to contribute back. From an initial pass, here are the things I'd
> >> like to do:
> >> - Include bowtie2 indexes for more genomes.
> >> - Include novoalign indexes for a number of commonly used genomes.
> >> - Clean up hg19 to include a full canonically sorted hg19, with indexes.
> >> Broad has a nice version prepped so GATK will be happy with it, and
> >> you need to stick with this ordering if you're ever going to use a
> >> GATK tool on it. Right now there is a partial hg19canon (without the
> >> random/haplotype chromosomes) and the structure is a bit complex.
> >> What's the best way to contribute these? Right now I have a lot of the
> >> indexes on S3. For instance, the hg19 indexes are here:
> >> https://s3.amazonaws.com/biodata/genomes/hg19-bowtie.tar.xz
> >> https://s3.amazonaws.com/biodata/genomes/hg19-bowtie2.tar.xz
> >> https://s3.amazonaws.com/biodata/genomes/hg19-bwa.tar.xz
> >> https://s3.amazonaws.com/biodata/genomes/hg19-novoalign.tar.xz
> >> https://s3.amazonaws.com/biodata/genomes/hg19-seq.tar.xz
> >> https://s3.amazonaws.com/biodata/genomes/hg19-ucsc.tar.xz
> >> I'm happy to format these differently or upload somewhere that would
> >> make it easy to include. Thanks again for setting this up, I'm looking
> >> forward to working off a shared repository of data,
> >> Brad
> >> ___________________________________________________________
> >> Please keep all replies on the list by using "reply all"
> >> in your mail client. To manage your subscriptions to this
> >> and other Galaxy lists, please use the interface at:
> >> http://lists.bx.psu.edu/
> > --
> > Jennifer Hillman-Jackson
> > Galaxy Support and Training
> > http://galaxyproject.org
> > ___________________________________________________________
> > Please keep all replies on the list by using "reply all"
> > in your mail client. To manage your subscriptions to this
> > and other Galaxy lists, please use the interface at:
> > http://lists.bx.psu.edu/
> Please keep all replies on the list by using "reply all"
> in your mail client. To manage your subscriptions to this
> and other Galaxy lists, please use the interface at:
> To search Galaxy mailing lists use the unified search at:
Please keep all replies on the list by using "reply all"
in your mail client. To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
To search Galaxy mailing lists use the unified search at: