Re: [galaxy-dev] Contributing to genome indexes on rsync server

2013-10-21 Thread Roman Valls Guimera
Hi Jennifer,

Today I was trying to pull some bowtie2 indices from Galaxy rsync server for 
PhiX to run some tests and just got the ones for bowtie1… I'm wondering what's 
the state in regards to this past thread and what we can do to help in here.

Cheers!
Roman

7 mar 2013 kl. 20:01 skrev Jennifer Jackson j...@bx.psu.edu:

 Hi Brad (and Roman),
 
 The team has talked about this in detail. There are a few wrinkles with just 
 pulling in indexes - Dan is doing some work that could change this later on, 
 but for now, the rsync will continue to point to the same location as Main's 
 genome data source. This means that there are some limits on what we can do 
 immediately. Setting up a submission pipe is one of them - there just isn't 
 resource to do this right now or a common place distinct from Main to house 
 the data. A few other ideas came up - we can chat later, each had side issues.
 
 But I saw your tweet and think that it is great that you are pulling 
 CloudBioLinux data from the rsync now, so let's get as much data in common as 
 possible, so you have data to work with near term.
 
 I am in the process of adding bt2 indexes - some are published to Main/rsync 
 server already and some are not, but more will show up over the next week or 
 so (along with more genomes and other indexes). I'll take a look at what you 
 have and pull/match what I can. Genome sort order and variants are my 
 concerns, both require special handling in processing and .locs. If it takes 
 longer to check, I am just going to create here if I haven't already. The 
 GATK-sort hg19 canonical is already on my list - it needed all indexes, not 
 just bw2. When the next distribution goes out, I'll list what is new on the 
 rsync in the News Brief.
 
 For the Novoalign indexes, I'm not quite sure what to do about those yet. Or 
 for any indexes associated with tools or genomes not hosted on Main. Do you 
 want to open a card for those and any other cases that are similar? We can 
 discuss a strategy from there, maybe at IUC, if Greg/Dan thinks it is 
 appropriate. Please add me so I can follow.
 
 I'll be in touch as I go through the data. Thanks for your patience on this!
 
 Jen
 Galaxy team
 
 On 2/21/13 12:43 PM, Brad Chapman wrote:
 Hi all;
 Is there a way for community members to contribute indexes to the rsync
 server? This resource is awesome and I'm working on migrating the
 CloudBioLinux retrieval scripts to use this instead of the custom S3
 buckets we'd set up previously:
 
 https://github.com/chapmanb/cloudbiolinux/blob/master/cloudbio/biodata/galaxy.py
 
 It's great to have this as a public shared resource and I'd like to be
 able to contribute back. From an initial pass, here are the things I'd
 like to do:
 
 - Include bowtie2 indexes for more genomes.
 
 - Include novoalign indexes for a number of commonly used genomes.
 
 - Clean up hg19 to include a full canonically sorted hg19, with indexes.
   Broad has a nice version prepped so GATK will be happy with it, and
   you need to stick with this ordering if you're ever going to use a
   GATK tool on it. Right now there is a partial hg19canon (without the
   random/haplotype chromosomes) and the structure is a bit complex.
 
 What's the best way to contribute these? Right now I have a lot of the
 indexes on S3. For instance, the hg19 indexes are here:
 
 https://s3.amazonaws.com/biodata/genomes/hg19-bowtie.tar.xz
 https://s3.amazonaws.com/biodata/genomes/hg19-bowtie2.tar.xz
 https://s3.amazonaws.com/biodata/genomes/hg19-bwa.tar.xz
 https://s3.amazonaws.com/biodata/genomes/hg19-novoalign.tar.xz
 https://s3.amazonaws.com/biodata/genomes/hg19-seq.tar.xz
 https://s3.amazonaws.com/biodata/genomes/hg19-ucsc.tar.xz
 
 I'm happy to format these differently or upload somewhere that would
 make it easy to include. Thanks again for setting this up, I'm looking
 forward to working off a shared repository of data,
 Brad
 ___
 Please keep all replies on the list by using reply all
 in your mail client.  To manage your subscriptions to this
 and other Galaxy lists, please use the interface at:
 
   http://lists.bx.psu.edu/
 
 -- 
 Jennifer Hillman-Jackson
 Galaxy Support and Training
 http://galaxyproject.org
 
 ___
 Please keep all replies on the list by using reply all
 in your mail client.  To manage your subscriptions to this
 and other Galaxy lists, please use the interface at:
 
 http://lists.bx.psu.edu/


___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


Re: [galaxy-dev] Galaxy ego fetching error (Mac OS X 10.8)

2013-07-26 Thread Roman Valls Guimera
Hello Galaxy!

I can actually confirm this issue in my macbook air after fetching a fresh 
clone from galaxy-dist (a few minutes ago) and running a:

$ rm -rf eggs/*  ./run.sh
(… many correctly fetched eggs… )
Traceback (most recent call last):
  File ./scripts/fetch_eggs.py, line 37, in module
c.resolve() # Only fetch eggs required by the config
  File /Users/roman/dev/galaxy-dist/lib/galaxy/eggs/__init__.py, line 345, in 
resolve
egg.resolve()
  File /Users/roman/dev/galaxy-dist/lib/galaxy/eggs/__init__.py, line 195, in 
resolve
return self.version_conflict( e.args[0], e.args[1] )
  File /Users/roman/dev/galaxy-dist/lib/galaxy/eggs/__init__.py, line 226, in 
version_conflict
r = pkg_resources.working_set.resolve( ( dist.as_requirement(), ), env, 
egg.fetch )
  File 
/Users/roman/.venvburrito/lib/python/distribute-0.6.49-py2.7.egg/pkg_resources.py,
 line 596, in resolve
raise DistributionNotFound(req)
pkg_resources.DistributionNotFound: mercurial==2.2.3
Fetch failed.

Immediately after running the command above, re-running ./run.sh gives:

Some eggs are out of date, attempting to fetch...
Warning: MarkupSafe (a dependent egg of Mako) cannot be fetched
Warning: pycrypto (a dependent egg of Fabric) cannot be fetched
Warning: SQLAlchemy (a dependent egg of sqlalchemy-migrate) cannot be fetched
Warning: simplejson (a dependent egg of WebHelpers) cannot be fetched
Fetched http://eggs.galaxyproject.org/ssh/ssh-1.7.14-py2.7.egg
One of Galaxy's managed eggs depends on something which is missing, this is 
almost certainly a bug in the egg distribution.
Dependency ssh requires pycrypto=2.1,!=2.4
Traceback (most recent call last):
  File ./scripts/fetch_eggs.py, line 37, in module
c.resolve() # Only fetch eggs required by the config
  File /Users/roman/dev/galaxy-dist/lib/galaxy/eggs/__init__.py, line 345, in 
resolve
egg.resolve()
  File /Users/roman/dev/galaxy-dist/lib/galaxy/eggs/__init__.py, line 168, in 
resolve
dists = pkg_resources.working_set.resolve( ( 
self.distribution.as_requirement(), ), env, self.fetch )
  File 
/Users/roman/.venvburrito/lib/python/distribute-0.6.49-py2.7.egg/pkg_resources.py,
 line 600, in resolve
raise VersionConflict(dist,req) # XXX put more info here
pkg_resources.VersionConflict: (ssh 1.7.14 
(/Users/roman/dev/galaxy-dist/eggs/ssh-1.7.14-py2.7.egg), 
Requirement.parse('pycrypto=2.1,!=2.4'))
Fetch failed.

So I tried to upgrade virtualenv-burrito, just in case it was a problem with 
its pkg_resources being outdated?:

$ virtualenv-burrito update
Everything is up to date.

Then unsetted PYTHONPATH to use MacOSX's python base installation but same 
error appeared:

$ unset PYTHONPATH

Traceback (most recent call last):
  File ./scripts/fetch_eggs.py, line 37, in module
c.resolve() # Only fetch eggs required by the config
  File /Users/roman/dev/galaxy-dist/lib/galaxy/eggs/__init__.py, line 345, in 
resolve
egg.resolve()
  File /Users/roman/dev/galaxy-dist/lib/galaxy/eggs/__init__.py, line 195, in 
resolve
return self.version_conflict( e.args[0], e.args[1] )
  File /Users/roman/dev/galaxy-dist/lib/galaxy/eggs/__init__.py, line 226, in 
version_conflict
r = pkg_resources.working_set.resolve( ( dist.as_requirement(), ), env, 
egg.fetch )
  File 
/Library/Python/2.7/site-packages/distribute-0.6.28-py2.7.egg/pkg_resources.py,
 line 588, in resolve
raise DistributionNotFound(req)
pkg_resources.DistributionNotFound: mercurial==2.2.3
Fetch failed.

Apparently the correct eggs are up there in the galaxy egg repo:

http://eggs.galaxyproject.org/mercurial/

I checked pull requests and bug reports first in trello, but there doesn't seem 
to be a ticket for this one.

Cheers!
Roman


16 jul 2013 kl. 20:59 skrev Iry Witham iry.wit...@jax.org:

 Hi Team,
 
 I am attempting to rebuild my local instance of galaxy on my Mac since having 
 been upgraded to Mountain Lion.  I have installed Mercurial, postgres, 
 python2.7.  I have cloned the latest galaxy-dist.  However, when I attempted 
 to launch galaxy with sh run.sh it fails to complete the fetch.  Here is what 
 I get:
 
 milkyway:galaxy-dist itw$ sh run.sh
 Initializing datatypes_conf.xml from datatypes_conf.xml.sample
 Initializing external_service_types_conf.xml from 
 external_service_types_conf.xml.sample
 Initializing migrated_tools_conf.xml from migrated_tools_conf.xml.sample
 Initializing reports_wsgi.ini from reports_wsgi.ini.sample
 Initializing shed_tool_conf.xml from shed_tool_conf.xml.sample
 Initializing tool_conf.xml from tool_conf.xml.sample
 Initializing shed_tool_data_table_conf.xml from 
 shed_tool_data_table_conf.xml.sample
 Initializing tool_data_table_conf.xml from tool_data_table_conf.xml.sample
 Initializing tool_sheds_conf.xml from tool_sheds_conf.xml.sample
 Initializing data_manager_conf.xml from data_manager_conf.xml.sample
 Initializing shed_data_manager_conf.xml from shed_data_manager_conf.xml.sample
 Initializing openid_conf.xml from openid_conf.xml.sample