It took me a couple of extra days to get the time to test this, sorry about
the delay.

The initial run:

WebError 0.8a couldn't be downloaded automatically.  You can try
building it by hand with:
  python scripts/ -e WebError
Fetch failed.

When I patch a traceback.print_exc() into , this is what I

Some eggs are out of date, attempting to fetch...
Warning: sqlalchemy (a dependent egg of sqlalchemy-migrate) cannot be
Traceback (most recent call last):
  File "./scripts/", line 36, in <module>
    c.resolve() # Only fetch eggs required by the config
  File "/opt/galaxy/lib/galaxy/eggs/", line 349, in resolve
    raise EggNotFetchable( missing )
EggNotFetchable: <unprintable EggNotFetchable object>
WebError 0.8a couldn't be downloaded automatically.  You can try
building it by hand with:
  python scripts/ -e WebError
Fetch failed.

If I look into lib/galaxy/eggs , that line is here:

In the current galaxy source, that is here:

As a bit of an aside, I really do have to install this other package in the
same python that galaxy uses, since it's one of the major parts of our fork
-- not part of our custom tools.
I wouldn't consider this a normal use case of course, but I'll bet that
many people will install galaxy with system python, which can lead to
similar circumstances.

Please let me know if I can provide any other useful information on what
I'm seeing.

Again, thanks for looking into this with me,

On Thu, Feb 5, 2015 at 8:19 AM, Nate Coraor <> wrote:

> Hi Stephen,
> On Wed, Feb 4, 2015 at 8:30 PM, Stephen Rosen <>
> wrote:
>> Hi Nate,
>> Thanks so much for helping me out with this.
>> It seems that I've miscommunicated what I'm doing a little bit.
>> I'm installing a package from that git repo using pip separately from
>> running the galaxy setup script.
>> That is, I want to install a python package and I want to setup galaxy as
>> two separate (and ideally independent) steps in my script.
>> I have no desire to do anything particularly esoteric or clever with
>> Galaxy's eggs or to install them myself.
>> Somehow, doing a pip install of my desired package is breaking the later
>> run of
>> If I understand you correctly, can hit conflicts with
>> site-packages, but is misreporting them as EggNotFetchable.
>> I am aware that we're using a fork of the Galaxy source (
>> ), so it may be that we
>> have an outdated of which carries a bug that has since been
>> patched.
>> Sorry for not mentioning it earlier -- it slipped my mind that we're
>> probably trailing behind the modern Galaxy default head.
>> If I put in the extra legwork now to wrap Galaxy in a virtualenv, this
>> issue will presumably disappear, but that assumes that I don't want to add
>> any packages to the python used for the uwsgi process.
> You can add additional packages to this virtualenv, although there could
> be problems if you install different versions of the same packages that
> Galaxy has as dependencies. For what it's worth, you should not need to
> install additional packages in Galaxy's virtualenv, and tools can be
> configured to use a different python or virtualenv than the one in which
> Galaxy is started.
>> I'll have to ask folks on the Globus Genomics team whether or not that is
>> needed -- I certainly hope not, since it means that there is some packaging
>> conflict we need to resolve.
>> The only reason I haven't done so already is time constraint -- I've been
>> trying to get this install scripting done as quickly as possible without
>> compromising anything truly needful.
>> All the same, I'll take some time tomorrow to provision a fresh server
>> and generate the stacktrace for you.
>> I'll also take some time to test with the latest Galaxy source, to see if
>> I get different behavior.
>> In the best case, this bug no longer exists for the modern source, but in
>> the worst case it bears the attention.
> Thanks for the help with debugging.
>> I don't have any sense of the reliability of and company --
>> my experience has been of a particular bug that presented on my first
>> contact with these scripts, so naturally I have a bias against them.
>> That said, I think the Python community may have finally settled on a
>> package manager, even if we can't seem to agree on a package format or
>> tooling surrounding it.
>> That's just an opinion though -- I have no blog post from Guido to back
>> it up.
> I hope we'll see the necessary improvements within the next year, but I
> think anything we do now will essentially reinvent the wheel (no pun
> intended) with regards to what will (hopefully) be officially implemented
> to fix the problem.
> --nate
>> I was not aware of the UCS2 vs. UCS4 issue -- thanks very much for the
>> citations, very helpful in understanding the problem space.
>> Thanks,
>> -Stephen
>> On Wed, Feb 4, 2015 at 5:41 PM, Nate Coraor <> wrote:
>>> Hi Stephen,
>>> I'll try to reply as in-depth as I can.
>>> On Wed, Feb 4, 2015 at 1:41 PM, Stephen Rosen <>
>>> wrote:
>>>> Hi Galaxy Dev,
>>>> I've been looking at the setup scripts for Galaxy to try to understand
>>>> a problem I recently had provisioning a Galaxy server.
>>>> I will readily admit that I have not read all of the relevant code
>>>> top-to-bottom, but I have at least skimmed all of it and read much of it.
>>>> Sorry if these questions are answered somewhere in Trello, the Wiki, or
>>>> somewhere else, but I was not able to find answers in any public locations.
>>>> As a small bit of probably irrelevant context:
>>>> I'm working with the Globus Genomics group on the DevOps side of things.
>>>> We're using Chef.
>>>> I've only just started working with the group in the past couple of
>>>> weeks (so my expertise with Galaxy itself is limited to nonexistent).
>>>> First, to describe the problem:
>>>> We want to provision a server running Galaxy without explicitly
>>>> wrapping it in a virtualenv.
>>>> Unless I missed something, that means that it's using system python.
>>>> When we use pip to install a package from a git repository before
>>>> running the setup scripts, fetch_eggs fails saying it failed to fetch
>>>> WebError 0.8a
>>>> If we install the same package from git with `pip install --egg ...` we
>>>> get a hunky-dory system where everything seems to work.
>>> A virtualenv is itself just a wrapper around whichever python binary was
>>> used to create it. I'd still suggest using a virtualenv created with the
>>> system python unless you have a really strong reason not to. In fact, I'm
>>> working on Galaxy process management and a command line tool for that
>>> management that will automatically create and use a virtualenv going
>>> forward.
>>> I'm a bit confused at what's happening here - you mention installing a
>>> package from a git repository with pip, but then refer to Galaxy's
>>> fetch_eggs(.py) script, which doesn't use pip or git.
>>>> As far as I can tell, there is no reason that this should be the case.
>>>> Sure, putting the git source directly into site-packages might cause
>>>> issues upon installation, but EggNotFetchable exceptions should only be
>>>> thrown if the egg actually can't be pulled down from
>>>> , right?
>>> EggNotFetchable can be thrown if you happen to be using a platform for
>>> which we do not provide eggs, although those are fairly uncommon. Right now
>>> we should cover x86/x86_64 Linux and any flavor of Intel OS X after 10.5.
>>>> I don't feel comfortable trying to make further progress on my
>>>> provisioning scripts without knowing why this is happening.
>>>> I'd hate to be bitten by this later on in the process.
>>>> Yes, the package in question may have poor behavior (likely it does),
>>>> but that doesn't change the fact that the error is totally misleading.
>>>> Furthermore, it doesn't appear that this poor behavior impedes me from
>>>> doing a pip install of the WebError package or any other packages from 
>>>> PyPI.
>>>> In case someone else wants to test to replicate, this is the command
>>>> being used:
>>>>   pip install --egg git+
>>>> problems occur if you omit `--egg`.
>>> None of this process is using Galaxy's egg handling, so I am not sure
>>> where the EggNotFetchable is coming from. What command are you running to
>>> get an EggNotFetchable error.
>>>> Second, a question about the rationale for Galaxy's egg handling:
>>>> Why is all of this wrapped up in these scripts in the first place?
>>>> I understand that pip might not be present on every platform, and I
>>>> don't mean to question a decision to support systems without it.
>>>> However, as detailed below, Galaxy does not support any platforms which
>>>> are incapable of running pip.
>>> This isn't the case - Galaxy does not use pip to install the framework
>>> dependencies at all. Some tool dependencies installed from the Tool Shed do
>>> use pip, but that's entirely separate from the dependencies of the Galaxy
>>> application.
>>> The `scripts/` script can be used to automatically build eggs
>>> on platforms which we do not prebuild eggs for. If this is necessary,
>>> `scripts/` should tell you.
>>>> Furthermore, pip is being pushed by the Python maintainers over
>>>> easy_install, so it's not like there isn't a clear choice in terms of which
>>>> one to support.
>>>> Perhaps most importantly, there don't appear to be any clear-cut
>>>> options to do the following, which I would consider a more ordinary
>>>> workflow:
>>>> - Run a galaxy script (like check_eggs) to generate a list of packages
>>>> from for platform (redirect output to
>>>> requirements.txt or similar)
>>>> - `pip install -r requirements.txt`
>>> This is exactly what `scripts/` and `scripts/`
>>> do.
>>> There are 3 reasons for the way we handle eggs in Galaxy:
>>> 1. Galaxy has a huge (and ever-growing) list of dependent python modules
>>> with C extensions. If we did not prebuild and distribute eggs for these,
>>> the initial setup to get Galaxy running would be long and problematic. Some
>>> people who download Galaxy to develop tools may not even have compilers
>>> installed, let alone the multitude of -dev or -devel packages that aren't
>>> part of a default Debian or RHEL installation that would be required to
>>> build all of these packages from source. One of the things that I feel
>>> makes Galaxy so accessible is that you can start using it immediately after
>>> you clone the source. So that ability to clone and start and have it work
>>> as reliably (and quickly) as possible is a high priority.
>>> 2. Galaxy started using eggs in 2005 or 2006. At this time, everything
>>> used distutils. pkg_resources came around, which soon brought setuptools
>>> and easy_install. After this came distutils2, pip and finally, these days,
>>> wheels. Our need for binary dependency packaging predated almost all of
>>> these (in fact, most packages in these days didn't even install .egg-info,
>>> which was the only reliable way to know what version of a module you were
>>> using) and as each new iteration of packaging/management came along it was
>>> never clear that any of them had "won" (and in fact, most of them lost). On
>>> top of this, the Python packaging folks have known for years that Python's
>>> platform detection for binary compatibility is broken[1]. While I was
>>> assured it'd be fixed soon, even with a complete reimplementation of Python
>>> packaging (wheels), they still haven't even made an effort to fix this
>>> problem[2]. In fact, binary wheels for Linux are explicitly not allowed on
>>> PyPI because of this.
>>> 3. We tightly control the versions of all of our dependencies, which is
>>> not always possible with pip if you aren't also controlling the source of
>>> your packages.
>>> [1]
>>> [2]
>>> The above could be part of a galaxy provisioning script, rather than
>>>> exposed to the administrator.
>>>> That also makes it significantly easier to control and manage the
>>>> virtualenv in which we run Galaxy, since we don't have to worry about
>>>> egg-related logic that we don't control and we know that the virtualenv's
>>>> bin dir will be earlier in the PATH than the system pip's dir.
>>>> Yes, I said above that our setup is presently using system python --
>>>> switching to a virtualenv is one of the many items on my to-do list.
>>>> In fact, I would expect that the default, desired setup for Galaxy
>>>> would be to put it in a virtualenv, rather than using system python, and to
>>>> use pip, rather than and company.
>>> A virtualenv is indeed the strongly preferred setup as I mentioned
>>> above. However, Galaxy does not install its eggs to the virtualenv. The
>>> virtualenv is there to avoid conflicts with things in the default python's
>>> site-packages/dist-packages. Galaxy's eggs are installed to (by default)
>>> the `eggs/` directory in the Galaxy source.
>>> However, the problem, as I now see it, is that you are trying to install
>>> all of Galaxy's dependencies, even at their correct versions, using pip,
>>> rather than letting Galaxy handle its eggs as it does. This is not going to
>>> work, Galaxy is going to insist on using its eggs.
>>>> When I look at the logic being used here, especially at
>>>> , it looks like a solution built exclusively for platforms on which pip is
>>>> not installed.
>>>> According to the wiki, Galaxy support only goes back as far as 2.6, and
>>>> supports 2.6, so there is no way of building a Galaxy server on
>>>> a platform that can't also have pip installed.
>>>> Adding pip to the requirements for Galaxy would not be particularly
>>>> onerous, and may simplify things significantly (no need to bundle
>>>> or similar).
>>> As mentioned above, we don't use or depend on pip, so it's not required.
>>> That said, a lot of our egg fetching logic could likely be replaced with
>>> pip (this code predates pip by a few years). And the eggs could probably be
>>> replaced with prebuilt wheels. However, even if we did use pip/wheels, we'd
>>> need to install them from our own repository, and it'd still require
>>> modifications for binary platform incompatibilities. The egg handling code
>>> we have now works pretty reliably, so I am not sure there is a whole lot to
>>> be gained by changing it until Python finally figures out how to handle
>>> binary compatibility properly.
>>>> As a last note about the misleading error from the fetch_eggs script,
>>>> telling me that WebError is "NotFetchable".
>>>> I probably wouldn't have much of a complaint about this if the error
>>>> had been more on target and I hadn't felt the need to do things like patch
>>>> lib/galaxy/eggs to print a stacktrace.
>>>> For example, if the script detected that there was a source installed
>>>> package which was getting underfoot, it should have alerted me or even
>>>> suggested installing my various packages in egg format.
>>> This is a bug - it is supposed to explain that there is a version
>>> conflict. If you have a stack trace, please send it along.
>>>> That kind of error detection is hard to maintain and hard to keep
>>>> accurate since the Galaxy team's priority is to build Galaxy, not a package
>>>> manager.
>>>> This, I suppose, circles back to an earlier question: why isn't Galaxy
>>>> using a python package manager to... manage its packages?
>>>> At the very least something along the lines of
>>>> `./scripts/ --use-pip` should be added.
>>>> It doesn't seem like it would be that hard to implement -- but I feel
>>>> like my lack of knowledge of Galaxy disqualifies me from building the
>>>> changeset reliably.
>>>> Of course, if no one objects, I will readily do so anyway (when I have
>>>> the time) as a proof of concept.
>>> There are modifications to a few of the dependencies, such as psycopg2.
>>> When psycopg2 is scrambled, the scrambling process fetches and compiles a
>>> bit of PostgreSQL's libpq and then statically links to it to provide a
>>> standalone egg that does not depend on the user having installed libpq on
>>> their system. So a pip-based system that allowed building from source would
>>> need to account for this.
>>> --nate
>>>> Barring even that improvement, the Wiki page at
>>>> should definitely be
>>>> updated to include some note on why Galaxy has this complex logic for
>>>> fetching eggs instead of using pip.
>>>> A quick tl;dr and summary:
>>>> - I don't have extensive experience with Galaxy, so I may not know what
>>>> I'm talking about.
>>>> - can be made to raise EggNotFetchable by doing a pip
>>>> install from a VCS without using `--egg`. This is a bug in
>>>> - All supported platforms for Galaxy support pip
>>>> - Galaxy should have an option to use pip to download its packages over
>>>> https from
>>>> - Galaxy should probably default to using pip when it's available,
>>>> since its failure modes are significantly better than a home-brewed package
>>>> manager -- this also leads to good behavior in virtualenvs.
>>>> Best regards, and many thanks for your time and attention,
>>>> -Stephen
>>>> ___________________________________________________________
>>>> Please keep all replies on the list by using "reply all"
>>>> in your mail client.  To manage your subscriptions to this
>>>> and other Galaxy lists, please use the interface at:
>>>> To search Galaxy mailing lists use the unified search at:
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

To search Galaxy mailing lists use the unified search at:

Reply via email to