first, best wishes for the new year!
my apologies, best whishes to you (and all others ;)
On 05 Jan, 2013, at 05:18, Stijn De Weirdt wrote:
disclaimer: i was never a big fan of the auto-download/source_urls option in EB
for exact this reason, ie sometimes EB will appear to be failing, due to
external reasons (however, the augmented userfriendliness far outweighs this)
A common case "for the impatient user" like me is, that you parallelize builds
and you end up overwriting the downloads (eg. GCC-Cloog fighting with plain
GCC).
I have done it a few times already and it takes a bit to understand what the
issue is.
Whatever solution we may pick, it will likely address nicely this case, too.
i'm not sure, but fixing fw issue #413 will ;)
For instance, if you try to build DOLFIN from zero, you will quickly realize
that that is doomed to fail, since source for MTL4/4.0.8878 is no longer avail.
There are work-arounds that someone can do (eg. I now symlink the newer version,
faking it to be older) but, these are all inferior to having a copy of the
tarball.
please contribute the newer working .eb files so others don't run into this
same issue
Done, along with its deps: easyconfigs/#76
thx!
Now, I understand that not everything will be possible but, I would really
like we had a mirroring solution, at least for the open source software codes.
i'm not sure what you mean with "open source", but afaik it is mainly the
software license that defnies who can distribute what.
Yeah, better clarify: normally OSS codes, as per OSI definition, allow infinite
redistribution;
I hereby liberally put under OSS label all such codes but further clarification
may be needed.
For now, let's focus on the codes which present no problem to build a combo
repo from.
(whatever that definition means)
but something tells me that typically these codes can always be found
somewhere, and the intel example you gave and want solved is not such a
case.
one thing that is missing to automate this is that the license is not
part of the .eb file. if it were, we could certainly create a public
repository with what we have collected. keeping it in sync with other
sites is then a datamanagement issue.
a) Have you heard of any other kind of "open source" registry project,
perhaps something that we could ride on registering specific tarballs?
most p2p tools offer something along these lines. (but i'm still waiting for a
fuse interface ;)
FUSE and any filesystem will likely give you one more interface with potential
new failure modes;
but, we can shop from the idea and export the data via AFS, your p2p-FUSE code
and so on ;-)
ie. it could be yet another interface to export the bunch of files, and that's
easily doable.
b) What technology do you think should be deployed? What are your preferences?
(http, ftp, git, rsync, zsync ... whatever you think should be offered)
most have master/slave models, so that might not be what you are looking for
(granted, you can write scripts to keep things in sync).
a better solution would be distributed filesystem with one (or more) replica
per site.
i'm aware of a project called REDDnet/L-store that aims for something similar.
i'm not sure if accessing data from a WAN shared filesystem counts as
distributing, so that might be a way out.
Do people prefer to have an exact local complete copy or rather a "subset
cache"? your stance?
a full namespace cache, and then 2 modes: full cache or cache on access.
(our current source dir is a bit less then 200GB; so at least a few
sites should be able to provide full caches)
One aspect I like about zsync or git, is that hashes make the copy mechanism
irrelevant
(if there is an issue, you are going to catch it). With that property on,
anything goes!
ie. why limit ourselves? every HPC site could pick/introduce the technology it
prefers.
because of the management part of data management. both zsync and git
have master/client issues, so you will need to write quite a bit of code
to keep those hidden from the end-user or setup a limited number of
"golden" sites.
eg hashes are fine, but how are we going to distribute the hashes in a
secure way? use PKI and trust the golden sites?
But I agree there are important design criteria to fulfill and the picture is
not 100% clear.
We will likely start with something and fix as we go.
sure.
c) Is your preference to integrate this to easybuild or, perhaps, keep it
orthogonal?
(eg. someone could bootstrap .local/easybuild/source with zsync & then let
it go)
i would keep it out of EB. integration should be provided, but it is a data
management issue, EB has enough problems to deal with as is. (but i'll be more
then happy to test/contribute ;)
Good. Orthogonality has the advantage of being future-proof.
ps3.
Ubuntu's "LTS" lineage is a good example of a commendable vendor retention
policy.
Somehow, not everybody around understands the universal need for LTS style
solutions...
what do you mean? LTS is only 5 years, hardly the lifetime of any of long
running experiment ;)
IMHO, long running experiments have eventually to face the fate of their
obsolete hardware :-(
Here is a description of good design aspects of LTS: https://wiki.ubuntu.com/LTS
(I am not promoting LTS, RHEL has even longer life cycle; I just don't have its
commitments).
Anyway, the point above is that typically software providers are not as serious
as OS vendors for backward compatibility, at the expense of user time.
eg. Intel, with a 3-year window, may pull the plug of past compilers/MPI stack
etc.
If you know some contradicting story, let it be known...
thanks,
Fotis