Re: [Beowulf] experience with HPC running on OpenStack

2020-06-30 Thread INKozin via Beowulf
Hi Jörg,
you may want to talk to Kenneth Tan @ https://www.sardinasystems.com/
In any case, he should be able to clarify your question regarding IB.
Happy to provide his email address.
Best
Igor

On Tue, 30 Jun 2020 at 12:22, Jörg Saßmannshausen <
sassy-w...@sassy.formativ.net> wrote:

> Dear all,
>
> we are currently planning a new cluster and this time around the idea was
> to
> use OpenStack for the HPC part of the cluster as well.
>
> I was wondering if somebody has some first hand experiences on the list
> here.
> One of the things we currently are not so sure about it is InfiniBand (or
> another low latency network connection but not ethernet): Can you run HPC
> jobs
> on OpenStack which require more than the number of cores within a box? I
> am
> thinking of programs like CP2K, GROMACS, NWChem (if that sounds familiar
> to
> you) which utilise these kind of networks very well.
>
> I cam across things like MagicCastle from Computing Canada but as far as I
> understand it, they are not using it for production (yet).
>
> Is anybody on here familiar with this?
>
> All the best from London
>
> Jörg
>
>
>
> ___
> Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit
> https://beowulf.org/cgi-bin/mailman/listinfo/beowulf
>
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
https://beowulf.org/cgi-bin/mailman/listinfo/beowulf


Re: [Beowulf] Here we go again

2019-12-12 Thread INKozin via Beowulf
All i can say is that AMD has finally released its own build of HPL and
that wasn't the case when Naples came out.

On Thu, 12 Dec 2019, 14:35 Douglas Eadline,  wrote:

>
> Anyone see anything like this with Epyc, i.e. poor AMD performance
> when using Intel compilers or MKL?
>
>
> https://www.pugetsystems.com/labs/hpc/AMD-Ryzen-3900X-vs-Intel-Xeon-2175W-Python-numpy---MKL-vs-OpenBLAS-1560/
>
>
>
> --
> Doug
>
> ___
> Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit
> https://beowulf.org/cgi-bin/mailman/listinfo/beowulf
>
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
https://beowulf.org/cgi-bin/mailman/listinfo/beowulf


Re: [Beowulf] HPE completes Cray acquisition

2019-09-25 Thread INKozin via Beowulf
Interesting that HPC and AI are in one business unit. Not everyone agrees
that the two are aligned or complimentary (inferencing on the edge being
one example).

On Wed, 25 Sep 2019, 17:09 Christopher Samuel,  wrote:

> Cray joins SGI as part of the HPE stable:
>
>
> https://www.hpe.com/us/en/newsroom/press-release/2019/09/hpe-completes-acquisition-of-supercomputing-leader-cray-inc.html
>
>  > As part of the acquisition, Cray president and CEO Peter Ungaro, will
> join HPE as head of the HPC and AI business unit in Hybrid IT.
>
> All the best,
> Chris
> --
>Chris Samuel  :  http://www.csamuel.org/  :  Berkeley, CA, USA
> ___
> Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit
> https://beowulf.org/cgi-bin/mailman/listinfo/beowulf
>
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
https://beowulf.org/cgi-bin/mailman/listinfo/beowulf


Re: [Beowulf] Lustre on google cloud

2019-07-26 Thread INKozin via Beowulf
I'm very much in favour of personal or team clusters as Chris has also
mentioned. Then the contract between the user and the cloud is explicit.
The data can be uploaded/ pre staged to S3 in advance (at no cost other
than time) or copied directly as part of the cluster creation process. It
makes no sense to replicate in the cloud your in-house infrastructure.
However having a solid storage base in-house is good. What you should look
into is the cost of transfer back if you really have to do it. The cost
could be prohibitively high, eg if Bam files need to be returned. I'm sure
Tim has an opinion.

On Fri, 26 Jul 2019, 05:01 Joe Landman,  wrote:

>
> On 7/25/19 8:26 PM, Jörg Saßmannshausen wrote:
> > Dear all, dear Chris,
> >
> > thanks for the detailed explanation. We are currently looking into cloud-
> > bursting so your email was very timely for me as I am suppose to look
> into it.
> >
> > One of the issues I can see with our workload is simply getting data
> into the
> > cloud and back out again. We are not talking about a few Gigs here, we
> are
> > talking up to say 1 or more TB. For reference: we got 9 PB of storage
> (GPFS)
> > of which we are currently using 7 PB and there are around 1000+ users
> > connected to the system. So cloud bursting would only be possible in some
> > cases.
> > Do you happen to have a feeling of how to handle the issue with the file
> sizes
> > sensibly?
>
> The issue is bursting with large data sets.  You might be able to
> pre-stage some portion of the data set in a public cloud, and then burst
> jobs from there.  Data motion between sites is going to be the hard
> problem in the mix.  Not technically hard, but hard from a cost/time
> perspective.
>
>
> --
> Joe Landman
> e: joe.land...@gmail.com
> t: @hpcjoe
> w: https://scalability.org
> g: https://github.com/joelandman
> l: https://www.linkedin.com/in/joelandman
>
> ___
> Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit
> https://beowulf.org/cgi-bin/mailman/listinfo/beowulf
>
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
https://beowulf.org/cgi-bin/mailman/listinfo/beowulf


Re: [Beowulf] flatpack

2019-07-23 Thread INKozin via Beowulf
Hi Jonathan,
Thanks, good to know.
I have tried running Krita on Centos recently but run into glibc issue
using Appimage. Might as well try flatpack which seems to be available.
Igor

On Mon, 22 Jul 2019, 03:31 Jonathan Engwall, <
engwalljonathanther...@gmail.com> wrote:

> Hello Beowulf,
> Some distros will be glad to know Flatpack will load your software center
> with working downloads. First visit the website:
> https://flatpak.org/setup/ , choose your distro, then enable it for your
> installation.
> After that, your software center, this is for GNOME, will then quadruple,
> at least, in size if your has been lacking. Ubuntu, for instance, has
> always been loaded.
> This works for CentOS, Fedora, also therefore RedHat which I have never
> used. Rasbian and several others are on the page.
> Attached:
> In the screenshot attached below you can see I now have Godot Engine. You
> can see Visual Scripting of a simple 2d ui_left, right type game. It looks
> tedious, but it is so easy to redesign.
> Of course I had a problem, no CallNode. Probably because I enabled 3d
> features. So I tried one around then another. I was dragging nodes, piping
> physics, I ran the normalized vector through the update. I stuffed it all
> into Return, many various things.
> I nearly got through it too. At one point a clear error: Size was (1). But
> LoL I didn't watch the demos so I didn't know how I did that either.
> Jonathan Engwall
>
>  Screenshot from 2019-07-20 18-32-30.png
> 
> ___
> Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit
> https://beowulf.org/cgi-bin/mailman/listinfo/beowulf
>
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
https://beowulf.org/cgi-bin/mailman/listinfo/beowulf


Re: [Beowulf] help for metadata-intensive jobs (imagenet)

2019-06-29 Thread INKozin via Beowulf
Hi John, great to hear from you. I assume you are asking about image
augmentation and pre-processing.
There are more or less standard steps to organise the downloaded images. If
you google you should be able to find suitable scripts. I recalled I
followed the ones provided by Soumith Chintala but he also used bits
provided by someone else. The thing is you do it once and then forget about
it. You can also remove some bad images. I recall there are some which give
a warning on read due to bad EXIF info etc, these can be over-written.
Cropping to the relevant area using the bounding boxes might be an
interesting option.
Augmentation is more interesting. There are many papers covering the
overall training process from scratch. Reading "Training ImageNet in one
hour " could be one starting option https://arxiv.org/abs/1706.02677
Then follow the references on data augmentation and you'll end up with a
few key papers which everyone references.
The ResNet "school" does things slightly differently than VGG.
Horovod provides examples for starters
https://github.com/horovod/horovod/tree/master/examples
What they don't do is random cropping.
Also keep in mind how the final quality of the training is assessed -
random crop, central crop, nine crops + reflection etc.

Thanks for the pointer to the new meetup. I love both HPC and AI. However I
don't see the announcement about the meeting on 21 August. Hope it will
appear later.

On Sat, 29 Jun 2019 at 07:49, John Hearns via Beowulf 
wrote:

> Igor, if there are any papers published on what you are doing with these
> images I would be very interested.
> I went to the new London HPC and AI Meetup on Thursday, one talk was by
> Odin Vision which was excellent.
> Recommend the new Meetup to anyone in the area. Next meeting 21st August.
>
> And a plug to Verne Global - they provided free Icelandic beer.
>
> On Sat, 29 Jun 2019 at 05:43, INKozin via Beowulf 
> wrote:
>
>> Converting the files to TF records or similar would be one obvious
>> approach if you are concerned about meta data. But then I d understand why
>> some people would not want that (size, augmentation process). I assume you
>> are are doing the training in a distributed fashion using MPI via Horovod
>> or similar and it might be tempting to do file partitioning across the
>> nodes. However doing so introduces a bias into minibatches (and custom
>> preprocessing). If you partition carefully by mapping classes to nodes it
>> may work but I also understand why some wouldn't be totally happy with
>> that. Ive trained keras/TF/horovod models on imagenet using up to 6 nodes
>> each with four p100/v100 and it worked reasonably well. As the training
>> still took a few days copying to local NVMe disks was a good option.
>> Hth
>>
>> On Fri, 28 Jun 2019, 18:47 Mark Hahn,  wrote:
>>
>>> Hi all,
>>> I wonder if anyone has comments on ways to avoid metadata bottlenecks
>>> for certain kinds of small-io-intensive jobs.  For instance, ML on
>>> imagenet,
>>> which seems to be a massive collection of trivial-sized files.
>>>
>>> A good answer is "beef up your MD server, since it helps everyone".
>>> That's a bit naive, though (no money-trees here.)
>>>
>>> How about things like putting the dataset into squashfs or some other
>>> image that can be loop-mounted on demand?  sqlite?  perhaps even a format
>>> that can simply be mmaped as a whole?
>>>
>>> personally, I tend to dislike the approach of having a job stage tons of
>>> stuff onto node storage (when it exists) simply because that guarantees a
>>> waste of cpu/gpu/memory resources for however long the stagein takes...
>>>
>>> thanks, mark hahn.
>>> --
>>> operator may differ from spokesperson.  h...@mcmaster.ca
>>> ___
>>> Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
>>> To change your subscription (digest mode or unsubscribe) visit
>>> https://beowulf.org/cgi-bin/mailman/listinfo/beowulf
>>>
>> ___
>> Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
>> To change your subscription (digest mode or unsubscribe) visit
>> https://beowulf.org/cgi-bin/mailman/listinfo/beowulf
>>
> ___
> Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit
> https://beowulf.org/cgi-bin/mailman/listinfo/beowulf
>
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
https://beowulf.org/cgi-bin/mailman/listinfo/beowulf


Re: [Beowulf] help for metadata-intensive jobs (imagenet)

2019-06-28 Thread INKozin via Beowulf
Converting the files to TF records or similar would be one obvious approach
if you are concerned about meta data. But then I d understand why some
people would not want that (size, augmentation process). I assume you are
are doing the training in a distributed fashion using MPI via Horovod or
similar and it might be tempting to do file partitioning across the nodes.
However doing so introduces a bias into minibatches (and custom
preprocessing). If you partition carefully by mapping classes to nodes it
may work but I also understand why some wouldn't be totally happy with
that. Ive trained keras/TF/horovod models on imagenet using up to 6 nodes
each with four p100/v100 and it worked reasonably well. As the training
still took a few days copying to local NVMe disks was a good option.
Hth

On Fri, 28 Jun 2019, 18:47 Mark Hahn,  wrote:

> Hi all,
> I wonder if anyone has comments on ways to avoid metadata bottlenecks
> for certain kinds of small-io-intensive jobs.  For instance, ML on
> imagenet,
> which seems to be a massive collection of trivial-sized files.
>
> A good answer is "beef up your MD server, since it helps everyone".
> That's a bit naive, though (no money-trees here.)
>
> How about things like putting the dataset into squashfs or some other
> image that can be loop-mounted on demand?  sqlite?  perhaps even a format
> that can simply be mmaped as a whole?
>
> personally, I tend to dislike the approach of having a job stage tons of
> stuff onto node storage (when it exists) simply because that guarantees a
> waste of cpu/gpu/memory resources for however long the stagein takes...
>
> thanks, mark hahn.
> --
> operator may differ from spokesperson.  h...@mcmaster.ca
> ___
> Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit
> https://beowulf.org/cgi-bin/mailman/listinfo/beowulf
>
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
https://beowulf.org/cgi-bin/mailman/listinfo/beowulf


Re: [Beowulf] Containers in HPC

2019-05-26 Thread INKozin via Beowulf
for what it's worth, Singularity worked well for me last time I tried it.
I think it was shortly after NVIDIA had announced support for it.

On Sun, 26 May 2019 at 11:11, Benjamin Redling 
wrote:

> On 23/05/2019 16.13, Loncaric, Josip via Beowulf wrote:
> > "Charliecloud" is a more secure approach to containers in HPC:
>
> I tried Singularity short before and during 2.3 with GPUs -- didn't
> work, documented issue, maybe solved. Stopped caring.
>
> Shortly afterwards I read about Charliecloud and tried it -- didn't
> work, too many issues. Stopped caring.
>
> So, "more secure" on paper (less lines of code) doesn't get any work done.
> My advice to anyone with a working setup: try it out if time permits,
> but don't bother to much and definitively don't advertise it to third
> parties beforehand.
>
> Regards,
> Benjamin
> --
> FSU Jena | https://JULIELab.de/Staff/Redling/
> ☎  +49 3641 9 44323
> ___
> Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit
> https://beowulf.org/cgi-bin/mailman/listinfo/beowulf
>
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
https://beowulf.org/cgi-bin/mailman/listinfo/beowulf


[Beowulf] Elastic Fabric Adapter (EFA)

2019-05-07 Thread INKozin via Beowulf
Hello, it looks like AWS wants to catch up with Azure in the HPC context. I
have found some benchmarks describing good scaling on openfoam etc but no
raw performance metrics. Has anyone tried it yet? What's the MPI latency
like?
The bandwidth should be close to 100 Gbps.
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
https://beowulf.org/cgi-bin/mailman/listinfo/beowulf


Re: [Beowulf] live free SGE descendent (for Centos 7)?

2019-03-05 Thread INKozin via Beowulf
Or you can ping Dave Love who I'm sure will be happy to respond.

On Tue, 5 Mar 2019, 18:43 Alex Chekholko via Beowulf, 
wrote:

> Hi David,
>
> What is your goal?
>
> Anecdotally, every cluster I know has switched from SGE to SLURM.  SchedMD
> has an active user list and bug tracker.
>
> If you are willing to spend money; I hear Univa support is excellent.
>
> Regards,
> Alex
>
>
> On Tue, Mar 5, 2019 at 10:39 AM David Mathog  wrote:
>
>> Are any of the free SGE derived projects still alive?  If so, buildable
>> on Centos 7?
>>
>> Son of grid engine, for instance, has not had a release since 2016
>>
>> https://arc.liv.ac.uk/downloads/SGE/releases/8.1.9/
>>
>> and there isn't one for Centos 7.  The last rocks release still has an
>> SGE
>> option, and that is based on Centos 7, so some version can be built on
>> that platform.  Anybody know off hand which one they used?
>>
>> The Univa version still seems to be kicking, but that is commercial.
>>
>> I have an old version running on one Centos 7 machine, but it was not
>> built there.  It is a 32 binary made long ago (Mandriva 2010 or Mageia
>> 3?)  and still uses
>>
>> /etc/rc.d/init.d/sgemaster
>>
>> to start/stop rather than a systemd method.
>>
>> Thanks,
>>
>> David Mathog
>> mat...@caltech.edu
>> Manager, Sequence Analysis Facility, Biology Division, Caltech
>> ___
>> Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
>> To change your subscription (digest mode or unsubscribe) visit
>> http://www.beowulf.org/mailman/listinfo/beowulf
>>
> ___
> Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit
> http://www.beowulf.org/mailman/listinfo/beowulf
>
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] HPC workflows

2018-12-09 Thread INKozin via Beowulf
While I agree with many points made so far I want to add that one aspect
which used to separate a typical HPC setup from some IT infrastructure is
complexity. And I don't mean technological complexity (because
technologically HPC can be fairly complex) but the diversity and the
interrelationships between various things. Typically HPC is relatively
homogeneous and straightforward. But everything is changing including HPC
so modularisation is a natural approach to make systems more manageable so
containers, conda, kubernetes etc are solutions to fight complexity. Yes,
these solutions can be fairly complex too but the impact is generally
intentionally restricted. For example, a conda environment can be rather
bloated but then flexibility for size is a reasonable trade-off.
One of the points Werner Vogels, Amazon CTO kept coming back over and over
again in his keynote at the recent reInvent is modular (cellular)
architecture at different levels (lambdas, firecracker, containers, VMs and
up) because working with redundant, replaceable modules makes services
scalable and resilient.
And I'm pretty sure the industry will continue on its path to embrace
microVMs as it did containers before that.
This modular approach may work quite well for on prem IT, cloud or HTC
(High Throughout Computing) but may still be a challenge for HPC because
you can argue that true HPC system must be tightly coupled (e.g. remember
OS jitter?)
As for ML and more specifically deep learning, it depends on what you do.
If you are doing inferencing ie production setup ie more like HTC then
everything works fine. But if you want to train a model on on ImageNet or
larger and do it very quickly (hours) then you will benefit from a tightly
coupled setup (although there are tricks such as asynchronous parameter
updates to alleviate latency)
Two points in case here: Kubeflow whose scaling seems somewhat deficient
and Horovod library which made many people rather excited because it allows
using Tensorflow and MPI.
While Docker and Singularity can be used with MPI, you'd probably want to
trim as much as you can if you want to push the scaling limit. But I think
we've already discussed many times on this list the topic of "heroic" HPC
vs "democratic" HPC (top vs tail).

Just on last thing regarding using GPUs in the cloud. Last time I checked
even the spot instances were so expensive you'd be so much better off if
you buy them even if for a month. Obviously if you have place to host them.
And obviously in your DC you can use a decent network for faster training.
As for ML services provided by AWS and others, my experience rather
limited. I helped one of our students with ML service on AWS. Initially he
was excited that he could just through his data set at it and get something
out. Alas, he quickly found out that he needs to do quite a bit more so
back to our HPC. Perhaps AutoML will be significantly improved in the
coming years but for now just expecting to get something good without an
effort is probably premature.


On Sun, 9 Dec 2018 at 15:26, Gerald Henriksen  wrote:

> On Fri, 7 Dec 2018 16:19:30 +0100, you wrote:
>
> >Perhaps for another thread:
> >Actually I went t the AWS USer Group in the UK on Wednesday. Ver
> >impressive, and there are the new Lustre filesystems and MPI networking.
> >I guess the HPC World will see the same philosophy of building your setup
> >using the AWS toolkit as Uber etc. etc. do today.
> >Also a lot of noise is being made at the moment about the convergence of
> >HPC and Machine Learning workloads.
> >Are we going to see the MAchine Learning folks adapting their workflows to
> >run on HPC on-premise bare metal clusters?
> >Or are we going to see them go off and use AWS (Azure, Google ?)
>
> I suspect that ML will not go for on-premise for a number of reasons.
>
> First, ignoring cost, companies like Google, Amazon and Microsoft are
> very good at ML because not only are they driving the research but
> they need it for their business.  So they have the in house expertise
> not only to implement cloud systems that are ideal for ML, but to
> implement custom hardware - see Google's Tensor Processor Unit.
>
> Second, setting up a new cluster isn't going to be easy.  Finding
> physical space, making sure enough utilities can be supplied to
> support the hardware, staffing up, etc.  are not only going to be
> difficult but inherently takes time when instead you can simply sign
> up to a cloud provider and have the project running within 24 hours.
> Would HPC exist today as we know it if the ability to instantly turn
> on a cluster existed at the beginning?
>
> Third, albeit this is very speculative.  I suspect ML learning is
> heading towards using custom hardware.  It has had a very good run
> using GPU's, and a GPU will likely always be the entry point for
> desktop ML, but unless Nvidia is holding back due to a lack of
> competition is does appear the GPU is reaching and end to its
> development much like CPUs have.  

Re: [Beowulf] HPC workflows

2018-11-28 Thread INKozin via Beowulf
On Wed, 28 Nov 2018 at 11:33, Bogdan Costescu  wrote:

> On Mon, Nov 26, 2018 at 4:27 PM John Hearns via Beowulf <
> beowulf@beowulf.org> wrote:
>
>> I have come across this question in a few locations. Being specific, I am
>> a fan of the Julia language. Ont he Juia forum a respected developer
>> recently asked what the options were for keeping code developed on a laptop
>> in sync with code being deployed on an HPC system.
>>
>


> I think out loud that many HPC codes depend crucially on a $HOME directory
>> being presnet on the compute nodes as the codes look for dot files etc. in
>> $HOME. I guess this can be dealt with by fake $HOMES which again sync back
>> to the Repo.
>>
>
> I don't follow you here... $HOME, dot files, repo, syncing back? And why
> "Repo" with capital letter, is it supposed to be a name or something
> special?
>

I think John is talking here about doing version control on whole HOME
directories but trying to be mindful of dot files such as .bashrc and
others which can be application or system specific. The first thing which
comes to mind is to use branches for different cluster systems. However
this also taps into backup (which is another important topic since HOME
dirs are not necessarily backed up). There could be a working solution
which makes use of recursive repos and git lfs support but pruning old
history could still be desirable. Git would minimize the amount of storage
because it's hash based. While this could make it possible to replicate
your environment "wherever you go", a/ you would drag a lot history around
and b/ a significantly different mindset is required to manage the whole
thing. A typical HPC user may know git clone but generally is not a git
adept. Developers are different and, who knows John, maybe someone will
pick up your idea.

Is gitfs any popular?

In my HPC universe, people actually not only need code, but also data -
> usually LOTS of data. Replicating the code (for scripting languages) or the
> binaries (for compiled stuff) would be trivial, replicating the data would
> not. Also pulling the data in or pushing it out (f.e. to/from AWS) on the
> fly whenever the instance is brought up would be slow and costly. And by
> the way this is in no way a new idea - queueing systems have for a long
> time the concept of "pre" and "post" job stages, which could be used to
> pull in code and/or data to the node(s) on which the node would be running
> and clean up afterwards.
>
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] Oh.. IBM eats Red Hat

2018-10-30 Thread INKozin via Beowulf
Will Red Hat come out Blue Hat after IBM blue washing?

On Tue, 30 Oct 2018 at 16:21,  wrote:

> Cringely has some interesting observations...
>
> "The deal is a good fit for many reasons explained below. And remember Red
> Hat is just down the road from IBM’s huge operation in Raleigh, NC.
>
> "Will Amazon, Google, and Microsoft now run out and buy SUSE, Ubuntu,
> Apache, etc?  Yes.
>
> "Will there be a mad rush to create new Linux distros? No. I think that
> boat has already sailed and further Linux branding won’t happen, at least
> not for traditional business reasons.
>
> [SNIP]
>
> "These big questions have yet to be answered, of course. Only time will
> tell. But we’ll shortly begin to see hints. What happens to Red Hat
> management, for example? There are those who think Red Hat will, in many
> ways, become the surviving corporate culture here — that is if Red Hat’s
> Jim Whitehurst gets Ginni Rometty’s IBM CEO job as part of the deal. That’s
> what I am predicting will happen. Ginni is overdue for retirement, this
> acquisition will not only qualify her for a huge retirement package, it
> will do so in a way that won’t be clearly successful or unsuccessful for
> years to come, so no clawbacks. And yet the market will (eventually) love
> it, IBM shares will soar, and Ginni will depart looking like a genius.
>
> [SNIP]
>
> "In the end the C-suite of IBM may be finally admitting to themselves what
> you and I have known for several years — that their strategic imperatives
> are not doing as well as they promised.  They also know they’ve invested
> way too much in stock repurchases and way too little in the business.  So
> with this Red Hat deal they’ve basically bet the farm to get themselves
> back in the game.
>
> "With Whitehurst at the top of IBM, the company will not only have an
> outsider like Gerstner was, it will have its first CEO ever who won’t be
> coming with a sales background. This is very good, because IBM will have a
> technical leader finally running the show.
>
> "Let’s review:
>
> "Ginni Rometty is past the age where IBM likes to retire CEO’s, which is
> 60.
>
> "Jim Whitehurst is 51, the age when IBM likes to hire new CEO’s.
>
> "I don’t see Whitehurst moving to Armonk, I do see IBM moving to Raleigh.
>
> "I do see Whitehurst as CEO of IBM in six months or less.
>
> "The Red Hat team will expand their products into new areas. IBM
> executives will retire in droves because they can’t compete and will resist
> learning something new.
>
>
> *Red Hat takes over IBM*
> https://www.cringely.com/2018/10/29/red-hat-takes-over-ibm/ [cringely.com]
> 
>
>
>
> Chuck Petras, PE**
> Schweitzer Engineering Laboratories, Inc
> Pullman, WA  99163  USA
> http://www.selinc.com
>
> SEL Synchrophasors - A New View of the Power System <
> http://synchrophasor.selinc.com>
>
> Making Electric Power Safer, More Reliable, and More Economical (R)
>
> ** Registered in Oregon.
> ___
> Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit
> http://www.beowulf.org/mailman/listinfo/beowulf
>
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] Oh.. IBM eats Red Hat

2018-10-29 Thread INKozin via Beowulf
oh yes, and forget to be able to find anything ever unless the pages are
externally accessible and index by google.

On Mon, 29 Oct 2018 at 17:06, John Hearns via Beowulf 
wrote:

> I just realised...  I will now need an account on the IBM Support Site, a
> SiteID AND an Entitlement to file bugs on any Redhat packages.
>
> For those who don't know the system - every site (University, company,
> Laboratory etc) has a SiteID number.
> You had better know that number - and if someone leaves or retires you had
> BETTER get than number from them.
> (I handled a support case once where a customer had someone retire - and
> not pass on the site ID- we had to get a high up in IBM UK invoplved);.
>
> One person on site then has the ability to allow others on the site to
> open support issues.
> You just cannot decide to open a support issue -you must have the rights
> to ask for support for that product.
>
>
>
>
>
>
>
> On Mon, 29 Oct 2018 at 16:55, Joe Landman  wrote:
>
>>
>> On 10/29/18 12:44 PM, David Mathog wrote:
>>
>> [...]
>>
>> > It turns out that getting up to date compilers and libraries has become
>> >> quite important for those working on large distributed code bases.
>> >
>> > Libraries are harder.  Try to build a newer one than ships with CentOS
>> > and it is not uncommon to end up having to build many other libraries
>> > (recursive dependencies) or to hit a brick wall when a kernel
>> > dependency surfaces.
>>
>>
>> This was my point about building things in a different tree.  I do this
>> with tools I use in https://github.com/joelandman/nlytiq-base , which
>> gives me a consistent set of tools regardless of the platform.
>>
>> Unfortunately, some of the software integrates Conda, which makes it
>> actually harder to integrate what you need.  Julia, for all its
>> benefits, is actually hard to build packages for such that they don't
>> use Conda.
>>
>>
>> > In biology apps of late there is a distressing tendency for software
>> > to only be supported in a distribution form which is essentially an
>> > entire OS worth of libraries packaged with the one (often very small)
>> > program I actually want to run.  (See "bioconda".)  Most of these
>> > programs will build just fine from source even on CentOS 6, but often
>> > the only way to download a binary for them is to accept an additional
>> > 1Gb (or more) of other stuff.
>>
>>
>> Yeah, this has become common across many fields.  Containers become the
>> new binaries, so you don't have to live with/accept the platform based
>> restrictions.  This was another point of mine.  And Greg K @Sylabs is
>> getting free exposure here :D
>>
>>
>> --
>> Joe Landman
>> e: joe.land...@gmail.com
>> t: @hpcjoe
>> w: https://scalability.org
>> g: https://github.com/joelandman
>> l: https://www.linkedin.com/in/joelandman
>>
>> ___
>> Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
>> To change your subscription (digest mode or unsubscribe) visit
>> http://www.beowulf.org/mailman/listinfo/beowulf
>>
> ___
> Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit
> http://www.beowulf.org/mailman/listinfo/beowulf
>
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] Oh.. IBM eats Red Hat

2018-10-29 Thread INKozin via Beowulf
oh, but RH's function is so much more nowadays than just a paid for
distribution.
hence the acquisition which is not about that. but the whole ecosystem can
suffer as a result.

On Mon, 29 Oct 2018 at 15:29, Prentice Bisbal via Beowulf <
beowulf@beowulf.org> wrote:

>
> On 10/29/2018 06:54 AM, INKozin via Beowulf wrote:
> >
> > what would be an alternative to RH?
>
> Ubuntu
> ___
> Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit
> http://www.beowulf.org/mailman/listinfo/beowulf
>
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] Oh.. IBM eats Red Hat

2018-10-29 Thread INKozin via Beowulf
Likewise!
The topic proves to be irresistible. Keep them coming

On Mon, 29 Oct 2018 at 15:10, Joe Landman  wrote:

>
> On 10/29/18 11:04 AM, Robert G. Brown wrote:
> > On Mon, 29 Oct 2018, Tony Brian Albers wrote:
> >
> >> I've worked for Big Blue, and I'm not sure the company cultures are
> >> compatible to say the least.
> >
> > I think it will be all right (and yes, look, I'm alive, I'm alive!).
>
>
> Glad to see that!
>
>
> [...]
>
>
> > Robert G. Brown   http://www.phy.duke.edu/~rgb/
> > Duke University Dept. of Physics, Box 90305
> > Durham, N.C. 27708-0305
> > Phone: 1-919-660-2567  Fax: 919-660-2525 email:r...@phy.duke.edu --
>
> Joe Landman
> e: joe.land...@gmail.com
> t: @hpcjoe
> w: https://scalability.org
> g: https://github.com/joelandman
> l: https://www.linkedin.com/in/joelandman
>
> ___
> Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit
> http://www.beowulf.org/mailman/listinfo/beowulf
>
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] Oh.. IBM eats Red Hat

2018-10-29 Thread INKozin via Beowulf
exactly my thoughts (even though i have not worked there, talking to its
employees was enough).
it's attitude towards open source is not exactly promising.
the recent github deal comes to mind but at least MS is declaring to be
more open towards open source.
and at least there is an alternative in that case - gitlab.
what would be an alternative to RH? certainly not a single one.

On Mon, 29 Oct 2018 at 07:43, Tony Brian Albers  wrote:

> https://www.reuters.com/article/us-red-hat-m-a-ibm/ibm-to-acquire-softw
> are-company-red-hat-for-34-billion-idUSKCN1N20N3
> 
>
> I wonder where that places us in the not too distant future..
>
> I've worked for Big Blue, and I'm not sure the company cultures are
> compatible to say the least.
>
> /tony
>
> --
> --
> Tony Albers
> Systems Architect
> Systems Director, National Cultural Heritage Cluster
> Royal Danish Library, Victor Albecks Vej 1, 8000 Aarhus C, Denmark.
> Tel: +45 2566 2383 / +45 8946 2316
> ___
> Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit
> http://www.beowulf.org/mailman/listinfo/beowulf
>
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


[Beowulf] singularity configuration

2018-03-15 Thread INKozin via Beowulf
Hello,
Singularity has been mentioned a few times on this list. Now on
version 2.4.2, it feels mature for production and converting a custom
Docker image into a Singularity image is very easy. The defaults seem
sensible although I didn't have time to explore everything in detail.
Are there potential exploits we should be aware of before making it
generally available on our clusters?
Do people significantly change the default configuration?
Thank you
Igor
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] nVidia revealed as evil

2018-01-06 Thread INKozin via Beowulf
Indeed, it was the reverse until recently but P100 is compute
capability 6.0, the latest GTX cards are 6.1 but V100 is already 7.
Performance-wise P100 is reasonably close to commodity cards on the
Tensorflow benchmarks (especially on 1 or 2 cards).
I don't have figures at hand but currently V100 is not too far from
them either as far as Deep Learning tasks are concerned.
This may change when TF moves to CUDA 9 and cuDNN 7 in the next release.

On 6 January 2018 at 00:43, Gerald Henriksen  wrote:
> On Fri, 5 Jan 2018 22:52:19 +, you wrote:
>
>>There has been a conversation going on on the AMBER mailing list for some 
>>time, related to this and specifically to the Volta card in some way, since 
>>AMBER performs best on the consumer grade stuff and doesn’t require the 
>>enterprise class features (I guess the reason for the performance difference 
>>is that the next gen consumer cards come out first?).
>
> Actually, the consumer cards (at least for the last generation or so)
> have been released last.
>
> Volta has not yet been released in a consumer form - the cheapest
> available so far is the Titan V (direct from Nvidia only) at $3k I
> believe.  Speculation is that consumer Volta may come out this year,
> but there is no pressure on Nvidia given AMD's troubles in the GPU
> market.
>
> The big difference is that of course most of the work is being done on
> consumer hardware because that's what the developers and researchers
> can afford.  Secondarily, the new feature Volta offers is still to
> recent to be properly supported (and some software may get no benefit
> from it) because affordable hardware isn't yet available to allow
> developer access to the new Tensor cores that Volta offers.
>
>>Anyhow, I spoke to an NVIDIA rep about it at SC17 and he kind of said “we are 
>>happy you’re buying whatever chip of ours). I said, sure, maybe, but I 
>>understand you’re putting the screws to the systems vendors so how are we 
>>supposed to buy them. Didn’t get a real concrete answer.
>
> I would think if Nvidia isn't careful this could provide an opening
> for AMD.
> ___
> Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit 
> http://www.beowulf.org/mailman/listinfo/beowulf
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf