Re: [Beowulf] Six card GPU node.

2016-09-18 Thread Christopher Samuel
On 14/09/16 05:03, Darren Wise wrote:

> By all means I would love to employ a 4U node decked out with XEON Phi
> cards, infact a complete rack full of them but the cheapest I can get my
> hands on a knights landing card is coming close to over £500 GBP and I
> reckon these 6 in total GPU cards will be much faster and also easier to
> work with, it's both nice that the XEON Phi cards have an inbuilt Linux
> OS subsystem but it's a bit of a bummer sometimes and to really eek out
> the extra horsepower that you really need some serious code tailoring to
> get you there..
> 
> I find that in itself the major drawback of folks buying into the XEON
> Phi family for coprocessing or offloading needs..

It's worth keeping in mind that while Knights Corner was effectively a
Linux system on a PCIe board the Knights Landing nodes being done now
are all nodes in their own right, it's a fully hosted system that has
the Knights Landing CPU on the motherboard itself.

> I do hope someone comes along with an idea I had many years ago to use
> SoC technology, I even have a 20x20mm ARM big.LITTLE SoC sat on my desk
> with 64 inbuilt ALU/Crypto/GPU cores and an array of these on a single
> PCI-E card I think would really make a good game changer in the
> coprocessing market.. (anyone want to invest in my PCB{shameless plug})

:-)

> Anyway, my home lab beowulf cluster experience and experiments are doing
> well regardless of the hiccups and total waste of time spent on some
> areas.. It's all going nicely indeed :)

Great to hear!

> *I wonder, if everyone would like to post links of software they use
> with regards to beowulf and clustering so that I may catalogue all these
> links and put them in an html document on my company server, might help
> us all out or in the future :D

OK, some quick samples from what we use:

Slurm (job scheduling): http://slurm.schedmd.com/
xCAT (cluster management): http://xcat.org/
EasyBuild: https://hpcugent.github.io/easybuild/
Open-MPI (our preferred MPI stack): https://www.open-mpi.org/

All the best.
Chris
-- 
 Christopher SamuelSenior Systems Administrator
 VLSCI - Victorian Life Sciences Computation Initiative
 Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545
 http://www.vlsci.org.au/  http://twitter.com/vlsci
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] Six card GPU node.

2016-09-14 Thread John Hearns
Darren,
You have a standing invitation to come in and see us, if you are ever down 
South.

Talking about PCIe extenders, we still have one of the original Nvidia GPU 
supercomputer chassis in store, the one where external PCIe extender cables were
Used to link to four (?) GPU cards.  I have never seen it fired up!

Good plan on buying up GPU from Computer Exchange though.


From: Beowulf [mailto:beowulf-boun...@beowulf.org] On Behalf Of Darren Wise
Sent: 13 September 2016 21:03
To: Beowulf@beowulf.org
Subject: [Beowulf] Six card GPU node.

Heya guys,

Well my Beowulf is still building with 6 nodes and a master sever containing a 
total of 56 2.10Ghz AMD 2373EE cores (dual socket, quad cores) nd while I'm 
working this up to around 12 nodes and a master creating a total of 218.4Ghz 
CPU.

I've ran into a right issue, I wanted to apply some GPU hardware, just as a 
little boost using the old GTX570 PCI-E x16 spare slot within each node, sadly 
this have given me some serious heartburn.. And I already knew the card itself 
would not even entertain being seated with a 1U node, but the use of an PCI-E 
ribbon extenders might have solved the issue..

Even more sadly, haha and even with the motherboard being able to accept 
version 3.0 standards while the GTX570 will be fine with version 2.0 PCI-E 
standards (and their mainly backwards compatible) it would not even get past 
the bios, infact I reckon it was halting even prior to that no matter how much 
juice I fed the card or the BIOS settings..

So neglecting all of this, I took some old skills and knowledge from the 
BitCoin ladies, purchased a ASROCK H81 pro board, (6 PCI-E slots see) steadily 
collecting GTX570 cards from CEX stores around the country online and I'm 
building a single GPU node instead..

Now I know what your going to say, why use these old boring GTX570 cards with 
only 480 CUDA cores, firstly each card is only £35 GBP, two year warranty to 
boot. So for £210 I can get myself 2880 CUDA cores and it will give me time to 
save up for upgrading these to GTX1080 cards as the price falls over a period 
of another year.

It will also help me get used to managing a multi card GPU cluster node, albeit 
second hand parts, but you have to start somewhere don't you. It also outs a 
bit more life into some older cheap, used cards!

By all means I would love to employ a 4U node decked out with XEON Phi cards, 
infact a complete rack full of them but the cheapest I can get my hands on a 
knights landing card is coming close to over £500 GBP and I reckon these 6 in 
total GPU cards will be much faster and also easier to work with, it's both 
nice that the XEON Phi cards have an inbuilt Linux OS subsystem but it's a bit 
of a bummer sometimes and to really eek out the extra horsepower that you 
really need some serious code tailoring to get you there..

I find that in itself the major drawback of folks buying into the XEON Phi 
family for coprocessing or offloading needs..

I do hope someone comes along with an idea I had many years ago to use SoC 
technology, I even have a 20x20mm ARM big.LITTLE SoC sat on my desk with 64 
inbuilt ALU/Crypto/GPU cores and an array of these on a single PCI-E card I 
think would really make a good game changer in the coprocessing market.. 
(anyone want to invest in my PCB{shameless plug})

Anyway, my home lab beowulf cluster experience and experiments are doing well 
regardless of the hiccups and total waste of time spent on some areas.. It's 
all going nicely indeed :)


*I wonder, if everyone would like to post links of software they use with 
regards to beowulf and clustering so that I may catalogue all these links and 
put them in an html document on my company server, might help us all out or in 
the future :D


> Kind regards,
> Darren Wise Esq,
> B.Sc, HND, GNVQ, City & Guilds.
>
> Managing Director (MD)
> Art Director (AD)
> Chief Architect/Analyst (CA/A)
> Chief Technical Officer (CTO)
>
> www.wisecorp.co.uk
> www.wisecorp.co.uk/babywise
> www.darrenwise.co.uk
Any views or opinions presented in this email are solely those of the author 
and do not necessarily represent those of the company. Employees of XMA Ltd are 
expressly required not to make defamatory statements and not to infringe or 
authorise any infringement of copyright or any other legal right by email 
communications. Any such communication is contrary to company policy and 
outside the scope of the employment of the individual concerned. The company 
will not accept any liability in respect of such communication, and the 
employee responsible will be personally liable for any damages or other 
liability arising. XMA Limited is registered in England and Wales (registered 
no. 2051703). Registered Office: Wilford Industrial Estate, Ruddington Lane, 
Wilford, Nottingham, NG11 7EP
___
Beowulf mailing list, Be