On Thu, Apr 10, 2014 at 01:44:30AM -0400, Mark Hahn wrote: > >I'm considering proof of concept Beowulf cluster build for machine > >learning purposes. > > you can't go wrong using cheap/PC/commodity parts. you'll also get the > easiest access to tools/distros/etc. >
I'm concerned about cluster size I would like to keep it as small as possible. Probably some Mini/Nano-ITX board would be good enough to beat Jetson TK1. I wonder about price for whole setup and its comparison with Jetson. > >In short I need as good as possible double precision matrix > >multiplication performance with small power consumption and size. > > TK1 appears to be SP-oriented (not surprisingly). it's a little unclear > what its power dissipation is - I'd guess something in the 20W range for > linpack. > > >Taking matrix multiplication into consideration I thought that GPU is > >natural choice. > > well, maybe. you always save power by operating more units at lower clock, > and GPU tends to embrace this approach. it's not like GPUs have some > magically more efficient circuits otherwise. but it's proabably worth > looking at the gpu-linpack performance/watt from AMD's APU options. (though > they contain higher-performance CPU and memory support than TK1.) > Very good point! Following your AMD APU advice I found this article: http://www.anandtech.com/show/7711/floating-point-peak-performance-of-kaveri-and-other-recent-amd-and-intel-chips I will try to rethink my configuration using AMD APU + Mini/Nano-ITX board and will see if I can get better result considering performance/price ratio. > >curious about your professional opinion on this build. > > my professional opinion is that when people use the phrase "build" > as a noun, they're coming from the PC/gamer world ;) > > sorry! :) More PC than gamer, maybe my English is not good enough. > > >Questions that already came to my mind: > >1. What are the most used diagnostic software for keeping cluster up and > >running. > > what failure modes are you thinking about? I use IPMI on my clusters, > and wouldn't build a cluster without it. > I mean board power on failures, bad blocks, overheating and other hardware issues. I don't know any development board with BMC, AFAIK this typically server component. I agree that remote management ability is very important. > >4. Theoretical max for this platform is 326 SP GFLOPS, I was able to > >confirm that DP/SP ratio is 1/24 so theoretical max for DP is 13 GFLOPS. > >Can someone elaborate or point me to documentation how hard will be to > >utilize this power assuming CUDA and MPI usage. > > "utilize"? it's pretty low flops, so the onboard 2G will be plenty > to keep it busy. otoh, the memory is only 64b wide (no mention of memory > clock I've seen), so probably fairly low-bandwidth. > In spec there is information about DDR3L FBGA96, 256Mbit x 16, 933MHz Hynix H5TC4G63AFR-RDA. > >I'm open to any suggestions, even if it means changing everything in > >this build :) > > IMO, you can learn everything you need to learn from 4-8 low-end PCs. > there are certainly power differences versus and arm+low-end-gpu board > like this, but since this device delivers pretty much token gflops, > you might consider just using a raspberry pi or beaglebone if you have your > heart set on avoiding the PC market. I considered RPi and BeagleBone. I measure performance on RPi and get 68 DP MFLOPS after overclocking. There is unleashed performance of VideoCore IV GPU (24 SP GFLOPS) but there is no C compiler for that (only reverse engineered assembly). BeagleBone MX seems to have about 50-60 MFLOPS according to this: http://www.vesperix.com/arm/atlas-arm/bench/gcc-a8/index.html So this boards are not comparable with Jetson. I will take a look at Mini/Nano-ITX PC market. I appreciate your reply Mark, thanks. Regards, Piotr Król _______________________________________________ Beowulf mailing list, [email protected] sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
