Re: VCL Expansion

2009-09-10 Thread Aaron Peeler

Hi,

I believe this would be very valuable. Apache VCL has received interest 
from the SuraGrid community http://www.sura.org/programs/sura_grid.html 
to look at integrating Globus and Condor among other things.


Having something like what you describe would definitely be beneficial.

Aaron


--On September 11, 2009 1:58:19 AM +1000 Sengor seng...@gmail.com wrote:


Hi all,

I'm contemplating integrating VCL with high throughput computing
meta-scheduling application Nimrod/G
(http://messagelab.monash.edu.au/NimrodG) as part of an upcoming
university research project.

What I've currently got in mind is:
- Expansion of Nimrod/G's compatibility with VCL (via APIs)
- Expansion of VCL's management server module capabilities to handle
Solaris 10 Zones and/or Solaris provisioning in general

If anyone's got any further ideas of what VCL project itself would
benefit from or general suggestions/comments, perhaps it can be
retrofit back into this project.

Thank you.

--
sengork




Aaron Peeler
OIT Advanced Computing
College of Engineering-NCSU
919.513.4571
http://vcl.ncsu.edu


Re: VCL scaleup limits

2009-09-10 Thread Brian Bouterse
Thanks for the scalability info.  I had one more question: does the  
vcld component benefit from multiple cores?  In other words, is the  
vcld component multithreaded?


Thanks!
Brian


Brian Bouterse
NEXT Services
919.698.8796

On Sep 9, 2009, at 12:51 PM, Aaron Peeler wrote:



Good question.

It's hard to put an exact number on it because of the different  
types of resources that can be made available(vm,bare-metal,lab  
machines), but we should be able to get theoretically close.


Correct - multiple management nodes is an important part of the  
scaling. So the first question is how many resources can a single  
management node support.


Additional things to consider are how the nodes get provisioned and  
the usage profile(asynchronous or synchronous). Also this is  
assuming one has robust web server, database, networking, storage  
and management nodes. The network, storage and management nodes are  
a big factor.



Provisioning options -
bare-metal vs. hypervised vs. stand-alone lab machines

*bare-metal - installing an image to disk using xCAT:
typically we have experienced anywhere from 150-200 blades per  
management node for asynchronous use.


*hypervisor using vmware free server vs ESX/i with either persistent  
vs non-persistent mode.


VMware ESX with network datastores for the vms and running in non- 
persistent mode. One could probably get 1000+ vms per management  
node maybe more. Again this assumes fast access/networking to  
storage (10G ethernet or fibre), a nice storage array and running  
20+ vms per esx server.


vmware free server - typically 5-10 vms per server

*stand-alone machines:
at NCSU we also use traditional lab machines when the university  
labs close. This mode is only brokering remote access to nodes,  
therefore no loading is going on. One could probably get 700+  
machines per management node.


Usage Profile: asynchronous vs synchronous
synchronous usage is the most demanding usage, block allocations/ 
provisioning for a class, workshop or some other event that needs  
many nodes at a single event.


asynchronous usage - users independently request nodes at any given  
time. This is spread out over time and has a lower provisioning load  
on the management node/s.



Since we(ncsu) don't have the infrastructure to confirm these  
numbers this is just an educated guess based on past experiences  
based on what we do have. I would feel comfortable saying that if  
using multiple management nodes and an ideal HW setup, beefy blades  
(multi-core, extended memory blades, high-end storage, fat-pipes,  
etc), vcl could conservatively support several thousands nodes.


Aaron


--On September 10, 2009 12:20:39 AM +1000 Sengor seng...@gmail.com  
wrote:



Hi Brian,

I'm not certain of the exact numerics, however I do believe support  
for
multiple management nodes (vcld's) is an intentional scale-out  
approach.


Perhaps some of the guys @ NCSU know this one, I believe their VCL
instance is currently the largest one in production.


On Wed, Sep 9, 2009 at 11:55 PM, Brian Bouterse bmbou...@gmail.com
wrote:


Hi All,

I'm wondering what the scalability limits of VCL?  What are the  
limits of
the vcld component, and why do we believe they exist?  Does the  
frontend

have any scalability limits?

I'm trying to figure out what the reasonable number of VMs a  
single vcld

installation can concurrently support.  What do you think and why?

Best,
Brian


Brian Bouterse
NEXT Services
919.698.8796





--
sengork




Aaron Peeler
OIT Advanced Computing
College of Engineering-NCSU
919.513.4571
http://vcl.ncsu.edu