CS Faculty Candidate Colloquium

 

Monday                      **Special Time and Location**
March 10
10:45 - 11:50 AM 
Kelley 1007

 

Dorian Arnold 
EECS Colloquium: Computer Science Faculty Candidate
Doctoral Candidate and Intel Foundation Ph.D. fellow
Computer Science Department, University of Wisconsin

Tree-based Overlay Networks for Scalable, Reliable Tools and
Applications 

HPC systems continue to grow in size and complexity making the
development of scalable software systems increasingly difficult. As a
result, very few tools and applications run effectively or at all at
today's largest scales (tens and hundreds of thousands of processors).
To make matters worse, million processor systems are scheduled for
availability within the next two to four years. 

Tree-based Overlay Networks (TBONs) have proven to be an effective
computational model for scalable distributed tools and applications. A
TBON is a network of hierarchically organized processes that exploits
the logarithmic scaling properties of trees to provide scalable data
multicast, gather, and in-network aggregation. In this talk, I will
describe the TBON model, demonstrating its power and flexibility with
unprecedented scalability results from a variety of application domains.
I also will describe our novel TBON failure recovery model, state
compensation, which relies on inherent information redundancies amongst
TBON processes. State compensation features fast, decentralized tree
reconstruction and state recovery protocols involving a small subset of
the tree and no process coordination. The protocols are scalable because
their performance is a function of the tree's fan-out, not total size. A
tree with a fan-out of 64 recovers from failures in milliseconds: with
only four levels, such a tree supports over a 16,000,000 processes! 

Biography

Dorian Arnold is a doctoral candidate and Intel Foundation Ph.D. fellow
in the Computer Sciences Department at the University of Wisconsin. He
holds a M.S. degree in Computer Science from the University of Tennessee
and a B.S. degree in Mathematics and Computer Science from Regis
University (Denver, CO). From 1999 to 2001, Dorian served as technical
lead of the NetSolve project at the University of Tennessee's Innovative
Computing Laboratory, and in 2006 he was a technical scholar at Lawrence
Livermore National Laboratory. His research focuses on the performance
and scalability issues of large distributed systems including efficient
communication, runtime data analysis, fault-tolerance, and system
deployment.

 

_______________________________________________
Colloquium mailing list
[email protected]
https://secure.engr.oregonstate.edu/mailman/listinfo/colloquium

Reply via email to