https://www.coursera.org/course/hetero 

Heterogeneous Parallel Programming
Wen-mei W. Hwu

This course teaches the use of CUDA/OpenCL, OpenACC, and MPI for programming 
heterogeneous parallel computing systems. It is application oriented and only 
introduces necessary technological knowledge to solidify understanding.
Watch intro video
Current Session:
Nov 28th 2012 (6 weeks long)    Go to class
Workload: 6-8 hours/week 
 
About the Course
All computing systems, from mobile to supercomputers, are becoming 
heterogeneous parallel computers using both multi-core CPUs and many-thread 
GPUs for higher power efficiency and computation throughput. While the 
computing community is racing to build tools and libraries to ease the use of 
these heterogeneous parallel computing systems, effective and confident use of 
these systems will always require knowledge about the low-level programming 
interfaces in these systems. This course is designed for students in all 
disciplines to learn the essence of these programming interfaces (CUDA/OpenCL, 
OpenMP, and MPI) and how they should orchestrate the use of these interfaces to 
achieve application goals.

The course is unique in that it is application oriented and only introduces the 
necessary underlying computer science and computer engineering knowledge needed 
for understanding. It covers data parallel execution model, memory models for 
locality, parallel algorithm patterns, overlapping computation with 
communication, and scalable programming using joint MPI-CUDA in large scale 
computing clusters. It has been offered as a one-week intensive summer school 
for the past four years. In the past two years, there have been ten 
video-linked academic sides with a total of more than two hundred students each 
year.
About the Instructor(s)
[Image: https://s3.amazonaws.com/coursera/topics/hetero/instructor-1.png]
Wen-mei W. Hwu is a Professor and holds the Sanders-AMD Endowed Chair in the 
Department of Electrical and Computer Engineering, University of Illinois at 
Urbana-Champaign. His research interests are in the area of architecture, 
implementation, compilation, and algorithms for parallel computing. He is the 
chief scientist of Parallel Computing Institute and director of the IMPACT 
research group (impact.crhc.illinois.edu). He is a co-founder and CTO of 
MulticoreWare. For his contributions in research and teaching, he received the 
ACM SigArch Maurice Wilkes Award, the ACM Grace Murray Hopper Award, the Eta 
Kappa Nu Holmes MacDonald Outstanding Teaching Award, the CAM/IEEE ISCA 
Influential Paper Award, and the Distinguished Alumni Award in Computer Science 
of the University of California, Berkeley. He is a fellow of IEEE and ACM. He 
directs the UIUC CUDA Center of Excellence and serves as one of the principal 
investigators of the $208M NSF Blue Waters Petascale computer project. Dr. Hwu 
received his Ph.D. degree in Computer Science from the University of 
California, Berkeley.

In 2007, Hwu teamed up with then NVIDIA Chief Scientist David Kirk to create a 
course called Programming Massively Parallel Processors 
(https://ece408.hwu.crhc.illinois.edu). Thousands of students worldwide follow 
the course through the web site each semester. The course material has also 
been used by numerous universities including MIT, Stanford, and Georgia Tech. 
Hwu and Kirk have also been teaching a VSCSE Summer School version to science 
and engineering graduate students from all over the world. In 2008, 50 
graduates from 17 countries and three continents attended the summer school 
(http://www.greatlakesconsortium.org/events/GPUMulticore/) in Urbana with 
another 60 participating remotely. Students in the summer school come from 
diverse disciplines. In 2009, the summer school was again fully subscribed with 
160 students from multiple continents. The 2010 offering was attended by 220 
students at four sites linked with HD video. The 2011 attendance further 
increased to 280 at 10 sites across the U.S. The 2012 summer school is 
projected to have more than 320 students at 10 linked sites. Due to popular 
demand, Hwu and Kirk have also been teaching abbreviated versions of their 
course globally, most recently at the Chinese Academy of Science in 2008, 
Berkeley in 2010, Braga Portugal in 2010, and Chile in 2011. They have also 
collaborated with UPC/Barcelona Supercomputing Center to offer an EU PUMPS 
summer school in Barcelona every year since 2010. In February 2010, Hwu and 
Kirk published a textbook on programming massively parallel processors. The 
book has been extremely popular, with more than 10,000 copies sold to date. 
International editions and translations are available in China, India, Japan, 
Russia, Spain, Portugal and Latin America. The second edition is already in 
production.
Course Syllabus

    Week One: Introduction to Heterogeneous Computing and a Quick Overview of 
CUDA C and MPI, with lab setup and programming assignment of vector addition in 
CUDA C
    Week Two: Kernel-Based Data Parallel Programming and Memory Model for 
Locality, with programming assignment of simple and tiled matrix multiplication.
    Week Three: Performance Considerations and Task Parallelism Model, with 
programming assignment in performance tuning.
    Week Four: Parallel Algorithm Patterns – Reduction/Scan, stencil 
computation and Sparse computation, with programming assignment of reduction 
tree.
    Week Five: MPI in a Heterogeneous Computing Cluster: domain partitioning, 
data distribution, data exchange, and using heterogeneous computing nodes, with 
programming assignment of a MPI-CUDA application.
    Week Six: Related Programming Models – OpenACC, CUDA FORTRAN, C++AMP, 
Thrust, and important trends in heterogeneous parallel computing, with final 
exam.

Recommended Background
Programming experience in C/C++.
Suggested Readings

Although the class is designed to be self-contained, students wanting to expand 
their knowledge beyond what we can cover in a one-quarter class can find a much 
more extensive coverage of this topic in the book Programming Massively 
Parallel Processors: A Hands-on Approach (Applications of GPU Computing 
Series), by David Kirk and Wen-mei Hwu, published by Morgan Kaufmann 
(Elsevier), ISBN 0123814723.

In December 2012, Morgan Kaufmann will publish an updated second edition of 
Programming Massively Parallel Processors with new material tied to the course 
syllabus including new chapters on parallel patterns and the latest CUDA 5.0 
features. Get exclusive preview access to selected chapters from the revised 
edition when you pre-order the second edition today. You’ll save up to 40% on 
the second edition and a selection of related titles. You must order using the 
promotion codes at this page in order to receive the preview chapters. Register 
for the course and visit the forum for further details.
Course Format
The class will consist of lecture videos, which are between 15 and 20 minutes 
in length. There will also be standalone homeworks that are not part of video 
lectures, optional programming assignments, and a (not optional) final exam.
FAQ

    Will I get a certificate for this course?

    Yes. Students who successfully complete the class will receive a 
certificate signed by the instructor.
    What resources will I need for this class?

    A laptop or desktop computer with preferably a CUDA enabled GPU. There are 
more than 200 million laptops and desktops with CUDA enabled GPUs. However, we 
will provide a teaching kit to allow you to take the class if you don’t have 
one.
    What is the coolest thing I'll learn if I take this class?

    You will learn how to unleash the massive computing power from mobile 
processors to supercomputers.


Categories:
Computer Science: Systems, Security, Networking
Computer Science: Programming & Software Engineering
_______________________________________________
Beowulf mailing list, [email protected] sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf

Reply via email to