I'd like to announce the 'Parallel Processing beyond MapReduce' workshop
which will take place directly after the Berlin Buzzwords conference (
This workshop will discuss novel paradigms for parallel processing
beyond the traditional MapReduce paradigm offered by Apache Hadoop.
The workshop will introduce two new systems:
Apache Giraph aims at processing large graphs, runs on standard Hadoop
infrastructure and is a loose port of Google's Pregel system. Giraph
follows the bulk-synchronous parallel model relative to graphs where
vertices can send messages to other vertices during a given superstep.
Stratosphere (http://www.stratosphere.eu) is a system that is developed
in a joint research project by Technische Universität Berlin, Humboldt
Universität zu Berlin and the Hasso-Plattner-Institut in Potsdam. It is
a database inspired, large-scale data processor based on concepts of
robust and adaptive execution. Stratosphere offers the PACT programming
model that extends the MapReduce programming model with additional
second order functions. As execution platform it uses the Nephele
system, a massively parallel data flow engine which is also researched
and developed in the project.
Attendees will hear about the new possibilities of Hadoop's NextGen
MapReduce architecture (YARN) and get a detailed introduction to the
Apache Giraph and Stratosphere systems. After that there will be plenty
of time for questions, discussions and diving into source code.
As a prerequisite, attendees have to bring a notebook with:
- a copy of Giraph downloaded with source
- Hadoop 0.23+ source tree and JARS local
- a copy of Stratosphere with source
- an IDE of their choice
The workshop will take place on the 6th and 7th of June and is limited
to 15 attendees. Please register by sending an email to sebastian [DOT]
schelter [AT] tu-berlin [DOT] de