Begin forwarded message:
From: Evert Lammerts <[email protected]>
Date: November 3, 2010 9:13:08 AM PDT
To: Evert Lammerts <[email protected]>
Subject: Hadoop workshop at SARA, December 7
Reply-To: "[email protected]" <[email protected]>
What: SARA Apache Hadoop computing workshop
When: Tuesday, December 7th
Where: SARA, Science Park 121 Amsterdam, room TBA
Dear readers,
You are all invited to participate in the first Apache Hadoop
"hackathon" in
The Netherlands on Tuesday, December 7th. This event is also the
kick-off of
the SARA Proof-of-Concept Hadoop service. More information on the
event and
the technology can be found below.
If you are interested in participating, please register as soon as
possible,
and not later than December 1st by sending an email to [email protected]
.
Best regards,
Evert Lammerts
===================================================
SARA Apache Hadoop "Hackathon"
Back in 2004, driven by the need to process extreme amounts of data,
Google
developed a system that exists of the Google File System and
MapReduce. Based
on two papers that Google published on the topic[1,2], Doug
Cutting[3] created
what is now the Apache Hadoop software stack. Apache Hadoop is used
in several
capacities by a number of internet giants (e.g. Yahoo!, IBM, eBay,
Facebook,
Last.fm, LinkedIn, Twitter and many more[4]). It's ability to store
and
process extremely large datasets makes it particularly beneficial
for sciences
as Natural Language Processing, BioInformatics, Social Sciences and
Humanities. SARA provides a Proof-of-Concept Apache Hadoop storage and
computing service for scientific use[5].
The "hackathon" will be a hands-on introduction to Hadoop. You are
invited to
come spend a day at SARA to play with the system, with the support
of two
experienced users, Edgar Meij (UvA) and Djoerd Hiemstra (UT). You
can work on
your own or in groups and analyze your own dataset or play with for
example
the Wikipedia, ENRON or White House access records datasets.
Programming
skills will be assumed, and you need to take your own laptops.
Coffee and soft drinks will be provided throughout the day, and
lunch will be
served at 12.00 AM.
Program:
09:30 - 09:35 Introduction to the day
09:35 - 10:15 Examples of data analysis with Hadoop
10:30 - 17:00 Hackathon
17:00 - end Presentation of results
[1] http://labs.google.com/papers/gfs.html
[2] http://labs.google.com/papers/mapreduce.html
[3] http://en.wikipedia.org/wiki/Doug_Cutting
[4] http://wiki.apache.org/hadoop/PoweredBy
[5] http://www.sara.nl/news/recent/20101103/Hadoop_proof-of-concept.html
Evert Lammerts
Consultant eScience Support
SARA Computing & Network Services
High Performance Computing & Visualization
Phone: +31 20 888 4101
Email: [email protected]
http://www.sara.nl