Dmitriy,
Congrats on all the good work. The slides are all very
informative and look great :-).
A minor thing that I would like to bring to your notice is in slide #19 on
Multi-query optimization.
Point #1 in the slide should be modified to
"Multiplexer tags records with pipeline ID they belong to in the map stage"
Point #3 in the slide says
"Multiplexer in reduce sends records to the appropriate pipeline"
This should be replaced with
"De-multiplexer in reduce stage sending record to the appropriate pipeline"
The changes would be in accordance with the true mux-demux meaning
Multiplexing - Sending multiple signals (tuples in this context) through a
single pipeline (group in this context).
De-multiplexing - Extracting multiple signals from a single pipeline.
-...@nkur
11/4/09 9:05 AM, "Dmitriy Ryaboy" <[email protected]> wrote:
We presented on Pig tonight at the Pittsburgh HUG.
Here are the slides:
http://squarecog.wordpress.com/2009/11/03/apache-pig-apittsburgh-hadoop-user-group/
The presentation takes a brief romp through "why a new language",
followed by a summary of what various joins do and how they work, some
highlights of what's being worked on, and a few ideas for potential
research directions (the Pittsburgh HUG is leans heavily towards the
academia).
Enjoy, feedback welcome.
-Dmitriy