Hi,
I'm a Hadoop 17 user who is doing research with Prof. Magda Balazinska
at the University of Washington on an improved progress indicator for
Pig Latin. We have a question regarding how Hadoop schedules Pig Latin
queries with JOIN operators. Does Hadoop schedule all MapReduce jobs in
a script sequentially or does it ever schedule two MapReduce jobs in
parallel. For example, if the output of two Map-Reduce jobs is later
joined and each of these jobs only needs a subset of the cluster
resources, would they be scheduled in parallel or in series?
I apologize if I sent this to the wrong list, but please let me know
which list is most appropriate for this type of question.
Thanks,
Kristi