Chao Shi created CRUNCH-352:
-------------------------------
Summary: Share library jars between MR stages
Key: CRUNCH-352
URL: https://issues.apache.org/jira/browse/CRUNCH-352
Project: Crunch
Issue Type: Improvement
Reporter: Chao Shi
Currently, library jars are copied to the staging directory every time when a
MR job submitted. This is time-consuming when a pipeline consumes tens of
stages. To make it even worse, the job client may run in a network away from
cluster.
I found hive and pig have/will have this optimization (HIVE-860 and PIG-2672).
Yarn also has similar plan (YARN-1492).
Although this is better done at Yarn/MR level, we can still do it at client
side solution to benefit users who cannot upgrade to latest Yarn or have to use
legacy MRv1.
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)