[
https://issues.apache.org/jira/browse/BIGTOP-1179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13883218#comment-13883218
]
Hitesh Shah commented on BIGTOP-1179:
-------------------------------------
[~gkesavan] asked me to chime in on this jira.
Tez usage:
- Tez can be used as a standalone library similar to MapReduce or can be a
library leveraged by other ecossytem projects like Hive/Pig to build their
applications on.
- For the former, tez already has a MR compatibility layer built-in and can
be used by anyone to run MR jobs using the Tez execution engine. However, there
is no 100% compatibility and there are likely to be some undiscovered bugs but
for most part, we have tested most of the basic MR jobs against Tez and seen no
issues. Users can also convert their existing MR jobs ( or chains of MR jobs )
to a native Tez DAG by directly using the Tez apis. The benefits of using the
tez execution engine for existing MR jobs range from features such as container
re-use ( I believe there was some level of JVM re-use support in hadoop-1.x but
not in 2.x ), options of using more optimized sorters to improve shuffle
performance, etc. When directly using the native APIs, the benefits are even
greater.
- Hive already has support for running natively against Tez in trunk. I
believe the hive-0.13 release will be the first release with tez supported.
Likewise, Pig already has a highly active developer branch for supporting pig
jobs to run natively against Tez. I believe they plan to do an early preview
from the branch itself at some point.
- Even the previous releases of Hive and Pig can work against Tez. Both Hive
and Pig work against MR so they can just be changed to using the tez execution
engine via a config switch.
For tests, you can use Tez's native jobs such as OrderedWordCount or MRRSleep
to do basic testing. Also, you can run any MR job against Tez by changing the
MR config - mapreduce.framework.name
>From a deployment point of view, Tez is very simple to deploy. It is mainly a
>client-side install with no requirement of jars being deployed to all nodes in
>the cluster. This has an additional benefit of supporting multiple versions of
>Tez on a single cluster.
Feel free to send a mail to [email protected] if you have more
questions.
> Add Apache Tez to bigtop
> ------------------------
>
> Key: BIGTOP-1179
> URL: https://issues.apache.org/jira/browse/BIGTOP-1179
> Project: Bigtop
> Issue Type: Bug
> Affects Versions: 0.8.0
> Reporter: Giridharan Kesavan
> Assignee: Giridharan Kesavan
> Attachments: 0001-BIGTOP-1179.-Add-Apache-Tez-to-bigtop.patch,
> 0002-BIGTOP-1179.-Add-Apache-Tez-to-bigtop.patch
>
>
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)