[ 
https://issues.apache.org/jira/browse/BIGTOP-1179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13883218#comment-13883218
 ] 

Hitesh Shah commented on BIGTOP-1179:
-------------------------------------

[~gkesavan] asked me to chime in on this jira. 

Tez usage:
   - Tez can be used as a standalone library similar to MapReduce or can be a 
library leveraged by other ecossytem projects like Hive/Pig to build their 
applications on. 
   - For the former, tez already has a MR compatibility layer built-in and can 
be used by anyone to run MR jobs using the Tez execution engine. However, there 
is no 100% compatibility and there are likely to be some undiscovered bugs but 
for most part, we have tested most of the basic MR jobs against Tez and seen no 
issues. Users can also convert their existing MR jobs ( or chains of MR jobs ) 
to a native Tez DAG by directly using the Tez apis. The benefits of using the 
tez execution engine for existing MR jobs range from features such as container 
re-use ( I believe there was some level of JVM re-use support in hadoop-1.x but 
not in 2.x ), options of using more optimized sorters to improve shuffle 
performance, etc. When directly using the native APIs, the benefits are even 
greater. 
  - Hive already has support for running natively against Tez in trunk. I 
believe the hive-0.13 release will be the first release with tez supported. 
Likewise, Pig already has a highly active developer branch for supporting pig 
jobs to run natively against Tez. I believe they plan to do an early preview 
from the branch itself at some point. 
  - Even the previous releases of Hive and Pig can work against Tez. Both Hive 
and Pig work against MR so they can just be changed to using the tez execution 
engine via a config switch.

For tests, you can use Tez's native jobs such as OrderedWordCount or MRRSleep 
to do basic testing. Also, you can run any MR job against Tez by changing the 
MR config - mapreduce.framework.name

>From a deployment point of view, Tez is very simple to deploy. It is mainly a 
>client-side install with no requirement of jars being deployed to all nodes in 
>the cluster. This has an additional benefit of supporting multiple versions of 
>Tez on a single cluster. 

Feel free to send a mail to [email protected] if you have more 
questions. 
 

> Add Apache Tez to bigtop
> ------------------------
>
>                 Key: BIGTOP-1179
>                 URL: https://issues.apache.org/jira/browse/BIGTOP-1179
>             Project: Bigtop
>          Issue Type: Bug
>    Affects Versions: 0.8.0
>            Reporter: Giridharan Kesavan
>            Assignee: Giridharan Kesavan
>         Attachments: 0001-BIGTOP-1179.-Add-Apache-Tez-to-bigtop.patch, 
> 0002-BIGTOP-1179.-Add-Apache-Tez-to-bigtop.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to