Thanks Gopal, Bikas and Hitesh for pouring your thoughts.

Hi Gopal,

One follow-up question: As you advised, in case of rolling upgrades to
overcome these errors, for hive, the best place to update HADOOP_CLASSPATH
with Tez jars is through hive-config.sh. Could you also suggest the best
ways to update HADOOP_CLASSPATH with Tez jars for mapreduce programs and
also for non Hive cli sessions (Through HiveServer2, et al)?

--Bala G.


On Mon, Jul 7, 2014 at 7:30 PM, Gopal V <[email protected]> wrote:

> On 7/7/14, 5:50 PM, Bala Krishna Gangisetty wrote:
>
>> Thanks Hitesh for your inputs. I've not come across any issues yet. So, I
>> can safely assume that putting Tez jars in Hadoop class path will not
>> cause
>> the map reduce programs to use Tez framework unless it is enabled. Let me
>> know if my understanding it not correct.
>>
>
> Your assumptions are correct.
>
> But this is not advised because it will break rolling upgrades.
>
> The main issue early adopters have run into is installing a tez built
> against hadoop-2.4.x into a cluster running hadoop-2.2.x.
>
> As Hitesh/Bikas mentioned, that would cause errors at runtime even for MR
> jobs.
>
> The errors you will get for that case is similar to the errors you get
> during a rolling upgrade between versions.
>
> There is no real reason to include tez jars for any hadoop daemons
> (datanode, nodemanager) you run in your cluster because they might error
> out while replacing those files.
>
> The correct solution for this is to install Tez in its own versioned
> directory.
>
> And for hive, within your hive-config.sh to do the following.
>
> export HADOOP_CLASSPATH=/opt/tez/current/*:/opt/tez/current/
> lib/*:/etc/tez/conf/:/usr/share/java/*:$HADOOP_CLASSPATH
>
> This setup with symlinks from
>
> /etc/tez/conf -> /opt/tez/current/conf
> /opt/tez/current -> /opt/tez/0.4.1
>
> Will ensure that you are ready to do rolling upgrades from day #1.
>
> After the symlinks point to a new version, the only daemon to restart
> would be hive-server2.
>
> Cheers,
> Gopal
>
>
>  On Mon, Jul 7, 2014 at 4:10 PM, Hitesh Shah <[email protected]> wrote:
>>
>>  Hi
>>>
>>> For the most part, there should be no issues as most dependencies that
>>> Tez
>>> pulls in are compatible with the hadoop version that it is compiled with
>>> (
>>> 2.2 or higher ). The major issue to be aware of is that you should
>>> compile
>>> Tez against the same version of hadoop/mapreduce that is deployed on your
>>> cluster.  The tez dependency jars contain both 3rd party deps as well as
>>> hadoop jars ( hdfs, common, yarn client-side and mapreduce client-side )
>>> -
>>> if there is a version mismatch, this may cause a problem when the tez
>>> directory is added to the hadoop classpath.
>>>
>>> Have you seen any issues? If yes, could you provide more details?
>>>
>>> thanks
>>> — Hitesh
>>>
>>>
>>> On Jul 7, 2014, at 3:44 PM, Bala Krishna Gangisetty <[email protected]>
>>> wrote:
>>>
>>> > I'm wondering, from operational point of view, are there any specifics
>>> that need special attention to make MRv2 and Tez frameworks coexist in
>>> harmony? I heard that putting Tez jars in Hadoop class path would impact
>>> the mapred behavior, even when Tez is not enabled (either through
>>> mapred-site.xml, or Hive). Could someone throw more light and share
>>> thoughts on it?
>>> >
>>> > --Bala G.
>>>
>>>
>>>
>>
>>
>

Reply via email to