Actually that is correct with regard to MR. The standard MapReduce as I
know reads the data from HDFS, apply map-reduce algorithm and writes back
to HDFS. If there are many iterations of map-reduce then, there will be
many intermediate writes to HDFS. This is all serial writes to disk. Each
map-reduce step is completely independent of other steps, and the executing
engine does not have any global knowledge of what map-reduce steps are
going to come after each map-reduce step.

However, having said that I thought Tez is essentially MR with DAG and
sounds like it still has some issues.

I wonder whether hive.support.concurrency is set to true with zookeeper
running and hive.lock.manager set to
org.apache.hadoop.hive.ql.lockmgr.zookeeper.ZooKeeperHiveLockManager

HTH

Dr Mich Talebzadeh



LinkedIn * 
https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*



http://talebzadehmich.wordpress.com


*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.



On 5 August 2016 at 23:46, Gopal Vijayaraghavan <gop...@apache.org> wrote:

>
> > Depends on how you configured scheduling in yarn ...
> ...
>
> >> you won't have this problem if you use Spark as the execution engine?
> >>That handles concurrency OK
>
> If I read this right, it is unlikely to be related to YARN configs.
>
> The Hue issue is directly related to how many Tez/Spark sessions are
> supported per-connection-handle.
>
> hive.server2.parallel.ops.in.session
>
> I would guess that this is queuing up in  the
> getSessionManager().submitBackgroundOperation() call talking to
> SparkSessionManagerImpl/TezSessionPoolManager.
>
>
> MR has no equivalent of a "session" in relation to the cluster, because as
> a non-DAG engine  it has to have the "parallel job queue" built over to
> support parallel stages.
>
>
> Cheers,
> Gopal
>
>
>

Reply via email to