Actually that is correct with regard to MR. The standard MapReduce as I know reads the data from HDFS, apply map-reduce algorithm and writes back to HDFS. If there are many iterations of map-reduce then, there will be many intermediate writes to HDFS. This is all serial writes to disk. Each map-reduce step is completely independent of other steps, and the executing engine does not have any global knowledge of what map-reduce steps are going to come after each map-reduce step.
However, having said that I thought Tez is essentially MR with DAG and sounds like it still has some issues. I wonder whether hive.support.concurrency is set to true with zookeeper running and hive.lock.manager set to org.apache.hadoop.hive.ql.lockmgr.zookeeper.ZooKeeperHiveLockManager HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* http://talebzadehmich.wordpress.com *Disclaimer:* Use it at your own risk. Any and all responsibility for any loss, damage or destruction of data or any other property which may arise from relying on this email's technical content is explicitly disclaimed. The author will in no case be liable for any monetary damages arising from such loss, damage or destruction. On 5 August 2016 at 23:46, Gopal Vijayaraghavan <gop...@apache.org> wrote: > > > Depends on how you configured scheduling in yarn ... > ... > > >> you won't have this problem if you use Spark as the execution engine? > >>That handles concurrency OK > > If I read this right, it is unlikely to be related to YARN configs. > > The Hue issue is directly related to how many Tez/Spark sessions are > supported per-connection-handle. > > hive.server2.parallel.ops.in.session > > I would guess that this is queuing up in the > getSessionManager().submitBackgroundOperation() call talking to > SparkSessionManagerImpl/TezSessionPoolManager. > > > MR has no equivalent of a "session" in relation to the cluster, because as > a non-DAG engine it has to have the "parallel job queue" built over to > support parallel stages. > > > Cheers, > Gopal > > >