Concurrently running Hive queries -- mappers seem to be shared, but one job hogs all reducers!

Ryan LeCompte Mon, 08 Mar 2010 11:39:19 -0800

Hey guys,

Here's a scenario:


Cluster allows a max of 90 mappers and 90 reducers.

1) Submit a large job, which immediately utilizes all mappers and all
reducers.
2) 10 minutes later, submit a second job. We notice that the cluster will
eventually allow the mapper portion of both jobs to be shared (so they both
run concurrently).

HOWEVER... The first job hogs all of the reducers and never "lets go" of
them so that the other query can have its reducers running.

Any idea how to overcome this? Is there a way to tell Hive or Hadoop to "let
go" of reducers that are currently running?

Should I limit the max reducers that a single job can use? How?

Thanks,
Ryan

Concurrently running Hive queries -- mappers seem to be shared, but one job hogs all reducers!

Reply via email to