Hey guys, Here's a scenario:
Cluster allows a max of 90 mappers and 90 reducers. 1) Submit a large job, which immediately utilizes all mappers and all reducers. 2) 10 minutes later, submit a second job. We notice that the cluster will eventually allow the mapper portion of both jobs to be shared (so they both run concurrently). HOWEVER... The first job hogs all of the reducers and never "lets go" of them so that the other query can have its reducers running. Any idea how to overcome this? Is there a way to tell Hive or Hadoop to "let go" of reducers that are currently running? Should I limit the max reducers that a single job can use? How? Thanks, Ryan
