I would like to control the maximum number of reducers a Hive query has access to. I have seen cases of Hive using up to 999 reducers, which seems inefficient (starting and stopping individual reducers), and I'd also like to cap the resources Hive uses on the cluster. (Investigating the fair use scheduler as well, which hopefully works well?)
I haven't seen any conclusive settings is the documentation, so what options are there to throttle Hive (using .4 at the moment)? hive.exec.reducers.max is mentioned in a JIRA item, but not in Hive documentation. Does it work? Thanks.
