You're definitely able to scale down core nodes on EMR. I just tried through the console on emr-5.3.1 to confirm in case that changed :)
The documentation does seem to be inconsistent or old - I did see one page that I think you found, but other more recent documentation states that you can resize core. You can clearly resize / scale down core if you play around in the console (you can even scale up/down on core and task in EMR with autoscaling) If you're scaling down core nodes, you need to make sure you know what you're doing because you're directly impacting HDFS and essentially trying to do a graceful termination of a node. Per the console when I tried: "You are shrinking your core group from 10 running instances to 9. Note: To minimize the risk of data loss, EMR attempts to decommission HDFS on the core nodes; the data present on the instances marked for removal is replicated to other running instances in the core instance group. We recommend that you minimize HDFS write I/O before shrinking the cluster and ensure that the remaining instances in the core group have free storage capacity." You can watch the node be decommissioned and blocks get replicated if you look in /var/log/hadoop-hdfs on the master node after scaling down on core. On Fri, Feb 10, 2017 at 12:19 AM, Devi Sunil Kumar Shegu <[email protected]> wrote: > Hey Anthony, > > AWS EMR documentation says > 1) Core Nodes can be added but cannot be removed > 2) Task Nodes can be added and removed > > How are you able to downscale the Core Nodes? > > Thanks > > On Fri, Feb 10, 2017 at 10:44 AM, Anthony Nguyen < > [email protected]> wrote: > >> Hey Devi, >> >> I'm able to successfully scale my HBase clusters up and down (core and >> task) on EMR. Can you please provide the logs in pastebin so that we can >> help? >> >> On Feb 10, 2017 12:10 AM, "Devi Sunil Kumar Shegu" <[email protected] >> > >> wrote: >> >> Hey Ted, >> >> Thanks for the reply. >> >> The problem with AWS EMR which doesn't allow downscaling of the slave(core) >> nodes. >> >> >> On Thu, Feb 9, 2017 at 7:17 PM, Ted Yu <[email protected]> wrote: >> >> > Can you be specific about how the table didn't work ? >> > Were some of its regions in transition or offline ? >> > >> > Which hbase release are you using ? >> > >> > Please pastebin relevant master log / region server log. >> > >> > Thanks >> > >> > > On Feb 9, 2017, at 3:50 AM, Devi Sunil Kumar Shegu < >> > [email protected]> wrote: >> > > >> > > Hi, >> > > >> > > Please look into the following issue >> > > >> > > Scenario: >> > > 1) Created an AWS EMR HBASE cluster with 1 Master node and 2 >> Core(Slave) >> > > nodes >> > > 2) Created HBase table with 20 regions auto split across the 2 Core >> nodes >> > > 3) I downscaled my cluster to 1 Core node >> > > 4) The table doesn't seem to work >> > > >> > > Questions: >> > > Is it wrong to downscale HBase cluster as my table doesn't seem to be >> > > working? >> > > >> > > Thanks >> > >>
