You are way underpowered. I don't think you are going to get reasonable performance out of this hardware with so many processes running on it (specially memory heavy processes like HBase), obviously severity depends on your use case
I would say you can decrease memory allocation to namenode/datanodes/secondary namenode/hbase master/zookeeper and increase allocation to region servers Regards, Dhaval ________________________________ From: Vimal Jain <[email protected]> To: [email protected] Sent: Wednesday, 7 August 2013 12:47 PM Subject: Re: Memory distribution for Hadoop/Hbase processes Hi Ted, I am using centOS. I could not get output of "ps aux | grep pid" as currently the hbase/hadoop is down in production due to some internal reasons. Can you please help me in figuring out memory distribution for my single node cluster ( pseudo-distributed mode) ? Currently its just 4GB RAM .Also i can try and make it up to 6 GB. So i have come up with following distribution :- Name node - 512 MB Data node - 1024MB Secondary Name node - 512 MB HMaster - 512 MB HRegion - 2048 MB Zookeeper - 512 MB So total memory allocation is 5 GB and i still have 1 GB left for OS. 1) So is it fine to go ahead with this configuration in production ? ( I am asking this because i had "long GC pause" problems in past when i did not change JVM memory allocation configuration in hbase-env.sh and hadoop-env.sh so it was taking default values . i.e. 1 GB for each of the 6 process so total allocation was 6 GB and i had only 4 GB of RAM. After this i just assigned 1.5 GB to HRegion and 512 MB each to HMaster and Zookeeper . I forgot to change it for Hadoop processes.Also i changed kernel parameter vm.swappiness to 0. After this , it was working fine). 2) Currently i am running pseudo-distributed mode as my data size is at max 10-15GB at present.How easy it is to migrate from pseudo-distributed mode to Fully distributed mode in future if my data size increases ? ( which will be the case for sure ) . Thanks for your help . Really appreciate it . On Sun, Aug 4, 2013 at 8:12 PM, Kevin O'dell <[email protected]>wrote: > My questions are : > 1) How this thing is working ? It is working because java can over allocate > memory. You will know you are using too much memory when the kernel starts > killing processes. > 2) I just have one table whose size at present is about 10-15 GB , so what > should be ideal memory distribution ? Really you should get a box with more > memory. You can currently only hold about ~400 MB in memory. > On Aug 4, 2013 9:58 AM, "Ted Yu" <[email protected]> wrote: > > > What OS are you using ? > > > > What is the output from the following command ? > > ps aux | grep pid > > where pid is the process Id for Namenode, Datanode, etc. > > > > Cheers > > > > On Sun, Aug 4, 2013 at 6:33 AM, Vimal Jain <[email protected]> wrote: > > > > > Hi, > > > I have configured Hbase in pseudo distributed mode with HDFS as > > underlying > > > storage.I am not using map reduce framework as of now > > > I have 4GB RAM. > > > Currently i have following distribution of memory > > > > > > Data Node,Name Node,Secondary Name Node each :1000MB(default > > > HADOOP_HEAPSIZE > > > property) > > > > > > Hmaster - 512 MB > > > HRegion - 1536 MB > > > Zookeeper - 512 MB > > > > > > So total heap allocation becomes - 5.5 GB which is absurd as my total > RAM > > > is only 4 GB , but still the setup is working fine on production. :-0 > > > > > > My questions are : > > > 1) How this thing is working ? > > > 2) I just have one table whose size at present is about 10-15 GB , so > > what > > > should be ideal memory distribution ? > > > -- > > > Thanks and Regards, > > > Vimal Jain > > > > > > -- Thanks and Regards, Vimal Jain
