It can be lots of things, Manjeet. You've gotta do a bit of troubleshooting yourself first; a long dump of your machine specs doesn't change that.
Can you describe what happened before/after the node went down? The log just says server isn't running, so we can't tell much from that alone. -Dima On Wed, Oct 19, 2016 at 10:53 PM, Manjeet Singh <[email protected]> wrote: > I want to add few more points > > > below is my cluster configuration > > > > > > > > > > *Distribution* > > * Total* > > *Distribution* > > *OS (RAID-1)* > > *DATA* > > *Total RAM* > > *Components* > > *Yarn Resource manager/ Node manager* > > *Node* > > Node- 1 > > 2x6 Core > > 12 core > > 6x300 GB > > 300 > > Single 900 GB RAID-10 > > 96 > > Hbase Master, HDFS Name Node, Zookeeper Server, Spark History server, > phoenix, HDFS Balancer, Spark getway. MySql. > > · YARN (MR2 Included) JobHistory Server > <http://192.168.129.121:7180/cmf/services/10/instances/25/status>. > > · ResourceManager > <http://192.168.129.121:7180/cmf/services/10/instances/26/status> > > Name Node > > Node- 2 > > 2x6 Core > > 12 core > > 6x300 GB > > 300 > > 300 GB X 6 Individual RAID-0 > > 80 > > Hdfs data node, Hbase Region, Zookeeper Server, spark, Hbase Master, > > YARN (MR2 Included) NodeManager > > Data Node, Spark Node > > Node- 3 > > 2x6 Core > > 12 core > > 6x300 GB > > 300 > > 300 GB X 6 Individual RAID-0 > > 80 > > Hdfs data node, Hbase Region, Zookeeper Server, spark > > YARN (MR2 Included) NodeManager > > Data Node, Spark Node > > Node - 4 > > 2x6 Core > > 12 core > > 8x300 GB > > 300 > > 300 GB X 6 Individual RAID-0 > > 80 > > Hdfs data node, Hbase Region, spark > > YARN (MR2 Included) NodeManager > > Data Node, Spark Node > > > > > > > > > I noticed that Hbase taking more time while reading so i use below property > to improve its performance > > *Property Name* > > *Original value* > > *Changed value* > > hfile.block.cache.size > > 0.4 > > 0.6 > > hbase.regionserver.global.memstore.size > > 0.4 > > 0.2 > > > below is some more information > > I have Spark ETL jobon same cluster and I have below parameters after > running this job > > > > *Parameter * > > *Value* > > Number of Pipeline > > 2 (Kafka) > > Raw Size of Kafka Message > > 21 GB > > Data Rate > > 1 MB/Sec per pipeline > > Size of Aggregated Data in Hbase > > 2.6 GB With Snappy and Major Compaction > > Batch Duration > > 30 sec > > Sliding Window , Window Duration > > 900 Sec [15 Minute] > > CPU Utilization > > 63.2 % > > Number of Executor > > 3 per pipeline > > Allocated RAM > > 3 GB per pipeline > > Cluster N/W IO > > 3.2 MB/sec > > Cluster Disk IO > > 3.5 MB/Sec > > Max Time(highest peak) taken by Spark ETL for 900 MB Size of Data to > Process data for Domain > > 2 Hour > > Max Time(highest peak) taken by Spark ETL for 900 MB Size of Data to > Process data for Application > > 30 Minute > > Total Time Taken by kafka Simulator to push the data into Kafka > > 6h > > Total Time Taken by by Spark ETL to process all the Data > > 7 h > > Number of SQL Query > > 10 > > Number of Profile > > 9 > > Number of Row in Hbase > > 11015719 > > > Thanks > Manjeet > > > On Thu, Oct 20, 2016 at 10:45 AM, Manjeet Singh < > [email protected]> > wrote: > > > Hi All > > Can any one help me to figure out the root cause I have 4 node cluster > and > > one data node get down , I did not understand why my Hbase Master not > able > > to get up > > > > I have belo log > > > > ERROR: org.apache.hadoop.hbase.ipc.ServerNotRunningYetException: Server > > is not running yet > > at org.apache.hadoop.hbase.master.HMaster. > > checkServiceStarted(HMaster.java:2296) > > at org.apache.hadoop.hbase.master.MasterRpcServices. > > isMasterRunning(MasterRpcServices.java:936) > > at org.apache.hadoop.hbase.protobuf.generated. > > MasterProtos$MasterService$2.callBlockingMethod(MasterProtos.java:55654) > > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java: > 2170) > > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner. > java:109) > > at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop( > > RpcExecutor.java:133) > > at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor. > > java:108) > > at java.lang.Thread.run(Thread.java:745) > > > > > > Thanks > > Manjeet > > > > -- > > luv all > > > > > > -- > luv all >
