Hi Bala, Please take a look at http://predictionio.apache.org/resources/faq/#running-hbase, specifically on "Q: How to fix HBase issues after cleaning up a disk that was full?".
Regards, Donald On Fri, Apr 13, 2018 at 9:34 AM, Pat Ferrel <[email protected]> wrote: > This may seem unhelpful now but for others it might be useful to mention > some minimum PIO in production best practices: > > 1) PIO should IMO never be run in production on a single node. When all > services share the same memory, cpu, and disk, it is very difficult to find > the root cause to a problem. > 2) backup data with pio export periodically > 3) install monitoring for disk used, as well as response times and other > factors so you get warnings before you get wedged. > 4) PIO will store data forever. It is designed as an input only system. > Nothing is dropped ever. This is clearly unworkable in real life so a > feature was added to trim the event stream in a safe way in PIO 0.12.0. > There is a separate Template for trimming the DB and doing other things > like deduplication and other compression on some schedule that can and > should be different than training. Do not use this template until you > upgrade and make sure it is compatible with your template: > https://github.com/actionml/db-cleaner > > > From: bala vivek <[email protected]> <[email protected]> > Reply: [email protected] <[email protected]> > <[email protected]> > Date: April 13, 2018 at 2:50:26 AM > To: [email protected] <[email protected]> > <[email protected]> > Subject: Re: Hbase issue > > Hi Donald, > > Yes, I'm running on the single machine. PIO, hbase , elasticsearch, spark > everything works on the same server. Let me know which file I need to > remove because I have client data present in PIO. > > I have tried adding the entries in hbase-site.xml using the following > link, after which I can see the Hmaster seems active but still, the error > remains the same. > > https://medium.com/@tjosepraveen/cant-get-connection-to-zookeeper- > keepererrorcode-connectionloss-for-hbase-63746fbcdbe7 > > > Hbase Error logs :- ( I have commented the server name) > > 2018-04-13 04:31:28,246 INFO [RS:0;VD500042:49584-SendThread(localhost:2182)] > zookeeper.ClientCnxn: Opening socket connection to server > localhost/0:0:0:0:0:0:0:1:2182. Will not attempt to authenticate using > SASL (unknown error) > 2018-04-13 04:31:28,247 WARN [RS:0;XXXXXX:49584-SendThread(localhost:2182)] > zookeeper.ClientCnxn: Session 0x162be5554b90003 for server null, unexpected > error, closing socket connection and attempting reconnect > java.net.ConnectException: Connection refused > at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) > at sun.nio.ch.SocketChannelImpl.finishConnect( > SocketChannelImpl.java:717) > at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport( > ClientCnxnSocketNIO.java:361) > at org.apache.zookeeper.ClientCnxn$SendThread.run( > ClientCnxn.java:1081) > 2018-04-13 04:31:28,553 ERROR [main] master.HMasterCommandLine: Master > exiting > java.lang.RuntimeException: Master not initialized after 200000ms seconds > at org.apache.hadoop.hbase.util.JVMClusterUtil.startup( > JVMClusterUtil.java:225) > at org.apache.hadoop.hbase.LocalHBaseCluster.startup( > LocalHBaseCluster.java:449) > at org.apache.hadoop.hbase.master.HMasterCommandLine.startMaster( > HMasterCommandLine.java:225) > at org.apache.hadoop.hbase.master.HMasterCommandLine.run( > HMasterCommandLine.java:137) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) > at org.apache.hadoop.hbase.util.ServerCommandLine.doMain( > ServerCommandLine.java:126) > at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:2436) > (END) > > I have tried multiple time pio-stop-all and pio-start-all but no luck the > service is not up. > If I install the hbase alone in the existing setup let me know what things > I should consider. If anyone faced this issue please provide me the > solution steps. > > On Thu, Apr 12, 2018 at 9:13 PM, Donald Szeto <[email protected]> wrote: > >> Hi Bala, >> >> Are you running a single-machine HBase setup? The ZooKeeper embedded in >> such a setup is pretty fragile to disk space issue and your ZNode might >> have corrupted. >> >> If that’s indeed your setup, please take a look at HBase log files, >> specifically on messages from ZooKeeper. In this situation, one way to >> recover is to remove ZooKeeper files and let HBase recreate them, assuming >> from your log output that you don’t have other services depend on the same >> ZK. >> >> Regards, >> Donald >> >> On Thu, Apr 12, 2018 at 5:34 AM bala vivek <[email protected]> >> wrote: >> >>> Hi, >>> >>> I use PIO 0.10.0 version and hbase 1.2.4. The setup was working fine >>> till today morning. I saw PIO was down as the mount space issue was present >>> on the server and cleared the unwanted files. >>> >>> After doing a pio-stop-all and pio-start-all the HMaster service is not >>> working. I tried multiple times the pio restart. >>> >>> I can see whenever I do a pio-stop-all and check the service using jps, >>> the Hmaster seems running. Similarly I tried to run the ./start-hbase.sh >>> script but still pio status is not showing as success. >>> >>> pio error log : >>> >>> [INFO] [Console$] Inspecting PredictionIO... >>> [INFO] [Console$] PredictionIO 0.10.0-incubating is installed at >>> /opt/tools/PredictionIO-0.10.0-incubating >>> [INFO] [Console$] Inspecting Apache Spark... >>> [INFO] [Console$] Apache Spark is installed at >>> /opt/tools/PredictionIO-0.10.0-incubating/vendors/spark-1.6. >>> 3-bin-hadoop2.6 >>> [INFO] [Console$] Apache Spark 1.6.3 detected (meets minimum requirement >>> of 1.3.0) >>> [INFO] [Console$] Inspecting storage backend connections... >>> [INFO] [Storage$] Verifying Meta Data Backend (Source: ELASTICSEARCH)... >>> [INFO] [Storage$] Verifying Model Data Backend (Source: LOCALFS)... >>> [INFO] [Storage$] Verifying Event Data Backend (Source: HBASE)... >>> [ERROR] [RecoverableZooKeeper] ZooKeeper exists failed after 1 attempts >>> [ERROR] [ZooKeeperWatcher] hconnection-0x7c891ba7, >>> quorum=localhost:2181, baseZNode=/hbase Received unexpected >>> KeeperException, re-throwing exception >>> [WARN] [ZooKeeperRegistry] Can't retrieve clusterId from Zookeeper >>> [ERROR] [StorageClient] Cannot connect to ZooKeeper (ZooKeeper ensemble: >>> localhost). Please make sure that the configuration is pointing at the >>> correct ZooKeeper ensemble. By default, HBase manages its own ZooKeeper, so >>> if you have not configured HBase to use an external ZooKeeper, that means >>> your HBase is not started or configured properly. >>> [ERROR] [Storage$] Error initializing storage client for source HBASE >>> [ERROR] [Console$] Unable to connect to all storage backends >>> successfully. The following shows the error message from the storage >>> backend. >>> [ERROR] [Console$] Data source HBASE was not properly initialized. >>> (org.apache.predictionio.data.storage.StorageClientException) >>> [ERROR] [Console$] Dumping configuration of initialized storage backend >>> sources. Please make sure they are correct. >>> [ERROR] [Console$] Source Name: ELASTICSEARCH; Type: elasticsearch; >>> Configuration: TYPE -> elasticsearch, HOME -> /opt/tools/PredictionIO-0.10.0 >>> -incubating/vendors/elasticsearch-1.7.3 >>> [ERROR] [Console$] Source Name: LOCALFS; Type: localfs; Configuration: >>> PATH -> /root/.pio_store/models, TYPE -> localfs >>> [ERROR] [Console$] Source Name: HBASE; Type: (error); Configuration: >>> (error) >>> >>> >>> Regards, >>> Bala >>> >> >
