This may seem unhelpful now but for others it might be useful to mention some minimum PIO in production best practices:
1) PIO should IMO never be run in production on a single node. When all services share the same memory, cpu, and disk, it is very difficult to find the root cause to a problem. 2) backup data with pio export periodically 3) install monitoring for disk used, as well as response times and other factors so you get warnings before you get wedged. 4) PIO will store data forever. It is designed as an input only system. Nothing is dropped ever. This is clearly unworkable in real life so a feature was added to trim the event stream in a safe way in PIO 0.12.0. There is a separate Template for trimming the DB and doing other things like deduplication and other compression on some schedule that can and should be different than training. Do not use this template until you upgrade and make sure it is compatible with your template: https://github.com/actionml/db-cleaner From: bala vivek <[email protected]> Reply: [email protected] <[email protected]> Date: April 13, 2018 at 2:50:26 AM To: [email protected] <[email protected]> Subject: Re: Hbase issue Hi Donald, Yes, I'm running on the single machine. PIO, hbase , elasticsearch, spark everything works on the same server. Let me know which file I need to remove because I have client data present in PIO. I have tried adding the entries in hbase-site.xml using the following link, after which I can see the Hmaster seems active but still, the error remains the same. https://medium.com/@tjosepraveen/cant-get-connection-to-zookeeper-keepererrorcode-connectionloss-for-hbase-63746fbcdbe7 Hbase Error logs :- ( I have commented the server name) 2018-04-13 04:31:28,246 INFO [RS:0;VD500042:49584-SendThread(localhost:2182)] zookeeper.ClientCnxn: Opening socket connection to server localhost/0:0:0:0:0:0:0:1:2182. Will not attempt to authenticate using SASL (unknown error) 2018-04-13 04:31:28,247 WARN [RS:0;XXXXXX:49584-SendThread(localhost:2182)] zookeeper.ClientCnxn: Session 0x162be5554b90003 for server null, unexpected error, closing socket connection and attempting reconnect java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361) at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081) 2018-04-13 04:31:28,553 ERROR [main] master.HMasterCommandLine: Master exiting java.lang.RuntimeException: Master not initialized after 200000ms seconds at org.apache.hadoop.hbase.util.JVMClusterUtil.startup(JVMClusterUtil.java:225) at org.apache.hadoop.hbase.LocalHBaseCluster.startup(LocalHBaseCluster.java:449) at org.apache.hadoop.hbase.master.HMasterCommandLine.startMaster(HMasterCommandLine.java:225) at org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:137) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:126) at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:2436) (END) I have tried multiple time pio-stop-all and pio-start-all but no luck the service is not up. If I install the hbase alone in the existing setup let me know what things I should consider. If anyone faced this issue please provide me the solution steps. On Thu, Apr 12, 2018 at 9:13 PM, Donald Szeto <[email protected]> wrote: Hi Bala, Are you running a single-machine HBase setup? The ZooKeeper embedded in such a setup is pretty fragile to disk space issue and your ZNode might have corrupted. If that’s indeed your setup, please take a look at HBase log files, specifically on messages from ZooKeeper. In this situation, one way to recover is to remove ZooKeeper files and let HBase recreate them, assuming from your log output that you don’t have other services depend on the same ZK. Regards, Donald On Thu, Apr 12, 2018 at 5:34 AM bala vivek <[email protected]> wrote: Hi, I use PIO 0.10.0 version and hbase 1.2.4. The setup was working fine till today morning. I saw PIO was down as the mount space issue was present on the server and cleared the unwanted files. After doing a pio-stop-all and pio-start-all the HMaster service is not working. I tried multiple times the pio restart. I can see whenever I do a pio-stop-all and check the service using jps, the Hmaster seems running. Similarly I tried to run the ./start-hbase.sh script but still pio status is not showing as success. pio error log : [INFO] [Console$] Inspecting PredictionIO... [INFO] [Console$] PredictionIO 0.10.0-incubating is installed at /opt/tools/PredictionIO-0.10.0-incubating [INFO] [Console$] Inspecting Apache Spark... [INFO] [Console$] Apache Spark is installed at /opt/tools/PredictionIO-0.10.0-incubating/vendors/spark-1.6.3-bin-hadoop2.6 [INFO] [Console$] Apache Spark 1.6.3 detected (meets minimum requirement of 1.3.0) [INFO] [Console$] Inspecting storage backend connections... [INFO] [Storage$] Verifying Meta Data Backend (Source: ELASTICSEARCH)... [INFO] [Storage$] Verifying Model Data Backend (Source: LOCALFS)... [INFO] [Storage$] Verifying Event Data Backend (Source: HBASE)... [ERROR] [RecoverableZooKeeper] ZooKeeper exists failed after 1 attempts [ERROR] [ZooKeeperWatcher] hconnection-0x7c891ba7, quorum=localhost:2181, baseZNode=/hbase Received unexpected KeeperException, re-throwing exception [WARN] [ZooKeeperRegistry] Can't retrieve clusterId from Zookeeper [ERROR] [StorageClient] Cannot connect to ZooKeeper (ZooKeeper ensemble: localhost). Please make sure that the configuration is pointing at the correct ZooKeeper ensemble. By default, HBase manages its own ZooKeeper, so if you have not configured HBase to use an external ZooKeeper, that means your HBase is not started or configured properly. [ERROR] [Storage$] Error initializing storage client for source HBASE [ERROR] [Console$] Unable to connect to all storage backends successfully. The following shows the error message from the storage backend. [ERROR] [Console$] Data source HBASE was not properly initialized. (org.apache.predictionio.data.storage.StorageClientException) [ERROR] [Console$] Dumping configuration of initialized storage backend sources. Please make sure they are correct. [ERROR] [Console$] Source Name: ELASTICSEARCH; Type: elasticsearch; Configuration: TYPE -> elasticsearch, HOME -> /opt/tools/PredictionIO-0.10.0-incubating/vendors/elasticsearch-1.7.3 [ERROR] [Console$] Source Name: LOCALFS; Type: localfs; Configuration: PATH -> /root/.pio_store/models, TYPE -> localfs [ERROR] [Console$] Source Name: HBASE; Type: (error); Configuration: (error) Regards, Bala
