Hello again, I have solved the problem with reference here: https://issues.apache.org/jira/browse/ZOOKEEPER-1621, and `pio status` returns me with a normal result, which seems great. However, the problem now is that I receive 500 (internal server error) with message that "The server was not able to produce a timely response to your request.". Also, when I do `pio train`, it fails with the following message: Caused by: org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after attempts=35, exceptions: Sat Mar 11 14:00:10 CST 2017, org.apache.hadoop.hbase.client.RpcRetryingCaller@7dfeb08d, java.net.ConnectException: Connection refused Sat Mar 11 14:00:10 CST 2017, org.apache.hadoop.hbase.client.RpcRetryingCaller@7dfeb08d, org.apache.hadoop.hbase.ipc.RpcClient$FailedServerException: This server is in the failed servers list: PredictIO3.ucf.com/10.1.3.153:37708 Sat Mar 11 14:00:11 CST 2017, org.apache.hadoop.hbase.client.RpcRetryingCaller@7dfeb08d, org.apache.hadoop.hbase.ipc.RpcClient$FailedServerException: This server is in the failed servers list: PredictIO3.ucf.com/10.1.3.153:37708 Sat Mar 11 14:00:12 CST 2017, org.apache.hadoop.hbase.client.RpcRetryingCaller@7dfeb08d, org.apache.hadoop.hbase.ipc.RpcClient$FailedServerException: This server is in the failed servers list: PredictIO3.ucf.com/10.1.3.153:37708 Sat Mar 11 14:00:14 CST 2017, org.apache.hadoop.hbase.client.RpcRetryingCaller@7dfeb08d, java.net.ConnectException: Connection refused Sat Mar 11 14:00:18 CST 2017, org.apache.hadoop.hbase.client.RpcRetryingCaller@7dfeb08d, java.net.ConnectException: Connection refused Sat Mar 11 14:00:28 CST 2017, org.apache.hadoop.hbase.client.RpcRetryingCaller@7dfeb08d, java.net.ConnectException: Connection refused Sat Mar 11 14:00:38 CST 2017, org.apache.hadoop.hbase.client.RpcRetryingCaller@7dfeb08d, java.net.ConnectException: Connection refused Sat Mar 11 14:00:48 CST 2017, org.apache.hadoop.hbase.client.RpcRetryingCaller@7dfeb08d, java.net.ConnectException: Connection refused Sat Mar 11 14:00:58 CST 2017, org.apache.hadoop.hbase.client.RpcRetryingCaller@7dfeb08d, java.net.ConnectException: Connection refused Sat Mar 11 14:01:18 CST 2017, org.apache.hadoop.hbase.client.RpcRetryingCaller@7dfeb08d, java.net.ConnectException: Connection refused Sat Mar 11 14:01:38 CST 2017, org.apache.hadoop.hbase.client.RpcRetryingCaller@7dfeb08d, java.net.ConnectException: Connection refused Sat Mar 11 14:01:58 CST 2017, org.apache.hadoop.hbase.client.RpcRetryingCaller@7dfeb08d, java.net.ConnectException: Connection refused Sat Mar 11 14:02:18 CST 2017, org.apache.hadoop.hbase.client.RpcRetryingCaller@7dfeb08d, java.net.ConnectException: Connection refused Sat Mar 11 14:02:39 CST 2017, org.apache.hadoop.hbase.client.RpcRetryingCaller@7dfeb08d, java.net.ConnectException: Connection refused Sat Mar 11 14:02:59 CST 2017, org.apache.hadoop.hbase.client.RpcRetryingCaller@7dfeb08d, java.net.ConnectException: Connection refused
I have tried to delete everything inside /hbase/zookeeper by some online advise, but the issue remained. Have someone met this failure and solved it? Thank you and appreciate for any help! Best regards, Amy Lin Amy <[email protected]> 於 2017年3月11日 週六 上午10:28寫道: > Hello, > > Yesterday I found the disk is fulled, which lead to Hbase failure: > > *stopping > hbase/home/crs/PredictionIO-0.10.0-incubating/vendors/hbase-1.0.0/bin/stop-hbase.sh: > line 50: echo: write error: No space left on device* > *Java HotSpot(TM) 64-Bit Server VM warning: Insufficient space for shared > memory file:* > * 853* > *Try using the -Djava.io.tmpdir= option to select an alternate temp > location.* > > So I spare a lot of disk spaces, and tried to `pio-stop-all` and > `pio-start-all`. Then `pio status` gave me error: > ----------------------------------------------------- > *[INFO] [Console$] Inspecting PredictionIO...* > *[INFO] [Console$] PredictionIO 0.10.0-incubating is installed at > /home/crs/PredictionIO-0.10.0-incubating* > *[INFO] [Console$] Inspecting Apache Spark...* > *[INFO] [Console$] Apache Spark is installed at > /home/crs/PredictionIO-0.10.0-incubating/vendors/spark-1.6.2-bin-hadoop2.6* > *[INFO] [Console$] Apache Spark 1.6.2 detected (meets minimum requirement > of 1.3.0)* > *[INFO] [Console$] Inspecting storage backend connections...* > *[INFO] [Storage$] Verifying Meta Data Backend (Source: ELASTICSEARCH)...* > *[INFO] [Storage$] Verifying Model Data Backend (Source: LOCALFS)...* > *[INFO] [Storage$] Verifying Event Data Backend (Source: HBASE)...* > *[ERROR] [RecoverableZooKeeper] ZooKeeper exists failed after 1 attempts* > *[ERROR] [ZooKeeperWatcher] hconnection-0x3fc05ea2, quorum=localhost:2181, > baseZNode=/hbase Received unexpected KeeperException, re-throwing exception* > *[WARN] [ZooKeeperRegistry] Can't retrieve clusterId from Zookeeper* > *[ERROR] [StorageClient] Cannot connect to ZooKeeper (ZooKeeper ensemble: > localhost). Please make sure that the configuration is pointing at the > correct ZooKeeper ensemble. By default, HBase manages its own ZooKeeper, so > if you have not configured HBase to use an external ZooKeeper, that means > your HBase is not started or configured properly.* > *[ERROR] [Storage$] Error initializing storage client for source HBASE* > *[ERROR] [Console$] Unable to connect to all storage backends > successfully. The following shows the error message from the storage > backend.* > *[ERROR] [Console$] Data source HBASE was not properly initialized. > (org.apache.predictionio.data.storage.StorageClientException)* > *[ERROR] [Console$] Dumping configuration of initialized storage backend > sources. Please make sure they are correct.* > *[ERROR] [Console$] Source Name: ELASTICSEARCH; Type: elasticsearch; > Configuration: HOME -> > /home/crs/PredictionIO-0.10.0-incubating/vendors/elasticsearch-1.7.5, HOSTS > -> Slave2,PredictIO3, PORTS -> 9300,9320, CLUSTERNAME -> CRS, TYPE -> > elasticsearch* > *[ERROR] [Console$] Source Name: LOCALFS; Type: localfs; Configuration: > PATH -> /home/crs/.pio_store/models, TYPE -> localfs* > *[ERROR] [Console$] Source Name: HBASE; Type: (error); Configuration: > (error)* > > ------------------------------------------------------ > My guess is that it fails whenever it tried to restart zookeeper. > > My pio-env.sh & some error in `hbase-crs-master-PredictIO3.log` is also > attached. > > Thank you!!!! > > Best regards, > Amy >
