'Hi Donald,

The link was good, but here is what I observe.

 > I dont see any of the zookeeper folder present seperately inside Hbase.
So the "find" command in unix bring many output for the term 'snapshot' and
'log'
 >  hbase hbck -repair and  hbase hbck -repairHoles commands works fine
without any error but on running  'hbase hbck'
I can see the following errors.

2018-04-15 12:29:04,158 ERROR [main]
client.ConnectionManager$HConnectionImplementation: Can't get connection to
ZooKeeper: KeeperErrorCode = ConnectionLoss for /hbase
2018-04-15 12:29:04,158 WARN  [main-SendThread(localhost:2182)]
zookeeper.ClientCnxn: Session 0x0 for server null, unexpected error,
closing socket connection and attempting reconnect
java.net.ConnectException: Connection refused
   at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
   at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
   at
org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
   at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081)
2018-04-15 12:29:04,158 INFO  [main] client.RpcRetryingCaller: Call
exception, tries=18, retries=35, started=518330 ms ago, cancelled=false,
msg=
2018-04-15 12:29:05,259 INFO  [main-SendThread(localhost:2182)]
zookeeper.ClientCnxn: Opening socket connection to server localhost/
127.0.0.1:2182. Will not attempt to authenticate using SASL (unkn
own error)
2018-04-15 12:29:05,259 WARN  [main-SendThread(localhost:2182)]
zookeeper.ClientCnxn: Session 0x0 for server null, unexpected error,
closing socket connection and attempting reconnect
java.net.ConnectException: Connection refused
   at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
   at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
   at
org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
   at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081)
2018-04-15 12:29:05,359 INFO  [main-SendThread(localhost:2182)]
zookeeper.ClientCnxn: Opening socket connection to server
localhost/0:0:0:0:0:0:0:1:2182. Will not attempt to authenticate using SASL
 (unknown error)
2018-04-15 12:29:05,360 WARN  [main-SendThread(localhost:2182)]
zookeeper.ClientCnxn: Session 0x0 for server null, unexpected error,
closing socket connection and attempting reconnect
java.net.ConnectException: Connection refused
   at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
   at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
   at
org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
   at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081)
^C2018-04-15 12:29:05,455 INFO  [Thread-3]
client.ConnectionManager$HConnectionImplementation: Closing zookeeper
sessionid=0x0
2018-04-15 12:29:05,461 INFO  [Thread-3] zookeeper.ZooKeeper: Session: 0x0
closed
2018-04-15 12:29:05,461 INFO  [main-EventThread] zookeeper.ClientCnxn:
EventThread shut down


Few errors like,

2018-04-15 12:29:02,954 INFO  [main-SendThread(localhost:2182)]
zookeeper.ClientCnxn: Opening socket connection to server
localhost/0:0:0:0:0:0:0:1:2182. Will not attempt to authenticate using SASL
 (unknown error)
2018-04-15 12:29:02,954 WARN  [main-SendThread(localhost:2182)]
zookeeper.ClientCnxn: Session 0x0 for server null, unexpected error,
closing socket connection and attempting reconnect
java.net.ConnectException: Connection refused
   at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
   at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
   at
org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
   at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081)


I can see the hostname/ip is correctly configured on the server which I use
PIO.

Regards,

Bala


On Sat, Apr 14, 2018 at 2:09 AM, Donald Szeto <don...@apache.org> wrote:

> Hi Bala,
>
> Please take a look at http://predictionio.apache.
> org/resources/faq/#running-hbase, specifically on "Q: How to fix HBase
> issues after cleaning up a disk that was full?".
>
> Regards,
> Donald
>
> On Fri, Apr 13, 2018 at 9:34 AM, Pat Ferrel <p...@occamsmachete.com> wrote:
>
>> This may seem unhelpful now but for others it might be useful to mention
>> some minimum PIO in production best practices:
>>
>> 1) PIO should IMO never be run in production on a single node. When all
>> services share the same memory, cpu, and disk, it is very difficult to find
>> the root cause to a problem.
>> 2) backup data with pio export periodically
>> 3) install monitoring for disk used, as well as response times and other
>> factors so you get warnings before you get wedged.
>> 4) PIO will store data forever. It is designed as an input only system.
>> Nothing is dropped ever. This is clearly unworkable in real life so a
>> feature was added to trim the event stream in a safe way in PIO 0.12.0.
>> There is a separate Template for trimming the DB and doing other things
>> like deduplication and other compression on some schedule that can and
>> should be different than training. Do not use this template until you
>> upgrade and make sure it is compatible with your template:
>> https://github.com/actionml/db-cleaner
>>
>>
>> From: bala vivek <bala.vivek...@gmail.com> <bala.vivek...@gmail.com>
>> Reply: user@predictionio.apache.org <user@predictionio.apache.org>
>> <user@predictionio.apache.org>
>> Date: April 13, 2018 at 2:50:26 AM
>> To: user@predictionio.apache.org <user@predictionio.apache.org>
>> <user@predictionio.apache.org>
>> Subject:  Re: Hbase issue
>>
>> Hi Donald,
>>
>> Yes, I'm running on the single machine. PIO, hbase , elasticsearch, spark
>> everything works on the same server. Let me know which file I need to
>> remove because I have client data present in PIO.
>>
>> I have tried adding the entries in hbase-site.xml using the following
>> link, after which I can see the Hmaster seems active but still, the error
>> remains the same.
>>
>> https://medium.com/@tjosepraveen/cant-get-connection-to-
>> zookeeper-keepererrorcode-connectionloss-for-hbase-63746fbcdbe7
>>
>>
>> Hbase Error logs :- ( I have commented the server name)
>>
>> 2018-04-13 04:31:28,246 INFO  
>> [RS:0;VD500042:49584-SendThread(localhost:2182)]
>> zookeeper.ClientCnxn: Opening socket connection to server
>> localhost/0:0:0:0:0:0:0:1:2182. Will not attempt to authenticate using
>> SASL (unknown error)
>> 2018-04-13 04:31:28,247 WARN  [RS:0;XXXXXX:49584-SendThread(localhost:2182)]
>> zookeeper.ClientCnxn: Session 0x162be5554b90003 for server null, unexpected
>> error, closing socket connection and attempting reconnect
>> java.net.ConnectException: Connection refused
>>         at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>>         at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl
>> .java:717)
>>         at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientC
>> nxnSocketNIO.java:361)
>>         at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.
>> java:1081)
>> 2018-04-13 04:31:28,553 ERROR [main] master.HMasterCommandLine: Master
>> exiting
>> java.lang.RuntimeException: Master not initialized after 200000ms seconds
>>         at org.apache.hadoop.hbase.util.JVMClusterUtil.startup(JVMClust
>> erUtil.java:225)
>>         at org.apache.hadoop.hbase.LocalHBaseCluster.startup(LocalHBase
>> Cluster.java:449)
>>         at org.apache.hadoop.hbase.master.HMasterCommandLine.startMaste
>> r(HMasterCommandLine.java:225)
>>         at org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMaste
>> rCommandLine.java:137)
>>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>>         at org.apache.hadoop.hbase.util.ServerCommandLine.doMain(Server
>> CommandLine.java:126)
>>         at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:2436)
>> (END)
>>
>> I have tried multiple time pio-stop-all and pio-start-all but no luck the
>> service is not up.
>> If I install the hbase alone in the existing setup let me know what
>> things I should consider. If anyone faced this issue please provide me the
>> solution steps.
>>
>> On Thu, Apr 12, 2018 at 9:13 PM, Donald Szeto <don...@apache.org> wrote:
>>
>>> Hi Bala,
>>>
>>> Are you running a single-machine HBase setup? The ZooKeeper embedded in
>>> such a setup is pretty fragile to disk space issue and your ZNode might
>>> have corrupted.
>>>
>>> If that’s indeed your setup, please take a look at HBase log files,
>>> specifically on messages from ZooKeeper. In this situation, one way to
>>> recover is to remove ZooKeeper files and let HBase recreate them, assuming
>>> from your log output that you don’t have other services depend on the same
>>> ZK.
>>>
>>> Regards,
>>> Donald
>>>
>>> On Thu, Apr 12, 2018 at 5:34 AM bala vivek <bala.vivek...@gmail.com>
>>> wrote:
>>>
>>>> Hi,
>>>>
>>>> I use PIO 0.10.0 version and hbase 1.2.4. The setup was working fine
>>>> till today morning. I saw PIO was down as the mount space issue was present
>>>> on the server and cleared the unwanted files.
>>>>
>>>> After doing a pio-stop-all and pio-start-all the HMaster service is not
>>>> working. I tried multiple times the pio restart.
>>>>
>>>> I can see whenever I do a pio-stop-all and check the service using jps,
>>>> the Hmaster seems running. Similarly I tried to run the ./start-hbase.sh
>>>> script but still pio status is not showing as success.
>>>>
>>>> pio error log :
>>>>
>>>> [INFO] [Console$] Inspecting PredictionIO...
>>>> [INFO] [Console$] PredictionIO 0.10.0-incubating is installed at
>>>> /opt/tools/PredictionIO-0.10.0-incubating
>>>> [INFO] [Console$] Inspecting Apache Spark...
>>>> [INFO] [Console$] Apache Spark is installed at
>>>> /opt/tools/PredictionIO-0.10.0-incubating/vendors/spark-1.6.
>>>> 3-bin-hadoop2.6
>>>> [INFO] [Console$] Apache Spark 1.6.3 detected (meets minimum
>>>> requirement of 1.3.0)
>>>> [INFO] [Console$] Inspecting storage backend connections...
>>>> [INFO] [Storage$] Verifying Meta Data Backend (Source: ELASTICSEARCH)...
>>>> [INFO] [Storage$] Verifying Model Data Backend (Source: LOCALFS)...
>>>> [INFO] [Storage$] Verifying Event Data Backend (Source: HBASE)...
>>>> [ERROR] [RecoverableZooKeeper] ZooKeeper exists failed after 1 attempts
>>>> [ERROR] [ZooKeeperWatcher] hconnection-0x7c891ba7,
>>>> quorum=localhost:2181, baseZNode=/hbase Received unexpected
>>>> KeeperException, re-throwing exception
>>>> [WARN] [ZooKeeperRegistry] Can't retrieve clusterId from Zookeeper
>>>> [ERROR] [StorageClient] Cannot connect to ZooKeeper (ZooKeeper
>>>> ensemble: localhost). Please make sure that the configuration is pointing
>>>> at the correct ZooKeeper ensemble. By default, HBase manages its own
>>>> ZooKeeper, so if you have not configured HBase to use an external
>>>> ZooKeeper, that means your HBase is not started or configured properly.
>>>> [ERROR] [Storage$] Error initializing storage client for source HBASE
>>>> [ERROR] [Console$] Unable to connect to all storage backends
>>>> successfully. The following shows the error message from the storage
>>>> backend.
>>>> [ERROR] [Console$] Data source HBASE was not properly initialized.
>>>> (org.apache.predictionio.data.storage.StorageClientException)
>>>> [ERROR] [Console$] Dumping configuration of initialized storage backend
>>>> sources. Please make sure they are correct.
>>>> [ERROR] [Console$] Source Name: ELASTICSEARCH; Type: elasticsearch;
>>>> Configuration: TYPE -> elasticsearch, HOME -> 
>>>> /opt/tools/PredictionIO-0.10.0
>>>> -incubating/vendors/elasticsearch-1.7.3
>>>> [ERROR] [Console$] Source Name: LOCALFS; Type: localfs; Configuration:
>>>> PATH -> /root/.pio_store/models, TYPE -> localfs
>>>> [ERROR] [Console$] Source Name: HBASE; Type: (error); Configuration:
>>>> (error)
>>>>
>>>>
>>>> Regards,
>>>> Bala
>>>>
>>>
>>
>

Reply via email to