Hi Asaf,
Thanks for the info. I tried this, but it didn't work for me (the region
servers never shut down). Any idea how long it should take to pick it up? I
let it sit several minutes, and all I saw in the RS logs was:
2013-08-08 13:41:55,303 INFO
Hi J-D,
Thanks for the help.
I tried your suggestion (hbase-daemon.sh stop master), and this leaves
all the region servers running. This seems the same as the problematic case
I was in when I was stopping only the HMaster, and not the region servers,
and then bouncing HDFS. It seems like I want
Yep. That's a confusing one.
When running /hbase stop master, it sets the shutdown flag in ZK. RS listen
in on this flag, and once they see it set, they shut them selfs down. Once
they are all down, the master goes down as well.
On Saturday, August 3, 2013, Jean-Daniel Cryans wrote:
Ah then
Doing a bin/stop-hbase.sh is the way to go, then on the Hadoop side
you do stop-all.sh. I think your ordering is correct but I'm not sure
you are using the right commands.
J-D
On Fri, Aug 2, 2013 at 8:27 AM, Patrick Schless
patrick.schl...@gmail.com wrote:
Ah, I bet the issue is that I'm
Doesn't stop-hbase.sh (and its ilk) require the server to be able to manage
the clients (using unpassworded SSH keys, for instance)? I don't have that
set up (for security reasons). I use capistrano for all these sort of
coordination tasks.
On Fri, Aug 2, 2013 at 12:07 PM, Jean-Daniel Cryans
Ah then doing bin/hbase-daemon.sh stop master on the master node is
the equivalent, but don't stop the region server themselves as the
master will take care of it. Doing a stop on the master and the region
servers will screw things up.
J-D
On Fri, Aug 2, 2013 at 3:28 PM, Patrick Schless
I'm running:
CDH4.1.2
HBase 0.92.1
Hadoop 2.0.0
Is there an issue with restarting a standby cluster with replication
running? I am doing the following on the standby cluster:
- stop hmaster
- stop name_node
- start name_node
- start hmaster
When the name node comes back up, it's reliably
I can't think of a way how your missing blocks would be related to
HBase replication, there's something else going on. Are all the
datanodes checking back in?
J-D
On Thu, Aug 1, 2013 at 2:17 PM, Patrick Schless
patrick.schl...@gmail.com wrote:
I'm running:
CDH4.1.2
HBase 0.92.1
Hadoop 2.0.0
Yup, 14 datanodes, all check back in. However, all of the corrupt files
seem to be splitlogs from data05. This is true even though I've done
several restarts (each restart adding a few missing blocks). There's
nothing special about data05, and it seems to be in the cluster, the same
as anyone
Can you follow the life of one of those blocks though the Namenode and
datanode logs? I'd suggest you start by doing a fsck on one of those
files with the option that gives the block locations first.
By the way why do you have split logs? Are region servers dying every
time you try out something?
10 matches
Mail list logo