Splitlog Replication

2014-01-07 Thread Patrick Schless
I need to run through some server maintenance on my data nodes, including a reboot. My splitlogs, though, only seem to have a replication factor of 1 (when a data nodes is taken offline, I sometimes have missing blocks for them). I know I can decommission data nodes with the exclude.dfs file, but

Re: 3-Hour Periodic Network/CPU/Disk/Latency Spikes

2013-12-16 Thread Patrick Schless
yourself. Best regards, Vladimir Rodionov Principal Platform Engineer Carrier IQ, www.carrieriq.com e-mail: vrodio...@carrieriq.com From: Patrick Schless [patrick.schl...@gmail.com] Sent: Friday, December 13, 2013 4:36 PM To: user Subject: Re: 3

Re: 3-Hour Periodic Network/CPU/Disk/Latency Spikes

2013-12-13 Thread Patrick Schless
hours enough store files build up to require compactions. There's nothing else automated in HDFS or HBase that I could see causing this. On Fri, Dec 13, 2013 at 3:07 PM, Patrick Schless patrick.schl...@gmail.comwrote: CDH4.1.2 HBase 0.92.1 HDFS 2.0.0 Every 3 hours, our production

Re: 3-Hour Periodic Network/CPU/Disk/Latency Spikes

2013-12-13 Thread Patrick Schless
[yuzhih...@gmail.com] Sent: Friday, December 13, 2013 3:33 PM To: user@hbase.apache.org Cc: user Subject: Re: 3-Hour Periodic Network/CPU/Disk/Latency Spikes Patrick: Attachment didn't go through. Cheers On Dec 13, 2013, at 3:18 PM, Patrick Schless patrick.schl...@gmail.com wrote: Very

ageOfLastAppliedOp, ageOfLastShippedOp

2013-08-23 Thread Patrick Schless
We run two hbase clusters, and one (master) replicates to the other (standby). We did some maintenance last night which involved bringing all of hbase down while we made changes to HDFS. After bringing things back up, our ageOfLastShippedOp on a few of the master region servers jumped to around -9

Re: How to recover data from hadoop/hbase cluster

2013-08-09 Thread Patrick Schless
Are you missing data? Replication/recovery of blocks is automatic, and there isn't a manual process to it. FWIW, for something like changing the hard drive configs on the box, it would be a better idea to unbalance the node ahead of time and then rebalance it. On Fri, Aug 9, 2013 at 7:18 AM, oc

Fix Borked Replication

2013-08-08 Thread Patrick Schless
I run HBase replication, and while improperly restarting my standby cluster I lost a few splitlog blocks in my replicated table (on the standby cluster). I'm thinking that my standby table is possibly borked now (I can't use the VerifyRep job because I use the increment API). Is it reasonable to

Re: HDFS Restart with Replication

2013-08-08 Thread Patrick Schless
. J-D On Fri, Aug 2, 2013 at 3:28 PM, Patrick Schless patrick.schl...@gmail.com wrote: Doesn't stop-hbase.sh (and its ilk) require the server to be able to manage the clients (using unpassworded SSH keys, for instance)? I don't have that set up (for security reasons). I use

Re: HDFS Restart with Replication

2013-08-06 Thread Patrick Schless
will screw things up. J-D On Fri, Aug 2, 2013 at 3:28 PM, Patrick Schless patrick.schl...@gmail.com wrote: Doesn't stop-hbase.sh (and its ilk) require the server to be able to manage the clients (using unpassworded SSH keys, for instance)? I don't have that set up (for security reasons

Cleaning up after failed table rename

2013-08-02 Thread Patrick Schless
I was testing an hbase table rename script I found in a JIRA, and it didn't work for me. Not a huge deal (I went with a different solution), but it left some data I want to clean up. I was trying to rename a table from t1 to t1.renamed. Now in HBase, 'list' shows 't1.renamed'. In HDFS, I have

Re: HDFS Restart with Replication

2013-08-02 Thread Patrick Schless
jdcry...@apache.orgwrote: Doing a bin/stop-hbase.sh is the way to go, then on the Hadoop side you do stop-all.sh. I think your ordering is correct but I'm not sure you are using the right commands. J-D On Fri, Aug 2, 2013 at 8:27 AM, Patrick Schless patrick.schl...@gmail.com wrote: Ah, I bet

Re: Cleaning up after failed table rename

2013-08-02 Thread Patrick Schless
) at org.apache.hadoop.hbase.util.HBaseFsck.main(HBaseFsck.java:3205) On Fri, Aug 2, 2013 at 11:31 AM, Ted Yu yuzhih...@gmail.com wrote: Can you try running: hbck -repair On Fri, Aug 2, 2013 at 9:28 AM, Patrick Schless patrick.schl...@gmail.comwrote: I was testing an hbase table rename script I found in a JIRA

HDFS Restart with Replication

2013-08-01 Thread Patrick Schless
I'm running: CDH4.1.2 HBase 0.92.1 Hadoop 2.0.0 Is there an issue with restarting a standby cluster with replication running? I am doing the following on the standby cluster: - stop hmaster - stop name_node - start name_node - start hmaster When the name node comes back up, it's reliably

Reload configs

2013-08-01 Thread Patrick Schless
Is there a way to reload the HBase configs without restarting the whole system (in other words, without an interruption of service)? I'm on: CDH4.1.2 HBase 0.92.1 Hadoop 2.0.0 Thanks, Patrick

Re: HDFS Restart with Replication

2013-08-01 Thread Patrick Schless
. On Thu, Aug 1, 2013 at 5:04 PM, Jean-Daniel Cryans jdcry...@apache.orgwrote: I can't think of a way how your missing blocks would be related to HBase replication, there's something else going on. Are all the datanodes checking back in? J-D On Thu, Aug 1, 2013 at 2:17 PM, Patrick Schless

Re: VerifyRep - Replication needs to be enabled to verify it.

2013-07-11 Thread Patrick Schless
that configuration value is false. Below is the relevant code: if (!conf.getBoolean(HConstants.REPLICATION_ENABLE_KEY, false)) { throw new IOException(Replication needs to be enabled to verify it.); } Jieshan -Original Message- From: Patrick Schless [mailto:patrick.schl

Re: VerifyRep - Replication needs to be enabled to verify it.

2013-07-11 Thread Patrick Schless
11, 2013 at 8:46 AM, Patrick Schless patrick.schl...@gmail.comwrote: Yes [1], I set that in hbase-site.xml when I turned on replication. This box is solely my job-tracker, so maybe it doesn't pick up the hbase-site.xml? Trying this job from the HMaster didn't work, because it doesn't have

hbase.client.scanner.caching - default 1, not 100

2013-07-11 Thread Patrick Schless
In 0.94 I noticed (in the Job File) my job VerifyRep job was running with hbase.client.scanner.caching set to 1, even though the hbase docs [1] say it defaults to 100. I didn't have that property being set in any of my configs. I added the properties to hbase-site.xml (set to 100), and now that

Re: hbase.client.scanner.caching - default 1, not 100

2013-07-11 Thread Patrick Schless
In section 2.3.1, you would see that its value is 1. Cheers On Thu, Jul 11, 2013 at 9:28 AM, Patrick Schless patrick.schl...@gmail.comwrote: In 0.94 I noticed (in the Job File) my job VerifyRep job was running with hbase.client.scanner.caching set to 1, even though the hbase docs [1] say

Replication - some timestamps off by 1 ms

2013-07-11 Thread Patrick Schless
I have had replication running for about a week now, and have had a lot of data flowing to our slave cluster over that time. Now, I'm running the verifyrep MR job over a 1-hour period a couple days ago (which should be fully replicated), and I'm seeing a small number of BADROWS. Spot-checking a

Re: Replication - some timestamps off by 1 ms

2013-07-11 Thread Patrick Schless
:53 PM, Jean-Daniel Cryans jdcry...@apache.orgwrote: Are those incremented cells? J-D On Thu, Jul 11, 2013 at 10:23 AM, Patrick Schless patrick.schl...@gmail.com wrote: I have had replication running for about a week now, and have had a lot of data flowing to our slave cluster over

VerifyRep - Replication needs to be enabled to verify it.

2013-07-10 Thread Patrick Schless
On 0.92.1, I have (recently) enabled replication, and I'm trying to verify that it's working correctly. I am getting an error saying that replication needs to be enabled, but replication *is* enabled, so I assume I'm doing something wrong. Looking at the age of the last shipped op (on the master

Re: HBase replication - EOF while reading

2013-07-03 Thread Patrick Schless
/jira/browse/HBASE-7122 Thanks, Himanshu On Tue, Jul 2, 2013 at 3:09 PM, Patrick Schless patrick.schl...@gmail.comwrote: I've just enabled replication (to 1 peer), and I'm seeing a bunch of errors, along the lines of [1]. Replication does seem to work, though (data is showing up

HBase replication - EOF while reading

2013-07-02 Thread Patrick Schless
I've just enabled replication (to 1 peer), and I'm seeing a bunch of errors, along the lines of [1]. Replication does seem to work, though (data is showing up in the standby cluster). The file exists (I can see it in the HDFS web GUI), but it seems be empty. Is this an error I need to worry

stop_replication dangerous?

2013-07-01 Thread Patrick Schless
The first two tutorials for enabling replication that google gives me [1], [2] take very different tones with regard to stop_replication. The HBase docs [1] make it sound fine to start and stop replication as desired. The Cloudera docs [2] say it may cause data loss. Which is true? If data loss

Re: stop_replication dangerous?

2013-07-01 Thread Patrick Schless
sure thing: https://issues.apache.org/jira/browse/HBASE-8844 On Mon, Jul 1, 2013 at 3:59 PM, Jean-Daniel Cryans jdcry...@apache.orgwrote: Yeah that package documentation ought to be changed. Mind opening a jira? Thx, J-D On Mon, Jul 1, 2013 at 1:51 PM, Patrick Schless patrick.schl

Re: CopyTable

2013-06-20 Thread Patrick Schless
you've to disable your table first. Matteo On Wed, Jun 19, 2013 at 6:19 PM, Patrick Schless patrick.schl...@gmail.comwrote: Unfortunately, I'm on 0.92.1, and the snapshot approach you linked isn't available until 0.94. Bummer, looked cool. Anybody have any insight into the question

Re: Replication - ports/hosts

2013-06-19 Thread Patrick Schless
On Wed, Jun 19, 2013 at 12:41 AM, Stack st...@duboce.net wrote: On Mon, Jun 17, 2013 at 12:06 PM, Patrick Schless patrick.schl...@gmail.com wrote: Working on setting up HBase replication across a VPN tunnel, and following the docs here: [1] (and here: [2]). Two questions, regarding

Re: CopyTable

2013-06-19 Thread Patrick Schless
Have you looked at http://hbase.apache.org/book.html#table.rename ? Cheers On Mon, Jun 17, 2013 at 12:20 PM, Patrick Schless patrick.schl...@gmail.com wrote: Context: I'm working on getting replication set up, and a prerequisite for me is to rename the table (since you have

Replication - ports/hosts

2013-06-17 Thread Patrick Schless
Working on setting up HBase replication across a VPN tunnel, and following the docs here: [1] (and here: [2]). Two questions, regarding firewall allowances required: 1) The docs say that the zookeeper clusters must be able to reach each other. I don't see any docs on why this is (the high-level

CopyTable

2013-06-17 Thread Patrick Schless
Context: I'm working on getting replication set up, and a prerequisite for me is to rename the table (since you have to replicate to the same name as the source). For this, I'm testing a CopyTable strategy, since there doesn't seem to be a good way to rename a table (please correct me if I'm

Re: CopyTable

2013-06-17 Thread Patrick Schless
/book.html#table.rename ? Cheers On Mon, Jun 17, 2013 at 12:20 PM, Patrick Schless patrick.schl...@gmail.com wrote: Context: I'm working on getting replication set up, and a prerequisite for me is to rename the table (since you have to replicate to the same name as the source

Web Admin Pages SSL

2012-07-30 Thread Patrick Schless
I like having access to the web admin pages that HBase, HDFS, etc provide. I can't find a way to put them behind SSL, though. For the HMaster it's easy enough (nginx+SSL as a reverse proxy), but the HMaster generates links like data01.company.com:60030. Is there a way to change the scheme and port

CellCounter -- Exceeded limits on number of counters

2012-07-11 Thread Patrick Schless
I am trying to find out the number of data points (cells) in a table with hbase org.apache.hadoop.hbase.mapreduce.CellCounter table output. on a very small table (3 cells), it works fine. On a table with a couple thousand cells, I get this error (4 times):

Re: Migrating Clusters - Broken Metadata

2012-07-06 Thread Patrick Schless
to the old nodes), I was able to remove the /etc/hosts entries and bounce hbase without any problem I still have no idea where the new hbase is getting the references to the old nodes.. Filed a bug report: https://issues.apache.org/jira/browse/HBASE-6343 On Thu, Jul 5, 2012 at 6:44 PM, Patrick