ClusterId read in ZooKeeper is null

2013-07-09 Thread Brian Jeltema
I'm new to HBase, and need a little guidance. I've set up a 6-node cluster, with 3 nodes running the ZooKeeper server. The database seems to be working from the hbase shell; I can create tables, insert, scan, etc. But when I try to perform operations in a Java app, I hang at: 13/07/09 12:40:34

Re: ClusterId read in ZooKeeper is null

2013-07-10 Thread Brian Jeltema
value for zookeeper.znode.parent in your cluster configuration, but not set this in your client code? On Tue, Jul 9, 2013 at 10:05 AM, Brian Jeltema brian.jelt...@digitalenvoy.net wrote: I'm new to HBase, and need a little guidance. I've set up a 6-node cluster, with 3 nodes running

Re: ClusterId read in ZooKeeper is null

2013-07-11 Thread Brian Jeltema
This issue has been resolved. It was caused by version skew between the client library and the running service. On Jul 10, 2013, at 11:47 AM, Brian Jeltema wrote: As far as I can tell the HMaster process is running correctly. There are no obvious problems in the logs. As suggested, I

ExportSnapshot using webhdfs

2014-03-21 Thread Brian Jeltema
Is it possible to use webhdfs to export a snapshot to another cluster? If so, what would the command look like? TIA Brian

Re: ExportSnapshot using webhdfs

2014-03-21 Thread Brian Jeltema
? Regards, Shahab On Fri, Mar 21, 2014 at 8:14 AM, Matteo Bertozzi theo.berto...@gmail.comwrote: ExportSnapshot uses the FileSystem API so you'll probably be able to say: -copy-to: webhdfs://host/path Matteo On Fri, Mar 21, 2014 at 12:09 PM, Brian Jeltema brian.jelt

Re: ExportSnapshot using webhdfs

2014-03-21 Thread Brian Jeltema
, 2014 at 4:27 PM, Brian Jeltema brian.jelt...@digitalenvoy.net wrote: Exporting across versions was why I tried webhdfs. I have a cluster running HBase 0.94 and wanted to export a table to a different cluster running HBase 0.96. I got the export to work, but attempting to do

Re: ExportSnapshot using webhdfs

2014-03-21 Thread Brian Jeltema
and the full stack trace Matteo On Fri, Mar 21, 2014 at 5:22 PM, Brian Jeltema brian.jelt...@digitalenvoy.net wrote: My paths were different, and it was .tabledesc rather than .tableinfo, but I got past the problem. Now the restore_snapshot seems to be hung, and I'm seeing many warnings

Re: Lease exception when I execute large scan with filters.

2014-04-12 Thread Brian Jeltema
I don't want to be argumentative here, but by definition is's not an internal feature because it's part of the public API. We use versioning in a way that makes me somewhat uncomfortable, but it's been quite useful. I'd like to see a clear explanation of why it exists and what use cases it was

how to get source table from MultiTableInputFormat

2014-04-23 Thread Brian Jeltema
If I’m using MultiTableInputFormat to process process input from several tables in a map/reduce job, is there any way in the mapper to determine which table a given Result is coming from? Brian

Re: Introducing Project Trafodion

2014-06-11 Thread Brian Jeltema
I'm sure you saw this, but just in case, here's another SQL interface to HBase. Brian On Jun 11, 2014, at 5:57 PM, Birdsall, Dave dave.birds...@hp.com wrote: Hi, The cat is already out of the bag on Trafodion on this dlist (and we are very happy about it actually). But, here's an official

determining which region a mapper processed

2014-07-20 Thread Brian Jeltema
When a MapReduce job is run against HBase, a mapper is created for each region (using TableInputFormat). Is there a way look at the history and determine which region a given mapper ran against? Or can I rely on the mapper number being the same as the region number? TIA Brian

Re: snapshot timeout problem

2014-07-21 Thread Brian Jeltema
server which is slow in completing its part of the snapshot procedure. Have you looked at region server logs ? Feel free to pastebin relevant portion. Thanks On Jul 21, 2014, at 4:03 AM, Brian Jeltema brian.jelt...@digitalenvoy.net wrote: I’m running HBase 0.98. I’m trying to snapshot

Re: snapshot timeout problem

2014-07-22 Thread Brian Jeltema
I ran the balancer from hbase shell, but don’t see any change. Is there a way to balance a specific table? bq. One RegionServer has 69 regions Can you run load balancer so that your regions are better balanced ? Cheers On Mon, Jul 21, 2014 at 6:56 AM, Brian Jeltema brian.jelt

Re: snapshot timeout problem

2014-07-22 Thread Brian Jeltema
, 2014 at 6:56 AM, Brian Jeltema brian.jelt...@digitalenvoy.net wrote: There are 174 regions, not well balanced. One RegionServer has 69 regions. That RegionServer generates a series of log entries (modified and shown below), one for each region, at roughly 1 to 2 second intervals. The timeout

directory usage question

2014-09-06 Thread Brian Jeltema
I'm trying to track down a problem I'm having running map/reduce jobs against snapshots. Can someone explain the difference between files stored in: /apps/hbase/data/archive/data/default and files stored in /apps/hbase/data/data/default (Hadoop 2.4, HBase 0.98) Thanks

Re: directory usage question

2014-09-07 Thread Brian Jeltema
probably just something I'm doing wrong. Brian Cheers On Sat, Sep 6, 2014 at 6:09 AM, Brian Jeltema brian.jelt...@digitalenvoy.net wrote: I'm trying to track down a problem I'm having running map/reduce jobs against snapshots. Can someone explain the difference between files stored

Re: directory usage question

2014-09-07 Thread Brian Jeltema
: The files under archive directory are referenced by snapshots. Please don't delete them manually. You can delete unused snapshots. Cheers On Sep 7, 2014, at 4:08 AM, Brian Jeltema brian.jelt...@digitalenvoy.net wrote: On Sep 6, 2014, at 9:32 AM, Ted Yu yuzhih...@gmail.com wrote

Re: directory usage question

2014-09-07 Thread Brian Jeltema
Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at org.apache.hadoop.util.RunJar.main(RunJar.java:212) Cheers On Sun, Sep 7, 2014 at 5:48 AM, Brian Jeltema

need help understand log output

2014-09-07 Thread Brian Jeltema
I have a map/reduce job that is consistently failing with timeouts. The failing mapper log files contain a series of records similar to those below. When I look at the hbase and hdfs logs (on foo.net in this case) I don’t see anything obvious at these timestamps. The mapper task times out

Re: need help understand log output

2014-09-08 Thread Brian Jeltema
-hadoop2 The MR job is reading from an HBase snapshot, if that’s relevant. Cheers On Sun, Sep 7, 2014 at 8:50 AM, Brian Jeltema brian.jelt...@digitalenvoy.net wrote: I have a map/reduce job that is consistently failing with timeouts. The failing mapper log files contain a series

Re: directory usage question

2014-09-08 Thread Brian Jeltema
of initTableSnapshotMapperJob that I didn’t expect, and I’m just trying to understand what it’s doing. BTW in tip of 0.98, with HBASE-11742, related code looks a bit different. Cheers On Sun, Sep 7, 2014 at 8:27 AM, Brian Jeltema brian.jelt...@digitalenvoy.net wrote: Eclipse doesn't show

Re: need help understand log output

2014-09-08 Thread Brian Jeltema
@,1400624237999.5bb6bd41597ddd8dd7ca03e78f3a3e65. after a delay of 12420 a log entry being generated every 10 seconds starting about 4 days ago. I presume these problems are related. On Sep 8, 2014, at 7:10 AM, Brian Jeltema brian.jelt...@digitalenvoy.net wrote: When number of attempts is greater than the value

Re: need help understand log output

2014-09-08 Thread Brian Jeltema
I’ve resolved these problems by restarting the region server that owned the region in question. I don’t know what the underlying issue was, but at this point it’s not worth pursuing. Thanks for responding. Brian On Sep 8, 2014, at 11:06 AM, Brian Jeltema brian.jelt...@digitalenvoy.net wrote

Re: need help understand log output

2014-09-10 Thread Brian Jeltema
out of curiosity, did you see below messages in RS log? LOG.warn(Snapshot called again without clearing previous. + Doing nothing. Another ongoing flush or did we fail last attempt?”); Nope thanks. On Tue, Sep 9, 2014 at 2:15 AM, Brian Jeltema brian.jelt

Re: need help understand log output

2014-09-11 Thread Brian Jeltema
, but revisiting the code, the flush every 10s in the RS log actually comes from HRegion#shouldFlush, so there is sth triggered.. could you pastebin the RS log? On Wed, Sep 10, 2014 at 6:59 PM, Brian Jeltema brian.jelt...@digitalenvoy.net wrote: out of curiosity, did you see below messages

Re: need help understand log output

2014-09-12 Thread Brian Jeltema
first began to appear, but nothing jumped out at me; a write-heavy MR job was running at the time, so there might be something buried in the noise (a lot of noise). 在 2014-9-11,19:08,Brian Jeltema brian.jelt...@digitalenvoy.net 写道: the RS log is huge. What do you want to see other than what

Re: Copying data from 94 to 98 ..

2014-09-16 Thread Brian Jeltema
I’ve been successfully moving snapshots from 94 to 98 using webhdfs. On the 94 cluster: hbase org.apache.hadoop.hbase.snapshot.ExportSnapshot -snapshot snappy -copy-to webhdfs://host-on-98-cluster/apps/hbase/data -mappers 12 and then manually fixing the file system layout. On Sep 16,

problem restoring snapshot

2014-09-25 Thread Brian Jeltema
I exported a snapshot to another cluster, same version of all software. A restore_snapshot on the target system hung and eventually timed out, I think due to file ownership issues. I restored hbase ownership to everything in /apps/hbase and tried the restore_snapshot again. It’s still hanging,

Re: problem restoring snapshot

2014-09-25 Thread Brian Jeltema
: Preconditions.checkArgument(!metaChanges.hasRegionsToRestore(), A clone should not have regions to restore); Was there region split prior to snapshot restore action ? Cheers On Thu, Sep 25, 2014 at 9:19 AM, Brian Jeltema brian.jelt...@digitalenvoy.net wrote: I exported a snapshot

Re: problem restoring snapshot

2014-09-25 Thread Brian Jeltema
, Brian Jeltema brian.jelt...@digitalenvoy.net wrote: The table did not exist on the target cluster when I tried the first restore_clone. Is there some way I can delete all traces of the table and start over? On Sep 25, 2014, at 12:25 PM, Ted Yu yuzhih...@gmail.com wrote: It is from

Re: problem restoring snapshot

2014-09-25 Thread Brian Jeltema
Does hbck report any inconsistency ? Not for the table in question. There are inconsistencies in an unrelated table. I do see related content in: /apps/hbase/data/.tmp/data/default/Foo is it safe to delete that stuff? Cheers On Thu, Sep 25, 2014 at 9:52 AM, Brian Jeltema

Re: problem restoring snapshot

2014-09-25 Thread Brian Jeltema
Deleting the contents of /apps/hbase/data/.tmp fixed the problem On Sep 25, 2014, at 1:48 PM, Ted Yu yuzhih...@gmail.com wrote: bq. is it safe to delete that stuff? Yes. You have the exported snapshot as source of truth. On Thu, Sep 25, 2014 at 10:43 AM, Brian Jeltema brian.jelt

ExportSnapshot webhdfs problems

2014-09-30 Thread Brian Jeltema
I’m trying to use ExportSnapshot to copy a snapshot from a Hadoop 1 to a Hadoop 2 cluster using the webhdfs protocol. I’ve done this successfully before, though there are always mapper failures and retries in the job log. However, I’m not having success with a rather large table due to an

snapshot timeouts

2014-10-08 Thread Brian Jeltema
I’m trying to snapshot a moderately large table (3 billion rows, but not a huge amount of data per row). Those snapshots have been timing out, so I set the following parameters to relatively large values: hbase.snapshot.master.timeoutMillis hbase.snapshot.region.timeout

Re: snapshot timeouts

2014-10-08 Thread Brian Jeltema
more information : the release of hbase you're using value for hbase.rpc.timeout (looks like you leave it @ default) more of the error (please include stack trace if possible) Cheers On Wed, Oct 8, 2014 at 12:09 PM, Brian Jeltema brian.jelt...@foo.net wrote: I’m trying to snapshot

Re: snapshot timeouts

2014-10-08 Thread Brian Jeltema
is screwed up. So I’ll clean up the mess and try again tomorrow. Regrets for the possible false alarm Brian On Oct 8, 2014, at 3:25 PM, Brian Jeltema brian.jelt...@digitalenvoy.net wrote: Sorry, I usually include that info. HBase version is 0.98. hbase.rpc.timeout is the default. When

what can cause RegionTooBusyException?

2014-11-10 Thread Brian Jeltema
I’m running a map/reduce job against a table that is performing a large number of writes (probably updating every row). The job is failing with the exception below. This is a solid failure; it dies at the same point in the application, and at the same row in the table. So I doubt it’s a conflict

Re: what can cause RegionTooBusyException?

2014-11-10 Thread Brian Jeltema
using ? Cheers On Mon, Nov 10, 2014 at 11:10 AM, Brian Jeltema brian.jelt...@digitalenvoy.net wrote: I’m running a map/reduce job against a table that is performing a large number of writes (probably updating every row). The job is failing with the exception below. This is a solid

Re: what can cause RegionTooBusyException?

2014-11-10 Thread Brian Jeltema
with monitoring tool) what memstore pressure was ? Thanks On Nov 10, 2014, at 11:34 AM, Brian Jeltema brian.jelt...@digitalenvoy.net wrote: How many tasks may write to this row concurrently ? only 1 mapper should be writing to this row. Is there a way to check which locks are being held

Re: what can cause RegionTooBusyException?

2014-11-11 Thread Brian Jeltema
Request Count. You can monitor the value for the underlying region to see if it receives above-normal writes. Cheers On Mon, Nov 10, 2014 at 4:06 PM, Brian Jeltema bdjelt...@gmail.com wrote: Was the region containing this row hot around the time of failure ? How do I measure

Re: 0.94 going forward

2014-12-16 Thread Brian Jeltema
I have been able to export snapshots from 0.94 to 0.98. I’ve pasted the instructions that I developed and published on our internal wiki. I also had to significantly increase retry count parameters due to a high number of timeout failures during the export. Cross-cluster transfers To export

Re: http://stackoverflow.com/questions/28350940/cannot-start-standalone-instance-of-hbase

2015-02-06 Thread Brian Jeltema
You’re running on Windows. Did you follow this: http://hbase.apache.org/cygwin.html On Feb 6, 2015, at 2:39 AM, sibtain sibtain_ab...@ymail.com wrote: Please follow this link http://stackoverflow.com/questions/28350940/cannot-start-standalone-instance-of-hbase for my question. I'm

unexpected replication on export

2015-03-09 Thread Brian Jeltema
I used ExportSnapshot to copy a snapshot from a cluster with a default replication factor of 3 to a smaller development cluster with a default replication factor of 1. The resulting table appears to have been created with a replication of 3, ignoring the default setting. Is this expected? Is

periodicFlusher get stuck

2015-02-24 Thread Brian Jeltema
I’m seeing occasional HBase log output similar to the output shown below. It appears there is a request to flush a region, repeated every 10 seconds, that apparently is never being performed. It’s causing MR jobs to timeout because they cannot write to this region. Is this a known problem?

Re: periodicFlusher get stuck

2015-02-24 Thread Brian Jeltema
/configuration On Feb 24, 2015, at 11:28 AM, Jean-Marc Spaggiari jean-m...@spaggiari.org wrote: Interesting... Can you share you hbase-site.xml? Have you setup hbase.regionserver.optionalcacheflushinterval? Can you hadoop fs -ls -R this region folder? 2015-02-24 11:15 GMT-05:00 Brian Jeltema

Re: periodicFlusher get stuck

2015-02-24 Thread Brian Jeltema
, Brian Jeltema brian.jelt...@digitalenvoy.net wrote: I’m seeing occasional HBase log output similar to the output shown below. It appears there is a request to flush a region, repeated every 10 seconds, that apparently is never being performed. It’s causing MR jobs to timeout because

Re: regions in transition

2015-12-22 Thread Brian Jeltema
safe to stop HBase and delete the ZK node? > > Thanks > > On Mon, Dec 21, 2015 at 3:54 PM, Brian Jeltema <bdjelt...@gmail.com> wrote: > >> I am doing a cluster upgrade to the HDP 2.2 stack. For some reason, after >> the upgrade HBase >> cannot find any regions for

Re: regions in transition

2015-12-23 Thread Brian Jeltema
I appear to have resolved the OOM error by greatly increasing the max process limit (to 64K). Using HDP 2.1 a limit of 1024 seemed to be working OK. I’m surprised I had to make a change of this magnitude. Brian > On Dec 23, 2015, at 7:20 AM, Brian Jeltema <bdjelt...@gmail.com&

constantly adding snapshot info

2015-12-27 Thread Brian Jeltema
I recently upgraded to Hadoop 2.7.1 and HBase 1.1.2. Since the upgrade, the HBase master logs have been filling and rolling over about every 5 minutes, filled with variations of the following (modified to obscure internal details): 2015-12-27 12:24:52,481 DEBUG

Re: constantly adding snapshot info

2015-12-27 Thread Brian Jeltema
in the previous post. Brian > > Regards > Samir > > On Sun, Dec 27, 2015 at 6:34 PM, Brian Jeltema <bdjelt...@gmail.com> wrote: > >> I recently upgraded to Hadoop 2.7.1 and HBase 1.1.2. >> >> Since the upgrade, the HBase master logs have be

regions in transition

2015-12-21 Thread Brian Jeltema
I am doing a cluster upgrade to the HDP 2.2 stack. For some reason, after the upgrade HBase cannot find any regions for existing tables. I believe the HDFS file system is OK. But looking at the ZooKeeper nodes, I noticed that many (maybe all) of the regions were listed in the ZooKeeper

Re: regions in transition

2015-12-22 Thread Brian Jeltema
(onlineRegions != null && onlineRegions.size() > 0) %> >> ... >> <%else> >>Not serving regions >> >> >> The message means that there was no region online on the underlying server. >> >> FYI >> >> On Tue, Dec

Re: regions in transition

2015-12-22 Thread Brian Jeltema
doesn’t have any regions to server. > On Dec 22, 2015, at 6:19 AM, Brian Jeltema <bdjelt...@gmail.com> wrote: > >> >> Can you pick a few regions stuck in transition and check related region >> server logs to see why they couldn't be assigned ? > > I don’t see

Re: regions in transition

2015-12-22 Thread Brian Jeltema
ns > > > The message means that there was no region online on the underlying server. > > FYI > > On Tue, Dec 22, 2015 at 7:18 AM, Brian Jeltema <bdjelt...@gmail.com> wrote: > >> Following up, if I look at the MBase Master UI in the Ambari console I see >&

Re: regions in transition

2015-12-23 Thread Brian Jeltema
be causing an OOM error? Thanks Brian > On Dec 22, 2015, at 12:46 PM, Brian Jeltema <bdjelt...@gmail.com> wrote: > >> >> You should really find out where you hmaster ui lives (there is a master UI >> for every node provided by the apache project) because it gives y

Re: `hbase classpath` command causes “File name too long” error

2016-06-17 Thread Brian Jeltema
It’s the first ‘$’ that’s killing you. it should be` export HADOOP_CLASSPATH=`${HBASE_HOME}/bin/hbase classpath` > On Jun 17, 2016, at 6:41 AM, Mahesha999 wrote: > > I am trying out HBase bulk loading. The command looks like this: > >

Re: Column families

2017-06-22 Thread Brian Jeltema
One use-case that applies to my tables is that I have a table with a set of columns that have data that is always processed with MR jobs, but other rather large columns that are generally only accessed through a UI. By separating those into two column families, MR jobs that do a full table scan