from:"Enis Söztutar"

Re: [ANNOUNCE] Chunhui Shen joins the Apache HBase PMC

2017-07-06 Thread Enis Söztutar

Congrats!

Enis

On Thu, Jul 6, 2017 at 10:57 AM, Devaraj Das  wrote:

> Congratulations, Chunhui!
> 
> From: Yu Li 
> Sent: Monday, July 03, 2017 10:24 PM
> To: d...@hbase.apache.org; Hbase-User
> Subject: [ANNOUNCE] Chunhui Shen joins the Apache HBase PMC
>
> On behalf of the Apache HBase PMC I am pleased to announce that Chunhui
> Shen
> has accepted our invitation to become a PMC member on the Apache
> HBase project. He has been an active contributor to HBase for past many
> years. Looking forward for many more contributions from him.
>
> Please join me in welcoming Chunhui to the HBase PMC!
>
> Best Regards,
> Yu
>
>
>

Re: is manually triggered major-compaction forcing a memstore flush

2017-05-30 Thread Enis Söztutar

Major compaction will not trigger flush. You have to issue that manually.
BTW, you can look at the settings related to "cache on flush" so that when
the flusher writes the files, the blocks can go directly to the block
cache.

Enis

On Mon, May 29, 2017 at 7:45 PM, Eric Owhadi  wrote:

> Hi Hbaseers,
> I am working on a project were we are preparing data on a daily basis, and
> then have it in a form fast to read.
> So I am considering moving the data away from mem-store down to block
> cache.
> So I need to flush and major compact. But I was wondering if I need to
> issue both commands, or if manually triggered major compaction include a
> silent flush before doing the actual major compaction?
> Thanks in advance for the help,
> Eric Owhadi
>

Re: What is Dead Region Servers and how to clear them up?

2017-05-26 Thread Enis Söztutar

In general if there are no regions in transition, the WAL recovery has
already finished. You can watch the master's log4j log for those entries,
but the lack of regions in transition is the easiest way to identify.

Enis

On Fri, May 26, 2017 at 12:14 PM, jeff saremi <jeffsar...@hotmail.com>
wrote:

> thanks Enis
>
> I apologize for earlier
>
> This looks very close to our issue
> When you say: "there is no "WAL" recovery is happening", how could i make
> sure of that? Thanks
>
> Jeff
>
>
> 
> From: Enis Söztutar <enis@gmail.com>
> Sent: Friday, May 26, 2017 11:47:11 AM
> To: d...@hbase.apache.org
> Cc: hbase-user
> Subject: Re: What is Dead Region Servers and how to clear them up?
>
> Jeff, please be respectful to be people who are trying to help you. This is
> not acceptable behavior and will result in consequences next time.
>
> On the specific issue that you are seeing, it is highly likely that you are
> seeing this: https://issues.apache.org/jira/browse/HBASE-14223. Having
> those servers in the dead servers list will not hurt operations, or
> runtimes or anything else. Possibly for those servers, there is not new
> instance of the regionserver running in the same host and ports.
>
> If you want to manually clean out these, you can follow these steps:
>  - Manually move these directries from the file system:
> /WALs/dead-server-splitting
>  - ONLY do this if you are sure that there is no "WAL" recovery is
> happening, and there is only WAL files with names containing ".meta."
>  - Restart HBase master.
>
> Upon restart, you can see that these do not show up anymore. For more
> technical details, please refer to the jira link.
>
> Enis
>
> On Fri, May 26, 2017 at 11:03 AM, jeff saremi <jeffsar...@hotmail.com>
> wrote:
>
> > Thank you for the GFY answer
> >
> > And i guess to figure out how to fix these I can always go through the
> > HBase source code.
> >
> >
> > 
> > From: Dima Spivak <dimaspi...@apache.org>
> > Sent: Friday, May 26, 2017 9:58:00 AM
> > To: hbase-user
> > Subject: Re: What is Dead Region Servers and how to clear them up?
> >
> > Sending this back to the user mailing list.
> >
> > RegionServers can die for many reasons. Looking at your RegionServer log
> > files should give hints as to why it's happening.
> >
> >
> > -Dima
> >
> > On Fri, May 26, 2017 at 9:48 AM, jeff saremi <jeffsar...@hotmail.com>
> > wrote:
> >
> > > I had posted this to the user mailing list and I have not got any
> direct
> > > answer to my question.
> > >
> > > Where do dead RS's come from and how can they be cleaned up? Someone in
> > > the midst of developers should know this.
> > >
> > > thanks
> > >
> > > Jeff
> > >
> > > 
> > > From: jeff saremi <jeffsar...@hotmail.com>
> > > Sent: Thursday, May 25, 2017 10:23:17 AM
> > > To: user@hbase.apache.org
> > > Subject: Re: What is Dead Region Servers and how to clear them up?
> > >
> > > I'm still looking to get hints on how to remove the dead regions.
> thanks
> > >
> > > 
> > > From: jeff saremi <jeffsar...@hotmail.com>
> > > Sent: Wednesday, May 24, 2017 12:27:06 PM
> > > To: user@hbase.apache.org
> > > Subject: Re: What is Dead Region Servers and how to clear them up?
> > >
> > > i'm trying to eliminate the dead region servers.
> > >
> > > 
> > > From: Ted Yu <yuzhih...@gmail.com>
> > > Sent: Wednesday, May 24, 2017 12:17:40 PM
> > > To: user@hbase.apache.org
> > > Subject: Re: What is Dead Region Servers and how to clear them up?
> > >
> > > bq. running hbck (many times
> > >
> > > Can you describe the specific inconsistencies you were trying to
> resolve
> > ?
> > > Depending on the inconsistencies, advice can be given on the best known
> > > hbck command arguments to use.
> > >
> > > Feel free to pastebin master log if needed.
> > >
> > > On Wed, May 24, 2017 at 12:10 PM, jeff saremi <jeffsar...@hotmail.com>
> > > wrote:
> > >
> > > > these are the things I have done so far:
> > > >
> > > >
> > > > - restarting master (few times)
> > > >
> > > > - running hbck (many times; this t

Re: What is Dead Region Servers and how to clear them up?

2017-05-26 Thread Enis Söztutar

Jeff, please be respectful to be people who are trying to help you. This is
not acceptable behavior and will result in consequences next time.

On the specific issue that you are seeing, it is highly likely that you are
seeing this: https://issues.apache.org/jira/browse/HBASE-14223. Having
those servers in the dead servers list will not hurt operations, or
runtimes or anything else. Possibly for those servers, there is not new
instance of the regionserver running in the same host and ports.

If you want to manually clean out these, you can follow these steps:
 - Manually move these directries from the file system:
/WALs/dead-server-splitting
 - ONLY do this if you are sure that there is no "WAL" recovery is
happening, and there is only WAL files with names containing ".meta."
 - Restart HBase master.

Upon restart, you can see that these do not show up anymore. For more
technical details, please refer to the jira link.

Enis

On Fri, May 26, 2017 at 11:03 AM, jeff saremi 
wrote:

> Thank you for the GFY answer
>
> And i guess to figure out how to fix these I can always go through the
> HBase source code.
>
>
> 
> From: Dima Spivak 
> Sent: Friday, May 26, 2017 9:58:00 AM
> To: hbase-user
> Subject: Re: What is Dead Region Servers and how to clear them up?
>
> Sending this back to the user mailing list.
>
> RegionServers can die for many reasons. Looking at your RegionServer log
> files should give hints as to why it's happening.
>
>
> -Dima
>
> On Fri, May 26, 2017 at 9:48 AM, jeff saremi 
> wrote:
>
> > I had posted this to the user mailing list and I have not got any direct
> > answer to my question.
> >
> > Where do dead RS's come from and how can they be cleaned up? Someone in
> > the midst of developers should know this.
> >
> > thanks
> >
> > Jeff
> >
> > 
> > From: jeff saremi 
> > Sent: Thursday, May 25, 2017 10:23:17 AM
> > To: user@hbase.apache.org
> > Subject: Re: What is Dead Region Servers and how to clear them up?
> >
> > I'm still looking to get hints on how to remove the dead regions. thanks
> >
> > 
> > From: jeff saremi 
> > Sent: Wednesday, May 24, 2017 12:27:06 PM
> > To: user@hbase.apache.org
> > Subject: Re: What is Dead Region Servers and how to clear them up?
> >
> > i'm trying to eliminate the dead region servers.
> >
> > 
> > From: Ted Yu 
> > Sent: Wednesday, May 24, 2017 12:17:40 PM
> > To: user@hbase.apache.org
> > Subject: Re: What is Dead Region Servers and how to clear them up?
> >
> > bq. running hbck (many times
> >
> > Can you describe the specific inconsistencies you were trying to resolve
> ?
> > Depending on the inconsistencies, advice can be given on the best known
> > hbck command arguments to use.
> >
> > Feel free to pastebin master log if needed.
> >
> > On Wed, May 24, 2017 at 12:10 PM, jeff saremi 
> > wrote:
> >
> > > these are the things I have done so far:
> > >
> > >
> > > - restarting master (few times)
> > >
> > > - running hbck (many times; this tool does not seem to be doing
> anything
> > > at all)
> > >
> > > - checking the list of region servers in ZK (none of the dead ones are
> > > listed here)
> > >
> > > - checking the WALs under /WALs. Out of 11 dead ones only 3
> > > are listed here with "-splitting" at the end of their names and they
> > > contain one single file like: 1493846660401..meta.1493922323600.meta
> > >
> > >
> > >
> > >
> > > 
> > > From: jeff saremi 
> > > Sent: Wednesday, May 24, 2017 9:04:11 AM
> > > To: user@hbase.apache.org
> > > Subject: What is Dead Region Servers and how to clear them up?
> > >
> > > Apparently having dead region servers is so common that a section of
> the
> > > master console is dedicated to that?
> > > How can we clean this up (preferably in an automated fashion)? Why
> isn't
> > > this being done by HBase automatically?
> > >
> > >
> > > thanks
> > >
> >
>

Re: [ANNOUNCE] - Welcome our new HBase committer Anastasia Braginsky

2017-03-27 Thread Enis Söztutar

Congrats and welcome.

Enis

On Mon, Mar 27, 2017 at 9:23 AM, Stephen Jiang 
wrote:

> Great!  Congratulations and welcome to the team!
>
> Thanks
> Stephen
>
> On Mon, Mar 27, 2017 at 8:53 AM, Andrew Purtell 
> wrote:
>
> > Congratulations and welcome!
> >
> > > On Mar 27, 2017, at 5:37 AM, ramkrishna vasudevan <
> > ramkrishna.s.vasude...@gmail.com> wrote:
> > >
> > > Hi All
> > >
> > > Welcome Anastasia Braginsky, one more female committer to HBase. She
> has
> > > been active now for a while with her Compacting memstore feature and
> she
> > > along with Eshcar have done lot of talks in various meetups and
> HBaseCon
> > on
> > > their feature.
> > >
> > > Welcome onboard and looking forward to work with you Anastasia !!!
> > >
> > > Regards
> > > Ram
> >
>

Re: limiting user threads on client

2017-03-13 Thread Enis Söztutar

There are different thread pools in the client, and some of the thread
pools depend on how are you constructing connection and table instances.

The first thread pool is the one owned by the connection. If you are using
ConnectionFactory.createConnection() (which you should) then this is the
property that controls how many of the threads in the connection:

hbase.hconnection.threads.max

This one configures when the threads will be discarded:

hbase.hconnection.threads.keepalivetime

You can also give your own thread pool to the Connection object if you want
to control threading behavior.
If you are creating HTable or Table objects from Connection, then by
default they share the same thread pool, so you do not have to do anything.
Otherwise, the HTable objects can have their own thread pools as well.

Then, there are RPC-level thread pools. In 1.x versions (unless you have
netty based async RPC), there is one thread per regionserver that the
client talks to. I don't think there is a limit of how many of these the
client can have at a single time. So, if the client ends up doing RPCs to
many servers, there will be one thread per server.

You should use jstack or kill -3 to inspect the hbase client threads
probably.

Enis
On Mon, Mar 13, 2017 at 2:57 PM, anil gupta  wrote:

> I think you need to set that property before you make HBaseConfiguration
> object. Have you tried that?
>
> On Mon, Mar 13, 2017 at 10:24 AM, Henning Blohm 
> wrote:
>
> > Unfortunately it doesn't seem to make a difference.
> >
> > I see that the configuration has hbase.htable.threads.max=1 right before
> > setting up the Connection but then I still get hundreds of
> >
> > hconnection-***
> >
> > threads. Is that actually Zookeeper?
> >
> > Thanks,
> > Henning
> >
> > On 13.03.2017 17:28, Ted Yu wrote:
> >
> >> Are you using Java client ?
> >> See the following in HTable :
> >>
> >>public static ThreadPoolExecutor getDefaultExecutor(Configuration
> >> conf) {
> >>
> >>  int maxThreads = conf.getInt("hbase.htable.threads.max", Integer.
> >> MAX_VALUE);
> >>
> >> FYI
> >>
> >> On Mon, Mar 13, 2017 at 9:14 AM, Henning Blohm <
> henning.bl...@zfabrik.de>
> >> wrote:
> >>
> >> Hi,
> >>>
> >>> I am running an HBase client on a very resource limited machine. In
> >>> particular numproc is limited so that I frequently get "Cannot create
> >>> native thread" OOMs. I noticed that, in particular in write situations,
> >>> the
> >>> hconnection pool grows into the hundreds of threads - even when at most
> >>> writing with less than ten application threads. Threads are discarded
> >>> again
> >>> after some minutes.
> >>>
> >>> In conjunction with other programs running on that machine, this
> >>> sometimes
> >>> leads to an "overload" situation.
> >>>
> >>> Is there a way to keep thread pool usage limited - or in some closer
> >>> relation with the actual concurrency required?
> >>>
> >>> Thanks,
> >>>
> >>> Henning
> >>>
> >>>
> >>>
> >>>
> >
>
>
> --
> Thanks & Regards,
> Anil Gupta
>

Re: On HBase Read Replicas

2017-02-22 Thread Enis Söztutar

If you are doing a get to a specific replica, it will execute as a read
with retries to a single "copy". There will not be any backup / fallback
RPCs to any other replica.

Only in timeline consistency mode there will be fallback RPCs.

Enis

On Sun, Feb 19, 2017 at 9:43 PM, Anoop John <anoop.hb...@gmail.com> wrote:

> Thanks Enis.. I was not knowing the way of setting replica id
> specifically..  So what will happen if that said replica is down at
> the read time?  Will that go to another replica?
>
> -Anoop-
>
> On Sat, Feb 18, 2017 at 3:34 AM, Enis Söztutar <enis@gmail.com> wrote:
> > You can do gets using two different "modes":
> >  - Do a read with backup RPCs. In case, the algorithm that I have above
> > will be used. 1 RPC to primary, and 2 more RPCs after primary timeouts.
> >  - Do a read to a single replica. In this case, there is only 1 RPC that
> > will happen to that given replica.
> >
> > Enis
> >
> > On Fri, Feb 17, 2017 at 12:03 PM, jeff saremi <jeffsar...@hotmail.com>
> > wrote:
> >
> >> Enis
> >>
> >> Thanks for taking the time to reply
> >>
> >> So i thought that a read request is sent to all Replicas regardless. If
> we
> >> have the option of Sending to one, analyzing response, and then sending
> to
> >> another, this bodes well with our scenarios.
> >>
> >> Please confirm
> >>
> >> thanks
> >>
> >> 
> >> From: Enis Söztutar <enis@gmail.com>
> >> Sent: Friday, February 17, 2017 11:38:42 AM
> >> To: hbase-user
> >> Subject: Re: On HBase Read Replicas
> >>
> >> You can use read-replicas to distribute the read-load if you are fine
> with
> >> stale reads. The read replicas normally have a "backup rpc" path, which
> >> implements a logic like this:
> >>  - Send the RPC to the primary replica
> >>  - if no response for 100ms (or configured timeout), send RPCs to the
> other
> >> replicas
> >>  - return the first non-exception response.
> >>
> >> However, there is also another feature for read replicas, where you can
> >> indicate which exact replica_id you want to read from when you are
> doing a
> >> get. If you do this:
> >> Get get = new Get(row);
> >> get.setReplicaId(2);
> >>
> >> the Get RPC will only go to the replica_id=2. Note that if you have
> region
> >> replication = 3, then you will have regions with replica ids: {0, 1, 2}
> >> where replica_id=0 is the primary.
> >>
> >> So you can do load-balancing with a get.setReplicaId(random() %
> >> num_replicas) kind of pattern.
> >>
> >> Enis
> >>
> >>
> >>
> >> On Thu, Feb 16, 2017 at 9:41 AM, Anoop John <anoop.hb...@gmail.com>
> wrote:
> >>
> >> > Never saw this kind of discussion.
> >> >
> >> > -Anoop-
> >> >
> >> > On Thu, Feb 16, 2017 at 10:13 PM, jeff saremi <jeffsar...@hotmail.com
> >
> >> > wrote:
> >> > > Thanks Anoop.
> >> > >
> >> > > Understood.
> >> > >
> >> > > Have there been enhancement requests or discussions on load
> balancing
> >> by
> >> > providing additional replicas in the past? Has anyone else come up
> with
> >> > anything on this?
> >> > > thanks
> >> > >
> >> > > 
> >> > > From: Anoop John <anoop.hb...@gmail.com>
> >> > > Sent: Thursday, February 16, 2017 2:35:48 AM
> >> > > To: user@hbase.apache.org
> >> > > Subject: Re: On HBase Read Replicas
> >> > >
> >> > > The region replica feature came in so as to reduce the MTTR and so
> >> > > increase the data availability.  When the master region containing
> RS
> >> > > dies, the clients can read from the secondary regions.  But to keep
> >> > > one thing in mind that this data from secondary regions will be bit
> >> > > out of sync as the replica is eventual consistent.   Because of this
> >> > > said reason,  change client so as to share the load across diff RSs
> >> > > might be tough.
> >> > >
> >> > > -Anoop-
> >> > >
> >> > > On Sun, Feb 12, 2017 at 8:13 AM, jeff saremi <
> jeffsar...@hotmail.com>
> >> > wrote:
> >

Re: On HBase Read Replicas

2017-02-17 Thread Enis Söztutar

You can do gets using two different "modes":
 - Do a read with backup RPCs. In case, the algorithm that I have above
will be used. 1 RPC to primary, and 2 more RPCs after primary timeouts.
 - Do a read to a single replica. In this case, there is only 1 RPC that
will happen to that given replica.

Enis

On Fri, Feb 17, 2017 at 12:03 PM, jeff saremi <jeffsar...@hotmail.com>
wrote:

> Enis
>
> Thanks for taking the time to reply
>
> So i thought that a read request is sent to all Replicas regardless. If we
> have the option of Sending to one, analyzing response, and then sending to
> another, this bodes well with our scenarios.
>
> Please confirm
>
> thanks
>
> 
> From: Enis Söztutar <enis@gmail.com>
> Sent: Friday, February 17, 2017 11:38:42 AM
> To: hbase-user
> Subject: Re: On HBase Read Replicas
>
> You can use read-replicas to distribute the read-load if you are fine with
> stale reads. The read replicas normally have a "backup rpc" path, which
> implements a logic like this:
>  - Send the RPC to the primary replica
>  - if no response for 100ms (or configured timeout), send RPCs to the other
> replicas
>  - return the first non-exception response.
>
> However, there is also another feature for read replicas, where you can
> indicate which exact replica_id you want to read from when you are doing a
> get. If you do this:
> Get get = new Get(row);
> get.setReplicaId(2);
>
> the Get RPC will only go to the replica_id=2. Note that if you have region
> replication = 3, then you will have regions with replica ids: {0, 1, 2}
> where replica_id=0 is the primary.
>
> So you can do load-balancing with a get.setReplicaId(random() %
> num_replicas) kind of pattern.
>
> Enis
>
>
>
> On Thu, Feb 16, 2017 at 9:41 AM, Anoop John <anoop.hb...@gmail.com> wrote:
>
> > Never saw this kind of discussion.
> >
> > -Anoop-
> >
> > On Thu, Feb 16, 2017 at 10:13 PM, jeff saremi <jeffsar...@hotmail.com>
> > wrote:
> > > Thanks Anoop.
> > >
> > > Understood.
> > >
> > > Have there been enhancement requests or discussions on load balancing
> by
> > providing additional replicas in the past? Has anyone else come up with
> > anything on this?
> > > thanks
> > >
> > > 
> > > From: Anoop John <anoop.hb...@gmail.com>
> > > Sent: Thursday, February 16, 2017 2:35:48 AM
> > > To: user@hbase.apache.org
> > > Subject: Re: On HBase Read Replicas
> > >
> > > The region replica feature came in so as to reduce the MTTR and so
> > > increase the data availability.  When the master region containing RS
> > > dies, the clients can read from the secondary regions.  But to keep
> > > one thing in mind that this data from secondary regions will be bit
> > > out of sync as the replica is eventual consistent.   Because of this
> > > said reason,  change client so as to share the load across diff RSs
> > > might be tough.
> > >
> > > -Anoop-
> > >
> > > On Sun, Feb 12, 2017 at 8:13 AM, jeff saremi <jeffsar...@hotmail.com>
> > wrote:
> > >> Yes indeed. thank you very much Ted
> > >>
> > >> 
> > >> From: Ted Yu <yuzhih...@gmail.com>
> > >> Sent: Saturday, February 11, 2017 3:40:50 PM
> > >> To: user@hbase.apache.org
> > >> Subject: Re: On HBase Read Replicas
> > >>
> > >> Please take a look at the design doc attached to
> > >> https://issues.apache.org/jira/browse/HBASE-10070.
> > >>
> > >> Your first question would be answered by that document.
> > >>
> > >> Cheers
> > >>
> > >> On Sat, Feb 11, 2017 at 2:06 PM, jeff saremi <jeffsar...@hotmail.com>
> > wrote:
> > >>
> > >>> The first time I heard replicas in HBase the following thought
> > immediately
> > >>> came to my mind:
> > >>> To alleviate the load in read-heavy clusters, one could assign Region
> > >>> servers to be replicas of others so that the load is distributed and
> > there
> > >>> is less pressure on the main RS.
> > >>>
> > >>> Just 2 days ago a colleague quoted a paragraph from HBase manual that
> > >>> contradicted this completely. Apparently, the replicas do not help
> > with the
> > >>> load but they actually contribute to more traffic on the network and
> > on the
> > >>> underlying file system
> > >>>
> > >>> Would someone be able to give us some insight on why anyone would
> want
> > >>> replicas?
> > >>>
> > >>> And also could one easily change this behavior in the HBase native
> Java
> > >>> client to support what I had been imagining as the concept for
> > replicas?
> > >>>
> > >>>
> > >>> thanks
> > >>>
> >
>

Re: On HBase Read Replicas

2017-02-17 Thread Enis Söztutar

You can use read-replicas to distribute the read-load if you are fine with
stale reads. The read replicas normally have a "backup rpc" path, which
implements a logic like this:
 - Send the RPC to the primary replica
 - if no response for 100ms (or configured timeout), send RPCs to the other
replicas
 - return the first non-exception response.

However, there is also another feature for read replicas, where you can
indicate which exact replica_id you want to read from when you are doing a
get. If you do this:
Get get = new Get(row);
get.setReplicaId(2);

the Get RPC will only go to the replica_id=2. Note that if you have region
replication = 3, then you will have regions with replica ids: {0, 1, 2}
where replica_id=0 is the primary.

So you can do load-balancing with a get.setReplicaId(random() %
num_replicas) kind of pattern.

Enis



On Thu, Feb 16, 2017 at 9:41 AM, Anoop John  wrote:

> Never saw this kind of discussion.
>
> -Anoop-
>
> On Thu, Feb 16, 2017 at 10:13 PM, jeff saremi 
> wrote:
> > Thanks Anoop.
> >
> > Understood.
> >
> > Have there been enhancement requests or discussions on load balancing by
> providing additional replicas in the past? Has anyone else come up with
> anything on this?
> > thanks
> >
> > 
> > From: Anoop John 
> > Sent: Thursday, February 16, 2017 2:35:48 AM
> > To: user@hbase.apache.org
> > Subject: Re: On HBase Read Replicas
> >
> > The region replica feature came in so as to reduce the MTTR and so
> > increase the data availability.  When the master region containing RS
> > dies, the clients can read from the secondary regions.  But to keep
> > one thing in mind that this data from secondary regions will be bit
> > out of sync as the replica is eventual consistent.   Because of this
> > said reason,  change client so as to share the load across diff RSs
> > might be tough.
> >
> > -Anoop-
> >
> > On Sun, Feb 12, 2017 at 8:13 AM, jeff saremi 
> wrote:
> >> Yes indeed. thank you very much Ted
> >>
> >> 
> >> From: Ted Yu 
> >> Sent: Saturday, February 11, 2017 3:40:50 PM
> >> To: user@hbase.apache.org
> >> Subject: Re: On HBase Read Replicas
> >>
> >> Please take a look at the design doc attached to
> >> https://issues.apache.org/jira/browse/HBASE-10070.
> >>
> >> Your first question would be answered by that document.
> >>
> >> Cheers
> >>
> >> On Sat, Feb 11, 2017 at 2:06 PM, jeff saremi 
> wrote:
> >>
> >>> The first time I heard replicas in HBase the following thought
> immediately
> >>> came to my mind:
> >>> To alleviate the load in read-heavy clusters, one could assign Region
> >>> servers to be replicas of others so that the load is distributed and
> there
> >>> is less pressure on the main RS.
> >>>
> >>> Just 2 days ago a colleague quoted a paragraph from HBase manual that
> >>> contradicted this completely. Apparently, the replicas do not help
> with the
> >>> load but they actually contribute to more traffic on the network and
> on the
> >>> underlying file system
> >>>
> >>> Would someone be able to give us some insight on why anyone would want
> >>> replicas?
> >>>
> >>> And also could one easily change this behavior in the HBase native Java
> >>> client to support what I had been imagining as the concept for
> replicas?
> >>>
> >>>
> >>> thanks
> >>>
>

Re: [ANNOUNCE] New HBase Committer Josh Elser

2016-12-12 Thread Enis Söztutar

Congrats Josh!

Enis

On Mon, Dec 12, 2016 at 11:39 AM, Esteban Gutierrez 
wrote:

> Congrats and welcome, Josh!
>
> esteban.
>
>
> --
> Cloudera, Inc.
>
>
> On Sun, Dec 11, 2016 at 10:17 PM, Yu Li  wrote:
>
> > Congratulations and welcome!
> >
> > Best Regards,
> > Yu
> >
> > On 12 December 2016 at 12:47, Mikhail Antonov 
> > wrote:
> >
> > > Congratulations Josh!
> > >
> > > -Mikhail
> > >
> > > On Sun, Dec 11, 2016 at 5:20 PM, 张铎  wrote:
> > >
> > > > Congratulations!
> > > >
> > > > 2016-12-12 9:03 GMT+08:00 Jerry He :
> > > >
> > > > > Congratulations , Josh!
> > > > >
> > > > > Good work on the PQS too.
> > > > >
> > > > > Jerry
> > > > >
> > > > > On Sun, Dec 11, 2016 at 12:14 PM, Josh Elser 
> > > wrote:
> > > > >
> > > > > > Thanks, all. I'm looking forward to continuing to work with you
> > all!
> > > > > >
> > > > > >
> > > > > > Nick Dimiduk wrote:
> > > > > >
> > > > > >> On behalf of the Apache HBase PMC, I am pleased to announce that
> > > Josh
> > > > > >> Elser
> > > > > >> has accepted the PMC's invitation to become a committer on the
> > > > project.
> > > > > We
> > > > > >> appreciate all of Josh's generous contributions thus far and
> look
> > > > > forward
> > > > > >> to his continued involvement.
> > > > > >>
> > > > > >> Allow me to be the first to congratulate and welcome Josh into
> his
> > > new
> > > > > >> role!
> > > > > >>
> > > > > >>
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > > Thanks,
> > > Michael Antonov
> > >
> >
>

Re: [DISCUSS] EOL 1.1 Release Branch

2016-11-07 Thread Enis Söztutar

Going back to the original discussion, the conclusion is to continue with
1.1 line for now, and re-evaluate in a couple of months it seems.

Nick do you want to keep driving the next 1.1 releases, or you want to let
Andrew do that? I can help as well if needed.

Enis

On Mon, Nov 7, 2016 at 12:17 PM, Andrew Purtell 
wrote:

> I have a patch for this and will be trying it out.
>
> On Nov 7, 2016, at 12:00 PM, Gary Helmling  wrote:
>
> >>
> >> I'm not deeply familiar with the AssignmentManager. I see when we
> process
> >> split rollbacks in onRegionSplit() we only call regionOffline() on
> >> daughters if they are known to exist. However when processing merge
> >> rollbacks in the else case of onRegionMerge() we unconditionally call
> >> regionOffline() on the parent-being-merged. Shouldn't that likewise be
> >> conditional on regionStates holding a state for the parent-being-merged?
> >> Pardon if I've missed something.
> >>
> >>
> > I'm really not familiar with the merge code, but this seems plausible to
> > me.  I see that onRegionSplit() has an early out at the top of the
> method,
> > but that will fail to evaluate if rs_a and rs_b are open and rs_p is
> null.
> > So if it's called with a code of MERGE_REVERTED, I think we could wind up
> > creating an offline meta entry for rs_p with no regioninfo, similar to
> > HBASE-16093.  And that entry could wind up hiding the (still online)
> > daughter regions.
>

Re: [DISCUSS] EOL 1.1 Release Branch

2016-11-04 Thread Enis Söztutar

I also think that having 1.1 going for a bit longer might be helpful still,
especially if the ITBLL is failing with branch-1.2. Almost all of our
internal testing happens with a 1.1 based code base, so I cannot tell
whether 1.2 / 1.3 is the same stability or not.

Enis

On Fri, Nov 4, 2016 at 5:05 PM, Andrew Purtell  wrote:

> Thanks. Yes I have been eyeing HBASE-16093. There might be another corner
> case there.
>
>
> On Fri, Nov 4, 2016 at 4:41 PM, Gary Helmling  wrote:
>
> > >
> > > The behavior: Looks like failed split/compaction rollback: row(s) in
> META
> > > without HRegionInfo, regions deployed without valid meta entries (at
> > > first), regions on HDFS without valid meta entries (later, after RS
> > > carrying them are killed by chaos), holes in the region chain leading
> to
> > > timeouts and job failure.
> > >
> > >
> > The empty regioninfo in meta sounds like HBASE-16093, though that fix is
> in
> > 1.2.  Interested to see if there are other problems around splits though.
> > Do you have a JIRA yet for tracking?
> >
> >
> > >
> > > You'll know you have found it when on the ITBLL console its meta
> scanner
> > > starts complaining about rows in meta without serialized HRegionInfo.
> > >
> > >
> > Will keep an eye out for this in our ITBLL runs here.
> >
>
>
>
> --
> Best regards,
>
>- Andy
>
> Problems worthy of attack prove their worth by hitting back. - Piet Hein
> (via Tom White)
>

Re: Re: Re: Re: What way to improve MTTR other than DLR(distributed log replay)

2016-10-31 Thread Enis Söztutar

inlined.

On Thu, Oct 27, 2016 at 3:01 PM, Stack <st...@duboce.net> wrote:

> On Fri, Oct 21, 2016 at 3:24 PM, Enis Söztutar <enis@gmail.com> wrote:
>
> > A bit late, but let me give my perspective. This can also be moved to
> jira
> > or dev@ I think.
> >
> > DLR was a nice and had pretty good gains for MTTR. However, dealing with
> > the sequence ids, onlining regions etc and the replay paths proved to be
> > too difficult in practice. I think the way forward would be to not bring
> > DLR back, but actually fix long standing log split problems.
> >
> > The main gains in DLR is that we do not create lots and lots of tiny
> files,
> > but instead rely on the regular region flushes, to flush bigger files.
> This
> > also helps with handling requests coming from different log files etc.
> The
> > only gain that I can think of that you get with DLR, but not with log
> split
> > is the online enabling of writes while the recovery is going on.
> However, I
> > think it is not worth having DLR just for this feature.
> >
> >
> And not having to write intermediary files as you note at the start of your
> paragraph.
>
>
>
> > Now, what are the problems with Log Split you ask. The problems are
> >   - we create a lot of tiny files
> >   - these tiny files are replayed sequentially when the region is
> assigned
> >   - The region has to replay and flush all data sequentially coming from
> > all these tiny files.
> >
> >
> Longest pole in MTTR used to be noticing the RS had gone away in the first
> place. Lets not forget to add this to our list.
>
>
>
> > In terms of IO, we pay the cost of reading original WAL files, and
> writing
> > this same amount of data into many small files where the NN overhead is
> > huge. Then for every region, we do serially sort the data by re-reading
> the
> > tiny WAL files (recovered edits) and sorting them in memory and flushing
> > the data. Which means we do 2 times the reads and writes that we should
> do
> > otherwise.
> >
> > The way to solve our log split bottlenecks is re-reading the big table
> > paper and implement the WAL recovery as described there.
> >  - Implement an HFile format that can contain data from multiple regions.
> > Something like a concatinated HFile format where each region has its own
> > section, with its own sequence id, etc.
>
>  - Implement links to these files where a link can refer to this data. This
> > is very similar to our ReferenceFile concept.
>
>  - In each log splitter task, instead of generating tiny WAL files that are
> > recovered edits, we instead buffer up in memory, and do a sort (this is
> the
> > same sort of inserting into the memstore) per region. A WAL is ~100 MB on
> > average, so should not be a big problem to buffer up this.
>
>
>
> Need to be able to spill. There will be anomalies.
>
>
>
> > At the end of
> > the WAL split task, write an hfile containing data from all the regions
> as
> > described above. Also do a multi NN request to create links in regions to
> > refer to these files (Not sure whether NN has a batch RPC call or not).
> >
> >
> It does not.
>
> So, doing an accounting, I see little difference from what we have now. In
> new scheme:
>
> + We read all WALs as before.
> + We write about the same (in current scheme, we'd aggregate
> across WAL so we didn't write a recovered edits file per WAL) though new
> scheme
>

current scheme is DLR or DLS (log split)? Compared to DLS, we will do read
WAL
once and write sorted HFiles once, instead of reading WAL twice, and
writing
small WALs (recovered.edits).


> maybe less since we currently flush after replay of recovered edits so we
> nail an
> hfile into the file system that has the recovered edits (but in new scheme,
> we'll bring
> on a compaction because we have references which will cause a rewrite of
> the big hfile
> into a smaller one...).
> + Metadata ops are about the same (rather than lots of small recovered
> edits files instead
> we write lots of small reference files)
>

Yes, I was assuming that we can do a batch createFiles() call in a single
RPC
reducing the NN overhead significantly. Lacking that, it is better to write
small
hfiles directly under region directory as Phil suggests above. Actually if
we do
hfile writes directly with spilling in regular log split, it will be a good
incremental
change.

If we have the file namespace (NN) in an hbase table (like meta) like we
have talked about
in the 1M regions jiras, we can gain a lot by writing a single hfile per
WAL.
The "hard links" will be cells in

Re: Re: Re: Re: What way to improve MTTR other than DLR(distributed log replay)

2016-10-21 Thread Enis Söztutar

A bit late, but let me give my perspective. This can also be moved to jira
or dev@ I think.

DLR was a nice and had pretty good gains for MTTR. However, dealing with
the sequence ids, onlining regions etc and the replay paths proved to be
too difficult in practice. I think the way forward would be to not bring
DLR back, but actually fix long standing log split problems.

The main gains in DLR is that we do not create lots and lots of tiny files,
but instead rely on the regular region flushes, to flush bigger files. This
also helps with handling requests coming from different log files etc. The
only gain that I can think of that you get with DLR, but not with log split
is the online enabling of writes while the recovery is going on. However, I
think it is not worth having DLR just for this feature.

Now, what are the problems with Log Split you ask. The problems are
  - we create a lot of tiny files
  - these tiny files are replayed sequentially when the region is assigned
  - The region has to replay and flush all data sequentially coming from
all these tiny files.

In terms of IO, we pay the cost of reading original WAL files, and writing
this same amount of data into many small files where the NN overhead is
huge. Then for every region, we do serially sort the data by re-reading the
tiny WAL files (recovered edits) and sorting them in memory and flushing
the data. Which means we do 2 times the reads and writes that we should do
otherwise.

The way to solve our log split bottlenecks is re-reading the big table
paper and implement the WAL recovery as described there.
 - Implement an HFile format that can contain data from multiple regions.
Something like a concatinated HFile format where each region has its own
section, with its own sequence id, etc.
 - Implement links to these files where a link can refer to this data. This
is very similar to our ReferenceFile concept.
 - In each log splitter task, instead of generating tiny WAL files that are
recovered edits, we instead buffer up in memory, and do a sort (this is the
same sort of inserting into the memstore) per region. A WAL is ~100 MB on
average, so should not be a big problem to buffer up this. At the end of
the WAL split task, write an hfile containing data from all the regions as
described above. Also do a multi NN request to create links in regions to
refer to these files (Not sure whether NN has a batch RPC call or not).

The reason this will be on-par or better than DLR is that, we are only
doing 1 read and 1 write, and the sort is parallelized. The region opening
does not have to block on replaying anything or waiting for flush, because
the data is already sorted and in HFile format. These hfiles will be used
the normal way by adding them to the KVHeaps, etc. When compactions run, we
will be removing the links to these files using the regular mechanisms.

Enis

On Tue, Oct 18, 2016 at 6:58 PM, Ted Yu  wrote:

> Allan:
> One factor to consider is that the assignment manager in hbase 2.0 would be
> quite different from those in 0.98 and 1.x branches.
>
> Meaning, you may need to come up with two solutions for a single problem.
>
> FYI
>
> On Tue, Oct 18, 2016 at 6:11 PM, Allan Yang  wrote:
>
> > Hi, Ted
> > These issues I mentioned above(HBASE-13567, HBASE-12743, HBASE-13535,
> > HBASE-14729) are ALL reproduced in our HBase1.x test environment. Fixing
> > them is exactly what I'm going to do. I haven't found the root cause yet,
> > but  I will update if I find solutions.
> >  what I afraid is that, there are other issues I don't know yet. So if
> you
> > or other guys know other issues related to DLR, please let me know
> >
> >
> > Regards
> > Allan Yang
> >
> >
> >
> >
> >
> >
> >
> > At 2016-10-19 00:19:06, "Ted Yu"  wrote:
> > >Allan:
> > >I wonder how you deal with open issues such as HBASE-13535.
> > >From your description, it seems your team fixed more DLR issues.
> > >
> > >Cheers
> > >
> > >On Mon, Oct 17, 2016 at 11:37 PM, allanwin  wrote:
> > >
> > >>
> > >>
> > >>
> > >> Here is the thing. We have backported DLR(HBASE-7006) to our 0.94
> > >> clusters  in production environment(of course a lot of bugs are fixed
> > and
> > >> it is working well). It is was proven to be a huge gain. When a large
> > >> cluster crash down, the MTTR improved from several hours to less than
> a
> > >> hour. Now, we want to move on to HBase1.x, and still we want DLR. This
> > >> time, we don't want to backport the 'backported' DLR to HBase1.x, but
> it
> > >> seems like that the community have determined to remove DLR...
> > >>
> > >>
> > >> The DLR feature is proven useful in our production environment, so I
> > think
> > >> I will try to fix its issues in branch-1.x
> > >>
> > >>
> > >>
> > >>
> > >>
> > >>
> > >> At 2016-10-18 13:47:17, "Anoop John"  wrote:
> > >> >Agree with ur observation.. But DLR feature we wanted to get
> removed..
> > >> >Because it

[ANNOUNCE] Stephen Yuan Jiang joins Apache HBase PMC

2016-10-14 Thread Enis Söztutar

On behalf of the Apache HBase PMC, I am happy to announce that Stephen has
accepted our invitation to become a PMC member of the Apache HBase project.

Stephen has been working on HBase for a couple of years, and is already a
committer for more than a year. Apart from his contributions in proc v2,
hbck and other areas, he is also helping for the 2.0 release which is the
most important milestone for the project this year.

Welcome to the PMC Stephen,
Enis

Re: Increased response time of hbase calls

2016-09-22 Thread Enis Söztutar

Please also note that m7 IS NOT HBase and have no connection with Apache
HBase at all. Please do not let your vendor tell otherwise.

Enis

On Thu, Sep 22, 2016 at 12:00 AM, Deepak Khandelwal <
dkhandelwal@gmail.com> wrote:

> Ok thx everyone.
>
> Will check with Mapr
>
> On Wednesday, September 21, 2016, Heng Chen 
> wrote:
>
> > Not sure hbase m7 is which version of hbase in community.
> >
> > Is your batch load job some kind of bulk load or just call HTable API
> > to dump data to HBase?
> >
> >
> > 2016-09-22 14:30 GMT+08:00 Dima Spivak  > >:
> > > Hey Deepak,
> > >
> > > Assuming I understand your question, I think you'd be better served
> > > reaching out to MapR directly. Our community isn't involved in M7 so
> the
> > > average user (or dev) wouldn't know about the ins and outs of that
> > > offering.
> > >
> > > On Wednesday, September 21, 2016, Deepak Khandelwal <
> > > dkhandelwal@gmail.com > wrote:
> > >
> > >> Hi all
> > >>
> > >> I am facing an issue while accessing data from an hbase m7 table which
> > has
> > >> about 50 million records.
> > >>
> > >> In a single Api request, we make 3 calls to hbase m7.
> > >> 1. Single Multi get to fetch about 30 records
> > >> 2. Single multi-put to update about 500 records
> > >> 3. Single multi-get to fetch about 15 records
> > >>
> > >> We consistently get the response in less than 200 seconds for approx
> > >> 99%calls. We have a tps of about 200 with 8vm's.
> > >> But we get issue everyday between 4pm and 6pm when Api response time
> > gets
> > >> significant increase to from 200ms to 7-8sec. This happens because we
> > have
> > >> a daily batch load That runs between 4and 6pm that puts multiple
> entries
> > >> into same hbase table.
> > >>
> > >> We are trying to find a solution to this problem that why response
> time
> > >> increases when batch load runs. We cannot change the time of batch
> job.
> > Is
> > >> there anything we could do to resolve this issue?any help or pointers
> > would
> > >> be much appreciated. Thanks
> > >>
> > >
> > >
> > > --
> > > -Dima
> >
>

Re: [ANNOUNCE] Duo Zhang (张铎) joins the Apache HBase PMC

2016-09-07 Thread Enis Söztutar

Congrats Duo.

Enis

On Wed, Sep 7, 2016 at 8:03 AM, Misty Stanley-Jones 
wrote:

> Congratulations, Duo!
>
> > On Sep 6, 2016, at 9:26 PM, Stack  wrote:
> >
> > On behalf of the Apache HBase PMC I am pleased to announce that 张铎
> > has accepted our invitation to become a PMC member on the Apache
> > HBase project. Duo has healthy notions on where the project should be
> > headed and over the last year and more has been working furiously to take
> > us there.
> >
> > Please join me in welcoming Duo to the HBase PMC!
> >
> > One of us!
> > St.Ack
>
>

Re: maybe waste on blockCache

2016-06-20 Thread Enis Söztutar

LRUBlock cache does not reserve the space for the in_memory tier. The space
is used for other tiers as well as long as there is space.

You can read the code at LRUBlockCache.evict() to learn more.

Enis

On Thu, Jun 16, 2016 at 1:21 AM, WangYQ  wrote:

> in hbase 0.98.10, if we use LruBlockCache, and set regionServer's max heap
> to 10G
> in default:
> the size of in_memory priority of LruBlockCache is :
> 10G * 0.4 * 0.25 = 1G
>
>
> 0.4: hfile.block.cache.size
> 0.25: hbase.lru.blockcache.memory.percentage
>
>
> if we do not set any user tables IN_MEMORY to true, then the whole hbase
> just need to cache hbase:meta data to in_memory LruBlockCache.
> hbase:meta does not split , so just need one regionServer to cache, so
> there is some waste in blockCache
>
>
> i think the regionServer open hbase:meta need to set  in_memory
> LruBlockCache to a certain size
> other regionServer set hbase.lru.blockcache.memory.percentage to 0, do not
> need to allocate  in_memory LruBlockCache.

Re: Scan.setMaxResultSize and Result.isPartial

2016-06-17 Thread Enis Söztutar

You should probably read
https://blogs.apache.org/hbase/entry/scan_improvements_in_hbase_1 first.

In HBase-1.1 and later code bases, you can call Scan.allowPartialResults()
to instruct the ClientScanner to give you partial results. In this case,
you can use Result.isPartial() to stitch together multiple Result objects
into a single row. Unless you explicitly request it, Results returned will
never be partial results. Why would you want to call
Scan.allowPartialResults() in the first place? It is because of client-side
memory allocation. If you have a row with millions of columns and GBs of
data let's say, you cannot afford to have the ClientScanner to auto-stitch
all the column values for you and give a single Result object, because it
will cause OOM.

Hope this helps.
Enis

On Fri, Jun 17, 2016 at 4:15 PM, Bryan Beaudreault  wrote:

> Hello,
>
> We are running 1.2.0-cdh5.7.0 on our server side, and 1.0.0-cdh5.4.5 on the
> client side. We're in the process of upgrading the client, but aren't there
> yet. I'm trying to figure out the relationship of Result.isPartial and the
> user, when setMaxResultSize is used.
>
> I've done a little reading of the code, and it looks like isPartial is
> mostly used by the internals of ClientScanner. From what I can tell the
> user should never get a Result where isPartial == true, because the
> ClientScanner will do multiple requests internally to flesh out incomplete
> rows.
>
> However, the code is a bit complex so I'd like to verify. Is this correct
> for either version of HBase above? Is it safe to use setMaxResultSize
> without any more work, or should we be handling the potential isPartial()
> Result ourselves in every scan request we make?
>
> I wonder if this should be added to the docs, either way (didn't see it),
> or remove isPartial from the public API in future versions?
>
> Thanks!
>

Re: Enabling stripe compaction without disabling table

2016-06-07 Thread Enis Söztutar

Disabling the table should not be needed. From the stripe compaction
perspective, deploying this in a disabled table versus in an online alter
table is not different at all. The "hbase.online.schema.update.enable"
property was fixing some possible race conditions that were fixed long time
ago.

We should update the documentation. Mind creating a small patch?
Enis

On Mon, Jun 6, 2016 at 12:37 PM, Bryan Beaudreault  wrote:

> Thanks Ted, I have seen that and I have had it set for to true for years
> without issue. I was asking in this case because the docs for stripe
> compaction explicitly say to disable the table. I will test in our QA
> environment first, but would also appreciate input from anyone who has done
> this without disabling the table first, for better or worse.
>
> On Mon, Jun 6, 2016 at 3:26 PM Ted Yu  wrote:
>
> > Have you seen the doc at the top
> > of ./hbase-shell/src/main/ruby/shell/commands/alter.rb ?
> >
> > Alter a table. If the "hbase.online.schema.update.enable" property is set
> > to
> > false, then the table must be disabled (see help 'disable'). If the
> > "hbase.online.schema.update.enable" property is set to true, tables can
> be
> > altered without disabling them first. Altering enabled tables has caused
> > problems
> > in the past, so use caution and test it before using in production.
> >
> > FYI
> >
> > On Mon, Jun 6, 2016 at 12:19 PM, Bryan Beaudreault <
> > bbeaudrea...@hubspot.com
> > > wrote:
> >
> > > Hello,
> > >
> > > We're running hbase 1.2.0-cdh5.7.0. According to the HBase book, in
> order
> > > to enable stripe compactions on a table we need to first disable the
> > > table. We
> > > basically can't disable tables in production. Is it possible to do this
> > > without disabling the table?  If not, are there any plans to make this
> > > doable?
> > >
> > > Thanks!
> > >
> >
>

Re: [ANNOUNCE] Mikhail Antonov joins the Apache HBase PMC

2016-05-26 Thread Enis Söztutar

Welcome Mikhail.

Enis

On Thu, May 26, 2016 at 12:19 PM, Gary Helmling  wrote:

> Welcome Mikhail!
>
> On Thu, May 26, 2016 at 11:47 AM Ted Yu  wrote:
>
> > Congratulations, Mikhail !
> >
> > On Thu, May 26, 2016 at 11:30 AM, Andrew Purtell 
> > wrote:
> >
> > > On behalf of the Apache HBase PMC I am pleased to announce that Mikhail
> > > Antonov has accepted our invitation to become a PMC member on the
> Apache
> > > HBase project. Mikhail has been an active contributor in many areas,
> > > including recently taking on the Release Manager role for the upcoming
> > > 1.3.x code line. Please join me in thanking Mikhail for his
> contributions
> > > to date and anticipation of many more contributions.
> > >
> > > Welcome to the PMC, Mikhail!
> > >
> > > --
> > > Best regards,
> > >
> > >- Andy
> > >
> > > Problems worthy of attack prove their worth by hitting back. - Piet
> Hein
> > > (via Tom White)
> > >
> >
>

Re: [ANNOUNCE] PhoenixCon the day after HBaseCon

2016-05-19 Thread Enis Söztutar

This is great. Thanks James for organizing!

Enis

On Thu, May 19, 2016 at 2:50 PM, James Taylor 
wrote:

> The inaugural PhoenixCon will take place 9am-1pm on Wed, May 25th (at
> Salesforce @ 1 Market St, SF), the day after HBaseCon. We'll have two
> tracks: one for Apache Phoenix use cases and one for Apache Phoenix
> internals.
>
> To RSVP and for more details see here[1].
>
> We hope you can make it!
>
> James
>
> [1]
> http://www.meetup.com/SF-Bay-Area-Apache-Phoenix-Meetup/events/230545182/
>

Re: No server address listed in hbase:meta for region SYSTEM.CATALOG

2016-04-28 Thread Enis Söztutar

Phoenix does not support HBase-1.2 clusters right now. Only 4.8 release of
Phoenix will support running Phoenix with HBase-1.2.

See https://issues.apache.org/jira/browse/PHOENIX-2833

What version of Phoenix did you deploy? It might be the case that Phoenix
coprocessors are just throwing exceptions. You can check the regionsever
logs.

Enis

On Thu, Apr 28, 2016 at 7:55 AM, Ted Yu  wrote:

> Here is sample scan output from a working cluster
>
>  hbase:namespace,,146075636.acc7841bcbaca column=info:regioninfo,
> timestamp=1460756360969, value={ENCODED =>
> acc7841bcbacafacf336e48bb14794de, NAME => 'hbase:namespace,,14607
>  facf336e48bb14794de.
> 5636.acc7841bcbacafacf336e48bb14794de.', STARTKEY => '', ENDKEY => ''}
>  hbase:namespace,,146075636.acc7841bcbaca column=info:seqnumDuringOpen,
> timestamp=1461104489067, value=\x00\x00\x00\x00\x00\x00\x00\x0C
>  facf336e48bb14794de.
>  hbase:namespace,,146075636.acc7841bcbaca column=info:server,
> timestamp=1461104489067, value=x.com:16020
>  facf336e48bb14794de.
>  hbase:namespace,,146075636.acc7841bcbaca column=info:serverstartcode,
> timestamp=1461104489067, value=1461104478409
>
> Note the info:server column.
>
> How did you deploy hbase and Phoenix.
>
> Mind pastebinning master log and meta server log ?
>
> Cheers
>
> On Thu, Apr 28, 2016 at 1:33 AM, Manisha Sethi 
> wrote:
>
> > Hi
> >
> > I have my hbase:meta table entries as :
> > SYSTEM.CATALOG,,1461831992343.a6daf63bd column=info:regioninfo,
> > timestamp=1461831993549, value={ENCODED =>
> > a6daf63bde1f1456ca4acee228b8f5fe, NAME => 'SYSTEM
> > e1f1456ca4acee228b8f5fe.
> > .CATALOG,,1461831992343.a6daf63bde1f1456ca4acee228b8f5fe.', STARTKEY =>
> '',
> > ENDKEY => ''}
> > hbase:namespace,,1461821579649.7edd6a09 column=info:regioninfo,
> > timestamp=1461821581196, value={ENCODED =>
> > 7edd6a099dc3612b7dafa52f380ac3e6, NAME => 'hbase:
> > 9dc3612b7dafa52f380ac3e6.
> >  namespace,,1461821579649.7edd6a099dc3612b7dafa52f380ac3e6.', STARTKEY =>
> > '', ENDKEY => ''}
> > hbase:namespace,,1461821579649.7edd6a09 column=info:seqnumDuringOpen,
> > timestamp=1461831928239, value=\x00\x00\x00\x00\x00\x00\x00\x1C
> > 9dc3612b7dafa52f380ac3e6.
> >
> > While I do scan 'SYSTEM.CATALOG' I get exception:
> >
> > No server address listed in hbase:meta for region
> > SYSTEM.CATALOG,.
> >
> > My aim is to connect to hbase throufh phoenix, but even hbase shell scan
> > not working. I can see entries in meta table,, I tried flush and compact
> > for meta also. But no progress...
> >
> > I am using Hbase 1.2
> >
> > Thanks
> > Manisha Sethi
> >
> >
> > 
> >
> >
>

Re: new 4.x-HBase-1.1 branch

2016-04-21 Thread Enis Söztutar

>
> Also FWIW. I'd be curious to hear how many Phoenix users are using 0.98
>> versus 1.0 and up, besides the folks at Salesforce, whom I know well
>> (smile). And, more generally, who in greater HBase land is still on 0.98
>> and won't move off this year.
>>
>
 From our customers perspective, most of the users are 1.1+.
The ones using 0.98-based HBase releases are not getting the latest Phoenix
anyway through the vendor channels.

Enis


>
>> > On Apr 20, 2016, at 5:00 PM, James Taylor 
>> wrote:
>> >
>> > Due to some API changes in HBase, we need to have a separate branch for
>> our
>> > HBase 1.1 compatible branches. I've created a 4.x-HBase-1.1 branch for
>> this
>> > - please make sure to keep this branch in sync with the other 4.x and
>> > master branches. We can release 4.x-HBase-1.2 compatible releases out of
>> > master for 4.8. I think post 4.8 release we may not need to continue
>> with
>> > the 4.x-HBase-1.0 branch.
>> >
>> > Thanks,
>> > James
>>
>
>

Re: new 4.x-HBase-1.1 branch

2016-04-21 Thread Enis Söztutar

Makes sense to drop the branch for HBase-1.0.x.

I had proposed it here before:
http://search-hadoop.com/m/9UY0h2XrnGW1d3OBF1=+DISCUSS+Drop+branch+for+HBase+1+0+


Enis

On Thu, Apr 21, 2016 at 8:06 AM, Andrew Purtell 
wrote:

> HBase announced at the last 1.0 release that it would be the last release
> in that line and I think we would recommend any 1.0 user move up to 1.1 or
> 1.2 at their earliest convenience. FWIW
>
> Also, as RM of the 0.98 code line I am considering ending its (mostly)
> regular release cadence at the end of this calendar year. I'd continue if
> there were expressed user or dev demand but otherwise place it into the
> same state as 1.0. Also FWIW. I'd be curious to hear how many Phoenix users
> are using 0.98 versus 1.0 and up, besides the folks at Salesforce, whom I
> know well (smile). And, more generally, who in greater HBase land is still
> on 0.98 and won't move off this year.
>
> > On Apr 20, 2016, at 5:00 PM, James Taylor 
> wrote:
> >
> > Due to some API changes in HBase, we need to have a separate branch for
> our
> > HBase 1.1 compatible branches. I've created a 4.x-HBase-1.1 branch for
> this
> > - please make sure to keep this branch in sync with the other 4.x and
> > master branches. We can release 4.x-HBase-1.2 compatible releases out of
> > master for 4.8. I think post 4.8 release we may not need to continue with
> > the 4.x-HBase-1.0 branch.
> >
> > Thanks,
> > James
>

Re: [Query:]Table creation with column family in Phoenix

2016-03-14 Thread Enis Söztutar

Phoenix maintains a column with empty value, because unlike a row-oriented
RDBMS, a NULL column is not represented explicitly in HBase, but implicitly
where the cell is not in HBase at all.

Here is the explanation from phoenix.apache.org:

For CREATE TABLE, an empty key value will also be added for each row so
that queries behave as expected (without requiring all columns to be
projected during scans). For CREATE VIEW, this will not be done, nor will
any HBase metadata be created. Instead the existing HBase metadata must
match the metadata specified in the DDL statement or aERROR 505 (42000):
Table is read only will be thrown.
 Enis

On Fri, Mar 11, 2016 at 2:53 PM, Harish Krishnan <
harish.t.krish...@gmail.com> wrote:

> Your scan query is returning all states/versions of your columns and column
> families
>
> Thanks & Regards,
> Harish.T.K
>
> On Thu, Mar 10, 2016 at 11:54 PM, Divya Gehlot 
> wrote:
>
> > Hi,
> > I created a table in Phoenix with three column families  and Inserted the
> > values as shown below
> >
> > Syntax :
> >
> > > CREATE TABLE TESTCF (MYKEY VARCHAR NOT NULL PRIMARY KEY, CF1.COL1
> > VARCHAR,
> > > CF2.COL2 VARCHAR, CF3.COL3 VARCHAR)
> > > UPSERT INTO TESTCF (MYKEY,CF1.COL1,CF2.COL2,CF3.COL3)values
> > > ('Key2','CF1','CF2','CF3')
> > > UPSERT INTO TESTCF (MYKEY,CF1.COL1,CF2.COL2,CF3.COL3)values
> > > ('Key2','CF12','CF22','CF32')
> >
> >
> >  When I try to scan same table in Hbase
> > hbase(main):010:0> scan "TESTCF"
> >
> > > ROW   COLUMN+CELL
> > >  Key1 column=CF1:COL1, timestamp=1457682385805,
> value=CF1
> > >  Key1 column=CF1:_0, timestamp=1457682385805, value=
> > >  Key1 column=CF2:COL2, timestamp=1457682385805,
> value=CF2
> > >  Key1 column=CF3:COL3, timestamp=1457682385805,
> value=CF3
> > >  Key2 column=CF1:COL1, timestamp=1457682426396,
> > value=CF12
> > >  Key2 column=CF1:_0, timestamp=1457682426396, value=
> > >  Key2 column=CF2:COL2, timestamp=1457682426396,
> > value=CF22
> > >  Key2 column=CF3:COL3, timestamp=1457682426396,
> > value=CF32
> > > 2 row(s) in 0.0260 seconds
> >
> >
> > My query is why I am getting CF1:_0 one extra column in each row with no
> > value.
> >
> > Can any body explain me .
> > Would really appreciate the help.
> >
> > Thanks,
> > Divya
> >
>

Re: How to find versions 0.95 through 1.3 of the HBase reference (book)?

2016-03-02 Thread Enis Söztutar

We do not publish per-release docs other than the master and 0.94. 0.94 is
for a special reason since 0.96+ contains some major changes.

The release artifacts for every release contains the book in the tarball if
you want to browse through.

Enis

On Wed, Mar 2, 2016 at 4:00 PM, Cosmin Lehene  wrote:

> I'm able to get 0.94 and 2.0.0 version of the book  on hbase.apache.org
>
> The 0.94 link is https://hbase.apache.org/0.94/book.html, but changing
> the version in the URL to https://hbase.apache.org/0.98/book.html<
> https://hbase.apache.org/0.94/book.html> doesn't seem to work.
>
>
> Am I missing something obvious, or are the rest of the versions
> unavailable on hbase.apache.org?
>
>
> Thanks,
>
> Cosmin
>
>

Re: Compaction settings per table

2016-02-25 Thread Enis Söztutar

For compaction configurations, you can also set it per table OR even per
column family.

In java, you can use

HTableDescriptor.setConfiguration() or HColumnDescriptor.setConfiguration()
to set the specific configuration values that overrides the ones set in
hbase-site.xml. For compaction and flush, we have a CompoundConfiguration
that is a layered configuration of hbase-site.xml values ->
HTD.getConfiguration() values -> HCD.getConfiguration() values.

You can also use hbase shell to set the configuration as well.

Enis

On Thu, Feb 25, 2016 at 4:51 AM, Gaurav Agarwal  wrote:

> Go the answer to memstore size per table
> via TableDescriptor#setMemStoreFlushSize(long)
>
> On Thu, Feb 25, 2016 at 5:38 PM, Gaurav Agarwal  wrote:
>
> > In addition, is there a way to set memstore flush size per table/cf as
> > well?
> >
> > On Thu, Feb 25, 2016 at 5:20 PM, Gaurav Agarwal 
> wrote:
> >
> >> Hi,
> >>
> >> Is there a way to set Compaction configurations differently for each of
> >> my table? Specifically, I want to tweak `
> >> hbase.hstore.compaction.min.size` parameter for one of my table while
> >> keeping it to its default value for others.
> >>
> >> --cheers, gaurav
> >>
> >
> >
> >
> > --
> > --cheers, gaurav
> >
>
>
>
> --
> --cheers, gaurav
>

Re: NPE on ProtobufLogWriter.sync()

2016-02-16 Thread Enis Söztutar

> Operation category WRITE is not supported in state standby
This looks to be coming from a NN that is in Standby state (or safe mode?).
Did you check whether underlying HDFS is healthy. Is this HA configuration
and hbase-site.xml contains the correct NN configuration?

Enis


On Tue, Feb 16, 2016 at 6:51 AM, Pedro Gandola 
wrote:

> Just lost another RegionServer... The cluster was very stable until
> yesterday, 2 region servers in less than 24 hours something might be wrong
> in my configuration.
>
> Any insights?
>
> Thank you
> Pedro
>
> *(rs log)*
>
> > 2016-02-16 14:22:37,749 WARN  [ResponseProcessor for block
> > BP-568427890-10.5.1.235-1453722567252:blk_1074371748_630937]
> > hdfs.DFSClient: DFSOutputStream ResponseProcessor exception  for block
> > BP-568427890-10.5.1.235-1453722567252:blk_1074371748_630937
> > java.io.IOException: Bad response ERROR for block
> > BP-568427890-10.5.1.235-1453722567252:blk_1074371748_630937 from datanode
> > DatanodeInfoWithStorage[10.5.1.117:50010
> > ,DS-f1076321-e61d-4554-98aa-0801d64040ae,DISK]
> > at
> >
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer$ResponseProcessor.run(DFSOutputStream.java:786)
> > 2016-02-16 14:22:37,751 INFO  [LeaseRenewer:hbase@nameservice-prod]
> > retry.RetryInvocationHandler: Exception while invoking renewLease of
> class
> > ClientNamenodeProtocolTranslatorPB over hbase-ms5.localdomain/
> > 10.5.2.248:8020 after 1 fail over attempts. Trying to fail over after
> > sleeping for 1261ms.
> >
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException):
> > Operation category WRITE is not supported in state standby
> > at
> >
> org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:87)
> > at
> >
> org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:1872)
> > at
> >
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1306)
> > at
> >
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.renewLease(FSNamesystem.java:4457)
> > at
> >
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.renewLease(NameNodeRpcServer.java:992)
> > at
> >
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.renewLease(ClientNamenodeProtocolServerSideTranslatorPB.java:652)
> > at
> >
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
> > at
> >
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
> > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969)
> > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2137)
> > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2133)
> > at java.security.AccessController.doPrivileged(Native Method)
> > at javax.security.auth.Subject.doAs(Subject.java:422)
> > at
> >
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
> > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2131)
> > at org.apache.hadoop.ipc.Client.call(Client.java:1427)
> > at org.apache.hadoop.ipc.Client.call(Client.java:1358)
> > at
> >
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)
> > at com.sun.proxy.$Proxy16.renewLease(Unknown Source)
> > at
> >
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.renewLease(ClientNamenodeProtocolTranslatorPB.java:590)
> > at sun.reflect.GeneratedMethodAccessor33.invoke(Unknown Source)
> > at
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> > at java.lang.reflect.Method.invoke(Method.java:497)
> > at
> >
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
> > at
> >
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
> > at com.sun.proxy.$Proxy17.renewLease(Unknown Source)
> > at sun.reflect.GeneratedMethodAccessor33.invoke(Unknown Source)
> > at
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> > at java.lang.reflect.Method.invoke(Method.java:497)
> > at org.apache.hadoop.hbase.fs.HFileSystem$1.invoke(HFileSystem.java:279)
> > at com.sun.proxy.$Proxy18.renewLease(Unknown Source)
> > at org.apache.hadoop.hdfs.DFSClient.renewLease(DFSClient.java:892)
> > at org.apache.hadoop.hdfs.LeaseRenewer.renew(LeaseRenewer.java:417)
> > at org.apache.hadoop.hdfs.LeaseRenewer.run(LeaseRenewer.java:442)
> > at org.apache.hadoop.hdfs.LeaseRenewer.access$700(LeaseRenewer.java:71)
> > at org.apache.hadoop.hdfs.LeaseRenewer$1.run(LeaseRenewer.java:298)
> > at java.lang.Thread.run(Thread.java:745)
> > 2016-02-16 14:22:37,753 WARN  [DataStreamer for file
> >
> /apps/hbase/data/WALs/hbase-rs6.localdomain,16020,1455560986600/hbase-rs6.localdomain%2C16020%2C1455560986600.default.1455632424508
> > block

Re: Row length is 0 at org.apache.hadoop.hbase.client.Mutation.checkRow(Mutation.java:503)

2016-02-09 Thread Enis Söztutar

The row key byte[] you are passing in Get() has a length of 0. HBase data
model does not allow 0-length row key, it should be at least 1 byte. 0-byte
row key is reserved for internal usage (to designate empty start key and
end keys).

In your storm topology, you are probably passing a row key that is
0-length. You can just add a condition there.

Enis

On Tue, Feb 9, 2016 at 5:50 AM, Raja.Aravapalli 
wrote:

>
> Thanks for the response Ted.
>
> Please refer to the code of HBaseClient @
> https://github.com/apache/storm/blob/0.9.3-branch/external/storm-hbase/src/
> main/java/org/apache/storm/hbase/common/HBaseClient.java
>
>
> Thanks.
>
>
> Regards,
> Raja Mahesh Aravapalli,
> Raja.Aravapalli (IM)| raja.aravapa...@target.com | +91-9900-300-945.
>
>
>
>
> On 2/9/16, 7:15 PM, "Ted Yu"  wrote:
>
> >Can you give us a bit more information ?
> >
> >Release of hbase
> >
> >snippet of your code (especially HBaseClient.java) related to the stack
> >trace
> >
> >Thanks
> >
> >On Tue, Feb 9, 2016 at 2:47 AM, Raja.Aravapalli
> >
> >wrote:
> >
> >>
> >> Hi,
> >>
> >> HBase table lookup is failing with below exception. Someone please help
> >>me
> >> in fixing this:
> >>
> >>
> >> java.lang.RuntimeException: java.lang.IllegalArgumentException: Row
> >>length
> >> is 0 at
> >>
> >>backtype.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.j
> >>ava:128)
> >> at
> >>
> >>backtype.storm.utils.DisruptorQueue.consumeBatchWhenAvailable(DisruptorQu
> >>eue.java:99)
> >> at
> >>
> >>backtype.storm.disruptor$consume_batch_when_available.invoke(
> disruptor.cl
> >>j:80)
> >> at
> >>
> >>backtype.storm.daemon.executor$fn__5265$fn__5278$fn__5329.invoke(executor
> >>.clj:794)
> >> at backtype.storm.util$async_loop$fn__551.invoke(util.clj:465) at
> >> clojure.lang.AFn.run(AFn.java:24) at
> >>java.lang.Thread.run(Thread.java:744)
> >> Caused by: java.lang.IllegalArgumentException: Row length is 0 at
> >> org.apache.hadoop.hbase.client.Mutation.checkRow(Mutation.java:503) at
> >> org.apache.hadoop.hbase.client.Mutation.checkRow(Mutation.java:487) at
> >> org.apache.hadoop.hbase.client.Get.(Get.java:89) at
> >>
> >>org.apache.storm.hbase.common.HBaseClient.constructGetRequests(HBaseClien
> >>t.java:112)
> >> at
> >>
> >>org.apache.storm.hbase.bolt.HBaseLookupBolt.execute(HBaseLookupBolt.java:
> >>65)
> >> at
> >>
> >>backtype.storm.daemon.executor$fn__5265$tuple_action_fn__5267.invoke(exec
> >>utor.clj:659)
> >> at
> >>
> >>backtype.storm.daemon.executor$mk_task_receiver$fn__5188.invoke(executor.
> >>clj:415)
> >> at
> >>
> >>backtype.storm.disruptor$clojure_handler$reify__1064.onEvent(
> disruptor.cl
> >>j:58)
> >> at
> >>
> >>backtype.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.j
> >>ava:125)
> >> ... 6 more
> >>
> >>
> >> I am using a storm application to do a lookup in HBase table, Get
> >>request
> >> is failing/throwing an exception for a rowkey specified for lookup.
> >>Please
> >> help me on finding the issue and fixing it
> >>
> >>
> >>
> >>
> >> Regards,
> >> Raja Mahesh Aravapalli.
> >>
>
>

[ANNOUNCE] Apache HBase 1.0.3 is now available for download

2016-01-29 Thread Enis Söztutar

 HBase Team is pleased to announce the immediate release of HBase 1.0.3.
Download it from your favorite Apache mirror [1] or maven repository.

HBase 1.0.3 is the next “patch” release in the 1.0.x release line and
supersedes all earlier 1.0.x. According to the HBase’s semantic version
guide (See [2]), the release candidate is source and binary compatible
with 1.0.x for client applications and server side libraries (coprocessors,
filters, etc).

Please note that 1.0.3 is the last “scheduled” release in the 1.0.x line of
releases since there is a limited amount of testing and release management
bandwidth. Users are highly encouraged to upgrade to 1.1 line of releases
if possible. However, if there is enough interest, or needed otherwise, we
can still decide to do more releases. Please be encouraged to speak up if
you want us to continue scheduled 1.0.x releases. See the hbase-dev mailing
thread [6] for more information.

1.0.3 release contains 102 fixes on top of 1.0.2 release. Most of
the changes are
bug fixes or test fixes except for the following:

** Sub-task
* [HBASE-14221] - Reduce the number of time row comparison is done in a
Scan
* [HBASE-14535] - Integration test for rpc connection concurrency /
deadlock testing·
* [HBASE-14539] - Slight improvement of StoreScanner.optimize
* [HBASE-14605] - Split fails due to 'No valid credentials' error when
SecureBulkLoadEndpoint#start tries to access hdfs
* [HBASE-14631] - Region merge request should be audited with request
user through proper scope of doAs() calls to region observer notifications
* [HBASE-14655] - Narrow the scope of doAs() calls to region observer
notifications for compaction
* [HBASE-14657] - Remove unneeded API from EncodedSeeker
* [HBASE-14709] - Parent change breaks graceful_stop.sh on a cluster
* [HBASE-15031] - Fix merge of MVCC and SequenceID performance
regression in branch-1.0
* [HBASE-15095] - isReturnResult=false  on fast path in branch-1.1 and
branch-1.0 is not respected

** Brainstorming
* [HBASE-14869] - Better request latency and size histograms

** Improvement
* [HBASE-14261] - Enhance Chaos Monkey framework by adding zookeeper
and datanode fault injections.
* [HBASE-14325] - Add snapshotinfo command to hbase script
* [HBASE-14436] - HTableDescriptor#addCoprocessor will always make
RegionCoprocessorHost create new Configuration
* [HBASE-14582] - Regionserver status webpage bucketcache list can
become huge
* [HBASE-14586] - Use a maven profile to run Jacoco analysis
* [HBASE-14643] - Avoid Splits from once again opening a closed reader
for fetching the first and last key

** Task
* [HBASE-14361] - ReplicationSink should create Connection instances
lazily

Full list of the issues can be found at
https://issues.apache.org/jira/secure/ReleaseNote.jspa?version=12332968=Text=12310753=Create

Compatibility
-
This release (1.0.3) is source, wire and binary compatible with all
previous 1.0.x releases. Client applications do not have to be recompiled
with the new version (unless new API is used) if upgrading from a previous
1.0.x. It is a drop-in replacement.

See release notes for 1.0.0 [2] for compatibility with earlier
versions (0.94, 0.96, 0.98). Compatibility of 1.0.3 with earlier versions
is the same as in 1.0.0.

Source Compatibility:
Client side code in HBase-1.0.x is (mostly) source compatible with 0.98.x
versions. Some minor API changes might be needed from the client side.

Wire Compatibility:
HBase-1.0.x release is wire compatible with 0.98.x releases. Clients and
servers running in different versions as long as new features are not used
should be possible. A rolling upgrade from 0.98.x clusters to 1.0.x is
supported as well.
Rolling upgrade from 0.96 directly to 1.0.x is not supported. 1.0.x is NOT
wire compatible with earlier releases (0.94, etc).

Binary Compatibility:
Binary compatibility at the Java API layer with earlier versions (0.98.x,
0.96.x and 0.94.x) is not supported. You may have to recompile your client
code and any server side code (coprocessors, filters etc) referring to
hbase jars.


Upgrading
-
This release is rolling upgradable from earlier 1.0.x releases.

See [2] and [3] for upgrade instructions from earlier versions. Upgrading
to 1.0.3 is similar to upgrading to 1.0.0 as documented in [3].

>From 0.98.x : Upgrade from 0.98.x in regular upgrade or rolling upgrade
fashion is supported.

>From 0.96.x : Upgrade from 0.96.x is supported with a shutdown and restart
of the cluster.

>From 0.94.x : Upgrade from 0.94.x is supported similar to upgrade from 0.94
-> 0.96. The upgrade script should be run to rewrite cluster level
metadata. See [3] for details.


Supported Hadoop versions
-
1.0.x releases support only Hadoop-2.x. Hadoop-2.4.x, Hadoop-2.5.x
and Hadoop 2.6.x releases are the most tested hadoop releases and we
recommend running with those versions (or later versions). Earlier Hadoop-2

[RESULT] [VOTE] Second release candidate for HBase 1.0.3 (RC1) is available. Please vote by Jan 26 2016

2016-01-27 Thread Enis Söztutar

With 4 binding +1, the vote is now closed.

Thanks everyone for voting. I'll go ahead with the release.

Enis

On Wed, Jan 27, 2016 at 7:54 PM, Stack <st...@duboce.net> wrote:

> +1
>
> Checked signatures and hash.
>
> Built from src.
>
> Loaded data and verified it made there.
>
> Checked logs.
>
> Poked around the package, all looks good
>
> St.Ack
>
>
> On Wed, Jan 20, 2016 at 12:29 AM, Enis Söztutar <e...@apache.org> wrote:
>
> >  I am pleased to announce that the second release candidate for the
> release
> > 1.0.3
> > (HBase-1.0.3RC1), is available for download at
> > https://dist.apache.org/repos/dist/dev/hbase/hbase-1.0.3RC1/
> >
> > Maven artifacts are also available in the temporary repository
> > https://repository.apache.org/content/repositories/orgapachehbase-1126
> >
> > Signed with my code signing key E964B5FF. Can be found here:
> > https://people.apache.org/keys/committer/enis.asc
> >
> > Signed tag in the repository can be found here:
> >
> >
> https://git-wip-us.apache.org/repos/asf?p=hbase.git;a=tag;h=45baeb796eb98676b4b45f2d29ed1bd595e26cb7
> >
> > HBase 1.0.3 is the next “patch” release in the 1.0.x release line and
> > supersedes all previous 1.0.x releases. According to the HBase’s semantic
> > version guide (See [1]), the release candidate is source and binary
> > compatible with 1.0.x for client applications and server side libraries
> > (coprocessors, filters, etc).
> >
> > Please note that 1.0.3 is the last “scheduled” release in the 1.0.x line
> of
> > releases since there is a limited amount of testing and release
> management
> > bandwidth. Users are highly encouraged to upgrade to 1.1 line of releases
> > if possible. However, if there is enough interest, or needed otherwise,
> we
> > can still decide to do more releases. Please be encouraged to speak up if
> > you want us to continue scheduled 1.0.x releases. See the hbase-dev
> mailing
> > thread [6] for more information.
> >
> > Binary / source compatibility report of 1.0.3RC1 compared to 1.0.2 can be
> > reached here:
> > https://home.apache.org/~enis/1.0.2_1.0.3RC1_compat_report.html
> >
> > 1.0.3 release contains 102 fixes on top of 1.0.2 release. Most of
> > the changes are
> > bug fixes or test fixes except for the following:
> >
> > ** Sub-task
> > * [HBASE-14221] - Reduce the number of time row comparison is done
> in a
> > Scan
> > * [HBASE-14535] - Integration test for rpc connection concurrency /
> > deadlock testing·
> > * [HBASE-14539] - Slight improvement of StoreScanner.optimize
> > * [HBASE-14605] - Split fails due to 'No valid credentials' error
> when
> > SecureBulkLoadEndpoint#start tries to access hdfs
> > * [HBASE-14631] - Region merge request should be audited with request
> > user through proper scope of doAs() calls to region observer
> notifications
> > * [HBASE-14655] - Narrow the scope of doAs() calls to region observer
> > notifications for compaction
> > * [HBASE-14657] - Remove unneeded API from EncodedSeeker
> > * [HBASE-14709] - Parent change breaks graceful_stop.sh on a cluster
> > * [HBASE-15031] - Fix merge of MVCC and SequenceID performance
> > regression in branch-1.0
> > * [HBASE-15095] - isReturnResult=false  on fast path in branch-1.1
> and
> > branch-1.0 is not respected
> >
> > ** Brainstorming
> > * [HBASE-14869] - Better request latency and size histograms
> >
> > ** Improvement
> > * [HBASE-14261] - Enhance Chaos Monkey framework by adding zookeeper
> > and datanode fault injections.
> > * [HBASE-14325] - Add snapshotinfo command to hbase script
> > * [HBASE-14436] - HTableDescriptor#addCoprocessor will always make
> > RegionCoprocessorHost create new Configuration
> > * [HBASE-14582] - Regionserver status webpage bucketcache list can
> > become huge
> > * [HBASE-14586] - Use a maven profile to run Jacoco analysis
> > * [HBASE-14643] - Avoid Splits from once again opening a closed
> reader
> > for fetching the first and last key
> >
> > ** Task
> > * [HBASE-14361] - ReplicationSink should create Connection instances
> > lazily
> >
> >
> > Full list of the issues can be found at
> >
> >
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?version=12332968=Text=12310753=Create
> >
> > Compatibility
> > -
> > This release (1.0.3) is source, wire and binary compatible with all
> > previous 1.0.x releases

Re: [VOTE] Second release candidate for HBase 1.0.3 (RC1) is available. Please vote by Jan 26 2016

2016-01-25 Thread Enis Söztutar

Tomorrow is the last day for this RC voting. Can we get some more votes
please. This will be the last scheduled release in 1.0 line.

Thanks,
Enis

On Mon, Jan 25, 2016 at 1:33 PM, Enis Söztutar <e...@apache.org> wrote:

> Here is the UT build for the RC:
>
>
> https://builds.apache.org/view/H-L/view/HBase/job/HBase-1.0.3RC1/2/testReport/
>
> All tests passed on the second run.
>
> Enis
>
> On Sat, Jan 23, 2016 at 8:00 PM, Enis Söztutar <e...@apache.org> wrote:
>
>> Gently reminder for the vote.
>>
>> On Wed, Jan 20, 2016 at 2:40 PM, Enis Söztutar <e...@apache.org> wrote:
>>
>>> Here is my official +1. I executed the same tests from 1.1.3RC for
>>> 1.0.3RC.
>>>
>>> - checked crcs, sigs.
>>>
>>> - checked tarball layouts, files, jar files, etc.
>>>
>>> - checked the book in the bin tar
>>>
>>> - checked versions reported
>>>
>>> - checked the compat report
>>>
>>> - compiled with Hadoop 2.2 to 2.7
>>>
>>> - build with hbase-downstreamer
>>>
>>> - run local model, shell smoke tests, LTT.
>>>
>>> - Put it up on a 4 node cluster, ran LTT.
>>>
>>>
>>> Everything looks nominal.
>>>
>>> On Tue, Jan 19, 2016 at 8:29 PM, Enis Söztutar <e...@apache.org> wrote:
>>>
>>>> I am pleased to announce that the second release candidate for the
>>>> release 1.0.3
>>>> (HBase-1.0.3RC1), is available for download at
>>>> https://dist.apache.org/repos/dist/dev/hbase/hbase-1.0.3RC1/
>>>>
>>>> Maven artifacts are also available in the temporary repository
>>>> https://repository.apache.org/content/repositories/orgapachehbase-1126
>>>>
>>>>
>>>> Signed with my code signing key E964B5FF. Can be found here:
>>>> https://people.apache.org/keys/committer/enis.asc
>>>>
>>>> Signed tag in the repository can be found here:
>>>> https://git-wip-us.apache.org/repos/asf?p=hbase.git;a=tag;h=45baeb796eb98676b4b45f2d29ed1bd595e26cb7
>>>>
>>>> HBase 1.0.3 is the next “patch” release in the 1.0.x release line and
>>>> supersedes all previous 1.0.x releases. According to the HBase’s semantic
>>>> version guide (See [1]), the release candidate is source and binary
>>>> compatible with 1.0.x for client applications and server side libraries
>>>> (coprocessors, filters, etc).
>>>>
>>>> Please note that 1.0.3 is the last “scheduled” release in the 1.0.x
>>>> line of releases since there is a limited amount of testing and release
>>>> management bandwidth. Users are highly encouraged to upgrade to 1.1 line of
>>>> releases if possible. However, if there is enough interest, or needed
>>>> otherwise, we can still decide to do more releases. Please be encouraged to
>>>> speak up if you want us to continue scheduled 1.0.x releases. See the
>>>> hbase-dev mailing thread [6] for more information.
>>>>
>>>> Binary / source compatibility report of 1.0.3RC1 compared to 1.0.2 can
>>>> be reached here:
>>>> https://home.apache.org/~enis/1.0.2_1.0.3RC1_compat_report.html
>>>>
>>>> 1.0.3 release contains 102 fixes on top of 1.0.2 release. Most of
>>>> the changes are
>>>> bug fixes or test fixes except for the following:
>>>>
>>>> ** Sub-task
>>>> * [HBASE-14221] - Reduce the number of time row comparison is done
>>>> in a Scan
>>>> * [HBASE-14535] - Integration test for rpc connection concurrency /
>>>> deadlock testing·
>>>> * [HBASE-14539] - Slight improvement of StoreScanner.optimize
>>>> * [HBASE-14605] - Split fails due to 'No valid credentials' error
>>>> when SecureBulkLoadEndpoint#start tries to access hdfs
>>>> * [HBASE-14631] - Region merge request should be audited with
>>>> request user through proper scope of doAs() calls to region observer
>>>> notifications
>>>> * [HBASE-14655] - Narrow the scope of doAs() calls to region
>>>> observer notifications for compaction
>>>> * [HBASE-14657] - Remove unneeded API from EncodedSeeker
>>>> * [HBASE-14709] - Parent change breaks graceful_stop.sh on a
>>>> cluster
>>>> * [HBASE-15031] - Fix merge of MVCC and SequenceID performance
>>>> regression in branch-1.0
>>>> * [HBASE-15095] - isReturnResult=false

Re: [VOTE] Second release candidate for HBase 1.0.3 (RC1) is available. Please vote by Jan 26 2016

2016-01-25 Thread Enis Söztutar

Here is the UT build for the RC:

https://builds.apache.org/view/H-L/view/HBase/job/HBase-1.0.3RC1/2/testReport/

All tests passed on the second run.

Enis

On Sat, Jan 23, 2016 at 8:00 PM, Enis Söztutar <e...@apache.org> wrote:

> Gently reminder for the vote.
>
> On Wed, Jan 20, 2016 at 2:40 PM, Enis Söztutar <e...@apache.org> wrote:
>
>> Here is my official +1. I executed the same tests from 1.1.3RC for
>> 1.0.3RC.
>>
>> - checked crcs, sigs.
>>
>> - checked tarball layouts, files, jar files, etc.
>>
>> - checked the book in the bin tar
>>
>> - checked versions reported
>>
>> - checked the compat report
>>
>> - compiled with Hadoop 2.2 to 2.7
>>
>> - build with hbase-downstreamer
>>
>> - run local model, shell smoke tests, LTT.
>>
>> - Put it up on a 4 node cluster, ran LTT.
>>
>>
>> Everything looks nominal.
>>
>> On Tue, Jan 19, 2016 at 8:29 PM, Enis Söztutar <e...@apache.org> wrote:
>>
>>> I am pleased to announce that the second release candidate for the
>>> release 1.0.3
>>> (HBase-1.0.3RC1), is available for download at
>>> https://dist.apache.org/repos/dist/dev/hbase/hbase-1.0.3RC1/
>>>
>>> Maven artifacts are also available in the temporary repository
>>> https://repository.apache.org/content/repositories/orgapachehbase-1126
>>>
>>> Signed with my code signing key E964B5FF. Can be found here:
>>> https://people.apache.org/keys/committer/enis.asc
>>>
>>> Signed tag in the repository can be found here:
>>> https://git-wip-us.apache.org/repos/asf?p=hbase.git;a=tag;h=45baeb796eb98676b4b45f2d29ed1bd595e26cb7
>>>
>>> HBase 1.0.3 is the next “patch” release in the 1.0.x release line and
>>> supersedes all previous 1.0.x releases. According to the HBase’s semantic
>>> version guide (See [1]), the release candidate is source and binary
>>> compatible with 1.0.x for client applications and server side libraries
>>> (coprocessors, filters, etc).
>>>
>>> Please note that 1.0.3 is the last “scheduled” release in the 1.0.x line
>>> of releases since there is a limited amount of testing and release
>>> management bandwidth. Users are highly encouraged to upgrade to 1.1 line of
>>> releases if possible. However, if there is enough interest, or needed
>>> otherwise, we can still decide to do more releases. Please be encouraged to
>>> speak up if you want us to continue scheduled 1.0.x releases. See the
>>> hbase-dev mailing thread [6] for more information.
>>>
>>> Binary / source compatibility report of 1.0.3RC1 compared to 1.0.2 can
>>> be reached here:
>>> https://home.apache.org/~enis/1.0.2_1.0.3RC1_compat_report.html
>>>
>>> 1.0.3 release contains 102 fixes on top of 1.0.2 release. Most of
>>> the changes are
>>> bug fixes or test fixes except for the following:
>>>
>>> ** Sub-task
>>> * [HBASE-14221] - Reduce the number of time row comparison is done
>>> in a Scan
>>> * [HBASE-14535] - Integration test for rpc connection concurrency /
>>> deadlock testing·
>>> * [HBASE-14539] - Slight improvement of StoreScanner.optimize
>>> * [HBASE-14605] - Split fails due to 'No valid credentials' error
>>> when SecureBulkLoadEndpoint#start tries to access hdfs
>>> * [HBASE-14631] - Region merge request should be audited with
>>> request user through proper scope of doAs() calls to region observer
>>> notifications
>>> * [HBASE-14655] - Narrow the scope of doAs() calls to region
>>> observer notifications for compaction
>>> * [HBASE-14657] - Remove unneeded API from EncodedSeeker
>>> * [HBASE-14709] - Parent change breaks graceful_stop.sh on a cluster
>>> * [HBASE-15031] - Fix merge of MVCC and SequenceID performance
>>> regression in branch-1.0
>>> * [HBASE-15095] - isReturnResult=false  on fast path in branch-1.1
>>> and branch-1.0 is not respected
>>>
>>> ** Brainstorming
>>> * [HBASE-14869] - Better request latency and size histograms
>>>
>>> ** Improvement
>>> * [HBASE-14261] - Enhance Chaos Monkey framework by adding zookeeper
>>> and datanode fault injections.
>>> * [HBASE-14325] - Add snapshotinfo command to hbase script
>>> * [HBASE-14436] - HTableDescriptor#addCoprocessor will always make
>>> RegionCoprocessorHost create new Configuration
>>> * [HBASE-14582] - Regionserver statu

Re: [VOTE] Second release candidate for HBase 1.0.3 (RC1) is available. Please vote by Jan 26 2016

2016-01-23 Thread Enis Söztutar

Gently reminder for the vote.

On Wed, Jan 20, 2016 at 2:40 PM, Enis Söztutar <e...@apache.org> wrote:

> Here is my official +1. I executed the same tests from 1.1.3RC for
> 1.0.3RC.
>
> - checked crcs, sigs.
>
> - checked tarball layouts, files, jar files, etc.
>
> - checked the book in the bin tar
>
> - checked versions reported
>
> - checked the compat report
>
> - compiled with Hadoop 2.2 to 2.7
>
> - build with hbase-downstreamer
>
> - run local model, shell smoke tests, LTT.
>
> - Put it up on a 4 node cluster, ran LTT.
>
>
> Everything looks nominal.
>
> On Tue, Jan 19, 2016 at 8:29 PM, Enis Söztutar <e...@apache.org> wrote:
>
>> I am pleased to announce that the second release candidate for the
>> release 1.0.3
>> (HBase-1.0.3RC1), is available for download at
>> https://dist.apache.org/repos/dist/dev/hbase/hbase-1.0.3RC1/
>>
>> Maven artifacts are also available in the temporary repository
>> https://repository.apache.org/content/repositories/orgapachehbase-1126
>>
>> Signed with my code signing key E964B5FF. Can be found here:
>> https://people.apache.org/keys/committer/enis.asc
>>
>> Signed tag in the repository can be found here:
>> https://git-wip-us.apache.org/repos/asf?p=hbase.git;a=tag;h=45baeb796eb98676b4b45f2d29ed1bd595e26cb7
>>
>> HBase 1.0.3 is the next “patch” release in the 1.0.x release line and
>> supersedes all previous 1.0.x releases. According to the HBase’s semantic
>> version guide (See [1]), the release candidate is source and binary
>> compatible with 1.0.x for client applications and server side libraries
>> (coprocessors, filters, etc).
>>
>> Please note that 1.0.3 is the last “scheduled” release in the 1.0.x line
>> of releases since there is a limited amount of testing and release
>> management bandwidth. Users are highly encouraged to upgrade to 1.1 line of
>> releases if possible. However, if there is enough interest, or needed
>> otherwise, we can still decide to do more releases. Please be encouraged to
>> speak up if you want us to continue scheduled 1.0.x releases. See the
>> hbase-dev mailing thread [6] for more information.
>>
>> Binary / source compatibility report of 1.0.3RC1 compared to 1.0.2 can be
>> reached here:
>> https://home.apache.org/~enis/1.0.2_1.0.3RC1_compat_report.html
>>
>> 1.0.3 release contains 102 fixes on top of 1.0.2 release. Most of
>> the changes are
>> bug fixes or test fixes except for the following:
>>
>> ** Sub-task
>> * [HBASE-14221] - Reduce the number of time row comparison is done in
>> a Scan
>> * [HBASE-14535] - Integration test for rpc connection concurrency /
>> deadlock testing·
>> * [HBASE-14539] - Slight improvement of StoreScanner.optimize
>> * [HBASE-14605] - Split fails due to 'No valid credentials' error
>> when SecureBulkLoadEndpoint#start tries to access hdfs
>> * [HBASE-14631] - Region merge request should be audited with request
>> user through proper scope of doAs() calls to region observer notifications
>> * [HBASE-14655] - Narrow the scope of doAs() calls to region observer
>> notifications for compaction
>> * [HBASE-14657] - Remove unneeded API from EncodedSeeker
>> * [HBASE-14709] - Parent change breaks graceful_stop.sh on a cluster
>> * [HBASE-15031] - Fix merge of MVCC and SequenceID performance
>> regression in branch-1.0
>> * [HBASE-15095] - isReturnResult=false  on fast path in branch-1.1
>> and branch-1.0 is not respected
>>
>> ** Brainstorming
>> * [HBASE-14869] - Better request latency and size histograms
>>
>> ** Improvement
>> * [HBASE-14261] - Enhance Chaos Monkey framework by adding zookeeper
>> and datanode fault injections.
>> * [HBASE-14325] - Add snapshotinfo command to hbase script
>> * [HBASE-14436] - HTableDescriptor#addCoprocessor will always make
>> RegionCoprocessorHost create new Configuration
>> * [HBASE-14582] - Regionserver status webpage bucketcache list can
>> become huge
>> * [HBASE-14586] - Use a maven profile to run Jacoco analysis
>> * [HBASE-14643] - Avoid Splits from once again opening a closed
>> reader for fetching the first and last key
>>
>> ** Task
>> * [HBASE-14361] - ReplicationSink should create Connection instances
>> lazily
>>
>>
>> Full list of the issues can be found at
>> https://issues.apache.org/jira/secure/ReleaseNote.jspa?version=12332968=Text=12310753=Create
>>
>> Compatibility
>> -
>> This release (1.0.3)

Re: [VOTE] Second release candidate for HBase 1.0.3 (RC1) is available. Please vote by Jan 26 2016

2016-01-20 Thread Enis Söztutar

Here is my official +1. I executed the same tests from 1.1.3RC for 1.0.3RC.

- checked crcs, sigs.

- checked tarball layouts, files, jar files, etc.

- checked the book in the bin tar

- checked versions reported

- checked the compat report

- compiled with Hadoop 2.2 to 2.7

- build with hbase-downstreamer

- run local model, shell smoke tests, LTT.

- Put it up on a 4 node cluster, ran LTT.


Everything looks nominal.

On Tue, Jan 19, 2016 at 8:29 PM, Enis Söztutar <e...@apache.org> wrote:

> I am pleased to announce that the second release candidate for the release
> 1.0.3
> (HBase-1.0.3RC1), is available for download at
> https://dist.apache.org/repos/dist/dev/hbase/hbase-1.0.3RC1/
>
> Maven artifacts are also available in the temporary repository
> https://repository.apache.org/content/repositories/orgapachehbase-1126
>
> Signed with my code signing key E964B5FF. Can be found here:
> https://people.apache.org/keys/committer/enis.asc
>
> Signed tag in the repository can be found here:
> https://git-wip-us.apache.org/repos/asf?p=hbase.git;a=tag;h=45baeb796eb98676b4b45f2d29ed1bd595e26cb7
>
> HBase 1.0.3 is the next “patch” release in the 1.0.x release line and
> supersedes all previous 1.0.x releases. According to the HBase’s semantic
> version guide (See [1]), the release candidate is source and binary
> compatible with 1.0.x for client applications and server side libraries
> (coprocessors, filters, etc).
>
> Please note that 1.0.3 is the last “scheduled” release in the 1.0.x line
> of releases since there is a limited amount of testing and release
> management bandwidth. Users are highly encouraged to upgrade to 1.1 line of
> releases if possible. However, if there is enough interest, or needed
> otherwise, we can still decide to do more releases. Please be encouraged to
> speak up if you want us to continue scheduled 1.0.x releases. See the
> hbase-dev mailing thread [6] for more information.
>
> Binary / source compatibility report of 1.0.3RC1 compared to 1.0.2 can be
> reached here:
> https://home.apache.org/~enis/1.0.2_1.0.3RC1_compat_report.html
>
> 1.0.3 release contains 102 fixes on top of 1.0.2 release. Most of
> the changes are
> bug fixes or test fixes except for the following:
>
> ** Sub-task
> * [HBASE-14221] - Reduce the number of time row comparison is done in
> a Scan
> * [HBASE-14535] - Integration test for rpc connection concurrency /
> deadlock testing·
> * [HBASE-14539] - Slight improvement of StoreScanner.optimize
> * [HBASE-14605] - Split fails due to 'No valid credentials' error when
> SecureBulkLoadEndpoint#start tries to access hdfs
> * [HBASE-14631] - Region merge request should be audited with request
> user through proper scope of doAs() calls to region observer notifications
> * [HBASE-14655] - Narrow the scope of doAs() calls to region observer
> notifications for compaction
> * [HBASE-14657] - Remove unneeded API from EncodedSeeker
> * [HBASE-14709] - Parent change breaks graceful_stop.sh on a cluster
> * [HBASE-15031] - Fix merge of MVCC and SequenceID performance
> regression in branch-1.0
> * [HBASE-15095] - isReturnResult=false  on fast path in branch-1.1 and
> branch-1.0 is not respected
>
> ** Brainstorming
> * [HBASE-14869] - Better request latency and size histograms
>
> ** Improvement
> * [HBASE-14261] - Enhance Chaos Monkey framework by adding zookeeper
> and datanode fault injections.
> * [HBASE-14325] - Add snapshotinfo command to hbase script
> * [HBASE-14436] - HTableDescriptor#addCoprocessor will always make
> RegionCoprocessorHost create new Configuration
> * [HBASE-14582] - Regionserver status webpage bucketcache list can
> become huge
> * [HBASE-14586] - Use a maven profile to run Jacoco analysis
> * [HBASE-14643] - Avoid Splits from once again opening a closed reader
> for fetching the first and last key
>
> ** Task
> * [HBASE-14361] - ReplicationSink should create Connection instances
> lazily
>
>
> Full list of the issues can be found at
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?version=12332968=Text=12310753=Create
>
> Compatibility
> -
> This release (1.0.3) is source, wire and binary compatible with all
> previous 1.0.x releases. Client applications do not have to be recompiled
> with the new version (unless new API is used) if upgrading from a previous
> 1.0.x. It is a drop-in replacement.
>
> See release notes for 1.0.0 [2] for compatibility with earlier
> versions (0.94, 0.96, 0.98). Compatibility of 1.0.3 with earlier versions
> is the same as in 1.0.0.
>
> Source Compatibility:
> Client side code in HBase-1.0.x is (mostly) source compatible with 0.98.x
> versions.

Re: Table regions are written in wrong path in HDFS

2015-11-24 Thread Enis Söztutar

Normally this path: /hbase/data/default/TestTable/TestTable/
d5b64ad96deb3b47467db97669c009fb
should be /hbase/data/default/TestTable/d5b64ad96deb3b47467db97669c009fb.

Meaning that the table descriptor somehow was written under the wrong
folder (there should be only one TestTable in the path).

The masters logs might reveal something at the time of the table creation.

Enis

On Sun, Nov 22, 2015 at 6:02 PM, Pankaj kr  wrote:

> Thanks Ted,
>
> All regions are in the same condition.
> Even the table descriptor also,
>
> drwxr-xr-x   - hbase hadoop  0 2015-11-18 00:04
> /hbase/data/default/TestTable/TestTable/.tabledesc
> -rw-r--r--   3 hbase hadoop289 2015-11-18 00:04
> /hbase/data/default/TestTable/TestTable/.tabledesc/.tableinfo.01
>
> So restarting HM/RS doesn't help.
>
> Regards,
> Pankaj
>
> -Original Message-
> From: Ted Yu [mailto:yuzhih...@gmail.com]
> Sent: 21 November 2015 22:24
> To: user@hbase.apache.org
> Subject: Re: Table regions are written in wrong path in HDFS
>
> Can you trace this region through master / region server log to see if
> there is some clue ?
>
> Cheers
>
> > On Nov 21, 2015, at 2:56 AM, Pankaj kr  wrote:
> >
> > Hi Folks,
> >
> > We met a very weird scenario.
> > We are running PE tool, during testing we found all regions are in
> transition in state FAILED_OPEN.
> >
> > Region server are failed to open the regions with below exception,
> > 2015-11-18 02:20:38,503 | ERROR | RS_OPEN_REGION-HOSTNAME:PORT-2 |
> Failed open of
> region=TestTable,002500,1447776261671.d5b64ad96deb3b47467db97669c009fb.,
> starting to roll back the global memstore size. |
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:386)
> > java.lang.IllegalStateException: Could not instantiate a region instance.
> >at
> org.apache.hadoop.hbase.regionserver.HRegion.newHRegion(HRegion.java:6229)
> >at
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:6536)
> >at
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:6508)
> >at
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:6464)
> >at
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:6415)
> >at
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:363)
> >at
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:129)
> >at
> org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128)
> >at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> >at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> >at java.lang.Thread.run(Thread.java:745)
> > Caused by: java.lang.reflect.InvocationTargetException
> >at
> sun.reflect.GeneratedConstructorAccessor17.newInstance(Unknown Source)
> >at
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> >at
> java.lang.reflect.Constructor.newInstance(Constructor.java:422)
> >at
> org.apache.hadoop.hbase.regionserver.HRegion.newHRegion(HRegion.java:6226)
> >... 10 more
> > Caused by: java.lang.IllegalArgumentException: Need table descriptor
> >at
> org.apache.hadoop.hbase.regionserver.HRegion.(HRegion.java:695)
> >at
> org.apache.hadoop.hbase.regionserver.HRegion.(HRegion.java:672)
> >... 14 more
> >
> > In HDFS, Table region was written as below,
> > drwxr-xr-x   - hbase hadoop  0 2015-11-18 00:05
> /hbase/data/default/TestTable/TestTable/d5b64ad96deb3b47467db97669c009fb
> > -rw-r--r--   3 hbase hadoop 96 2015-11-18 00:05
> /hbase/data/default/TestTable/TestTable/d5b64ad96deb3b47467db97669c009fb/.regioninfo
> >
> > Here table name "TestTable" has come twice, how is that possible?
> >
> > Have anyone met this scenario?
> > We are using HBase 1.0.2 and HDFS 2.7.2 version.
> >
> > Any suggestion/help would be much appreciated.
> >
> > Regards,
> > Pankaj
>

Re: HA reads properties

2015-11-19 Thread Enis Söztutar

Some of these properties got changed, deprecated or new ones added between
different apache versions. The apache documentation lists the latest
configuration examples which should work with hbase-1.1+.

Enis

On Thu, Nov 19, 2015 at 8:49 AM, Melvin Kanasseril <
melvin.kanasse...@sophos.com> wrote:

> Hi folks,
>
> Two questions -
>
> I noticed there is a discrepancy in the client side properties for
> configuring HA reads between vanilla Hbase and CDH’s Hbase. For example,
> hbase.client.primaryCallTimeout.get is in microseconds in the former (Hbase
> book client side properties<
> http://hbase.apache.org/book.html#_client_side_properties>) and in
> milliseconds in the latter (CDH Hbase Read Replicas<
> http://www.cloudera.com/content/www/en-us/documentation/enterprise/latest/topics/admin_hbase_read_replicas.html#concept_nsg_2c2_dr_unique_1__table_t34_nm2_dr_unique_1>).
> Is one of the them incorrect or is this intentional and working as
> specified?
>
> This might be a Cloudera specific question – the CDH documentation doesn’t
> seem to indicate having hbase.client.replicaCallTimeout.scan. I read
> somewhere that this was done later (HBASE-10357<
> https://issues.apache.org/jira/browse/HBASE-10357>). Does anybody know if
> this functionality made its way to hbase-CDH?
>
> Thanks!
> Melvin
>

Re: Question regarding the regionservers file on the HBase master

2015-11-19 Thread Enis Söztutar

The regionservers file is used only by the start-hbase.sh and stop-hbase.sh
kind of scripts. That is why the documentation is referring to as "if you
rely on ssh to start your daemons". These scripts executes the start or
stop daemon request using SSH. That is also why you need to have a list of
hostnames.

Nothing on a running cluster depends on the regionservers file. The
regionserver discovery, etc already happens via zookeeper.

If you are deploying via chef or Ambari, etc, or manually starting the
daemons on every node via hbase-daemon.sh, you do not need the
regionservers file in the conf directory.

Enis

On Wed, Nov 18, 2015 at 9:01 PM, F21  wrote:

> I have got 2 questions regarding the regionservers file on the HBase
> master:
>
> 1. Do we need to update the regionservers file when adding or removing a
> region server from the cluster only if we administer it via SSH? The
> documentation (http://hbase.apache.org/book.html#adding.new.node) is
> quite unclear: "If you rely on ssh to start your daemons, don’t forget to
> add the new hostname in/conf/regionservers/on the master."
>
> 2. What is the regionservers file used for? Why can't the region servers
> and master discover each other through the Zookeeper /hbase/rs znode?
>

Re: HBase client and server version is not compatible leadregionserver down

2015-11-18 Thread Enis Söztutar

It is highly unlikely that a client to cause the server side to abort. It
is possible that due to some other problem the regionservers are aborting.
The regionservers will reject client connection requests if there is an RPC
version mismatch.

1.x and 0.98 client and servers have been tested to be rolling upgrade
compatible (meaning that older clients can work with newer server versions
or vice versa).

Enis

On Wed, Nov 18, 2015 at 10:47 AM, Ted Yu  wrote:

> HBaseAdmin has this method:
>
>   public ClusterStatus getClusterStatus() throws IOException {
>
> where ClusterStatus has:
>
>   public String getHBaseVersion() {
>
> return hbaseVersion;
>
> FYI
>
> On Wed, Nov 18, 2015 at 8:58 AM, 聪聪 <175998...@qq.com> wrote:
>
> > Ted：
> >
> >
> > Because a lot of development is the use of the wrong client, I think
> about
> > how to avoid.For example, we even upgrade to 1.0 but they may use 2.0
> > version.
> >
> >
> >
> >
> > -- 原始邮件 --
> > 发件人: "Ted Yu";;
> > 发送时间: 2015年11月18日(星期三) 晚上7:37
> > 收件人: "user@hbase.apache.org";
> >
> > 主题: Re: HBase client and server version is not compatible
> leadregionserver
> > down
> >
> >
> >
> > See http://hbase.apache.org/book.html#hbase.rolling.upgrade
> >
> > For example, in Rolling upgrade from 0.98.x to HBase 1.0.0, we state that
> > it is possible to do a rolling upgrade between hbase-0.98.x and
> > hbase-1.0.0.
> >
> > Cheers
> >
> > On Wed, Nov 18, 2015 at 12:22 AM, 聪聪 <175998...@qq.com> wrote:
> >
> > > We recently found that regionserver down.Later, we found that because
> the
> > > client and server version is not compatible.The client version is
> > > 1.0,server version is 0.98.6.I want to know why this is, and whether
> > there
> > > is a better protection mechanism.How to avoid this problem, because
> some
> > > development will appear this kind of mistake operation.
> >
>

Re: HBase error disabled security features are not available

2015-11-04 Thread Enis Söztutar

You should also configure authentication set to kerberos.

Enis

On Tue, Nov 3, 2015 at 9:00 PM, kumar r  wrote:

> Yes I have set all this property,
>
> I got user ticket using kinit command and when trying to run *user_permission
> or grant *command in hbase shell, getting this exception
>
> Posted the question is stack overflow in the below link
>
>
> http://stackoverflow.com/questions/33496541/hbase-error-disabled-security-features-are-not-available
>
>
>
> On Wed, Nov 4, 2015 at 10:23 AM, Ted Yu  wrote:
>
>> Have you set the following config ?
>>
>> hbase.master.keytab.file
>> hbase.master.kerberos.principal
>> hbase.regionserver.keytab.file
>> hbase.regionserver.kerberos.principal
>>
>> Refer to http://hbase.apache.org/book.html for their meaning / sample
>> value.
>>
>> Please show the stack trace of the exception you got.
>>
>> Cheers
>>
>> On Tue, Nov 3, 2015 at 8:43 PM, kumar r  wrote:
>>
>> > Configured secure HBase-1.1.2 with Hadoop-2.7.1 on Windows. When i
>> enable
>> > authorization referring Configuring HBase Authorization
>> > <
>> >
>> http://www.cloudera.com/content/www/en-us/documentation/archive/cdh/4-x/4-3-2/CDH4-Security-Guide/cdh4sg_topic_8_3.html
>> > >,
>> > getting *ERROR: DISABLED: Security features are not available*
>> exception.
>> >
>> > I have set the authorization configurations as below,
>> >
>> > 
>> >  hbase.security.authorization
>> >  true
>> > 
>> >
>> > 
>> >  hbase.coprocessor.master.classes
>> >
>> >  org.apache.hadoop.hbase.security.access.AccessController
>> > 
>> >
>> > 
>> >  hbase.coprocessor.region.classes
>> >
>> >
>> org.apache.hadoop.hbase.security.token.TokenProvider,org.apache.hadoop.hbase.security.access.AccessController
>> > 
>> >
>> > But HBase Authorization works fine when i tried with HBase-0.98.13
>> > version. Some one help me to enable HBase Authorization in a correct
>> > way.
>> >
>>
>
>

Re: equivalent to HTableUtil in v. 1.1

2015-10-28 Thread Enis Söztutar

Hi Artem,

The 1.0 API BufferedMutator should cover the use case where previously
HTableUtil was used or HTable.setAutoFlush(false) is used.

BufferedMutator already groups the mutations per region server under the
hood (AsyncProcess) and sends the buffered mutations in the background.
There should be enough documentation to get started, and there is example
code in hbase-examples module for BM usage. But let us know if it covers
your use case or not.

Enis

On Tue, Oct 27, 2015 at 7:58 PM, Artem Ervits  wrote:

> Hello all,
>
> is there an equivalent utility in HBase 1.0+ to HTableUtil? I'd like to
> batch mutations by region server.
>
> Thanks
>

Re: Thinking of HBase 1.1.3 [was HBASE-14317 and 1.1.3]

2015-09-23 Thread Enis Söztutar

Agreed that we should not change the declared interface for TRR in patch
releases. Ugly, but we can rethrow as RuntimeException or ignore in 1.1 and
before.

I think this is also a blocker:
https://issues.apache.org/jira/browse/HBASE-14474

Enis

On Wed, Sep 23, 2015 at 3:50 PM, Nick Dimiduk  wrote:

> I've run the compatibility checking tool [0] between branch-1.1
> (0bf97bac2ed564994a0bcda5f1993260bf0b448f) and 1.1.0
> (e860c66d41ddc8231004b646098a58abca7fb523). There has been a little bit of
> drift, but nothing that I think is release-blocking. However, I'd like to
> bring it to your attention here, before it sinks an RC. You can compare
> this to the run between 1.1.0 and 1.1.2RC2, which became 1.1.2 [1]. Notice
> we've added a handful of methods, which is acceptable according to our
> guidelines [2].The question I have is about adding throws IOException
> to TableRecordReader.close(). IOException is in the interface declaration
> of the super type, but this will require a source code change for anyone
> consuming our type directly. I believe, according to [2], this breaks our
> guidelines for a patch release.
>
> I've also sent a note over to HBASE-14394 [3] regarding the added public
> and undocumented method to TableRecordReader, so there's potentially two
> addendum's required for this patch.
>
> How would the community like to proceed?
>
> [0]:
> http://people.apache.org/~ndimiduk/1.1.0_branch-1.1_compat_report.html
> [1]: http://people.apache.org/~ndimiduk/1.1.0_1.1.2RC2_compat_report.html
> [2]: http://hbase.apache.org/book.html#hbase.versioning
> [3]:
>
> https://issues.apache.org/jira/browse/HBASE-14394?focusedCommentId=14905429=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14905429
>
> On Mon, Sep 21, 2015 at 3:18 PM, Nick Dimiduk  wrote:
>
> > Hi folks,
> >
> > It's that time again, I'm looking at spinning 1.1.3 bit this week, with
> > hopes that we can get a release out in early October. The only issue I'm
> > actively tracking as a must for this release is HBASE-14374, the back
> port
> > for HBASE-14317. Is there anything else you're planning to get in for
> this
> > one that's not been committed yet? Please speak up. I'll be starting my
> > pre-release validations tomorrow or Wednesday.
> >
> > Thanks,
> > Nick
> >
> > On Fri, Sep 4, 2015 at 4:08 PM, Andrew Purtell 
> > wrote:
> >
> >> > PMC: do you have bandwidth to test yet another round of RC's?
> >>
> >> Yes, absolutely, and if you'd also like help making the RCs mail me
> >> privately.
> >>
> >> On Fri, Sep 4, 2015 at 8:11 AM, Nick Dimiduk 
> wrote:
> >>
> >> > Hi folks,
> >> >
> >> > I know we just got through voting periods on three patch releases, but
> >> > HBASE-14317 is looking pretty bad by my eye. Given we have a fix on
> our
> >> > end, I'm up for spinning 1.1.3 a couple weeks early. How does the
> >> community
> >> > feel about it? Users: do you need this patch immediately? PMC: do you
> >> have
> >> > bandwidth to test yet another round of RC's? I'm not on JIRA yet this
> >> > morning; is there other nastiness we should get fixed in an
> accelerated
> >> .3
> >> > as well?
> >> >
> >> > Thanks for your thoughts and your time.
> >> > -n
> >> >
> >>
> >>
> >>
> >> --
> >> Best regards,
> >>
> >>- Andy
> >>
> >> Problems worthy of attack prove their worth by hitting back. - Piet Hein
> >> (via Tom White)
> >>
> >
> >
>

[ANNOUNCE] Apache HBase 1.0.2 is now available for download

2015-09-01 Thread Enis Söztutar

 The HBase Team is pleased to announce the immediate release of HBase
1.0.2.

Download it from your favorite Apache mirror [1] or maven repository.

HBase 1.0.2 is the next “patch” release in the 1.0.x release line and
supersedes all previous 1.0.x releases . According to the HBase’s semantic
version guide (See [2]), the release candidate is source and binary
compatible with 1.0.0 for client applications and server side libraries
(coprocessors, filters, etc).

1.0.2 release contains 163 fixes on top of 1.0.1 release. Most of
the changes are
bug fixes or test fixes except for the following:

** Improvement
* [HBASE-12415] - Add add(byte[][] arrays) to Bytes.
* [HBASE-12957] - region_mover#isSuccessfulScan may be extremely slow
on region with lots of expired data
* [HBASE-13247] - Change BufferedMutatorExample to use addColumn()
since add() is deprecated
* [HBASE-13344] - Add enforcer rule that matches our JDK support
statement
* [HBASE-13366] - Throw DoNotRetryIOException instead of read only
IOException
* [HBASE-13420] - RegionEnvironment.offerExecutionLatency Blocks
Threads under Heavy Load
* [HBASE-13431] - Allow to skip store file range check based on column
family while creating reference files in HRegionFileSystem#splitStoreFile
* [HBASE-13550] - [Shell] Support unset of a list of table attributes
* [HBASE-13761] - Optimize FuzzyRowFilter
* [HBASE-13780] - Default to 700 for HDFS root dir permissions for
secure deployments
* [HBASE-13828] - Add group permissions testing coverage to AC.
* [HBASE-13925] - Use zookeeper multi to clear znodes in ZKProcedureUtil

** New Feature
* [HBASE-13057] - Provide client utility to easily enable and disable
table replication

** Task
* [HBASE-11276] - Add back support for running ChaosMonkey as
standalone tool
* [HBASE-13764] - Backport HBASE-7782
(HBaseTestingUtility.truncateTable() not acting like CLI) to branch-1.x
* [HBASE-13799] - javadoc how Scan gets polluted when used; if you set
attributes or ask for scan metrics
* [HBASE-14085] - Correct LICENSE and NOTICE files in artifacts

** Sub-task
* [HBASE-7847] - Use zookeeper multi to clear znodes
* [HBASE-13035] - [0.98] Backport HBASE-12867 - Shell does not support
custom replication endpoint specification
* [HBASE-13201] - Remove HTablePool from thrift-server
* [HBASE-13496] - Make
Bytes$LexicographicalComparerHolder$UnsafeComparer::compareTo inlineable
* [HBASE-13497] - Remove MVCC stamps from HFile when that is safe
* [HBASE-13563] - Add missing table owner to AC tests.
* [HBASE-13579] - Avoid isCellTTLExpired() for NO-TAG cases
* [HBASE-13937] - Partially revert HBASE-13172·
* [HBASE-13983] - Doc how the oddball HTable methods getStartKey,
getEndKey, etc. will be removed in 2.0.0
* [HBASE-14003] - work around jdk8 spec bug in WALPerfEval
* [HBASE-14086] - remove unused bundled dependencies


Full list of the issues can be found at
https://issues.apache.org/jira/secure/ReleaseNote.jspa?version=12329865=Html=12310753=Create


Compatibility
-
This release (1.0.2) is source, wire and binary compatible with all
previous 1.0.x releases. Client applications do not have to be recompiled
with the new version (unless new API is used) if upgrading from a previous
1.0.x. It is a drop-in replacement.

See release notes for 1.0.0 [2] for compatibility with earlier
versions (0.94, 0.96, 0.98). Compatibility of 1.0.2 with earlier versions
is the same as in 1.0.0.

Source Compatibility:
Client side code in HBase-1.0.x is (mostly) source compatible with 0.98.x
versions. Some minor API changes might be needed from the client side.

Wire Compatibility:
HBase-1.0.x release is wire compatible with 0.98.x releases. Clients and
servers running in different versions as long as new features are not used
should be possible. A rolling upgrade from 0.98.x clusters to 1.0.x is
supported as well.
Rolling upgrade from 0.96 directly to 1.0.x is not supported. 1.0.x is NOT
wire compatible with earlier releases (0.94, etc).

Binary Compatibility:
Binary compatibility at the Java API layer with earlier versions (0.98.x,
0.96.x and 0.94.x) is not supported. You may have to recompile your client
code and any server side code (coprocessors, filters etc) referring to
hbase jars.


Upgrading
-
This release is rolling upgradable from earlier 1.0.x releases.

See [2] and [3] for upgrade instructions from earlier versions. Upgrading
to 1.0.2 is similar to upgrading to 1.0.0 as documented in [3].

>From 0.98.x : Upgrade from 0.98.x in regular upgrade or rolling upgrade
fashion is supported.

>From 0.96.x : Upgrade from 0.96.x is supported with a shutdown and restart
of the cluster.

>From 0.94.x : Upgrade from 0.94.x is supported similar to upgrade from 0.94
-> 0.96. The upgrade script should be run to rewrite cluster level
metadata. See [3] for details.


Supported Hadoop versions
-
1.0.x releases

Re: Please welcome new HBase committer Stephen Yuan Jiang

2015-08-21 Thread Enis Söztutar

Congrats, well deserved.

Enis

On Fri, Aug 21, 2015 at 11:12 AM, Devaraj Das d...@hortonworks.com wrote:

 Congratulations Stephen!

  On Aug 20, 2015, at 7:10 PM, Andrew Purtell apurt...@apache.org wrote:
 
  On behalf of the Apache HBase PMC, I am pleased to announce that Stephen
  Jiang has accepted the PMC's invitation to become a committer on the
  project. We appreciate all of Stephen's hard work and generous
  contributions thus far, and look forward to his continued involvement.
 
  Congratulations and welcome, Stephen!
 
  --
  Best regards,
 
- Andy
 
  Problems worthy of attack prove their worth by hitting back. - Piet Hein
  (via Tom White)

Re: [DISCUSS] correcting abusive behavior on mailing lists was (Re: [DISCUSS] Multi-Cluster HBase Client)

2015-06-30 Thread Enis Söztutar

I've just saw the thread in question, and I also feel that an action has to
be taken because this type of behavior is unacceptable. It is also not the
first strike if my memory serves me.

Moderation is fine if we have voluteers. Otherwise +1 for a temporary ban.

Enis

On Tue, Jun 30, 2015 at 6:25 PM, Sean Busbey bus...@cloudera.com wrote:

 On Tue, Jun 30, 2015 at 7:58 PM, lars hofhansl la...@apache.org wrote:

  Moderating is better than outright banning, I think.While Micheal is
  sometimes infuriating, he's also funny and smart.
  Can we have a group of moderators? I'd volunteer, but not if I'm the only
  one.
 
 
 So far we have both you and I willing to volunteer. I'm comfortable at two
 moderators if you are, though I'd certainly welcome additional.


 --
 Sean

Re: [VOTE] First release candidate for HBase 1.1.1 (RC0) is available

2015-06-25 Thread Enis Söztutar

ITBLL requires at least 1M per mapper. You can run with less number of
mappers and numKeys = 1M.

Enis

On Thu, Jun 25, 2015 at 4:23 AM, Jean-Marc Spaggiari 
jean-m...@spaggiari.org wrote:

 2015-06-24 22:29 GMT-04:00 Enis Söztutar enis@gmail.com:

  
   Also, I tried to run IntegrationTestBigLinkedList and it fails:
   015-06-24 19:06:11,644 ERROR [main]
   test.IntegrationTestBigLinkedList$Verify: Expected referenced count
 does
   not match with actual referenced count. expected referenced=100
   ,actual=0
  
 
  What are the command line arguments passed? Verify cannot find any
  references?
 

 I ran this:
 bin/hbase org.apache.hadoop.hbase.test.IntegrationTestBigLinkedList loop 3
 100 1 /tmp/loop 10

 I can retry with any other if it helps.


 
 
  
  
   And last I ran IntegrationTestLoadAndVerify but I have no idea how to
   interpret the result ;)
   org.apache.hadoop.hbase.test.IntegrationTestLoadAndVerify$Counters
   REFERENCES_WRITTEN=1980
   ROWS_WRITTEN=2000
  
   org.apache.hadoop.hbase.test.IntegrationTestLoadAndVerify$Counters
   REFERENCES_CHECKED=1036925998
   ROWS_WRITTEN=0
  
 
  This is a bit fishy. Again, what are the parameters passed? Did you run
  with a clean cluster state?
 


 Was always doing that before any test:
 echo disable 'IntegrationTestLoadAndVerify'; drop
 'IntegrationTestLoadAndVerify' | bin/hbase shell

 And ran with this command:
 bin/hbase org.apache.hadoop.hbase.test.IntegrationTestLoadAndVerify
 loadAndVerify



 
  For these two tests, I think there is at least 3 or so bugs already fixed
  in theory. Our tests and my 1.2B row tests on a previous branch-1.1 code
  base was ok.
 
 

 Also, any idea why non of the tests pass on my side?
 I tried some of them individually and they passed:
 ---
  T E S T S
 ---
 Running org.apache.hadoop.hbase.master.TestClockSkewDetection
 Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.69 sec -
 in org.apache.hadoop.hbase.master.TestClockSkewDetection
 Results :
 Tests run: 1, Failures: 0, Errors: 0, Skipped: 0

 But it prevents me to run the entire test suite at once. I will give it few
 more tries today...

 JM



 
  
  
   So. It seems to be working on my cluster, but I have not been able to
 get
   any successful test. Therefore I'm a bit reluctant to say +1 and will
  only
   say +/-0
  
   For pefs tests, I still need some more work on my clusters... So not
 for
   this release.
  
   JM
  
   2015-06-24 16:25 GMT-04:00 Ted Yu yuzhih...@gmail.com:
  
+1
   
Ran test suite against Java 1.8.0_45
Checked signature
Practiced basic shell commands
   
On Tue, Jun 23, 2015 at 4:25 PM, Nick Dimiduk ndimi...@apache.org
   wrote:
   
 I'm happy to announce the first release candidate of HBase 1.1.1
 (HBase-1.1.1RC0) is available for download at
 https://dist.apache.org/repos/dist/dev/hbase/hbase-1.1.1RC0/

 Maven artifacts are also available in the staging repository

  
 https://repository.apache.org/content/repositories/orgapachehbase-1087/

 Artifacts are signed with my code signing subkey
 0xAD9039071C3489BD,
 available in the Apache keys directory
 https://people.apache.org/keys/committer/ndimiduk.asc

 There's also a signed tag for this release at


   
  
 
 https://git-wip-us.apache.org/repos/asf?p=hbase.git;a=tag;h=af1934d826cab80f727e9a95c5b564f04da73259

 HBase 1.1.1 is the first patch release in the HBase 1.1 line,
   continuing
on
 the theme of bringing a stable, reliable database to the Hadoop and
   NoSQL
 communities. This release includes over 100 bug fixes since the
 1.1.0
 release, including an assignment manager bug that can lead to data
  loss
in
 rare cases. Users of 1.1.0 are strongly encouraged to update to
 1.1.1
   as
 soon as possible.

 The full list of issues can be found at


   
  
 
 https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12310753version=12332169

 Please try out this candidate and vote +/-1 by midnight Pacific
 time
  on
 Sunday, 2015-06-28 as to whether we should release these artifacts
 as
HBase
 1.1.1.

 Thanks,
 Nick

Re: [VOTE] First release candidate for HBase 1.1.1 (RC0) is available

2015-06-25 Thread Enis Söztutar

Slight correction, ITBLL needs numKeys to be a multiple of 1M. See the
javadoc and ascii art at the code.

Enis

On Thu, Jun 25, 2015 at 1:47 PM, Enis Söztutar enis@gmail.com wrote:

 ITBLL requires at least 1M per mapper. You can run with less number of
 mappers and numKeys = 1M.

 Enis

 On Thu, Jun 25, 2015 at 4:23 AM, Jean-Marc Spaggiari 
 jean-m...@spaggiari.org wrote:

 2015-06-24 22:29 GMT-04:00 Enis Söztutar enis@gmail.com:

  
   Also, I tried to run IntegrationTestBigLinkedList and it fails:
   015-06-24 19:06:11,644 ERROR [main]
   test.IntegrationTestBigLinkedList$Verify: Expected referenced count
 does
   not match with actual referenced count. expected referenced=100
   ,actual=0
  
 
  What are the command line arguments passed? Verify cannot find any
  references?
 

 I ran this:
 bin/hbase org.apache.hadoop.hbase.test.IntegrationTestBigLinkedList loop 3
 100 1 /tmp/loop 10

 I can retry with any other if it helps.


 
 
  
  
   And last I ran IntegrationTestLoadAndVerify but I have no idea how to
   interpret the result ;)
   org.apache.hadoop.hbase.test.IntegrationTestLoadAndVerify$Counters
   REFERENCES_WRITTEN=1980
   ROWS_WRITTEN=2000
  
   org.apache.hadoop.hbase.test.IntegrationTestLoadAndVerify$Counters
   REFERENCES_CHECKED=1036925998
   ROWS_WRITTEN=0
  
 
  This is a bit fishy. Again, what are the parameters passed? Did you run
  with a clean cluster state?
 


 Was always doing that before any test:
 echo disable 'IntegrationTestLoadAndVerify'; drop
 'IntegrationTestLoadAndVerify' | bin/hbase shell

 And ran with this command:
 bin/hbase org.apache.hadoop.hbase.test.IntegrationTestLoadAndVerify
 loadAndVerify



 
  For these two tests, I think there is at least 3 or so bugs already
 fixed
  in theory. Our tests and my 1.2B row tests on a previous branch-1.1 code
  base was ok.
 
 

 Also, any idea why non of the tests pass on my side?
 I tried some of them individually and they passed:
 ---
  T E S T S
 ---
 Running org.apache.hadoop.hbase.master.TestClockSkewDetection
 Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.69 sec -
 in org.apache.hadoop.hbase.master.TestClockSkewDetection
 Results :
 Tests run: 1, Failures: 0, Errors: 0, Skipped: 0

 But it prevents me to run the entire test suite at once. I will give it
 few
 more tries today...

 JM



 
  
  
   So. It seems to be working on my cluster, but I have not been able to
 get
   any successful test. Therefore I'm a bit reluctant to say +1 and will
  only
   say +/-0
  
   For pefs tests, I still need some more work on my clusters... So not
 for
   this release.
  
   JM
  
   2015-06-24 16:25 GMT-04:00 Ted Yu yuzhih...@gmail.com:
  
+1
   
Ran test suite against Java 1.8.0_45
Checked signature
Practiced basic shell commands
   
On Tue, Jun 23, 2015 at 4:25 PM, Nick Dimiduk ndimi...@apache.org
   wrote:
   
 I'm happy to announce the first release candidate of HBase 1.1.1
 (HBase-1.1.1RC0) is available for download at
 https://dist.apache.org/repos/dist/dev/hbase/hbase-1.1.1RC0/

 Maven artifacts are also available in the staging repository

  
 https://repository.apache.org/content/repositories/orgapachehbase-1087/

 Artifacts are signed with my code signing subkey
 0xAD9039071C3489BD,
 available in the Apache keys directory
 https://people.apache.org/keys/committer/ndimiduk.asc

 There's also a signed tag for this release at


   
  
 
 https://git-wip-us.apache.org/repos/asf?p=hbase.git;a=tag;h=af1934d826cab80f727e9a95c5b564f04da73259

 HBase 1.1.1 is the first patch release in the HBase 1.1 line,
   continuing
on
 the theme of bringing a stable, reliable database to the Hadoop
 and
   NoSQL
 communities. This release includes over 100 bug fixes since the
 1.1.0
 release, including an assignment manager bug that can lead to data
  loss
in
 rare cases. Users of 1.1.0 are strongly encouraged to update to
 1.1.1
   as
 soon as possible.

 The full list of issues can be found at


   
  
 
 https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12310753version=12332169

 Please try out this candidate and vote +/-1 by midnight Pacific
 time
  on
 Sunday, 2015-06-28 as to whether we should release these
 artifacts as
HBase
 1.1.1.

 Thanks,
 Nick

Re: [VOTE] First release candidate for HBase 1.1.1 (RC0) is available

2015-06-24 Thread Enis Söztutar


 Also, I tried to run IntegrationTestBigLinkedList and it fails:
 015-06-24 19:06:11,644 ERROR [main]
 test.IntegrationTestBigLinkedList$Verify: Expected referenced count does
 not match with actual referenced count. expected referenced=100
 ,actual=0


What are the command line arguments passed? Verify cannot find any
references?




 And last I ran IntegrationTestLoadAndVerify but I have no idea how to
 interpret the result ;)
 org.apache.hadoop.hbase.test.IntegrationTestLoadAndVerify$Counters
 REFERENCES_WRITTEN=1980
 ROWS_WRITTEN=2000

 org.apache.hadoop.hbase.test.IntegrationTestLoadAndVerify$Counters
 REFERENCES_CHECKED=1036925998
 ROWS_WRITTEN=0


This is a bit fishy. Again, what are the parameters passed? Did you run
with a clean cluster state?

For these two tests, I think there is at least 3 or so bugs already fixed
in theory. Our tests and my 1.2B row tests on a previous branch-1.1 code
base was ok.




 So. It seems to be working on my cluster, but I have not been able to get
 any successful test. Therefore I'm a bit reluctant to say +1 and will only
 say +/-0

 For pefs tests, I still need some more work on my clusters... So not for
 this release.

 JM

 2015-06-24 16:25 GMT-04:00 Ted Yu yuzhih...@gmail.com:

  +1
 
  Ran test suite against Java 1.8.0_45
  Checked signature
  Practiced basic shell commands
 
  On Tue, Jun 23, 2015 at 4:25 PM, Nick Dimiduk ndimi...@apache.org
 wrote:
 
   I'm happy to announce the first release candidate of HBase 1.1.1
   (HBase-1.1.1RC0) is available for download at
   https://dist.apache.org/repos/dist/dev/hbase/hbase-1.1.1RC0/
  
   Maven artifacts are also available in the staging repository
  
 https://repository.apache.org/content/repositories/orgapachehbase-1087/
  
   Artifacts are signed with my code signing subkey 0xAD9039071C3489BD,
   available in the Apache keys directory
   https://people.apache.org/keys/committer/ndimiduk.asc
  
   There's also a signed tag for this release at
  
  
 
 https://git-wip-us.apache.org/repos/asf?p=hbase.git;a=tag;h=af1934d826cab80f727e9a95c5b564f04da73259
  
   HBase 1.1.1 is the first patch release in the HBase 1.1 line,
 continuing
  on
   the theme of bringing a stable, reliable database to the Hadoop and
 NoSQL
   communities. This release includes over 100 bug fixes since the 1.1.0
   release, including an assignment manager bug that can lead to data loss
  in
   rare cases. Users of 1.1.0 are strongly encouraged to update to 1.1.1
 as
   soon as possible.
  
   The full list of issues can be found at
  
  
 
 https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12310753version=12332169
  
   Please try out this candidate and vote +/-1 by midnight Pacific time on
   Sunday, 2015-06-28 as to whether we should release these artifacts as
  HBase
   1.1.1.
  
   Thanks,
   Nick

Re: [VOTE] First release candidate for HBase 1.1.1 (RC0) is available

2015-06-24 Thread Enis Söztutar

Here is my official +1.

- Checked sigs, crcs
- Checked dir layout
- Built src with Hadoop-2.3+
- Run local mode, smoke tests from shell
- Run LTT on local mode
- Checked compat report that Nick put up.
- Checked tag
- Checked src tarball contents against tag. There are two extra files:
hbase-shaded-client/pom.xml and hbase-shaded-server/pom.xml. Not sure
where they are coming from. Create an issue? But not important for the RC.

Plus, we have been running (close to) 1.1.1 bits against our test rig with
most of the IT's and the results never looked better.

Enis

On Wed, Jun 24, 2015 at 7:29 PM, Enis Söztutar enis@gmail.com wrote:

Also, I tried to run IntegrationTestBigLinkedList and it fails:
015-06-24 19:06:11,644 ERROR [main]
test.IntegrationTestBigLinkedList$Verify: Expected referenced count does
not match with actual referenced count. expected referenced=100
,actual=0

What are the command line arguments passed? Verify cannot find any
references?

And last I ran IntegrationTestLoadAndVerify but I have no idea how to
interpret the result ;)
org.apache.hadoop.hbase.test.IntegrationTestLoadAndVerify$Counters
REFERENCES_WRITTEN=1980
ROWS_WRITTEN=2000

org.apache.hadoop.hbase.test.IntegrationTestLoadAndVerify$Counters
REFERENCES_CHECKED=1036925998
ROWS_WRITTEN=0

This is a bit fishy. Again, what are the parameters passed? Did you run
with a clean cluster state?

For these two tests, I think there is at least 3 or so bugs already fixed
in theory. Our tests and my 1.2B row tests on a previous branch-1.1 code
base was ok.

So. It seems to be working on my cluster, but I have not been able to get
any successful test. Therefore I'm a bit reluctant to say +1 and will only
say +/-0

For pefs tests, I still need some more work on my clusters... So not for
this release.

2015-06-24 16:25 GMT-04:00 Ted Yu yuzhih...@gmail.com:

Ran test suite against Java 1.8.0_45
Checked signature
Practiced basic shell commands

On Tue, Jun 23, 2015 at 4:25 PM, Nick Dimiduk ndimi...@apache.org
wrote:

I'm happy to announce the first release candidate of HBase 1.1.1
(HBase-1.1.1RC0) is available for download at
https://dist.apache.org/repos/dist/dev/hbase/hbase-1.1.1RC0/

Maven artifacts are also available in the staging repository

https://repository.apache.org/content/repositories/orgapachehbase-1087/

Artifacts are signed with my code signing subkey 0xAD9039071C3489BD,
available in the Apache keys directory
https://people.apache.org/keys/committer/ndimiduk.asc

There's also a signed tag for this release at

https://git-wip-us.apache.org/repos/asf?p=hbase.git;a=tag;h=af1934d826cab80f727e9a95c5b564f04da73259

HBase 1.1.1 is the first patch release in the HBase 1.1 line,
continuing
on
the theme of bringing a stable, reliable database to the Hadoop and
NoSQL
communities. This release includes over 100 bug fixes since the 1.1.0
release, including an assignment manager bug that can lead to data
loss
in
rare cases. Users of 1.1.0 are strongly encouraged to update to 1.1.1
as
soon as possible.

The full list of issues can be found at

https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12310753version=12332169

Please try out this candidate and vote +/-1 by midnight Pacific time
on
Sunday, 2015-06-28 as to whether we should release these artifacts as
HBase
1.1.1.

Thanks,
Nick

Re: Troubles with HBase 1.1.0 RC2

2015-05-13 Thread Enis Söztutar

Yeah, for coprocessors, what Andrew said. You have to make minor changes.

From your repo, I was able to build:

HW10676:hbase-deps-test$ ./build.sh

:compileJava

Download
https://repository.apache.org/content/repositories/orgapachehbase-1078/org/apache/hbase/hbase/1.1.0/hbase-1.1.0.pom

Download
https://repository.apache.org/content/repositories/orgapachehbase-1078/org/apache/hbase/hbase-server/1.1.0/hbase-server-1.1.0.jar

Download
https://repository.apache.org/content/repositories/orgapachehbase-1078/org/apache/hbase/hbase-common/1.1.0/hbase-common-1.1.0.jar

Download
https://repository.apache.org/content/repositories/orgapachehbase-1078/org/apache/hbase/hbase-common/1.1.0/hbase-common-1.1.0-tests.jar

Download
https://repository.apache.org/content/repositories/orgapachehbase-1078/org/apache/hbase/hbase-protocol/1.1.0/hbase-protocol-1.1.0.jar

Download
https://repository.apache.org/content/repositories/orgapachehbase-1078/org/apache/hbase/hbase-procedure/1.1.0/hbase-procedure-1.1.0.jar

Download
https://repository.apache.org/content/repositories/orgapachehbase-1078/org/apache/hbase/hbase-client/1.1.0/hbase-client-1.1.0.jar

Download
https://repository.apache.org/content/repositories/orgapachehbase-1078/org/apache/hbase/hbase-prefix-tree/1.1.0/hbase-prefix-tree-1.1.0.jar

Download
https://repository.apache.org/content/repositories/orgapachehbase-1078/org/apache/hbase/hbase-hadoop-compat/1.1.0/hbase-hadoop-compat-1.1.0.jar

Download
https://repository.apache.org/content/repositories/orgapachehbase-1078/org/apache/hbase/hbase-hadoop2-compat/1.1.0/hbase-hadoop2-compat-1.1.0.jar

Download
https://repository.apache.org/content/repositories/orgapachehbase-1078/org/apache/hbase/hbase-annotations/1.1.0/hbase-annotations-1.1.0.jar

:processResources UP-TO-DATE

:classes

:jar

:assemble

:compileTestJava UP-TO-DATE

:processTestResources UP-TO-DATE

:testClasses UP-TO-DATE

:test UP-TO-DATE

:check UP-TO-DATE

:build


BUILD SUCCESSFUL


Total time: 1 mins 8.182 secs


Also you should not need to pass -Dcompat.module=hbase-hadoop2-compat.

Enis

On Wed, May 13, 2015 at 3:21 PM, Andrew Purtell apurt...@apache.org wrote:

  So, it looks like RegionCoprocessorEnvironment.getRegion() has been
 removed?

 No, the signature has changed, basically s/HRegion/Region/. HRegion is an
 internal, low level implementation type. Has always been. We have replaced
 it with Region, an interface that contains a subset of HRegion we feel we
 can support for coprocessor source and binary compatibility longer term.
 This work was done on HBASE-12972 if you're curious to know more about it.

  I guess I deploy a new coproc that uses whatever the new method is, and
 then in my client, detect at runtime which HBase version I'm talking to and
 use that to determine which coprocessor to hit?

 Coprocessors are server side extensions. These API changes will require you
 to modify the code you plan to deploy on the server. I don't think any
 client side changes are needed. Unless your coprocessor implements an
 Endpoint and _you_ are changing your RPC message formats, a 1.0.x client
 shouldn't care whether it is talking to a 1.0.x server or a 1.1.x server,
 running your coprocessor or not.



 On Wed, May 13, 2015 at 3:00 PM, James Estes james.es...@gmail.com
 wrote:

  I saw the vote thread for RC2, so tried to build my project against it.
 
  My build fails when I depend on 1.1.0. I created a bare bones project
  to show the issue I'm running into:
  https://github.com/housejester/hbase-deps-test
 
  To be clear, it works in 1.0.0 (and I did add the repository).
 
  Further, we have a coprocessor and when I stand up a 1.1.0 HBase and
  call my endpoint, I get:
 
  ! Caused by: java.lang.NoSuchMethodError:
 
 
 org.apache.hadoop.hbase.coprocessor.RegionCoprocessorEnvironment.getRegion()Lorg/apache/hadoop/hbase/regionserver/HRegion;
 
  The same coprocessor works under 1.0.0.
 
  So, it looks like RegionCoprocessorEnvironment.getRegion() has been
  removed?
 
  The Audience annotation is:
  @InterfaceAudience.LimitedPrivate(HBaseInterfaceAudience.COPROC)
  @InterfaceStability.Evolving
 
  Since it is Evolving it is allowed to change in a breaking way. I'm
  trying to think about how I migrate. I guess I deploy a new coproc
  that uses whatever the new method is, and then in my client, detect at
  runtime which HBase version I'm talking to and use that to determine
  which coprocessor to hit?
 
  Thanks,
  James
 



 --
 Best regards,

- Andy

 Problems worthy of attack prove their worth by hitting back. - Piet Hein
 (via Tom White)

Re: [VOTE] Second release candidate for HBase 1.1.0 (RC1) is available.

2015-05-12 Thread Enis Söztutar

I guess you should mention that the RC is created from branch-1.1.0, as
opposed to branch-1.1.

I was looking at
https://people.apache.org/~ndimiduk/1.0.0_1.1.0RC1_compat_report.html and
was surprised by the HTable warning about RegionLocator. Did you
cherry-pick the commit from master or branch-1.1 for branch-1.1.0? It seems
that it is missing the HTable change (which is not needed in master).

Enis

On Mon, May 11, 2015 at 11:15 PM, Nick Dimiduk ndimi...@apache.org wrote:

 Here's my review and +1:

 [✓] verified tarballs vs public key in p.a.o/keys/committers/ndimiduk.asc.
 [✓] extracted src tgz, structure looks good.
 [✓] verified build of src tgz against hadoop versions (2.2.0/minikdc=2.3.0,
 2.3.0, 2.4.0, 2.4.1, 2.5.0, 2.5.1, 2.5.2, 2.6.0, 2.7.0), with both
 jdk1.7.0_67.jdk (mac) and jdk1.8.0_20.jdk (mac).
 [✓] run LoadTestTool against standalone built from src tgz with FAST_DIFF
 block encoder and ROWCOL blooms. No issues, logs look good.
 [✓] poked around with the shell on the same: list, status, snapshot,
 compact, drop, clone, delete_snapshot, drop. no issues, logs look good.
 [✓] built site from src tgz. site and book looks good.
 [✓] extracted bin tgz, inspect structure. look good.
 [✓] started standalone from bin tgz, poked webUI, dumped metrics, conf,
 debug. looks good.
 [✓] load the site from bin tgz. noticed issue with non-relative link, filed
 HBASE-13669.
 [✓] ran Stack's hbase-downstreamer [0] vs. the maven repo. tests pass.
 [✓] inspected compatibility reports vs. 0.98.0 [1] (skimmed) and 1.0.0 [2]
 (thoroughly, comment on HBASE-13661). Looks okay to me.

 Additionally, against RC0 I did the following:

 [✓] on 7-node cluster, verified rolling upgrade from 0.98.0 while
 concurrently running LoadTestTool with LZ4 compression (0.98.0 client). No
 issues, logs look good.
 [✓] run IntegrationTestBigLinkedList with slow deterministic CM on 6-node
 cluster for 24+ hours, 15 loops of 75mm. All chains verified.

 [0]: https://github.com/ndimiduk/hbase-downstreamer
 [1]: http://people.apache.org/~ndimiduk/0.98.0_1.1.0RC1_compat_report.html
 [2]: http://people.apache.org/~ndimiduk/1.0.0_1.1.0RC1_compat_report.html


 On Mon, May 11, 2015 at 11:04 PM, Nick Dimiduk ndimi...@apache.org
 wrote:

  I'm happy to announce the second release candidate of HBase 1.1.0
  (HBase-1.1.0RC1) is available for download at
  https://dist.apache.org/repos/dist/dev/hbase/hbase-1.1.0RC1/
 
  Maven artifacts are also available in the staging repository
  https://repository.apache.org/content/repositories/orgapachehbase-1077
 
  Artifacts are signed with my code signing subkey 0xAD9039071C3489BD,
  available in the Apache keys directory
  https://people.apache.org/keys/committer/ndimiduk.asc.
 
  There's also a signed tag for this release at
 
 https://git-wip-us.apache.org/repos/asf?p=hbase.git;a=tag;h=9d54a2e8806dc2a283859382e2a2069a8e49ee81
 
  HBase 1.1.0 is the first minor release in the HBase 1.x line, continuing
  on the theme of bringing a stable, reliable database to the Hadoop and
  NoSQL communities. This release includes nearly 200 resolved issues above
  the 1.0.x series to date. Notable features include:
 
   - Async RPC client (HBASE-12684)
   - Simple RPC throttling (HBASE-11598)
   - Improved compaction controls (HBASE-8329, HBASE-12859)
   - Scan improvements (HBASE-11544, HBASE-13090)
   - New extension interfaces for coprocessor users, better supporting
  projects like Phoenix (HBASE-12972, HBASE-12975)
   - Per-column family flush (HBASE-10201)
   - WAL on SSD (HBASE-12848)
   - BlockCache in Memcached (HBASE-13170)
   - Tons of region replica enhancements around META, WAL, and bulk loading
  (HBASE-11574, HBASE-11568, HBASE-11571, HBASE-11567)
 
  The full list of issues can be found at
 
 https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12310753version=12329043
 
  This release candidate is not materially different from the previous,
  including only three changes over the previous RC: HBASE-13661,
  HBASE-13637, HBASE-13665. Because of the small difference between this RC
  and the previous, and the extended period for which the previous RC was
  available, this vote window will be open for a modest period of 48 hours.
 
  Please try out this candidate and vote +/-1 by 23:00 Pacific time on
  2015-05-13 as to whether we should release these bits as HBase 1.1.0.
 
  Thanks,
  Nick

Re: [VOTE] Third release candidate for HBase 1.1.0 (RC2) is available.

2015-05-12 Thread Enis Söztutar

Here is my +1.

checked sigs, crcs

build src tarsal with hadoop 2.3, 2.4.0, 2.5.0, 2.5.1, 2.5.2 and 2.6.0 and
2.7.0 (Did not try the 2.2)

run local mode

simple tests from shell

Build with downstreamer

checked dir layouts

checked jar files

checked version, tag,

Checked the documentation.

I did not these tests on the RC2, but RC0:

Run with LTT with 1M load on local mode and 5 node cluster. Seems fine.

Run with different block encoding and compression algorithms

Run 1.0.1 and 0.98.12 clients against 1.1.0 server, some smoke tests (list,
scan) and LTT
I am relying on the fact that RC0 and RC2 has only trivial changes in
regards to code.

Enis

On Tue, May 12, 2015 at 4:12 PM, Ted Yu yuzhih...@gmail.com wrote:

- Checked signatures
- Ran unit test suite
- Exercised basic shell commands
- Performed RAT check
- Ran LTT in a mixed deployment (0.98.4 servers along side 1.1.0 servers)

Cheers

On Tue, May 12, 2015 at 3:56 PM, Andrew Purtell apurt...@apache.org
wrote:

- Checked sums and signatures
- Unpacked src tarball, layout looks good
- Built from source with defaults
- RAT check passed
- Built from source with Hadoop 2.2.0
- Unit test suite mostly passed. (See HBASE-13676.)*
- Unpacked bin tarball, layout looks good
- Spot checked docs, looks good*
- Ran LTT with 1M keys, latencies stable after warmup. Some minor issues
with log levels (HBASE-13673, HBASE-13674, HBASE-13675)*

* - Carried over from previous RC

On Tue, May 12, 2015 at 3:24 PM, Nick Dimiduk ndimi...@apache.org
wrote:

Testing done with the new bits:

- confirmed there's no difference between HBASE-13661 on branch-1.1
(07d9904) HBASE-13661 on branch-1.1.0 (e860c66).

$ interdiff (git show 07d9904) (git show e860c66) 21 | wc -l
0

- inspected compatibility reports vs. 0.98.0 [0] and 1.0.0 [1]. HTable
issue from RC1 corrected.
- extracted src tgz, structure looks good.
- verified build of src tgz against hadoop versions
(2.2.0/minikdc=2.3.0,
2.3.0, 2.4.0, 2.4.1, 2.5.0, 2.5.1, 2.5.2, 2.6.0, 2.7.0), with both
jdk1.7.0_67.jdk (mac) and jdk1.8.0_20.jdk (mac).
- run LoadTestTool against standalone built from src tgz with FAST_DIFF
block encoder and ROWCOL blooms. No issues, filed HBASE-13677.
- built site from src tgz. site and book looks good.
- extracted bin tgz, inspect structure. looks good.
- run LoadTestTool against standalone from bin tgz with FAST_DIFF block
encoder and ROWCOL blooms. No issues, logs look good.
- load the site from bin tgz. looks good.
- ran Stack's hbase-downstreamer vs. the maven repo. tests pass.

These validations combined with my previous testing means I'm +1 for
this
RC.

[0]:
http://people.apache.org/~ndimiduk/0.98.0_1.1.0RC2_compat_report.html
[1]:
http://people.apache.org/~ndimiduk/1.0.0_1.1.0RC2_compat_report.html

On Tue, May 12, 2015 at 2:57 PM, Nick Dimiduk ndimi...@apache.org
wrote:

I'm happy to announce the third (time's the charm) release candidate
of
HBase 1.1.0 (HBase-1.1.0RC2) is available for download at
https://dist.apache.org/repos/dist/dev/hbase/hbase-1.1.0RC2/

Maven artifacts are also available in the staging repository

https://repository.apache.org/content/repositories/orgapachehbase-1078

Artifacts are signed with my code signing subkey 0xAD9039071C3489BD,
available in the Apache keys directory
https://people.apache.org/keys/committer/ndimiduk.asc.

There's also a signed tag for this release at

https://git-wip-us.apache.org/repos/asf?p=hbase.git;a=tag;h=831f98dc3ba722be44792d10b67c2e57d9522caf

HBase 1.1.0 is the first minor release in the HBase 1.x line,
continuing
on the theme of bringing a stable, reliable database to the Hadoop
and
NoSQL communities. This release includes nearly 200 resolved issues
above
the 1.0.x series to date. Notable features include:

- Async RPC client (HBASE-12684)
- Simple RPC throttling (HBASE-11598)
- Improved compaction controls (HBASE-8329, HBASE-12859)
- Scan improvements (HBASE-11544, HBASE-13090)
- New extension interfaces for coprocessor users, better supporting
projects like Phoenix (HBASE-12972, HBASE-12975)
- Per-column family flush (HBASE-10201)
- WAL on SSD (HBASE-12848)
- BlockCache in Memcached (HBASE-13170)
- Tons of region replica enhancements around META, WAL, and bulk
loading
(HBASE-11574, HBASE-11568, HBASE-11571, HBASE-11567)

The full list of issues can be found at

https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12310753version=12329043

This release candidate is not materially different from the previous,
correcting the application of HBASE-13661. Because of the small
difference
between this RC and the previous, and the extended period for which
the
first RC was available, this vote window

Re: [VOTE] First release candidate for HBase 1.1.0 (RC0) is available.

2015-05-11 Thread Enis Söztutar


  - a resolution to the RegionScanner interface change, if deemed necessary


Sorry, I confused RegionScanner with ResultScanner. RegionScanner is not a
Public class, only for co-processors. I think we do not need a fix here.

On a totally unrelated note, I was going over the ThrottlingException for
HBASE-13661, and noticed that it extends DoNotRetryIOException. Looking at
the unit tests as well, it means that if the throttle is exceeded, it
bubbles up to the client level as a RetriedExhaustedException, and the
application side has to do explicit handling because of this. I was
intuitively expecting the hbase-client level to handle the throttling
instead of raising this as an application-level exception. Is this the
expected behavior? The exception contains enough details on the throttling
that it seems it can do the wait, but seems strange to delegate that to the
application instead of handling it at the retry layer. Did we chose this
because of fast fail semantics? Sorry I missed the reviews.

This semantics is important for the RC I think, since it is the first time
we are introducing it. Just wanted to confirm that it is an explicit
decision.

Enis


  - corrected docs and site build

 Since this RC has already been open for 12 days and that RC would contain
 an extremely limited set of changes above RC0, I would like to run it
 through an abbreviated voting window -- say 48 hours.

 On Wed, Apr 29, 2015 at 10:35 PM, Nick Dimiduk ndimi...@apache.org
 wrote:

  I'm happy to announce the first release candidate of HBase 1.1.0
  (HBase-1.1.0RC0) is available for download at
  https://dist.apache.org/repos/dist/dev/hbase/hbase-1.0.1RC2/
 
  Maven artifacts are also available in the staging repository
  https://repository.apache.org/content/repositories/orgapachehbase-1076
 
  Artifacts are signed with my code signing subkey 0xAD9039071C3489BD,
  available in the Apache keys directory
  https://people.apache.org/keys/committer/ndimiduk.asc and in
  http://people.apache.org/~ndimiduk/KEY
 
  There's also a signed tag for this release at
 
 https://git-wip-us.apache.org/repos/asf?p=hbase.git;a=tag;h=2c102dbe56116ca342abd08e906d70d900048a55
 
  HBase 1.1.0 is the first minor release in the HBase 1.x line, continuing
  on the theme of bringing a stable, reliable database to the Hadoop and
  NoSQL communities. This release includes nearly 200 resolved issues above
  the 1.0.x series to date. Notable features include:
 
   - Async RPC client (HBASE-12684)
   - Simple RPC throttling (HBASE-11598)
   - Improved compaction controls (HBASE-8329, HBASE-12859)
   - New extension interfaces for coprocessor users, better supporting
  projects like Phoenix (HBASE-12972, HBASE-12975)
   - Per-column family flush (HBASE-10201)
   - WAL on SSD (HBASE-12848)
   - BlockCache in Memcached (HBASE-13170)
   - Tons of region replica enhancements around META, WAL, and bulk loading
  (HBASE-11574, HBASE-11568, HBASE-11571, HBASE-11567)
 
  The full list of issues can be found at
 
 https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12310753version=12329043
 
  Please try out this candidate and vote +/-1 by midnight Pacific time on
  2015-05-06 as to whether we should release these bits as HBase 1.1.0.
 
  Thanks,
  Nick

Re: [VOTE] First release candidate for HBase 1.1.0 (RC0) is available.

2015-05-10 Thread Enis Söztutar

Here are my tests so far:

checked sigs, crcs

build src tarsal with hadoop 2.3, 2.4.0, 2.5.0, 2.5.1, 2.5.2 and 2.6.0.
Build with Hadoop-2.2.0 is broken as previous mail. I don’t think we should
sink the RC for this.

run local mode

simple tests from shell

Build with downstreamer

checked dir layouts

checked jar files

checked version, tag,

Checked the documentation. Both the index.html and book are in the old
format. Nick did you copy the docs from master? This is unfortunate, but
not a blocker to the RC.

Run wiht LTT with 1M load on local mode and 5 node cluster. Seems fine.

Run with different block encoding and compression algorithms

Run 1.0.1 and 0.98.12 clients against 1.1.0 server, some smoke tests (list,
scan) and LTT

Going through the compat report:
https://people.apache.org/~enis/1.0.1_1.1.0RC0_compat_report.html, a couple
of interesting things:

  - https://issues.apache.org/jira/browse/HBASE-13551 missed 2 classes
related to proc v2. They should not be Public.

  - AuthUtil should not be Public.

This seems source incompatibility:

 - RegionScanner.nextRaw ( java.util.Listorg.apache.hadoop.hbase.Cell p1,
int p2 ) [abstract]  *:*  boolean

My vote would be -0, since the RegionScanner.nextRaw() although not used
much is a concerning change and breaks source compat.

Enis

On Wed, May 6, 2015 at 1:07 PM, Nick Dimiduk ndimi...@gmail.com wrote:

 I'm also traveling today.

 I've already extended the vote for this RC to Sunday, and since no one has
 said this is a -1 -worthy regression, this candidate continues to stand.

 On Wed, May 6, 2015 at 12:16 PM, Andrew Purtell andrew.purt...@gmail.com
 wrote:

  Formally, -0
 
  Given tomorrow is hbasecon perhaps it would be better to spin a RC on
  Friday?
 
  I can take HBASE-13637 but am sitting on a plane at the moment. Won't be
  able to get to it until tonight.
 
   On May 6, 2015, at 10:43 AM, Nick Dimiduk ndimi...@apache.org wrote:
  
   On Wed, May 6, 2015 at 10:13 AM, Andrew Purtell 
  andrew.purt...@gmail.com
   wrote:
  
   I prefer to patch the POMs.
  
   Is this a formal -1?
  
   I've opened HBASE-13637 for tracking this issue. Let's get it fixed and
   I'll spin a new RC tonight.
  
   On May 5, 2015, at 4:16 PM, Nick Dimiduk ndimi...@gmail.com wrote:
  
   So what's the conclusion here? Are we dropping 2.2 support or
 updating
   the
   poms and sinking the RC?
  
   On Fri, May 1, 2015 at 7:47 AM, Sean Busbey bus...@cloudera.com
   wrote:
  
   On Thu, Apr 30, 2015 at 6:48 PM, Andrew Purtell 
 apurt...@apache.org
   wrote:
  
   We could patch our POMs to reference the hadoop-minikdc artifact
   independently of the rest of the Hadoop packages. It's standalone
 and
   rarely changes.
   +1. I've been using HBase to test Hadoop changes for isolating
   dependencies
   from downstream folks (HADOOP-11804), and I've just been leaving the
   hadoop-minikdc artifact as-is due to these very reasons.
  
   --
   Sean

Re: [VOTE] First release candidate for HBase 1.1.0 (RC0) is available.

2015-05-05 Thread Enis Söztutar

Ok, I am able to verify with that KEY file. Did you update your key at
id.apache.org? There is a bot that updates the key under people.apache.org
as far as I remember.

Enis

On Mon, May 4, 2015 at 8:36 PM, Nick Dimiduk ndimi...@apache.org wrote:

 I had uploaded to subkeys.pgp.net originally, wasn't sure why
 people.apache.org wasn't picking up the new sub key. I've just uploaded to
 pgp.mit.edu just now.

 See also http://people.apache.org/~ndimiduk/KEY

 On Mon, May 4, 2015 at 6:42 PM, Enis Söztutar enis@gmail.com wrote:

  Nick did you upload your keys to MIT servers?
 
  I am not able to verify the sig.
 
  gpg --list-keys
 
  pub   4096R/8644EEB6 2014-03-11 [expires: 2016-04-14]
 
  uid  Nick Dimiduk ndimi...@apache.org
 
  uid  Nick Dimiduk ndimi...@gmail.com
 
  sub   4096R/D2DCE494 2014-03-11 [expires: 2016-04-14]
 
  gpg --verify hbase-1.1.0-bin.tar.gz.asc
 
  gpg: Signature made Wed Apr 29 18:45:38 2015 PDT using RSA key ID
 1C3489BD
 
  gpg: Can't check signature: public key not found
 
  Did you use a different key (1C3489BD) then the one at
  https://people.apache.org/keys/committer/ndimiduk.asc ?
 
  Enis
 
  On Wed, Apr 29, 2015 at 10:35 PM, Nick Dimiduk ndimi...@apache.org
  wrote:
 
   I'm happy to announce the first release candidate of HBase 1.1.0
   (HBase-1.1.0RC0) is available for download at
   https://dist.apache.org/repos/dist/dev/hbase/hbase-1.0.1RC2/
  
   Maven artifacts are also available in the staging repository
   https://repository.apache.org/content/repositories/orgapachehbase-1076
  
   Artifacts are signed with my code signing subkey 0xAD9039071C3489BD,
   available in the Apache keys directory
   https://people.apache.org/keys/committer/ndimiduk.asc and in
   http://people.apache.org/~ndimiduk/KEY
  
   There's also a signed tag for this release at
  
  
 
 https://git-wip-us.apache.org/repos/asf?p=hbase.git;a=tag;h=2c102dbe56116ca342abd08e906d70d900048a55
  
   HBase 1.1.0 is the first minor release in the HBase 1.x line,
 continuing
  on
   the theme of bringing a stable, reliable database to the Hadoop and
 NoSQL
   communities. This release includes nearly 200 resolved issues above the
   1.0.x series to date. Notable features include:
  
- Async RPC client (HBASE-12684)
- Simple RPC throttling (HBASE-11598)
- Improved compaction controls (HBASE-8329, HBASE-12859)
- New extension interfaces for coprocessor users, better supporting
   projects like Phoenix (HBASE-12972, HBASE-12975)
- Per-column family flush (HBASE-10201)
- WAL on SSD (HBASE-12848)
- BlockCache in Memcached (HBASE-13170)
- Tons of region replica enhancements around META, WAL, and bulk
 loading
   (HBASE-11574, HBASE-11568, HBASE-11571, HBASE-11567)
  
   The full list of issues can be found at
  
  
 
 https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12310753version=12329043
  
   Please try out this candidate and vote +/-1 by midnight Pacific time on
   2015-05-06 as to whether we should release these bits as HBase 1.1.0.
  
   Thanks,
   Nick

Re: [VOTE] First release candidate for HBase 1.1.0 (RC0) is available.

2015-05-04 Thread Enis Söztutar

Nick did you upload your keys to MIT servers?

I am not able to verify the sig.

gpg --list-keys

pub   4096R/8644EEB6 2014-03-11 [expires: 2016-04-14]

uid  Nick Dimiduk ndimi...@apache.org

uid  Nick Dimiduk ndimi...@gmail.com

sub   4096R/D2DCE494 2014-03-11 [expires: 2016-04-14]

gpg --verify hbase-1.1.0-bin.tar.gz.asc

gpg: Signature made Wed Apr 29 18:45:38 2015 PDT using RSA key ID 1C3489BD

gpg: Can't check signature: public key not found

Did you use a different key (1C3489BD) then the one at
https://people.apache.org/keys/committer/ndimiduk.asc ?

Enis

On Wed, Apr 29, 2015 at 10:35 PM, Nick Dimiduk ndimi...@apache.org wrote:

 I'm happy to announce the first release candidate of HBase 1.1.0
 (HBase-1.1.0RC0) is available for download at
 https://dist.apache.org/repos/dist/dev/hbase/hbase-1.0.1RC2/

 Maven artifacts are also available in the staging repository
 https://repository.apache.org/content/repositories/orgapachehbase-1076

 Artifacts are signed with my code signing subkey 0xAD9039071C3489BD,
 available in the Apache keys directory
 https://people.apache.org/keys/committer/ndimiduk.asc and in
 http://people.apache.org/~ndimiduk/KEY

 There's also a signed tag for this release at

 https://git-wip-us.apache.org/repos/asf?p=hbase.git;a=tag;h=2c102dbe56116ca342abd08e906d70d900048a55

 HBase 1.1.0 is the first minor release in the HBase 1.x line, continuing on
 the theme of bringing a stable, reliable database to the Hadoop and NoSQL
 communities. This release includes nearly 200 resolved issues above the
 1.0.x series to date. Notable features include:

  - Async RPC client (HBASE-12684)
  - Simple RPC throttling (HBASE-11598)
  - Improved compaction controls (HBASE-8329, HBASE-12859)
  - New extension interfaces for coprocessor users, better supporting
 projects like Phoenix (HBASE-12972, HBASE-12975)
  - Per-column family flush (HBASE-10201)
  - WAL on SSD (HBASE-12848)
  - BlockCache in Memcached (HBASE-13170)
  - Tons of region replica enhancements around META, WAL, and bulk loading
 (HBASE-11574, HBASE-11568, HBASE-11571, HBASE-11567)

 The full list of issues can be found at

 https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12310753version=12329043

 Please try out this candidate and vote +/-1 by midnight Pacific time on
 2015-05-06 as to whether we should release these bits as HBase 1.1.0.

 Thanks,
 Nick

Re: HBase Filesystem Adapter

2015-04-30 Thread Enis Söztutar

This is a nice topic. Let's put it on the ref guide.

Hbase on Azure FS is GA, and there has already been some work for
supporting HBase on the Hadoop native driver.
From this thread, my gathering is that, HBase should run on HDFS, MaprFS,
 IBM GPFS, Azure WASB (and maybe Isilon, etc).

I had a old write up of all the interaction points between HBase and the
underlying FS for semantic guarantees (atomic namespace rename, recover
lease, sync / flush ) etc. If anyone interested, I can try to dig it up.

Enis

On Thu, Apr 30, 2015 at 1:46 PM, Nick Dimiduk ndimi...@gmail.com wrote:

 I believe HBase also runs directly against Azure Blob Storage. This article
 [0] gives some details; I'm not sure if it's hit GA yet.

 -n

 [0]:

 http://azure.microsoft.com/blog/2014/06/06/azure-hdinsight-previewing-hbase-clusters-as-a-nosql-database-on-azure-blobs/

 On Thu, Apr 30, 2015 at 11:46 AM, Sean Busbey bus...@cloudera.com wrote:

  This thread is starting to sound like a new section for the ref guide. :)
 
  --
  Sean
  On Apr 30, 2015 1:07 PM, Jerry He jerry...@gmail.com wrote:
 
   We've also made HBase running on IBM GPFS.
   http://en.wikipedia.org/wiki/IBM_General_Parallel_File_System
  
   We have a Hadoop FileSystem implementation that translates hadoop calls
   into GPFS native calls.
   Overall it has been running well on live clusters.
  
   Jerry

Re: [VOTE] First release candidate for HBase 1.1.0 (RC0) is available.

2015-04-30 Thread Enis Söztutar

The build is broken with Hadoop-2.2 because mini-kdc is not found:

[ERROR] Failed to execute goal on project hbase-server: Could not resolve
dependencies for project org.apache.hbase:hbase-server:jar:1.1.0: Could not
find artifact org.apache.hadoop:hadoop-minikdc:jar:2.2

We are saying that 1.1 supports 2.2, but not tested. We can either decide
to drop support for 2.2, or sink the RC and fix the isssue.

Enis

On Thu, Apr 30, 2015 at 9:12 AM, Andrew Purtell andrew.purt...@gmail.com
wrote:

 This is a VOTE thread. This discussion is highly off topic. Please drop
 dev@ from the CC and change the subject.



  On Apr 30, 2015, at 7:30 AM, Ted Yu yuzhih...@gmail.com wrote:
 
  And the following:
 
   dependency
 groupIdorg.apache.hbase/groupId
 artifactIdhbase-protocol/artifactId
 version${hbase.version}/version
   /dependency
   dependency
 groupIdorg.apache.hbase/groupId
 artifactIdhbase-hadoop-compat/artifactId
 version${hbase.version}/version
   /dependency
 
  On Thu, Apr 30, 2015 at 7:27 AM, Jean-Marc Spaggiari 
  jean-m...@spaggiari.org wrote:
 
  most probably something like this:
 
 dependency
 groupIdorg.apache.hbase/groupId
 artifactIdhbase-client/artifactId
 version${hbase.version}/version
 /dependency
 dependency
 groupIdorg.apache.hbase/groupId
 artifactIdhbase-server/artifactId
 version${hbase.version}/version
 /dependency
 dependency
 groupIdorg.apache.hbase/groupId
 artifactIdhbase-common/artifactId
 version${hbase.version}/version
 /dependency
 
 
 
 
  2015-04-30 10:25 GMT-04:00 Jeetendra Gangele gangele...@gmail.com:
 
  I have added below line but this not bringing the required jars
 
  repositories
 repository
   idHbase-1.1.0/id
   url
  https://repository.apache.org/content/repositories/orgapachehbase-1076
  /url
 /repository
  /repositories
 
  On 30 April 2015 at 19:50, Jeetendra Gangele gangele...@gmail.com
  wrote:
 
  will it be possible for you to give me below
 
  Artifact Id,GrouId and version.
  if this is not possible how to add this repository is Pom.xml?
 
 
  On 30 April 2015 at 19:28, Ted Yu yuzhih...@gmail.com wrote:
 
  Jeetendra:
  Add the following repo in your pom.xml (repositories section):
  https://repository.apache.org/content/repositories/orgapachehbase-1076
 
  Then you can use 1.1.0 for hbase version.
 
  Cheers
 
  On Wed, Apr 29, 2015 at 11:06 PM, Jeetendra Gangele 
  gangele...@gmail.com
  wrote:
 
  I means how to include in pom.xml
 
  On 30 April 2015 at 11:36, Jeetendra Gangele gangele...@gmail.com
 
  wrote:
 
  How to include this is project code any sample?
 
  On 30 April 2015 at 11:32, Nick Dimiduk ndimi...@gmail.com
  wrote:
 
  Nope, you're right. That link should be
  https://dist.apache.org/repos/dist/dev/hbase/hbase-1.1.0RC0/
 
  On Wed, Apr 29, 2015 at 10:39 PM, Ashish Singhi 
  ashish.singhi.apa...@gmail.com wrote:
 
  Hi Nick.
  bq. (HBase-1.1.0RC0) is available for download at
  https://dist.apache.org/repos/dist/dev/hbase/hbase-1.0.1RC2/
  The above url is correct ? from the name it does not seems to
  be.
 
  -- Ashish
 
  On Thu, Apr 30, 2015 at 11:05 AM, Nick Dimiduk 
  ndimi...@apache.org
  wrote:
 
  I'm happy to announce the first release candidate of HBase
  1.1.0
  (HBase-1.1.0RC0) is available for download at
  https://dist.apache.org/repos/dist/dev/hbase/hbase-1.0.1RC2/
 
  Maven artifacts are also available in the staging repository
  https://repository.apache.org/content/repositories/orgapachehbase-1076
 
  Artifacts are signed with my code signing subkey
  0xAD9039071C3489BD,
  available in the Apache keys directory
  https://people.apache.org/keys/committer/ndimiduk.asc and in
  http://people.apache.org/~ndimiduk/KEY
 
  There's also a signed tag for this release at
 
 https://git-wip-us.apache.org/repos/asf?p=hbase.git;a=tag;h=2c102dbe56116ca342abd08e906d70d900048a55
 
  HBase 1.1.0 is the first minor release in the HBase 1.x line,
  continuing
  on
  the theme of bringing a stable, reliable database to the
  Hadoop
  and
  NoSQL
  communities. This release includes nearly 200 resolved issues
  above
  the
  1.0.x series to date. Notable features include:
 
  - Async RPC client (HBASE-12684)
  - Simple RPC throttling (HBASE-11598)
  - Improved compaction controls (HBASE-8329, HBASE-12859)
  - New extension interfaces for coprocessor users, better
  supporting
  projects like Phoenix (HBASE-12972, HBASE-12975)
  - Per-column family flush (HBASE-10201)
  - WAL on SSD (HBASE-12848)
  - BlockCache in Memcached (HBASE-13170)
  - Tons of region replica enhancements around META, WAL, and
  bulk
  loading
  (HBASE-11574, HBASE-11568, HBASE-11571, HBASE-11567)
 
  The full list of issues can be found at
 
 https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12310753version=12329043

Re: Index out of bounds exception when reading row

2015-04-23 Thread Enis Söztutar

In case this is HBASE-11234, HDP-2.2 releases contain the fix.

Enis

On Thu, Apr 23, 2015 at 12:06 PM, Ted Yu yuzhih...@gmail.com wrote:

 I think Dejan was referring to HBASE-11234

 Cheers

 On Thu, Apr 23, 2015 at 8:28 AM, Dejan Menges dejan.men...@gmail.com
 wrote:

  Hi,
 
  This is a known bug, there's fix already. We had it as well.
 
  Cheers,
  Dejan
 
  On Thu, Apr 23, 2015 at 5:19 PM João Alves j...@5dlab.com wrote:
 
   Hi all,
  
   I have a cluster with HDP 2.1 stack running HBase 0.98.0.2. I have one
   HBase table where there is at least one row that is impossible to get
  using
   either the java API or the hbase shell. I was unable to find online any
   examples that encompass this particular situation, maybe you guys can
  help
   me. The output error is the following:
  
  
   ERROR: java.io.IOException
   at
  org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2046)
   at
 org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:92)
   at
  
 
 org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.consumerLoop(SimpleRpcScheduler.java:160)
   at
  
 
 org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.access$000(SimpleRpcScheduler.java:38)
   at
  
 
 org.apache.hadoop.hbase.ipc.SimpleRpcScheduler$1.run(SimpleRpcScheduler.java:110)
   at java.lang.Thread.run(Thread.java:744)
   Caused by: java.lang.IndexOutOfBoundsException
   at java.nio.Buffer.checkBounds(Buffer.java:559)
   at java.nio.HeapByteBuffer.get(HeapByteBuffer.java:143)
   at org.apache.hadoop.hbase.io
   .encoding.FastDiffDeltaEncoder$1.decode(FastDiffDeltaEncoder.java:489)
   at org.apache.hadoop.hbase.io
  
 
 .encoding.FastDiffDeltaEncoder$1.decodeNext(FastDiffDeltaEncoder.java:540)
   at org.apache.hadoop.hbase.io
  
 
 .encoding.BufferedDataBlockEncoder$BufferedEncodedSeeker.seekToKeyInBlock(BufferedDataBlockEncoder.java:336)
   at org.apache.hadoop.hbase.io
  
 
 .hfile.HFileReaderV2$EncodedScannerV2.loadBlockAndSeekToKey(HFileReaderV2.java:1134)
   at org.apache.hadoop.hbase.io
   .hfile.HFileReaderV2$AbstractScannerV2.seekTo(HFileReaderV2.java:501)
   at org.apache.hadoop.hbase.io
   .hfile.HFileReaderV2$AbstractScannerV2.seekTo(HFileReaderV2.java:515)
   at
  
 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.seekAtOrAfter(StoreFileScanner.java:238)
   at
  
 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.seek(StoreFileScanner.java:153)
   at
  
 
 org.apache.hadoop.hbase.regionserver.StoreScanner.seekScanners(StoreScanner.java:317)
   at
  
 
 org.apache.hadoop.hbase.regionserver.StoreScanner.init(StoreScanner.java:176)
   at
  
 org.apache.hadoop.hbase.regionserver.HStore.getScanner(HStore.java:1847)
   at
  
 
 org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.init(HRegion.java:3716)
   at
  
 
 org.apache.hadoop.hbase.regionserver.HRegion.instantiateRegionScanner(HRegion.java:1890)
   at
  
 
 org.apache.hadoop.hbase.regionserver.HRegion.getScanner(HRegion.java:1876)
   at
  
 
 org.apache.hadoop.hbase.regionserver.HRegion.getScanner(HRegion.java:1853)
   at
   org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:4738)
   at
   org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:4712)
   at
  
 
 org.apache.hadoop.hbase.regionserver.HRegionServer.get(HRegionServer.java:2847)
   at
  
 
 org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:28857)
   at
  org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2008)
   ... 5 more
  
  
   The description of the table is:
  
   {NAME = 'd', DATA_BLOCK_ENCODING = 'FAST_DIFF', BLOOMFILTER = 'ROW',
   REPLICATION_SCOPE = '0', COMPRESSION = ' true
SNAPPY', VERSIONS = '1', TTL = '2147483647', MIN_VERSIONS = '0',
   KEEP_DELETED_CELLS = 'false', BLOCKSIZE = '65536', IN_
MEMORY = 'false', BLOCKCACHE = 'true'}, {NAME = 'm',
   DATA_BLOCK_ENCODING = 'FAST_DIFF', BLOOMFILTER = 'ROW', REPLICATIO
N_SCOPE = '0', VERSIONS = '1', COMPRESSION = 'SNAPPY', MIN_VERSIONS
  =
   '0', TTL = '2147483647', KEEP_DELETED_CELLS = 'f
alse', BLOCKSIZE = '65536', IN_MEMORY = 'false', BLOCKCACHE =
 'true'}
  
   Thanks for the help,
   João

Re: Can't start Hmaster due to zookeeper

2015-04-21 Thread Enis Söztutar

Moving this to user mailing list. issues mailing list is for jira issues.

Did you start zookeeper, and follow the guide at
https://hbase.apache.org/book.html#quickstart ?

Enis

On Tue, Apr 21, 2015 at 11:36 AM, Bo Fu b...@uchicago.edu wrote:

 Hi,

 I’m a beginner of HBase. I’m recently deploying HBase 1.0.0 onto Emulab
 using Hadoop 2.6.0
 When I type bin/start-hbase.sh, Hbase and HRegionservers starts and then
 shut down. The master log are as follows:

 2015-04-21 12:13:58,607 INFO  [main-SendThread(pc439.emulab.net
 http://pc439.emulab.net:2181)] zookeeper.ClientCnxn: Opening socket
 connection to server pc439.emulab.net/155.98.38.39:2181
 http://pc439.emulab.net/155.98.38.39:2181. Will not attempt to
 authenticate using SASL (unknown error)
 2015-04-21 12:13:58,608 INFO  [main-SendThread(pc439.emulab.net
 http://pc439.emulab.net:2181)] zookeeper.ClientCnxn: Socket connection
 established to pc439.emulab.net/155.98.38.39:2181
 http://pc439.emulab.net/155.98.38.39:2181, initiating session
 2015-04-21 12:13:58,609 INFO  [main-SendThread(pc439.emulab.net
 http://pc439.emulab.net:2181)] zookeeper.ClientCnxn: Unable to read
 additional data from server sessionid 0x0, likely server has closed socket,
 closing socket connection and attempting reconnect
 2015-04-21 12:13:59,513 INFO  [main-SendThread(pc440.emulab.net
 http://pc440.emulab.net:2181)] zookeeper.ClientCnxn: Opening socket
 connection to server pc440.emulab.net/155.98.38.40:2181
 http://pc440.emulab.net/155.98.38.40:2181. Will not attempt to
 authenticate using SASL (unknown error)
 2015-04-21 12:13:59,513 INFO  [main-SendThread(pc440.emulab.net
 http://pc440.emulab.net:2181)] zookeeper.ClientCnxn: Socket connection
 established to pc440.emulab.net/155.98.38.40:2181
 http://pc440.emulab.net/155.98.38.40:2181, initiating session
 2015-04-21 12:13:59,514 INFO  [main-SendThread(pc440.emulab.net
 http://pc440.emulab.net:2181)] zookeeper.ClientCnxn: Unable to read
 additional data from server sessionid 0x0, likely server has closed socket,
 closing socket connection and attempting reconnect
 2015-04-21 12:14:01,531 INFO  [main-SendThread(pc439.emulab.net
 http://pc439.emulab.net:2181)] zookeeper.ClientCnxn: Opening socket
 connection to server pc439.emulab.net/155.98.38.39:2181
 http://pc439.emulab.net/155.98.38.39:2181. Will not attempt to
 authenticate using SASL (unknown error)
 2015-04-21 12:14:01,531 INFO  [main-SendThread(pc439.emulab.net
 http://pc439.emulab.net:2181)] zookeeper.ClientCnxn: Socket connection
 established to pc439.emulab.net/155.98.38.39:2181
 http://pc439.emulab.net/155.98.38.39:2181, initiating session
 2015-04-21 12:14:01,532 INFO  [main-SendThread(pc439.emulab.net
 http://pc439.emulab.net:2181)] zookeeper.ClientCnxn: Unable to read
 additional data from server sessionid 0x0, likely server has closed socket,
 closing socket connection and attempting reconnect
 2015-04-21 12:14:01,633 WARN  [main] zookeeper.RecoverableZooKeeper:
 Possibly transient ZooKeeper, quorum=pc439.emulab.net
 http://pc439.emulab.net:2181,pc440.emulab.nethttp://pc440.emulab.net:2181,
 exception=org.apache.zookeeper.KeeperException$ConnectionLossException:
 KeeperErrorCode = ConnectionLoss for /hbase
 2015-04-21 12:14:01,633 ERROR [main] zookeeper.RecoverableZooKeeper:
 ZooKeeper create failed after 4 attempts
 2015-04-21 12:14:01,634 ERROR [main] master.HMasterCommandLine: Master
 exiting
 java.lang.RuntimeException: Failed construction of Master: class
 org.apache.hadoop.hbase.master.HMaster
 at
 org.apache.hadoop.hbase.master.HMaster.constructMaster(HMaster.java:1982)
 at
 org.apache.hadoop.hbase.master.HMasterCommandLine.startMaster(HMasterCommandLine.java:198)
 at
 org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:139)
 at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
 at
 org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:126)
 at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:1996)
 Caused by: org.apache.zookeeper.KeeperException$ConnectionLossException:
 KeeperErrorCode = ConnectionLoss for /hbase
 at
 org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
 at
 org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
 at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:783)
 at
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.createNonSequential(RecoverableZooKeeper.java:512)
 at
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.create(RecoverableZooKeeper.java:491)
 at
 org.apache.hadoop.hbase.zookeeper.ZKUtil.createWithParents(ZKUtil.java:1252)
 at
 org.apache.hadoop.hbase.zookeeper.ZKUtil.createWithParents(ZKUtil.java:1230)
 at
 org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.createBaseZNodes(ZooKeeperWatcher.java:174)
 at
 org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.init(ZooKeeperWatcher.java:167)

Re: hbase.apache.org homepage looks weird on Chrome and Firefox

2015-04-17 Thread Enis Söztutar

It is also possible because of HTTP resources are not loaded. I run into
this daily, because of an extension I am using which refuses to load unsafe
scripts from HTTPS links which I default to.

Usually, web sites also host the CSS and javascript files referred instead
of hot linking them. Maybe we should do the same.

Enis

On Thu, Apr 16, 2015 at 12:13 PM, anil gupta anilgupt...@gmail.com wrote:

In chrome, i did Clear Browsing Data and then revisited
http://hbase.apache.org/;. It came up properly. Thanks for the pointer,
Nick.

On Thu, Apr 16, 2015 at 11:05 AM, Andrew Purtell apurt...@apache.org
wrote:

Looks fine for me, Chrome and Firefox tested. As Nick says Looks like the
CSS asset didn't load at Anil's location for whatever reason.

On Thu, Apr 16, 2015 at 8:36 AM, Stack st...@duboce.net wrote:

Are others running into the issue Anil sees?
Thanks,
St.Ack

On Thu, Apr 16, 2015 at 8:13 AM, anil gupta anilgupt...@gmail.com
wrote:

Chrome: Version 42.0.2311.90 (64-bit) on Mac

But, firefox(34.0.5) also displays the page in same way.

On Thu, Apr 16, 2015 at 12:58 AM, Ted Yu yuzhih...@gmail.com
wrote:

Which Chrome version do you use ?

I use 41.0.2272.104 (64-bit) (on Mac) and the page renders fine.

Cheers

On Wed, Apr 15, 2015 at 11:27 PM, anil gupta
anilgupt...@gmail.com
wrote:

Hi,

I am aware that recently there were some updates done on HBase
website.
For
last few months, more often than not, the homepage is displayed
in
weird
way in chrome and firefox. Is there a bug on homepage that is
leading
to
this view:

https://www.dropbox.com/s/jcpfnu4jwim28zg/Screen%20Shot%202015-04-15%20at%2011.18.46%20PM.png?dl=0

https://www.dropbox.com/s/o7xminppnzll6x7/Screen%20Shot%202015-04-15%20at%2011.19.55%20PM.png?dl=0

IMO, if the homepage looks broken then its hard to proceed ahead
to
read
the docs. My two cents.

Also, it would be nice if we could move docs of startgate from
here:
https://wiki.apache.org/hadoop/Hbase/Stargate to
hbase.apache.org.

--
Thanks Regards,
Anil Gupta

--
Best regards,

- Andy

Problems worthy of attack prove their worth by hitting back. - Piet Hein
(via Tom White)

--
Thanks Regards,
Anil Gupta

Re: Regionserver won't start

2015-03-30 Thread Enis Söztutar

The configuration and logs seem usual. Agreed that it seems the shutdown
hook binding failed.

Do you see anything in the .out files? What about the classpath.

Enis

On Fri, Mar 27, 2015 at 2:23 AM, Dejan Menges dejan.men...@gmail.com
wrote:

 Sorry, I ment to set it to $hbase.tmp.dir/local or whatever, just read what
 I wrote, just to move it out of /tmp.

 On Fri, Mar 27, 2015 at 10:21 AM Dejan Menges dejan.men...@gmail.com
 wrote:

  Hi Orlando,
 
  This is default value:
 
  property
  namehbase.local.dir/name
  value/tmp/local/value
  /property
 
  I would set it to whatever is value of hbase.tmp.dir in your config,
  otherwise if it stays in /tmp you're going to have big issues.
 
  Also, check the permissions on these folders to be sure HBase can write
 to
  them.
 
  Can you give it a try?
 
 
  On Fri, Mar 27, 2015 at 10:14 AM ocassano ocass...@staff.voo.be wrote:
 
  Here is my hbase-site.conf
  hbase-site.xml
  http://apache-hbase.679495.n3.nabble.com/file/n4069700/hbase-site.xml
 
  Do you see something wrong in it?
 
  Thank for your help.
 
  Orlando
 
 
 
  --
  View this message in context: http://apache-hbase.679495.n3.
  nabble.com/Regionserver-won-t-start-tp4069642p4069700.html
  Sent from the HBase User mailing list archive at Nabble.com.

Re: oldWALs: what it is and how can I clean it?

2015-02-26 Thread Enis Söztutar

@Madeleine,

The folder gets cleaned regularly by a chore in master. When a WAL file is
not needed any more for recovery purposes (when HBase can guaratee HBase
has flushed all the data in the WAL file), it is moved to the oldWALs
folder for archival. The log stays there until all other references to the
WAL file are finished. There is currently two services which may keep the
files in the archive dir. First is a TTL process, which ensures that the
WAL files are kept at least for 10 min. This is mainly for debugging. You
can reduce this time by setting hbase.master.logcleaner.ttl configuration
property in master. It is by default 60. The other one is replication.
If you have replication setup, the replication processes will hang on to
the WAL files until they are replicated. Even if you disabled the
replication, the files are still referenced.

You can look at the logs from master from classes (LogCleaner,
TimeToLiveLogCleaner, ReplicationLogCleaner) to see whether the master is
actually running this chore and whether it is getting any exceptions.

@Liam,
Disabled replication will still hold on to the WAL files because, because
it has a guarantee to not lose data between disable and enable. You can
remove_peer, which frees up the WAL files to be eligible for deletion. When
you re-add replication peer again, the replication will start from the
current status, versus if you re-enable a peer, it will continue from where
it left.



On Thu, Feb 26, 2015 at 12:56 AM, Madeleine Piffaretti 
mpiffare...@powerspace.com wrote:

 Hi,

 The replication is not turned on HBase...
 Does this folder should be clean regularly? Because I have data from
 december 2014...


 2015-02-26 1:40 GMT+01:00 Liam Slusser lslus...@gmail.com:

  I'm having this same problem.  I had replication enabled but have since
  been disabled.  However oldWALs still grows.  There are so many files in
  there that running hadoop fs -ls /hbase/oldWALs runs out of memory.
 
  On Wed, Feb 25, 2015 at 9:27 AM, Nishanth S nishanth.2...@gmail.com
  wrote:
 
   Do you have replication turned on in hbase and  if so is your slave
consuming the replicated data?.
  
   -Nishanth
  
   On Wed, Feb 25, 2015 at 10:19 AM, Madeleine Piffaretti 
   mpiffare...@powerspace.com wrote:
  
Hi all,
   
We are running out of space in our small hadoop cluster so I was
  checking
disk usage on HDFS and I saw that most of the space was occupied by
  the*
/hbase/oldWALs* folder.
   
I have checked in the HBase Definitive Book and others books,
  web-site
and I have also search my issue on google but I didn't find a proper
response...
   
So I would like to know what does this folder, what is use for and
 also
   how
can I free space from this folder without breaking everything...
   
   
If it's related to a specific version... our cluster is under
5.3.0-1.cdh5.3.0.p0.30 from cloudera (hbase 0.98.6).
   
Thx for your help!

[ANNOUNCE] Apache HBase 1.0.0 is now available for download

2015-02-24 Thread Enis Söztutar

The HBase Team is pleased to announce the immediate release of HBase 1.0.0.
Download it from your favorite Apache mirror [1] or maven repository.

HBase 1.0.0 is the next stable release, and the start of semantic
versioned
releases (See [2]).

The 1.0.0 release has three goals:
1) to lay a stable foundation for future 1.x releases;
2) to stabilize running HBase cluster and its clients; and
3) make versioning and compatibility dimensions explicit

Including previous (developer preview) 0.99.x releases, 1.0.0 contains over
1500
jiras resolved on top of 0.98.0. Some of the major changes are:

API reorganization and changes
  HBase’s client level API has evolved over the years. To simplify the
  semantics and to support and make it extensible and easier to use in
  the future, we revisited the API before 1.0. To that end, 1.0.0 introduces
  new APIs, and deprecates some of the commonly-used client side APIs
  (HTableInterface, HTable and HBaseAdmin).

  We advise to update your application to use the new style of APIs, since
  deprecated APIs will be removed in future releases (2.x). See [3] and [4]
  for an overview of changes.

  All Client side API's are marked with InterfaceAudience.Public class,
  indicating that the class/method is an official client API for HBase
  (See “11.1.1. HBase API Surface” in the HBase Refguide[2] for more details
  on the Audience annotations) Going forward, all 1.x releases are planned
to
  be API compatible for classes annotated as client public.

Read availability using timeline consistent region replicas
  This release contains Phase 1 items for experimental Read availability
using
  timeline consistent region replicas feature. A region can be hosted in
  multiple region servers in read-only mode. One of the replicas for the
region
  will be primary, accepting writes, and other replicas will be sharing the
same
  data files. Read requests can be done against any replica for the region
with
  backup RPCs for high availability with timeline consistency guarantees.
More
  information can be found at HBASE-10070.

Online config change and other forward ports from 0.89-fb branch
  HBASE-12147 forward ported online config change which enables some of the
  configuration from the server to be reloaded without restarting the region
  servers.

Master runs a Region Server as well
  Starting with 1.0.0, the HBase master server and backup master servers
will
  also act as a region server. RPC port and info port for web UI is shared
for
  the master and region server roles. Active master can host regions of
  defined tables if configured (disabled by default). Backup masters will
not
  host regions.

Other notable improvements in 1.0.0 are listed (but not limited to) below:
 - A new web skin in time for 1.0 (http://hbase.apache.org)
 - [HBASE-5349]  - Automatic tuning of global memstore and block cache sizes
 - Various security, tags and visibility labels improvements
 - Bucket cache improvements (usability and compressed data blocks)
 - [HBASE-11367] - A new pluggable replication endpoint to plug in to
HBase's
   inter-cluster replication to replicate to a custom data store
 - [HBASE-11885] - A Dockerfile to easily build and run HBase from source
 - [HBASE-8332]  - Truncate table command
 - [HBASE-11059] - Region assignment to use hbase:meta table instead of
   zookeeper for faster region assignment (disabled by default)
 - Extensive documentation improvements
 - [HBASE-12511] - namespace permissions - add support from table creation
   privilege in a namespace 'C'
 - [HBASE-12568] - Adopt Semantic Versioning and document it in the book
 - [HBASE-12640] - Add Thrift-over-HTTPS and doAs support for Thrift Server
 - [HBASE-12651] - Backport HBASE-12559 'Provide LoadBalancer with online
   configuration capability' to branch-1
 - [HBASE-10560] - Per cell TTLs
 - [HBASE-11997] - CopyTable with bulkload
 - [HBASE-11990] - Make setting the start and stop row for a specific prefix
   easier
 - [HBASE-12220] - Add hedgedReads and hedgedReadWins metrics
 - [HBASE-12032] - Script to stop regionservers via RPC
 - [HBASE-11907] - Use the joni byte[] regex engine in place of j.u.regex in
   RegexStringComparator
 - [HBASE-11796] - Add client support for atomic checkAndMutate
 - [HBASE-11804] - Raise default heap size if unspecified
 - [HBASE-12126] - Region server coprocessor endpoint
 - [HBASE-12075] - Preemptive Fast Fail
 - [HBASE-12363] - Improve how KEEP_DELETED_CELLS works with MIN_VERSIONS
 - [HBASE-12434] - Add a command to compact all the regions in a
regionserver
 - [HBASE-8707]  - Add LongComparator for filter
 - [HBASE-12286] - [shell] Add server/cluster online load of configuration
   changes
 - [HBASE-12361] - Show data locality of region in table page
 - [HBASE-12496] - A blockedRequestsCount metric
 - [HBASE-12730] - Backport HBASE-5162 (Basic client pushback mechanism) to
   branch-1
 - [HBASE-12731] - Heap occupancy based client pushback
 - [HBASE-12728] - buffered writes

[ANNOUNCE] Apache HBase 1.0.0 is now available for download

2015-02-24 Thread Enis Söztutar

The HBase Team is pleased to announce the immediate release of HBase 1.0.0.
Download it from your favorite Apache mirror [1] or maven repository.

HBase 1.0.0 is the next stable release, and the start of semantic
versioned
releases (See [2]).

The 1.0.0 release has three goals:
1) to lay a stable foundation for future 1.x releases;
2) to stabilize running HBase cluster and its clients; and
3) make versioning and compatibility dimensions explicit

Including previous (developer preview) 0.99.x releases, 1.0.0 contains over
1500
jiras resolved on top of 0.98.0. Some of the major changes are:

API reorganization and changes
  HBase’s client level API has evolved over the years. To simplify the
  semantics and to support and make it extensible and easier to use in
  the future, we revisited the API before 1.0. To that end, 1.0.0 introduces
  new APIs, and deprecates some of the commonly-used client side APIs
  (HTableInterface, HTable and HBaseAdmin).

  We advise to update your application to use the new style of APIs, since
  deprecated APIs will be removed in future releases (2.x). See [3] and [4]
  for an overview of changes.

  All Client side API's are marked with InterfaceAudience.Public class,
  indicating that the class/method is an official client API for HBase
  (See “11.1.1. HBase API Surface” in the HBase Refguide[2] for more details
  on the Audience annotations) Going forward, all 1.x releases are planned
to
  be API compatible for classes annotated as client public.

Read availability using timeline consistent region replicas
  This release contains Phase 1 items for experimental Read availability
using
  timeline consistent region replicas feature. A region can be hosted in
  multiple region servers in read-only mode. One of the replicas for the
region
  will be primary, accepting writes, and other replicas will be sharing the
same
  data files. Read requests can be done against any replica for the region
with
  backup RPCs for high availability with timeline consistency guarantees.
More
  information can be found at HBASE-10070.

Online config change and other forward ports from 0.89-fb branch
  HBASE-12147 forward ported online config change which enables some of the
  configuration from the server to be reloaded without restarting the region
  servers.

Master runs a Region Server as well
  Starting with 1.0.0, the HBase master server and backup master servers
will
  also act as a region server. RPC port and info port for web UI is shared
for
  the master and region server roles. Active master can host regions of
  defined tables if configured (disabled by default). Backup masters will
not
  host regions.

Other notable improvements in 1.0.0 are listed (but not limited to) below:
 - A new web skin in time for 1.0 (http://hbase.apache.org)
 - [HBASE-5349]  - Automatic tuning of global memstore and block cache sizes
 - Various security, tags and visibility labels improvements
 - Bucket cache improvements (usability and compressed data blocks)
 - [HBASE-11367] - A new pluggable replication endpoint to plug in to
HBase's
   inter-cluster replication to replicate to a custom data store
 - [HBASE-11885] - A Dockerfile to easily build and run HBase from source
 - [HBASE-8332]  - Truncate table command
 - [HBASE-11059] - Region assignment to use hbase:meta table instead of
   zookeeper for faster region assignment (disabled by default)
 - Extensive documentation improvements
 - [HBASE-12511] - namespace permissions - add support from table creation
   privilege in a namespace 'C'
 - [HBASE-12568] - Adopt Semantic Versioning and document it in the book
 - [HBASE-12640] - Add Thrift-over-HTTPS and doAs support for Thrift Server
 - [HBASE-12651] - Backport HBASE-12559 'Provide LoadBalancer with online
   configuration capability' to branch-1
 - [HBASE-10560] - Per cell TTLs
 - [HBASE-11997] - CopyTable with bulkload
 - [HBASE-11990] - Make setting the start and stop row for a specific prefix
   easier
 - [HBASE-12220] - Add hedgedReads and hedgedReadWins metrics
 - [HBASE-12032] - Script to stop regionservers via RPC
 - [HBASE-11907] - Use the joni byte[] regex engine in place of j.u.regex in
   RegexStringComparator
 - [HBASE-11796] - Add client support for atomic checkAndMutate
 - [HBASE-11804] - Raise default heap size if unspecified
 - [HBASE-12126] - Region server coprocessor endpoint
 - [HBASE-12075] - Preemptive Fast Fail
 - [HBASE-12363] - Improve how KEEP_DELETED_CELLS works with MIN_VERSIONS
 - [HBASE-12434] - Add a command to compact all the regions in a
regionserver
 - [HBASE-8707]  - Add LongComparator for filter
 - [HBASE-12286] - [shell] Add server/cluster online load of configuration
   changes
 - [HBASE-12361] - Show data locality of region in table page
 - [HBASE-12496] - A blockedRequestsCount metric
 - [HBASE-12730] - Backport HBASE-5162 (Basic client pushback mechanism) to
   branch-1
 - [HBASE-12731] - Heap occupancy based client pushback
 - [HBASE-12728] - buffered writes

Re: periodicFlusher get stuck

2015-02-24 Thread Enis Söztutar

Yes, this looks like HBASE-10499, but without more logs from the region
server it is hard to tell.

hdp_specific
The next version of HDP-2.1 is already scheduled to contain HBASE-10499. If
you want, we can continue at the HDP forums.

Enis

On Tue, Feb 24, 2015 at 10:38 AM, Brian Jeltema 
brian.jelt...@digitalenvoy.net wrote:


  What vendor/version/release corresponds with version
  0.98.0.2.1.2.1-471-hadoop2 ? I've not seen that before.

 That’s what Ambari 1.6.0 installed when we selected HDP 2.1 (if memory
 serves).

 
  We did recently analyze and fix an issue involving the flush queue, see
  HBASE-10499 (https://issues.apache.org/jira/browse/HBASE-10499). This
 was
  released in 0.98.10. I'm not definitively saying this is your issue but
 do
  recommend an upgrade to the the current Apache HBase 0.98 release, which
 is
  0.98.10.1, or contact your vendor.

 Looks promising. Thanks.

 
 
 
  On Tue, Feb 24, 2015 at 8:15 AM, Brian Jeltema 
  brian.jelt...@digitalenvoy.net wrote:
 
  I’m seeing occasional HBase log output similar to the output shown
 below.
  It appears there is a request to flush a region, repeated every 10
  seconds, that apparently is never being performed. It’s causing MR jobs
 to
  timeout because they cannot write to this region. Is this a known
 problem?
 
  hbase version 0.98.0.2.1.2.1-471-hadoop2
  hadoop version 2.4.0.2.1.2.1-471
 
 
  2015-02-23 14:51:47,612 INFO  [regionserver60020.periodicFlusher]
  regionserver.HRegionServer: regionserver60020.periodicFlusher requesting
  flush for region
  Host,\x00_m\xB8\x06,1415044750009.6ec50faa43a312cd6465d991e5984ec6.
 after a
  delay of 13758
  2015-02-23 14:51:57,611 INFO  [regionserver60020.periodicFlusher]
  regionserver.HRegionServer: regionserver60020.periodicFlusher requesting
  flush for region
  Host,\x00_m\xB8\x06,1415044750009.6ec50faa43a312cd6465d991e5984ec6.
 after a
  delay of 18080
  2015-02-23 14:52:07,611 INFO  [regionserver60020.periodicFlusher]
  regionserver.HRegionServer: regionserver60020.periodicFlusher requesting
  flush for region
  Host,\x00_m\xB8\x06,1415044750009.6ec50faa43a312cd6465d991e5984ec6.
 after a
  delay of 17701
  2015-02-23 14:52:17,612 INFO  [regionserver60020.periodicFlusher]
  regionserver.HRegionServer: regionserver60020.periodicFlusher requesting
  flush for region
  Host,\x00_m\xB8\x06,1415044750009.6ec50faa43a312cd6465d991e5984ec6.
 after a
  delay of 19090
  2015-02-23 14:52:27,616 INFO  [regionserver60020.periodicFlusher]
  regionserver.HRegionServer: regionserver60020.periodicFlusher requesting
  flush for region
  Host,\x00_m\xB8\x06,1415044750009.6ec50faa43a312cd6465d991e5984ec6.
 after a
  delay of 4042
  2015-02-23 14:52:37,615 INFO  [regionserver60020.periodicFlusher]
  regionserver.HRegionServer: regionserver60020.periodicFlusher requesting
  flush for region
  Host,\x00_m\xB8\x06,1415044750009.6ec50faa43a312cd6465d991e5984ec6.
 after a
  delay of 12968
  2015-02-23 18:12:03,307 INFO  [regionserver60020.periodicFlusher]
  regionserver.HRegionServer: regionserver60020.periodicFlusher requesting
  flush for region
  Host,\x00_m\xB8\x06,1424724136146.48d4d3fa0e02a97a8a1d9b85d5cf0162.
 after a
  delay of 10482
  2015-02-23 18:12:13,308 INFO  [regionserver60020.periodicFlusher]
  regionserver.HRegionServer: regionserver60020.periodicFlusher requesting
  flush for region
  Host,\x00_m\xB8\x06,1424724136146.48d4d3fa0e02a97a8a1d9b85d5cf0162.
 after a
  delay of 14829
  2015-02-23 19:15:13,330 INFO  [regionserver60020.periodicFlusher]
  regionserver.HRegionServer: regionserver60020.periodicFlusher requesting
  flush for region
  Host,\x00_m\xB8\x06,1424724136146.48d4d3fa0e02a97a8a1d9b85d5cf0162.
 after a
  delay of 22888
  2015-02-23 19:15:23,329 INFO  [regionserver60020.periodicFlusher]
  regionserver.HRegionServer: regionserver60020.periodicFlusher requesting
  flush for region
  Host,\x00_m\xB8\x06,1424724136146.48d4d3fa0e02a97a8a1d9b85d5cf0162.
 after a
  delay of 21081
  2015-02-23 19:15:33,329 INFO  [regionserver60020.periodicFlusher]
  regionserver.HRegionServer: regionserver60020.periodicFlusher requesting
  flush for region
  Host,\x00_m\xB8\x06,1424724136146.48d4d3fa0e02a97a8a1d9b85d5cf0162.
 after a
  delay of 6387
  2015-02-23 20:50:23,368 INFO  [regionserver60020.periodicFlusher]
  regionserver.HRegionServer: regionserver60020.periodicFlusher requesting
  flush for region
  Host,\x00_m\xB8\x06,1424724136146.48d4d3fa0e02a97a8a1d9b85d5cf0162.
 after a
  delay of 8828
 
 
 
 
  --
  Best regards,
 
- Andy
 
  Problems worthy of attack prove their worth by hitting back. - Piet Hein
  (via Tom White)

Re: HTable or HConnectionManager, how a client connect to HBase?

2015-02-18 Thread Enis Söztutar

It is a bit more complex than that. It is actually a hash of some subset of
the configuration properties. See HConnectionKey class if you want to learn
more. But the important thing is that with the new style, you do not need
to worry anything about these since there is no implicit connection
sharing. Everything is explicit now.

Enis

On Tue, Feb 17, 2015 at 11:50 PM, Serega Sheypak serega.shey...@gmail.com
wrote:

 Hi, Enis Söztutar
 You've wrote:
 You are right that the constructor new HTable(Configuration, ..) will
 share the underlying connection if same configuration object is used.

 What do it mean the same? is equality checked using reference (java == )
 or using equals(Object other) method?


 2015-02-18 7:34 GMT+03:00 Enis Söztutar enis@gmail.com:

  Hi,
 
  You are right that the constructor new HTable(Configuration, ..) will
 share
  the underlying connection if same configuration object is used.
 Connection
  is a heavy weight object, that holds the zookeeper connection, rpc
 client,
  socket connections to multiple region servers, master, and the thread
 pool,
  etc. You definitely do not want to create multiple connections per
 process
  unless you know what you are doing.
 
  The model is changed, and the old way of HTable(Configuration, ..) is
  deprecated because, we want to make the Connection lifecycle management
  explicit. In the new model, an opened Connection is closed by the user
  again, and light weight Table instances are obtained from the Connection.
  Having HTable's share their connections implicitly makes reasoning about
 it
  too hard. The new model should be pretty easy to follow.
 
  Enis
 
  On Sat, Feb 14, 2015 at 6:45 AM, Liu, Ming (HPIT-GADSC) 
 ming.l...@hp.com
  wrote:
 
   Hi,
  
   I am using HBase 0.98.6.
  
   I learned from this maillist before, that the recommended method to
   'connect' to HBase from client is to use HConnectionManager like this:
   HConnection
   con=HConnectionManager.createConnection(configuration);
   HTableInterfacetable =
   con.getTable(hbase_table1);
   Instead of
   HTableInterface table = new
   HTable(configuration, hbase_table1);
  
   I don't quite understand the reason. I was thinking that each time I
   initialize a HTable instance, it needs to create a new HConnection. And
   that is expensive. But using the first method, multiple HTable
 instances
   can share the same HConnection. That is quite reasonable to me.
   However, I was reading from some articles on internet that , even if I
  use
   the 'new HTable(conf, tbl)' method, if the 'conf' object is the same
 one,
   all the HTable instances will still share the same HConnection. I was
   recently read yet another article and said when using 'new HTable(conf,
   tbl)', one don't need to use the exactly same 'conf' object (same one
 in
   memory). if two 'conf' objects, two different objects are all the
 same, I
   mean all attributes of these two are same (for example, created from
 the
   same hbase-site.xml and never change) then HTable objects can still
 share
   the same HConnection.  I also try to read the HTable src code, it is
 very
   hard, but it seems to me the last statement is correct: 'HTable will
  share
   HConnection, if configuration is all the same'.
  
   Sorry for so verbose. My question:
   If two 'configuration' objects are same, then two HTable object
   instantiated with them respectively can still share the same
 HConnection
  or
   not? Directly using the 'new HTable()' method.
   If the answer is 'yes', then why I still need the HConnectionManager to
   create a shared connection?
   I am talking about 0.98.6.
   I googled for days, and even try to read HBase src code, but still get
   really confused. I try to do some tests also, but since I am too
 newbie,
  I
   don't know how to verify the difference, I really don't know what a
   HConnection do under the hood. I counted the ZooKeeper client requests,
  and
   I found some difference. If this ZooKeeper requests difference is a
  correct
   metrics, it means to me that two HTable do not share HConnetion even
  using
   same 'configuration' in the constructor. So it confused me more and
  more
  
   Please someone kindly help me for this newbie question and thanks in
   advance.
  
   Thanks,
   Ming

Re: HTable or HConnectionManager, how a client connect to HBase?

2015-02-17 Thread Enis Söztutar

Hi,

You are right that the constructor new HTable(Configuration, ..) will share
the underlying connection if same configuration object is used. Connection
is a heavy weight object, that holds the zookeeper connection, rpc client,
socket connections to multiple region servers, master, and the thread pool,
etc. You definitely do not want to create multiple connections per process
unless you know what you are doing.

The model is changed, and the old way of HTable(Configuration, ..) is
deprecated because, we want to make the Connection lifecycle management
explicit. In the new model, an opened Connection is closed by the user
again, and light weight Table instances are obtained from the Connection.
Having HTable's share their connections implicitly makes reasoning about it
too hard. The new model should be pretty easy to follow.

Enis

On Sat, Feb 14, 2015 at 6:45 AM, Liu, Ming (HPIT-GADSC) ming.l...@hp.com
wrote:

 Hi,

 I am using HBase 0.98.6.

 I learned from this maillist before, that the recommended method to
 'connect' to HBase from client is to use HConnectionManager like this:
 HConnection
 con=HConnectionManager.createConnection(configuration);
 HTableInterfacetable =
 con.getTable(hbase_table1);
 Instead of
 HTableInterface table = new
 HTable(configuration, hbase_table1);

 I don't quite understand the reason. I was thinking that each time I
 initialize a HTable instance, it needs to create a new HConnection. And
 that is expensive. But using the first method, multiple HTable instances
 can share the same HConnection. That is quite reasonable to me.
 However, I was reading from some articles on internet that , even if I use
 the 'new HTable(conf, tbl)' method, if the 'conf' object is the same one,
 all the HTable instances will still share the same HConnection. I was
 recently read yet another article and said when using 'new HTable(conf,
 tbl)', one don't need to use the exactly same 'conf' object (same one in
 memory). if two 'conf' objects, two different objects are all the same, I
 mean all attributes of these two are same (for example, created from the
 same hbase-site.xml and never change) then HTable objects can still share
 the same HConnection.  I also try to read the HTable src code, it is very
 hard, but it seems to me the last statement is correct: 'HTable will share
 HConnection, if configuration is all the same'.

 Sorry for so verbose. My question:
 If two 'configuration' objects are same, then two HTable object
 instantiated with them respectively can still share the same HConnection or
 not? Directly using the 'new HTable()' method.
 If the answer is 'yes', then why I still need the HConnectionManager to
 create a shared connection?
 I am talking about 0.98.6.
 I googled for days, and even try to read HBase src code, but still get
 really confused. I try to do some tests also, but since I am too newbie, I
 don't know how to verify the difference, I really don't know what a
 HConnection do under the hood. I counted the ZooKeeper client requests, and
 I found some difference. If this ZooKeeper requests difference is a correct
 metrics, it means to me that two HTable do not share HConnetion even using
 same 'configuration' in the constructor. So it confused me more and more

 Please someone kindly help me for this newbie question and thanks in
 advance.

 Thanks,
 Ming

Re: Region Server Info Port

2015-01-21 Thread Enis Söztutar

We do have

Admin.getClusterStatus().getLoad(serverName).getInfoServerPort() which
maybe what you want.

Enis

On Wed, Jan 21, 2015 at 5:05 PM, Stack st...@duboce.net wrote:

 On Wed, Jan 21, 2015 at 4:49 PM, Stack st...@duboce.net wrote:

  On Wed, Jan 21, 2015 at 3:03 PM, Talat Uyarer ta...@uyarer.com wrote:
 
  IMHO reaching rs info port impossible in client side.
 
 
  You are right.
 
  You want the Tasks as JSON?
 
 
 
  Should I do accepting about info port ? If I do that my code does not
 work
  with pseduo installation.
 
 
 
  We should add an API that returns pb of cluster info with stuff like info
  port in it. Info port has been a bit of a bastard child; it doesn't fit
  anywhere.  Liu Shaohui did some nice work getting the info port to the
  master. What is missing is getting it over the last stretch to the
 client.
 
  Or an easier-to-parse version of the master's /dump servlet. Has stats on
 regionservers but doesn't show info port at the moment.
 St.Ack



  Thanks Talat,
  St.Ack
 
 
 
  On Jan 22, 2015 12:59 AM, Ted Yu yuzhih...@gmail.com wrote:
 
   getRegionServerInfoPort() calls
  regionServerTracker.getRegionServerInfo(sn)
   but RegionServerTracker is in hbase-server module :-(
  
   On Wed, Jan 21, 2015 at 2:48 PM, Talat Uyarer ta...@uyarer.com
 wrote:
  
Is there any extraction for zk connection frpm client side ?
   
   
   
2015-01-22 0:42 GMT+02:00 Ted Yu yuzhih...@gmail.com:
 HBaseAdmin provides this method:

   public int getMasterInfoPort() throws IOException {

 However, there is no counterpart for region server info port.

 There're two options I can think of:

 1. parse corresponding znode to extract this information

 2. parse master UI to retrieve this information

 Cheers

 On Wed, Jan 21, 2015 at 2:25 PM, Talat Uyarer ta...@uyarer.com
   wrote:

 Hi Ted,

 I work on HBASE-4368. I try to reach
 http://[rs-server]:[info_port]/rs-status?format=jsonfilter=all
  url
 fro getting task list on region server. But rs info_port can be
  change
 depends on hbase installation.Is it possible learning rs's info
  port ?

 2015-01-21 19:26 GMT+02:00 Ted Yu yuzhih...@gmail.com:
  bq. for reaching RS's webservices
 
  What information do you want to collect from region servers ?
  Have you considered using jmx ?
  See 8.3.5. JMX under
  http://hbase.apache.org/book.html#important_configurations
 
  Cheers
 
  On Wed, Jan 21, 2015 at 7:44 AM, Talat Uyarer 
 ta...@uyarer.com
wrote:
 
  Hi folks,
 
  I try to create new command for hbase client. It will show
processlist
  on RSs. I have a problem. I need RS's info port for reaching
  RS's
  webservices. In testing side I can reach it with
 
 

   
  
 
 TEST_UTIL.getMiniHBaseCluster.getMaster.getRegionServerInfoPort(ServerName).
  However I can not reach Hmaster on client side. Is there
 anyway
  learning RS infoport on client side ?
 
  P.s. Maybe you will say why you do not use
  regionserver.info.port.
   If
  Hbase run on single machine as a pseduo distrubuted mod. the
  configuration is not sample with real info port.
 
  Thanks
 
  --
  Talat
 



 --
 Talat UYARER
 Websitesi: http://talat.uyarer.com
 Twitter: http://twitter.com/talatuyarer
 Linkedin: http://tr.linkedin.com/pub/talat-uyarer/10/142/304

   
   
   
--
Talat UYARER
Websitesi: http://talat.uyarer.com
Twitter: http://twitter.com/talatuyarer
Linkedin: http://tr.linkedin.com/pub/talat-uyarer/10/142/304

Re: Does 'online region merge' make regions unavailable for some time?

2015-01-21 Thread Enis Söztutar

Online in this context is HBase cluster being online, not individual
regions. For the merge process, the regions go briefly offline similar to
how splits work. It should be on the order of seconds.

Enis

On Wed, Jan 21, 2015 at 10:26 AM, Ted Yu yuzhih...@gmail.com wrote:

 Please take a look at slides 5 and 6 in this file:

 https://issues.apache.org/jira/secure/attachment/12561887/merge%20region.pdf

 It is clear that the two regions to be merged are taken offline in step 1.

 Cheers

 On Tue, Jan 20, 2015 at 5:26 PM, Otis Gospodnetic 
 otis.gospodne...@gmail.com wrote:

  Hi,
 
  Considering this is called the *online* region merge, I would assume
  regions being merged never go offline during the merge and both regions
  being merged are available for reading and writing at all times, even
  during the merge though I don't get how writes would work if one
 region
  is being moved from one RS to another so maybe this is not truly
 online
  and writes are either rejected or buffered/blocked until the region is
  moved AND merged?  Anyone knows for sure?
 
  I see this in one of the comments:
  Q: If one (or both) of the regions were receiving non-trivial load prior
 to
  this action, would client(s) be affected ?
  A: Yes, region would be off services in a short time, it is equal with
  moving region, e.g balance a region
 
  Also took a look at the patch:
 
 
 https://issues.apache.org/jira/secure/attachment/12574965/hbase-7403-trunkv33.patch
 
  And see:
 
  +/**
  + * The merging region A has been taken out of the server's online
  regions list.
  + */
  +OFFLINED_REGION_A,
 
 
  ... and if you look for the word offline in the patch I think it's
  pretty clear that BOTH regions being merged do go offline at some
  point.  I guess it could be after the merge, too, not before
 
  ... maybe others know?
 
 
  Thanks,
  Otis
  --
  Monitoring * Alerting * Anomaly Detection * Centralized Log Management
  Solr  Elasticsearch Support * http://sematext.com/
 
 
  On Mon, Jan 19, 2015 at 4:17 AM, Vladimir Tretyakov 
  vladimir.tretya...@sematext.com wrote:
 
   Hi, I have one question about 'online region merge' (
   https://issues.apache.org/jira/browse/HBASE-7403).
   How I've understood regions which will be passed to merge method will
 be
   unavailable for some time.
  
   That means:
   1. Some data will be unavailable some time.
   2. If client will try to write data to these regions it will get
   exceptions.
  
   Are above sentences correct?
  
   Somebody can estimate time which 1 and 2 will be true? Seconds, minutes
  or
   hours? Is there any way to avoid 1 and 2?
  
   I am asking because now we have problem during time with number of
  regions
   (our key contains timestamp), count of regions growing constantly
   (splitting) and it become a cause of performance problem with time.
   For avoiding this effect we use 2 tables:
   1. First table we use for writing and reading data.
   2. Second we use only for reading data.
  
   After some time we truncate second table and rotate these tables (first
   become second and second become first). That allow us control count of
   regions, but solution looks a bit ugly, I looked at 'online region
  merge',
   but we can't live with restrictions I've described in first part of
   question.
  
   Can somebody help with answers?
  
   Thx, Vladimir Tretyakov.

[ANNOUNCE] HBase 0.99.2 (developer preview release) is now available for download

2014-12-10 Thread Enis Söztutar

The HBase Team is pleased to announce the immediate release of HBase 0.99.2.
Download it from your favorite Apache mirror [1] or maven repository.

THIS RELEASE IS NOT INTENDED FOR PRODUCTION USE, and does not contain any
backwards or forwards compatibility guarantees (even within minor versions
of
0.99.x). Please refrain from deploying this over important data. Use latest
0.98.x release instead. HBase 0.99.2 is a developer preview release, and
an odd-numbered release as defined in [2].

0.99.2 is the last planned release from 0.99.x line of developer preview
releases.Please use this release as a test bed for the upcoming HBase-1.0
release. Report any encountered problems or features that you think need
fixing before 1.0. This release also contains some API changes, and
deprecation of older APIs which won't be supported in 2.0 series. Please
give them a try and let us know what you think. All contribution in terms
of testing, benchmarking, checking API / source /wire compatibility,
checking out documentation and further code contribution is highly
appreciated. 1.0 will be the first series in the 1.x line of releases
which are expected to keep compatibility with previous 1.x releases. Thus
it
is very important to check the client side and server side APIs for
compatibility and maintainability concerns for future releases.

0.99.2 builds on top of all the changes that is in the 0.99.1 and 0.99.0
releases (an overview can be found at [4,5]). The theme of (eventual) 1.0
release is to become a stable base for future 1.x series of releases.
1.0 release will aim to achieve at least the same level of stability of
0.98 releases without introducing too many new features.

The work to clearly mark and differentiate client facing APIs, and
redefine
some of the client interfaces for improving semantics, easy of use and
maintainability has continued in 0.99.2 release. Remaining work can
be found in HBASE-10602. Marking/remarking of interfaces with
InterfaceAudience
has also been going on, which will identify areas for compatibility (with
clients, coprocessors and dependent projects like Phoenix) for future
releases.

The work to clearly mark and differentiate client facing APIs, and
redefine
some of the client interfaces for improving semantics, easy of use and
maintainability has continued in 0.99.2 release. Marking/remarking of
interfaces with InterfaceAudience has also been going on (HBASE-10462),
which will identify areas for compatibility (with clients, coprocessors
and dependent projects like Phoenix) for future releases.

0.99.2 contains 190 issues fixed on top of 0.99.1. Some other notable
improvements
in this release are
- [HBASE-12075] - Preemptive Fast Fail
- [HBASE-12147] - Porting Online Config Change from 89-fb
- [HBASE-12354] - Update dependencies in time for 1.0 release
- [HBASE-12363] - Improve how KEEP_DELETED_CELLS works with MIN_VERSIONS
- [HBASE-12434] - Add a command to compact all the regions in a
regionserver
- [HBASE-8707] - Add LongComparator for filter
- [HBASE-12286] - [shell] Add server/cluster online load of configuration
changes
- [HBASE-12361] - Show data locality of region in table page
- [HBASE-12496] - A blockedRequestsCount metric
- Switch to using new style of client APIs internally (in a lot of places)
- Improvements in visibility labels
- Perf improvements
- Some more documentation improvements
- Numerous improvements in other areas and bug fixes.

The release has these changes in default behaviour (from 0.99.1):
- Disabled the Distributed Log Replay feature by default. Similar to
0.98
and earlier releases Distributed Log Split is the default.

The list of changes in this release can be found in the release notes [3].
Thanks to everybody who contributed to this release!

ps. The release announcement was delayed by a couple of days due to some
INFRA issues.

Cheers,
The HBase Team

1. http://www.apache.org/dyn/closer.cgi/hbase/
2. https://hbase.apache.org/book/upgrading.html#hbase.versioning
3.
https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12310753version=12328551
4.
https://mail-archives.apache.org/mod_mbox/hbase-dev/201409.mbox/%3ccamuu0w94oarqcz2zy4zlqy_aaqn70whhh1ycs_0bjpseeec...@mail.gmail.com%3E
5.
https://mail-archives.apache.org/mod_mbox/hbase-dev/201409.mbox/%3ccamuu0w9y_+afw6ww0ha_p8kbew35b3ncshbuqacfndzs8tc...@mail.gmail.com%3E

[ANNOUNCE] HBase 0.99.0 (developer preview release) is now available for download

2014-09-22 Thread Enis Söztutar

The HBase Team is pleased to announce the immediate release of HBase 0.99.0.
Download it from your favorite Apache mirror [1] or maven repository.

THIS RELEASE IS NOT INTENDED FOR PRODUCTION USE, and does not contain any
backwards or forwards compatibility guarantees (even within minor versions
0.99.x). Please refrain from deploying this over important data. Use latest
0.98.x release instead. HBase 0.99.0 is a developer preview release, and
an odd-numbered release as defined in [2].

A series of of 0.99.x releases are planned in preparation for 1.0.0 release
which will be the next stable and supported release. Please use this
release as
a test bed for the upcoming HBase-1.0 release. Report any encountered
problems
or features that you think need fixing before 1.0. This release also
contains
some API changes, and deprecation of older APIs which won't be supported in
2.0 series. Please give them a try and let us know what you think.

The list of changes in this release can be found in the release notes [3]
and a higher level summary can be found at [4].
Thanks to everybody who contributed to this release!

Cheers,
The HBase Team

1. http://www.apache.org/dyn/closer.cgi/hbase/
2. https://hbase.apache.org/book/upgrading.html#hbase.versioning
3.
https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12310753version=12325675
4.
https://mail-archives.apache.org/mod_mbox/hbase-dev/201409.mbox/%3ccamuu0w9y_+afw6ww0ha_p8kbew35b3ncshbuqacfndzs8tc...@mail.gmail.com%3E

Re: Disk space leak when using HBase and HDFS ShortCircuit

2014-06-25 Thread Enis Söztutar

Agreed, this seems like an hdfs issue unless hbase itself does not close
the hfiles properly. But judging from the fact that you were able to
circumvent the problem by getting reducing the cache size, it does seem
unlikely.

I don't think the local block reader will be notified when a file/block is
deleted.

What was your dfs.client.read.shortcircuit.streams.cache.size ?

Enis


On Wed, Jun 25, 2014 at 3:29 PM, Andrew Purtell apurt...@apache.org wrote:

 Forwarded

 -- Forwarded message --
 From: Vladimir Rodionov vrodio...@carrieriq.com
 Date: Wed, Jun 25, 2014 at 12:03 PM
 Subject: RE: Disk space leak when using HBase and HDFS ShortCircuit
 To: user@hbase.apache.org user@hbase.apache.org


  Apparently those file descriptors were stored by the HDFS
  ShortCircuit cache.

 As far as I understand this is issue of HDFS shorty-circuit-reads
 implementation not HBase. HBase uses HDFS API to access
 files. Did you ask this question on hdfs dev list? This looks like a very
 serious bug.

 Best regards,
 Vladimir Rodionov
 Principal Platform Engineer
 Carrier IQ, www.carrieriq.com
 e-mail: vrodio...@carrieriq.com

 
 From: Giuseppe Reina [g.re...@gmail.com]
 Sent: Wednesday, June 25, 2014 2:54 AM
 To: user@hbase.apache.org
 Subject: Disk space leak when using HBase and HDFS ShortCircuit

 Hi all,
we have been experiencing the same problem with 2 of our clusters. We
 are currently using HDP 2.1 that comes with HBase 0.98.

 The problem manifested by showing a huge differences (hundreds of GB)
 between the output of df and du of the hdfs data directories.
 Eventually, other systems complained for the lack of space before shutting
 down. We identified the problem and discovered that all the RegionServers
 were holding lots of open file descriptors to deleted files, which
 prevented the OS to free the disk space occupied (hence the difference
 between df and du). The deleted files were pointing to the local HDFS
 blocks of old HFiles deleted from HDFS during the compaction and/or split
 operations. Apparently those file descriptors were stored by the HDFS
 ShortCircuit cache.

 My question is, isn't the shortcircuit feautre supposed to get notified
 somehow of file deletion on a file on HDFS so it can remove the open fds
 from the cache? This creates huge leaks whenever HBase is heavily loaded
 and we had to restart the RegionServer periodically until before
 identifying the problem. We solved the problem first by disabling
 shortcircuit from HDFS and then enabling it and reducing the cache size so
 to trigger often the caching policies (this leads to some performance
 loss).


 p.s. I am aware of the 
 dfs.client.read.shortcircuit.streams.cache.expiry.ms
  directoparameter, but for some reason the default value (5 mins) does not
 work out-of-the-box on HDP 2.1, moreover the problem persists for high
 timeouts and big cache sizes.

 Kind Regards

 Confidentiality Notice:  The information contained in this message,
 including any attachments hereto, may be confidential and is intended to be
 read only by the individual or entity to whom this message is addressed. If
 the reader of this message is not the intended recipient or an agent or
 designee of the intended recipient, please note that any review, use,
 disclosure or distribution of this message or its attachments, in any form,
 is strictly prohibited.  If you have received this message in error, please
 immediately notify the sender and/or notificati...@carrieriq.com and
 delete
 or destroy any copy of this message and its attachments.



 --
 Best regards,

- Andy

 Problems worthy of attack prove their worth by hitting back. - Piet Hein
 (via Tom White)

Re: mapred/mapreduce classes in hbase-server rather than hbase-client

2014-05-15 Thread Enis Söztutar

Hi Keegan,

Unfortunately, at the time of the module split in 0.96, we could not
completely decouple mapreduce classes from the server dependencies. I think
we actually need two modules to be extracted out, one is hbase-mapreduce
(probably separate module than client module) and hbase-storage for the
storage bits. I am sure at some point that will happen.

Enis


On Tue, May 13, 2014 at 9:09 AM, Keegan Witt keeganw...@gmail.com wrote:

 Possibly this was due to HBASE-7186 or HBASE-7188.  It's especially odd
 since I don't see usages outside the mapreduce package (at least for the
 classes that were of interest to me), so there shouldn't be any issue with
 changing the artifact the package is deployed in.
 Is this more a question for the dev list?

 -Keegan


 On Thu, May 1, 2014 at 10:59 AM, Keegan Witt keeganw...@gmail.com wrote:

  It looks like maybe as part of HBASE-4336, classes under the mapred and
  mapreduce package are now deployed in the hbase-server artifact.
  Wouldn't
  it make more sense to have these deployed in hbase-client?  hbase-server
 is
  a pretty big artifact to pull down to get access to TableOutputFormat,
 for
  example.
 
  If this makes sense, I can open a Jira.  I just thought I'd see if
 someone
  could explain the rationale first.  Thanks!
 
  -Keegan

Re: [VOTE] The 1st HBase 0.98.2 release candidate (RC0) is available

2014-05-13 Thread Enis Söztutar

Sorry I could not get on this sooner. I do typically run a list of tests
from my checklist to verify the release, which is why I sometimes do not
vote on releases. Let me do the dutiful for this.

Agreed with the general sentiment. Please reconsider.

Enis


On Mon, May 12, 2014 at 3:59 PM, Stack st...@duboce.net wrote:

 On Sun, May 11, 2014 at 7:54 PM, Andrew Purtell andrew.purt...@gmail.com
 wrote:

  Actually I do resign as RM for 0.98 effective immediately.
 
 
 Please reconsider.  Innocent mistake.



  I've not seen the community before disinterested in a release
 sufficiently
  that not even simple package verification is time worth spending.
 
 
 As a community we are notoriously bad at voting on releases.  Last week was
 particularly distracting (hbasecon, hackathon, mail outage).

 St.Ack

Re: Fw: [VOTE] The 1st HBase 0.98.2 release candidate (RC0) is available

2014-05-12 Thread Enis Söztutar

Here is my late +1 for the release.

- Checked checksums
- Checked gpg signature (gpg --verify )
- Checked included documentation book.html, etc.
- Running unit tests
- Started in local mode, run LoadTestTool
- checked hadoop libs in h1 / h2
- build src with both hadoop 1 and 2
- checked directory layouts
- run local cluster
- run smoke tests with shell on the artifacts
- run tests locally on binaries
-- bin/hbase org.apache.hadoop.hbase.util.LoadTestTool -write 10:10:100
-num_keys 100 -read 100:30

Enis

On Tue, May 13, 2014 at 12:19 AM, lars hofhansl la...@apache.org wrote:

Resending... Looks like first attempt did not go through (still Apache
email issues?).

- Forwarded Message -
From: lars hofhansl la...@apache.org
To: d...@hbase.apache.org d...@hbase.apache.org
Cc: user@hbase.apache.org user@hbase.apache.org
Sent: Monday, May 12, 2014 11:01 AM
Subject: Re: [VOTE] The 1st HBase 0.98.2 release candidate (RC0) is
available

Getting folks to test releases is like pulling teeths, or herding cats, or
getting three kids to agree which song from Frozen is the best :)

All jokes aside, this *is* a problem. With the goal of more frequent
releases release verification needs to be a smooth process.
I do not know how to fix it, but if we all could just remember to test a
release (and a simple verification does not take much time) that'd help.

I take some of the blame. And I'll weasel an excuse: HBaseCon. :)

Please reconsider Andy. It's not always fun, but it's important service to
the community.

-- Lars

From: Andrew Purtell andrew.purt...@gmail.com
To: d...@hbase.apache.org d...@hbase.apache.org
Cc: user@hbase.apache.org user@hbase.apache.org
Sent: Sunday, May 11, 2014 7:54 PM
Subject: Re: [VOTE] The 1st HBase 0.98.2 release candidate (RC0) is
available

Actually I do resign as RM for 0.98 effective
immediately.

I've not seen the community before disinterested in a release sufficiently
that not even simple package verification is time worth spending.

On May 12, 2014, at 10:47 AM, Andrew Purtell andrew.purt...@gmail.com
wrote:

Too late I already put out the artifacts. I thought the release vote is
lazy consensus. If you would like me to resign as RM for 0.98 I will.

On May 12, 2014, at 9:17 AM, Todd Lipcon t...@cloudera.com wrote:

Hey Andrew,

Sorry for the late email here, but -- I believe releases need at least
three +1 votes from PMC members[1]. Perhaps people were a bit busy last
week due to HBaseCon and we should extend the voting on this release
candidate for another week?

[1] http://www.apache.org/foundation/voting.html

-Todd

On Fri, May 9, 2014 at 12:14 AM, Andrew Purtell apurt...@apache.org
wrote:

With one +1 and no 0 or -1 votes, this vote has passed. I have sent the
artifacts onward for mirroring and will announce in ~24 hours.

On Wed, May 7, 2014 at 10:04 AM, Andrew Purtell apurt...@apache.org
wrote:

Unit test suite passes 100% 25 times out of 25 runs on Java 6u43,
7u45,
and 8u0.

Cluster testing looks good with LoadTestTool, and YCSB.

An informal performance test on a small cluster comparing 0.98.0 and
0.98.2 indicates no serious perf regressions.
See email to dev@ titled
Comparing the performance of 0.98.2 RC0 and 0.98.0 using YCSB.

On Wed, Apr 30, 2014 at 8:50 PM, Andrew Purtell apurt...@apache.org
wrote:

The 1st HBase 0.98.2 release candidate (RC0) is available for
download
at
http://people.apache.org/~apurtell/0.98.2RC0/ and Maven artifacts
are
also available in the temporary repository

https://repository.apache.org/content/repositories/orgapachehbase-1020.

Signed with my code signing key D5365CCD.

The issues resolved in this release can be found here:

https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12310753version=12326505

Please try out the candidate and vote +1/-1 by midnight Pacific Time
(00:00 -0800 GMT) on May 7 on whether or not we should release this
as
0.98.2.

--
Best regards,

- Andy

Problems worthy of attack prove their worth by hitting back. - Piet
Hein
(via Tom White)

--
Best regards,

- Andy

Problems worthy of attack prove their worth by hitting back. - Piet
Hein
(via Tom White)

--
Best regards,

- Andy

Problems worthy of attack prove their worth by hitting back. - Piet
Hein
(via Tom White)

--
Todd Lipcon
Software Engineer, Cloudera

Re: hbase replication for higher availability

2014-04-07 Thread Enis Söztutar

Hey,

Indeed it is a viable approach. Some of the HBase deployments use the
master-master replication model across DCs. The consistency semantics will
obvious depend on the application use case.

However, there is no out-of-the-box client to do across-DC requests. But
wrapping the HBase client in a higher level should give you the
possibility.

On the other hand, we are adding high available reads within the DC in the
issue HBASE-10070. You can track the development there.

Cheers,
Enis


On Mon, Mar 31, 2014 at 3:37 PM, Jeff Storey storey.j...@gmail.com wrote:

 Thank you for the input.


 On Sun, Mar 30, 2014 at 8:10 PM, Vladimir Rodionov
 vrodio...@carrieriq.comwrote:

  It can be viable approach if you can keep replication lag under control.
 
   I'm not sure how the java api deals with reading from a region server
  that
   is in the process of failing over? Is there a way to detect that?
 
  Do two reads in sequence:
 
  1. Read primary cluster
  2. Read secondary if 1. exceeds your time - out .
 
  Best regards,
  Vladimir Rodionov
  Principal Platform Engineer
  Carrier IQ, www.carrieriq.com
  e-mail: vrodio...@carrieriq.com
 
  
  From: Jeff Storey [storey.j...@gmail.com]
  Sent: Sunday, March 30, 2014 2:31 PM
  To: user@hbase.apache.org
  Subject: hbase replication for higher availability
 
  In evaluating strategies for minimizing downtime when a region server
  fails, in addition to the common approaches such as lowering the
 zookeeper
  timeout, is it possible to use replication to improve availability (at
 the
  cost of consistency) for reads?
 
  I'm still getting more familiar with the HBASE api, but my thought would
 be
  to do something like:
 
  - attempt read from the primary cluster
  - if read fails because of downed region server, read from slave cluster
  (understanding that the read may be a little bit stale)
 
  I wouldn't expect this to happen too frequently, but in a case where I
  would rather return slightly stale data rather than no data, is this a
  viable approach?
 
  I'm not sure how the java api deals with reading from a region server
 that
  is in the process of failing over? Is there a way to detect that?
 
  Thanks for the help.
 
  Confidentiality Notice:  The information contained in this message,
  including any attachments hereto, may be confidential and is intended to
 be
  read only by the individual or entity to whom this message is addressed.
 If
  the reader of this message is not the intended recipient or an agent or
  designee of the intended recipient, please note that any review, use,
  disclosure or distribution of this message or its attachments, in any
 form,
  is strictly prohibited.  If you have received this message in error,
 please
  immediately notify the sender and/or notificati...@carrieriq.com and
  delete or destroy any copy of this message and its attachments.

[ANNOUNCE] Next releases of HBase will drop Hadoop-1.x support

2014-03-06 Thread Enis Söztutar

Hi,

In the dev thread [1], HBase developers are considering dropping support
for Hadoop-1 in future releases 0.99, and HBase-1.0. This is a heads up, so
that you can plan ahead if you are choosing to go with 0.96.x and 0.98.x
releases.

Hadoop-2.2 was released last October, and it is superior in every way to
Hadoop-1, so it is recommended that users switch to Hadoop-2.

The support matrix for Hadoop versions can be found in [2], and future
discussion in [3].

[1]
http://mail-archives.apache.org/mod_mbox/hbase-dev/201403.mbox/%3ccamuu0w93mgp7zbbxgccov+be3etmkvn5atzowvzqd_gegdk...@mail.gmail.com%3E


[2] http://hbase.apache.org/book/configuration.html#hadoop

[3] https://issues.apache.org/jira/browse/HBASE-10690

Cheers,
Enis

Re: org.apache.hadoop.hbase.ipc.SecureRpcEngine class not found in HBase jar

2014-03-05 Thread Enis Söztutar

Indeed we need some pom mockery to be able to do that. It would be good
though.

Some history:
https://issues.apache.org/jira/browse/HBASE-5341
and related
https://issues.apache.org/jira/browse/HBASE-6929

Enis



On Tue, Mar 4, 2014 at 5:03 PM, James Taylor giacomotay...@gmail.comwrote:

 That'd be good if the HBase team could put the secure build in maven too.
 Thanks,

 James


 On Tue, Mar 4, 2014 at 4:33 PM, lars hofhansl la...@apache.org wrote:

  Might be better to push the secure build to maven. No disadvantage in
  doing so. Not sure if there's any maven blackmagic missing/needed.
 
  -- Lars
 
--
   *From:* anil gupta anilgupt...@gmail.com
  *To:* user@hbase.apache.org user@hbase.apache.org
  *Cc:* James Taylor giacomotay...@gmail.com
  *Sent:* Tuesday, March 4, 2014 10:48 AM
  *Subject:* Re: org.apache.hadoop.hbase.ipc.SecureRpcEngine class not
  found in HBase jar
 
  Thanks for the reply.
 
  Since the HBase security jar is not published in Maven repo. I am running
  into Problem with enhancing the jdbc connection of Phoenix(
  https://issues.apache.org/jira/browse/PHOENIX-19) to support connecting
 to
  a secure HBase cluster.
  Is there any particular reason due to which we don't publish security jar
  of HBase?
 
  I have been using cdh4.5 and that has hbase security. For phoenix, i
 don't
  think i can reference Cloudera stuff. If we cannot publish the security
 jar
  in Maven repo then Phoenix might have to build hbase with the flag that
  Gary mentioned.
 
  Thanks,
  Anil Gupta
 
 
  On Tue, Mar 4, 2014 at 10:40 AM, Gary Helmling ghelml...@gmail.com
  wrote:
 
   For HBase 0.94, you need a version of HBase built with the security
   profile to get SecureRpcEngine and other security classes.  I'm not
 sure
   that the published releases on maven central actually include this.
  
   However, it's easily to build yourself, just add -Psecurity to the
 mvn
   command line to get the security profile.
  
   For HBase 0.96+ this is no longer necessary, as the security classes
 are
   now part of the main build.
  
  
   On Tue, Mar 4, 2014 at 10:02 AM, anil gupta anilgupt...@gmail.com
  wrote:
  
Hi All,
   
If i create a maven project with the following maven dependency then
  the
HBase jar doesn't have org.apache.hadoop.hbase.ipc.SecureRpcEngine
  class.
 dependency
   groupIdorg.apache.hbase/groupId
   artifactIdhbase/artifactId
   version0.94.12/version
/dependency
   
SecureRPCEngine class is used when the cluster is secured. Is there
 any
other maven dependency i need to use to get that class?
   
--
Thanks  Regards,
Anil Gupta
 
   
  
 
 
 
  --
  Thanks  Regards,
  Anil Gupta

Re: Help: HMaster getting Aborted at startup!

2014-03-05 Thread Enis Söztutar

From the log message, it seems that you did not run the upgrade.

Can you try with the instructions Ted sent?

Enis


On Wed, Mar 5, 2014 at 12:39 AM, Richard Chen cxd3...@gmail.com wrote:

 Thanks for the suggestion. But when I looked at the JIRA, it shows

- Fix Version/s:0.98.0
 https://issues.apache.org/jira/browse/HBASE/fixforversion/12323143
, 0.96.0
 https://issues.apache.org/jira/browse/HBASE/fixforversion/12324822


 So why do I need to do the patch in 0.98.0 which is supposedly having the
 fixes inside? But still 0.98.0 has the same error during
 start-up...weird...

 BTW, aren't the stable releases passing all the tests including integration
 tests? why do we still see such issues in stable releases?

 Thanks,
 Richard

 On Wed, Mar 5, 2014 at 11:36 AM, Rabbit's Foot
 rabbitsf...@is-land.com.twwrote:

  You can try to refering the following link
 
  https://issues.apache.org/jira/browse/HBASE-9278
 
  https://issues.apache.org/jira/browse/HBASE-9497
 
 
  2014-03-05 10:47 GMT+08:00 Richard Chen cxd3...@gmail.com:
 
   Hmaster getting aborted after starting ./start-hbase.sh in hbase-0.96.0
   with hadoop 2.2.0.
  
   Tried with hbase-0.94.16 and hbase-0.98 but same result. Hmaster aborts
  as
   soon as it starts. Even tried with replacing jars in hbase lib manually
  as
   well as using maven but the issue is unresolved. Is there any other
   solution?
  
   Below is the corresponding hbase-hadoop-master-hadoop-master.log...
  
   2014-02-24 10:11:27,078 INFO
   [Replication.RpcServer.handler=2,port=6] ipc.RpcServer:
   Replication.RpcServer.handler=2,port=6: starting
   2014-02-24 10:11:27,565 INFO  [RpcServer.handler=23,port=6]
   ipc.RpcServer: RpcServer.handler=23,port=6: starting
   2014-02-24 10:11:27,970 INFO  [master:hadoop-master:6]
   mortbay.log: Logging to
   org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via
   org.mortbay.log.Slf4jLog
   2014-02-24 10:11:28,172 INFO  [master:hadoop-master:6]
   http.HttpServer: Added global filter 'safety'
   (class=org.apache.hadoop.http.HttpServer$QuotingInputFilter)
   2014-02-24 10:11:28,177 INFO  [master:hadoop-master:6]
   http.HttpServer: Added filter static_user_filter
   (class=org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter)
   to context master
   2014-02-24 10:11:28,177 INFO  [master:hadoop-master:6]
   http.HttpServer: Added filter static_user_filter
   (class=org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter)
   to context static
   2014-02-24 10:11:28,191 INFO  [master:hadoop-master:6]
   http.HttpServer: Jetty bound to port 60010
   2014-02-24 10:11:28,191 INFO  [master:hadoop-master:6]
   mortbay.log: jetty-6.1.26
   2014-02-24 10:11:29,227 INFO  [master:hadoop-master:6]
   mortbay.log: Started SelectChannelConnector@0.0.0.0:60010
   2014-02-24 10:11:29,623 INFO  [master:hadoop-master:6]
   master.ActiveMasterManager: Registered Active
   Master=hadoop-master.payoda.com,6,1393236677609
   2014-02-24 10:11:29,629 INFO  [master:hadoop-master:6]
   Configuration.deprecation: fs.default.name is deprecated. Instead, use
   fs.defaultFS
   2014-02-24 10:11:29,851 DEBUG [main-EventThread]
   master.ActiveMasterManager: A master is now available
   2014-02-24 10:11:30,537 INFO  [master:hadoop-master:6]
   Configuration.deprecation: hadoop.native.lib is deprecated. Instead,
   use io.native.lib.available
   2014-02-24 10:11:30,800 DEBUG [master:hadoop-master:6]
   util.FSTableDescriptors: Current tableInfoPath =
  
  
 
 hdfs://hadoop-master:9000/hbase/data/hbase/meta/.tabledesc/.tableinfo.01
   2014-02-24 10:11:30,821 DEBUG [master:hadoop-master:6]
   util.FSTableDescriptors: TableInfo already exists.. Skipping creation
   2014-02-24 10:11:30,944 INFO  [master:hadoop-master:6]
   fs.HFileSystem: Added intercepting call to namenode#getBlockLocations
   so can do block reordering using class class
   org.apache.hadoop.hbase.fs.HFileSystem$ReorderWALBlocks
   2014-02-24 10:11:30,950 INFO  [master:hadoop-master:6]
   master.SplitLogManager: Timeout=12, unassigned timeout=18,
   distributedLogReplay=false
   2014-02-24 10:11:30,956 INFO  [master:hadoop-master:6]
   master.SplitLogManager: Found 0 orphan tasks and 0 rescan nodes
   2014-02-24 10:11:31,000 INFO  [master:hadoop-master:6]
   zookeeper.ZooKeeper: Initiating client connection,
   connectString=192.168.14.35:2181 sessionTimeout=9
   watcher=hconnection-0x4a867fad
   2014-02-24 10:11:31,012 INFO
   [master:hadoop-master:6-SendThread(hadoop-master.payoda.com:2181)]
   zookeeper.ClientCnxn: Opening socket connection to server
   hadoop-master.payoda.com/192.168.14.35:2181. Will not attempt to
   authenticate using SASL (Unable to locate a login configuration)
   2014-02-24 10:11:31,617 INFO
   [master:hadoop-master:6-SendThread(hadoop-master.payoda.com:2181)]
   zookeeper.ClientCnxn: Socket connection established to

Re: windows client and ignoring winutils.exe ??!!

2014-02-21 Thread Enis Söztutar

winutils.exe is the native implementation for some of the utilities that
hdfs and hadoop clients require. For the hbase client and hadoop client to
work properly, you have to have it installed properly.
You can build it locally: http://wiki.apache.org/hadoop/Hadoop2OnWindows.

Enis


On Fri, Feb 21, 2014 at 11:00 AM, shapoor esmaili_...@yahoo.com wrote:

 hello everyone,
 i installed hbase-0.96.1.1-hadoop2 using hadoop-2.2.0. The installation
 is in a Linux machine. I am however willing to access it from a windows
 client using java and eclipse. it always worked with hbase-0.94. But now i
 get the following exception and my hbaseAdmin cannot be initialized. it
 stays null.

 Could not locate executable
 C:\Users\shapoor\sources\hadoop-2.2.0\bin\winutils.exe in the Hadoop
 binaries.

 I don't have an installation in windows. How can I ignore this?

 regards,



 --
 View this message in context:
 http://apache-hbase.679495.n3.nabble.com/windows-client-and-ignoring-winutils-exe-tp4056227.html
 Sent from the HBase User mailing list archive at Nabble.com.

Re: Regarding Hardware configuration for HBase cluster

2014-02-11 Thread Enis Söztutar

We've also recently updated
http://hbase.apache.org/book/ops.capacity.htmlwhich contains similar
numbers, and some more details on the items to
consider for sizing.

Enis



On Sat, Feb 8, 2014 at 10:12 PM, Ramu M S ramu.ma...@gmail.com wrote:

 Thanks Lars.

 We were in the process of building our HBase cluster. Much smaller size
 though. This discussion helped a lot to us as well.

 Regards,
 Ramu
 On Feb 9, 2014 11:06 AM, lars hofhansl la...@apache.org wrote:

  In a year or two you won't be able to buy 1T or even 2T disks cheaply.
  More spindles are good more cores are good too. This is a fuzzy art.
 
  A hard fact is that HBase cannot (at the moment) handle more than 8-10T
  per server with HBase, you'd  just have extra disks for IOPS.
  You won't be happy if you expect each server to store 24T.
 
  I would go with more and smaller servers. Some people run two
  RegionServers on a single machine, but that is not a well explored option
  at this point (up to recently it needed an HBase patch to work).
 
  You *definitely* have to do some benchmarking with your usecase. You
 might
  be able to get away with fewer servers, you need to test for that.
 
  -- Lars
 
 
 
 
  
   From: Ramu M S ramu.ma...@gmail.com
  To: user@hbase.apache.org
  Sent: Saturday, February 8, 2014 12:10 AM
  Subject: Re: Regarding Hardware configuration for HBase cluster
 
 
  Lars,
 
  What about high density storage servers that has capacity of up to 24
  drives. There were also some recommendations in few blogs about having 1
  core per disk.
 
  1TB disks have slight price difference compared to 600 GB. With
  negotiations it'll be as low as 50$. Also price difference between 8 core
  and 12 core processors is very less, 200-300$.
 
  Do you think having 20-24 cores and 24 1TB disks will also be an option?
 
  Regards,
  Ramu
 
  On Feb 8, 2014 11:19 AM, lars hofhansl la...@apache.org wrote:
 
   Let's not refer to our users in the third person. It's not polite :)
  
   Suresh,
  
   I wrote something up about RegionServer sizing here:
  
 
 http://hadoop-hbase.blogspot.com/2013/01/hbase-region-server-memory-sizing.html
  
   For your load I would guess that you'd need about 100 servers.
  
   That would:
   1. have 8TB/server
   2. 30m rows/day/server
   3. 30GB/day/server
  
   You not expect a single server to be able to absorb more than
 1rows/s
   or 40mb/s, whatever is less.
  
   The machines I'd size as follows:
   12-16 cores, HT, 1.8GHz-2.4GHz (more is better)
   32-96GB ram
   6-12 drives (more spindles are better to absorb the write load)
   10ge NICs and TopOfRack switches
  
   Now, this is only a *rough guideline* and obviously you'd have perform
   your own tests and this would only scale across if the machines if your
   keys are sufficiently distributed.
   The details also depend on how compressable your data is and your exact
   access patterns (read patters, spiky write load, etc)
   Start with 10 data nodes and appropriately scaled down load and see how
  it
   works.
  
   Vladimir is right here, you probably want to seek professional help.
  
   -- Lars
  
  
  
  
   
From: Vladimir Rodionov vrodio...@carrieriq.com
   To: user@hbase.apache.org user@hbase.apache.org
   Sent: Friday, February 7, 2014 10:29 AM
   Subject: RE: Regarding Hardware configuration for HBase cluster
  
  
   This guy is building system of a scale of Yahoo and asking user group
 how
   to size the cluster.
   Few people here can give him advice based on their experience and I am
  not
   one of them. I can
   only speculate on how many nodes will we need to consume 3TB/3B
 records
   daily.
  
   For this scale of a system its better to go to Cloudera/IBM/HW, and not
  to
   try to build it yourself,
   especially when you ask questions on user group (not answer them).
  
   Best regards,
   Vladimir Rodionov
   Principal Platform Engineer
   Carrier IQ, www.carrieriq.com
   e-mail: vrodio...@carrieriq.com
  
   
  
   From: Ted Yu [yuzhih...@gmail.com]
   Sent: Friday, February 07, 2014 6:27 AM
   To: user@hbase.apache.org
   Cc: user@hbase.apache.org
   Subject: Re: Regarding Hardware configuration for HBase cluster
  
   Have you read http://www.slideshare.net/larsgeorge/hbase-sizing-notes?
  
   Cheers
  
   On Feb 6, 2014, at 8:47 PM, suresh babu bigdatac...@gmail.com wrote:
  
Hi Stana,
   
We are trying to find out how many data nodes (including hardware
configuration detail)should be configured or setup for this
 requirement
   
-suresh
   
On Friday, February 7, 2014, stana st...@is-land.com.tw wrote:
   
HI suresh babu :
   
how many data nodes do you have?
   
   
2014-02-07 suresh babu bigdatac...@gmail.com javascript:;:
   
refreshing the thread,
   
Can you please  suggest any inputs for the hardware
 configuration(for
   the
below mentioned use case).

Re: Hbase Installation on top of HDFS high availability

2014-01-22 Thread Enis Söztutar

You can find the manual for HDP here:
http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.0.5.0/bk_system-admin-guide/content/ch_hadoop-ha.html

I think you have to configure your hbase root dir and client side hdfs
configuration.
Enis


On Wed, Jan 22, 2014 at 12:22 AM, jaksky stransky...@gmail.com wrote:

 I am using HBase 0.96.0 as I am running a Hortonworks HDP 2.0. The issue is
 that when my NameNode does perform the fail over the HBase gets completely
 stacked. I think that the problem is that HBase is still pointing to the
 previous namennode but couldn't find a manual how to set it up correctly.



 --
 View this message in context:
 http://apache-hbase.679495.n3.nabble.com/Hbase-Installation-on-top-of-HDFS-high-availability-tp4055067p4055099.html
 Sent from the HBase User mailing list archive at Nabble.com.

Re: Performance between HBaseClient scan and HFileReaderV2

2014-01-02 Thread Enis Söztutar

Nice test!

There is a couple of things here:

 (1) HFileReader reads only one file, versus, an HRegion reads multiple
files (into the KeyValueHeap) to do a merge scan. So, although there is
only one file, there is some overehead of doing a merge sort'ed read from
multiple files in the region. For a more realistic test, you can try to do
the reads using HRegion directly (instead of HFileReader). The overhead is
not that much though in my tests.
 (2) For scanning with client API, the results have to be serialized and
deserialized and send over the network (or loopback for local). This is
another overhead that is not there in HfileReader.
 (3) HBase scanner RPC implementation is NOT streaming. The RPC works like
fetching batch size (1) records, and cannot fully saturate the disk and
network pipeline.

In my tests for MapReduce over snapshot files (HBASE-8369), I have
measured 5x difference, because of layers (2) and (3). Please see my slides
at http://www.slideshare.net/enissoz/mapreduce-over-snapshots

I think we can do a much better job at (3), see HBASE-8691. However, there
will always be some overhead, although it should not be 5-8x.

As suggested above, in the meantime, you can take a look at the patch for
HBASE-8369, and https://issues.apache.org/jira/browse/HBASE-10076 to see
whether it suits your use case.

Enis


On Thu, Jan 2, 2014 at 1:43 PM, Sergey Shelukhin ser...@hortonworks.comwrote:

 Er, using MR over snapshots, which reads files directly...
 https://issues.apache.org/jira/browse/HBASE-8369
 However, it was only committed to 98.
 There was interest in 94 port (HBASE-10076), but it never happened...


 On Thu, Jan 2, 2014 at 1:42 PM, Sergey Shelukhin ser...@hortonworks.com
 wrote:

  You might be interested in using
  https://issues.apache.org/jira/browse/HBASE-8369
  However, it was only committed to 98.
  There was interest in 94 port (HBASE-10076), but it never happened...
 
 
  On Thu, Jan 2, 2014 at 1:32 PM, Jerry Lam chiling...@gmail.com wrote:
 
  Hello Vladimir,
 
  In my use case, I guarantee that a major compaction is executed before
 any
  scan happens because the system we build is a read only system. There
 will
  have no deleted cells. Additionally, I only need to read from a single
  column family and therefore I don't need to access multiple HFiles.
 
  Filter conditions are nice to have because if I can read HFile 8x faster
  than using HBaseClient, I can do the filter on the client side and still
  perform faster than using HBaseClient.
 
  Thank you for your input!
 
  Jerry
 
 
 
  On Thu, Jan 2, 2014 at 1:30 PM, Vladimir Rodionov
  vrodio...@carrieriq.comwrote:
 
   HBase scanner MUST guarantee correct order of KeyValues (coming from
   different HFile's),
   filter condition+ filter condition on included column families and
   qualifiers, time range, max versions and correctly process deleted
  cells.
   Direct HFileReader does nothing from the above list.
  
   Best regards,
   Vladimir Rodionov
   Principal Platform Engineer
   Carrier IQ, www.carrieriq.com
   e-mail: vrodio...@carrieriq.com
  
   
   From: Jerry Lam [chiling...@gmail.com]
   Sent: Thursday, January 02, 2014 7:56 AM
   To: user
   Subject: Re: Performance between HBaseClient scan and HFileReaderV2
  
   Hi Tom,
  
   Good point. Note that I also ran the HBaseClient performance test
  several
   times (as you can see from the chart). The caching should also benefit
  the
   second time I ran the HBaseClient performance test not just
 benefitting
  the
   HFileReaderV2 test.
  
   I still don't understand what makes the HBaseClient performs so poorly
  in
   comparison to access directly HDFS. I can understand maybe a factor
 of 2
   (even that it is too much) but a factor of 8 is quite unreasonable.
  
   Any hint?
  
   Jerry
  
  
  
   On Sun, Dec 29, 2013 at 9:09 PM, Tom Hood tom.w.h...@gmail.com
 wrote:
  
I'm also new to HBase and am not familiar with HFileReaderV2.
   However,
   in
your description, you didn't mention anything about clearing the
  linux OS
cache between tests.  That might be why you're seeing the big
  difference
   if
you ran the HBaseClient test first, it may have warmed the OS cache
  and
then HFileReaderV2 benefited from it.  Just a guess...
   
-- Tom
   
   
   
On Mon, Dec 23, 2013 at 12:18 PM, Jerry Lam chiling...@gmail.com
   wrote:
   
 Hello HBase users,

 I just ran a very simple performance test and would like to see if
   what I
 experienced make sense.

 The experiment is as follows:
 - I filled a hbase region with 700MB data (each row has roughly 45
columns
 and the size is 20KB for the entire row)
 - I configured the region to hold 4GB (therefore no split occurs)
 - I ran compactions after the data is loaded and make sure that
  there
   is
 only 1 region in the table under test.
 - No other table exists in the hbase cluster because this

Re: Problems with hbase.hregion.max.filesize

2013-12-19 Thread Enis Söztutar

If the split takes too long (longer than 30 secs), I would say you may have
too many store files in the region. Split has to write two tiny files per
store file. The other thing may be the region has to be closed before
split. Thus it has to do a flush. If it cannot complete the flush in time,
it might cancel the split as well. Did you check that? Does your
compactions working as intended?

Enis

On Wed, Dec 18, 2013 at 10:06 AM, Timo Schaepe t...@timoschaepe.de wrote:

@Ted Yu:
Yep, nevertheless thanks a lot!

Am 18.12.2013 um 10:03 schrieb Ted Yu yuzhih...@gmail.com:

Timo:
I went through namenode log and didn't find much clue.

Cheers

On Tue, Dec 17, 2013 at 9:37 PM, Timo Schaepe t...@timoschaepe.de
wrote:

Hey Ted Yu,

I had digging the name node log and so far I've found nothing special.
No
Exception, FATAL or ERROR message nor anything other peculiarities.
Only I see a lot of messages like this:

2013-12-12 13:53:22,541 INFO org.apache.hadoop.hdfs.StateChange:
Removing
lease on

/hbase/Sessions_1091/d04cadb1b2252dafc476c138e9651ca7/.splits/9717de41277e207c24359a18dae72cd3/l/58ab2c11ca9b4b4994ce54bac0bb4c68.d04cadb1b2252dafc476c138e9651ca7
from client DFSClient_hb_rs_baur-hbase7.baur.boreus.de
,60020,1386712527761_1295065721_26
2013-12-12 13:53:22,541 INFO org.apache.hadoop.hdfs.StateChange: DIR*
completeFile:

/hbase/Sessions_1091/d04cadb1b2252dafc476c138e9651ca7/.splits/9717de41277e207c24359a18dae72cd3/l/58ab2c11ca9b4b4994ce54bac0bb4c68.d04cadb1b2252dafc476c138e9651ca7
is closed by DFSClient_hb_rs_baur-hbase7.baur.boreus.de
,60020,1386712527761_1295065721_26

But maybe that is normal. If you wanna have a look, you can find the log
snippet at

https://www.dropbox.com/s/8sls714knn4yqp3/hadoop-hadoop-namenode-baur-hbase1.log.2013-12-12.snip

Thanks,

Timo

Am 14.12.2013 um 09:12 schrieb Ted Yu yuzhih...@gmail.com:

Timo:
Other than two occurrences of 'Took too long to split the files'
@ 13:54:20,194 and 13:55:10,533, I don't find much clue from the posted
log.

If you have time, mind checking namenode log for 1 minute interval
leading
up to 13:54:20,194 and 13:55:10,533, respectively ?

Thanks

On Sat, Dec 14, 2013 at 5:21 AM, Timo Schaepe t...@timoschaepe.de
wrote:

Hey,

@JM: Thanks for the hint with hbase.regionserver.fileSplitTimeout. At
the
moment (the import is actually working) and after I splittet the
specific
regions manually, we do not have growing regions anymore.

hbase hbck says, all things are going fine.
0 inconsistencies detected.
Status: OK

@Ted Yu: Sure, have a look here: http://pastebin.com/2ANFVZEU
The relevant tablename ist data_1091.

Thanks for your time.

Timo

Am 13.12.2013 um 20:18 schrieb Ted Yu yuzhih...@gmail.com:

Timo:
Can you pastebin regionserver log around 2013-12-12 13:54:20 so that
we
can
see what happened ?

Thanks

On Fri, Dec 13, 2013 at 11:02 AM, Jean-Marc Spaggiari
jean-m...@spaggiari.org wrote:

Try to increase hbase.regionserver.fileSplitTimeout but put it back
to
its
default value after.

Default value is 30 seconds. I think it's not normal for a split to
take
more than that.

What is your hardware configuration?

Have you run hbck to see if everything is correct?

2013/12/13 Timo Schaepe t...@timoschaepe.de

Hello again,

digging in the logs of the specific regionserver shows me that:

2013-12-12 13:54:20,194 INFO
org.apache.hadoop.hbase.regionserver.SplitRequest: Running
rollback/cleanup
of failed split of

data,OR\x83\xCF\x02\x82\xAE\xF3U,1386851456415.d04cadb1b2252dafc476c138e9651ca7.;
Took too long to split the files and create the references,
aborting
split

This message appears two time, so it seems, that HBase tried to
split
the
region but it failed. I don't know why. How is the behaviour of
HBase,
if a
region split fails? Are there more tries to split this region
again?
I
didn't find any new tries in the log. Now I split the big regions
manually
and this works. And also it seems, that HBase split the new regions
again
to crunch they down to the given limit.

But also it is a mystery for me, why the split size in Hannibal
shows
me
10 GB and in base-site.xml I put 2 GB…

Thanks,

Timo

Am 13.12.2013 um 10:22 schrieb Timo Schaepe t...@timoschaepe.de:

Hello,

during the loading of data in our cluster I noticed some strange
behavior of some regions, that I don't understand.

Scenario:
We convert data from a mysql database to HBase. The data is
inserted
with a put to the specific HBase table. The row key is a
timestamp. I
know
the problem with timestamp keys, but in our requirement it works
quiet
well. The problem is now, that there are some regions, which are
growing
and growing.

For example the table on the picture [1]. First, all data was

Re: hbase log4j.properties can we change to DRFA to RFA.

2013-10-17 Thread Enis Söztutar

You can change the log4j configuration for your deployment as you see fit.
DRFA is however a better default for the general case.
You might also want to ensure that you are not running with DEBUG logging
enabled in a production cluster.

Enis


On Mon, Oct 14, 2013 at 11:01 PM, sreenivasulu y
sreenivasul...@huawei.comwrote:

 Hi,

 Can we change log4j.properties
 DailyRollingFileAppender to RollingFileAppender
 With some specific size of the log file.

 Because of, with in a day may be a log file size will become 10GB to 12GB
 (In our cluster its happen), because of some failure in the hbase cluster.
 In hdfs component and all log rolling is done with RFA only.

 Please provide your comments for changing log4j.properties file.

Upcoming HBase bay area user and dev meetups

2013-10-07 Thread Enis Söztutar

Hi guys,

I just wanted to give a heads up on upcoming bay area user and dev meetups
which will happen on the same day, October 24th. ( special thanks Stack for
pushing this.)

The user meetup will start at 6:30, and the talks scheduled so far are:

+ Steven Noels will talk about using the Lily Indexer to search your HBase
content: http://ngdata.github.io/hbase-indexer/
+ St.Ack will talk about what is in hbase-0.96.0
+ Enis will talk about Mapreduce over HBase snapshots (HBASE-8369)

There will be food and beers as usual. The event page is at
http://www.meetup.com/hbaseusergroup/events/140759692/. Please write me or
Stack off-list if you want to give a talk. There is still room for one more
talk.


The dev meetup will start at 4pm. Some of the suggested topics include:
+ When is 0.98.0?
+ When is 1.0?  What makes for an HBase 1.0.
+ Assignment Manager
+ What next on MTTR?

The event page is at: http://www.meetup.com/hackathon/events/144366512/.
Feel free to suggest / bring up topics that you think is important for
post-0.96.

Enis

Re: Shell on windows

2013-10-07 Thread Enis Söztutar

Hbase shell is just the jruby shell with some custom methods imported. Are
you running with powershell or cmd? You can just boot up jruby to see
whether there is any difference there.

Enis


On Wed, Oct 2, 2013 at 7:03 AM, Sznajder ForMailingList 
bs4mailingl...@gmail.com wrote:

 Hi

 I am running HBASE on windows.

 However, a strange behavior when launching the shell...
 We do not see the prompt (as we see it in linux). It is not so disturbing,
 but just asking

 Benjamin

Re: Spatial data posting in HBase

2013-10-07 Thread Enis Söztutar

You can look at the HBase in Action book, which contains a whole chapter
on an example GIS system on HBase.

Enis


On Fri, Oct 4, 2013 at 1:01 AM, Adrien Mogenet adrien.moge...@gmail.comwrote:

 If you mean insert and query spatial data, look at algorithms that are
 distributed databases compliant : geohash, z-index, voronoi diagram...

 Well, that makes me want to write a blog article about these topics :)


 On Tue, Sep 24, 2013 at 3:43 PM, Ted Yu yuzhih...@gmail.com wrote:

  There're plenty of examples in unit tests.
  e.g. :
 
Put put = new Put(Bytes.toBytes(row + String.format(%1$04d,
 i)));
put.add(family, null, value);
table.put(put);
 
  value can be obtained through Bytes.toBytes().
  table is an HTable.
 
  Cheers
 
 
  On Tue, Sep 24, 2013 at 4:15 AM, cto ankur...@tcs.com wrote:
 
   Hi ,
  
   I am very new in HBase. Could you please let me know , how to insert
   spatial
   data (Latitude / Longitude) in HBase using Java .
  
  
  
   --
   View this message in context:
  
 
 http://apache-hbase.679495.n3.nabble.com/Spatial-data-posting-in-HBase-tp4051123.html
   Sent from the HBase User mailing list archive at Nabble.com.
  
 



 --
 Adrien Mogenet
 http://www.borntosegfault.com

Re: Hbase in embedded mode

2013-09-19 Thread Enis Söztutar

Right now we do not have what you suggest.

Eric has created an issue for this:
https://issues.apache.org/jira/browse/HBASE-8016

I think it makes a lot of sense, especially enabling HRegion as a library
to work on top of shared hdfs and building a simple layer to embed the
client side, etc.

The closes thing right now, is MiniHBaseCluster, but that requires an
in-memory zookeeper, master, regionserver, etc and still uses rpc's.

Enis


On Thu, Sep 19, 2013 at 11:29 AM, samar kumar samar.opensou...@gmail.comwrote:

 Hi Ted
I am aware of the stand alone mode but I was looking for something which
 will not have any ipc calls.  everything should be a local api call.

 so  no listen to ports. eg embed dbs like derby does.

 Regards
 Samar
 On 19 Sep 2013 19:20, Ted Yu yuzhih...@gmail.com wrote:

  See 2.2.1 in http://hbase.apache.org/book.html#standalone_dist
 
  On Sep 19, 2013, at 6:49 AM, samar.opensource 
  samar.opensou...@gmail.com wrote:
 
   Hi Guys,
  Can we use HBase in a embedded more. So whole HBase should start in
  the same JVM and there should no RPC calls. Something like our embedded
  java dbs.
  
   Do we have some like this or something close to this.
   Regards,
   Samar

Re: Please welcome our newest committer, Rajeshbabu Chintaguntla

2013-09-11 Thread Enis Söztutar

Congrats and welcome aboard.


On Wed, Sep 11, 2013 at 10:08 AM, Jimmy Xiang jxi...@cloudera.com wrote:

 Congrats!


 On Wed, Sep 11, 2013 at 9:54 AM, Stack st...@duboce.net wrote:

  Hurray for Rajesh!
 
 
  On Wed, Sep 11, 2013 at 9:17 AM, ramkrishna vasudevan 
  ramkrishna.s.vasude...@gmail.com wrote:
 
   Hi All,
  
   Please join me in welcoming Rajeshbabu (Rajesh) as our new HBase
  committer.
   Rajesh has been there for more than a year and has been solving some
 very
   good bugs around the Assignment Manger area.  He has been working on
  other
   stuff like HBase-Mapreduce performance improvement, migration scripts
 and
   off late in the Secondary Index related things.
  
   Rajesh has made his first commit to the pom.xml already.
   Once again, congratulations and welcome to this new role (smile).
  
   Cheers
   Ram

Please welcome our newest committer, Nick Dimiduk

2013-09-10 Thread Enis Söztutar

Hi,

Please join me in welcoming Nick as our new addition to the list of
committers. Nick is exceptionally good with user-facing issues, and has
done major contributions in mapreduce related areas, hive support, as well
as 0.96 issues and the new and shiny data types API.

Nick, as tradition, feel free to do your first commit to add yourself to
pom.xml.

Cheers,
Enis

Re: HBase - stable versions

2013-09-04 Thread Enis Söztutar

As long as there is interest for 0.94, we will care for 0.94. However, when
0.96.0 comes out, it will be marked as the next stable release, so I expect
that we would promote newcomers that branch.

Any committer can propose any branch and release candidate any time, so if
there are road blocks for 0.94.x mainline, you might as well propose a new
branch.

Enis


On Wed, Sep 4, 2013 at 4:29 PM, Varun Sharma va...@pinterest.com wrote:

 We, at Pinterest, are also going to stay on 0.94 for a while since it has
 worked well for us and we don't have the resources to test 0.96 in the EC2
 environment. That may change in the future but we don't know when...


 On Wed, Sep 4, 2013 at 1:53 PM, Andrew Purtell apurt...@apache.org
 wrote:

  If LarsH is willing to stay on as RM for 0.94 then IMHO we should proceed
  as today with the exception that 0.96 is what the stable symlink points
 to.
 
  As long as 0.94 has someone willing to RM and users such as Salesforce
 then
  there will be individuals there and in the community motivated to keep it
  in good working order with occasional point releases. We should not throw
  up roadblocks or adopt an arbitrary policy, as long as new features
 arrive
  in the branch as backports, and the changes maintain our point release
  compatibility criteria (rolling restarts possible, no API regressions).
 
 
  On Tue, Sep 3, 2013 at 5:30 PM, lars hofhansl la...@apache.org wrote:
 
   With 0.96 being imminent we should start a discussion about continuing
   support for 0.94.
  
   0.92 became stale pretty soon after 0.94 was released.
   The relationship between 0.94 and 0.96 is slightly different, though:
  
   1. 0.92.x could be upgraded to 0.94.x without downtime
   2. 0.92 clients and servers are mutually compatible with 0.94 clients
 and
   servers
   3. the user facing API stayed backward compatible
  
   None of the above is true when moving from 0.94 to 0.96+.
   Upgrade from 0.94 to 0.96 will require a one-way upgrade process
  including
   downtime, and client and server need to be upgraded in lockstep.
  
   I would like to have an informal poll about who's using 0.94 and is
   planning to continue to use it; and who is planning to upgrade from
 0.94
  to
   0.96.
   Should we officially continue support for 0.94? How long?
  
   Thanks.
  
   -- Lars
  
 
 
 
  --
  Best regards,
 
 - Andy
 
  Problems worthy of attack prove their worth by hitting back. - Piet Hein
  (via Tom White)

Re: Chocolatey package for Windows

2013-08-21 Thread Enis Söztutar

Hi,

Agreed with what Nick said. There is also an MSI based installation for
HBase as a part of HDP-1.3 package. You can check it out here:
http://hortonworks.com/products/hdp-windows/

Enis


On Tue, Aug 20, 2013 at 2:54 PM, Nick Dimiduk ndimi...@gmail.com wrote:

 Hi Andrew,

 I don't think the homebrew recipes are managed by an HBase developer.
 Rather, someone in the community has taken it upon themselves to
 provide the project through brew. Likewise, the Apache HBase project does
 not provide RPM or DEB packages, but you're likely to find them if you look
 around.

 Maybe you can find a willing maintainer on the users@ list? (I don't run
 Windows very often so I won't make a good volunteer)

 Thanks,
 Nick

 On Tuesday, August 20, 2013, Andrew Pennebaker wrote:

  Could we automate the installation process for Windows with a
  Chocolateyhttp://chocolatey.org/package, the way we offer a
  Homebrew
  https://github.com/mxcl/homebrew/blob/master/Library/Formula/hbase.rb
  formula
  for Mac OS X?

Re: Master server abort

2013-07-11 Thread Enis Söztutar

I've seen a similar stack trace in some test as well, and opened the issue
https://issues.apache.org/jira/browse/HBASE-8912 for tracking this.

This looks like a problem in AssignmentManager that fails to recognize a
valid state transition, but I did not have the time to look into it
further. We'll spend some time to fix this issue, given that this affect
production deployments.

Can you please attach your logs at the issue as well.

Enis


On Thu, Jul 11, 2013 at 11:10 AM, Vladimir Rodionov vrodio...@carrieriq.com
 wrote:

 This is happening in  one of our small QA cluster.
 HBase 0.94.6.1 (CDH 4.3.0)

 1 master + 5 RS. Zk quorum is 1 (on master node)

 We can not start the cluster:

 In a log file I find some ERROR's and FATALs . FATAL's come first followed
 by ERRORs (this is important):

 FATALs:

 2013-07-10 19:42:00,376 INFO
 org.apache.hadoop.hbase.master.AssignmentManager: The master has opened the
 region
 SMALL_GOLDENROD_2012-IDPROFILES,31,1363783108271.77a4640bfaecc907e0ea3535a16c56a8.
 that was online on sjc1-eng-qa04.carrieriq.com,60020,1373485278882
 2013-07-10 19:42:00,376 INFO
 org.apache.hadoop.hbase.master.AssignmentManager: The master has opened the
 region
 TEST_MM5550_INDEX-UPLOADS,9C,1363689771995.2ee2e6b81ee44ff790abf38275698d45.
 that was online on sjc1-eng-qa03.carrieriq.com,60020,1373485278616
 2013-07-10 19:42:00,376 FATAL org.apache.hadoop.hbase.master.HMaster:
 Master server abort: loaded coprocessors are: []
 2013-07-10 19:42:00,376 INFO
 org.apache.hadoop.hbase.master.AssignmentManager: The master has opened the
 region
 SMALL_GOLDENROD_2012-IDPROFILES,F8,1363783108280.a0b1b6d003df84ca1404af942bcc9fbc.
 that was online on sjc1-eng-qa02.carrieriq.com,60020,1373485278611
 2013-07-10 19:42:00,376 INFO
 org.apache.hadoop.hbase.master.AssignmentManager: The master has opened the
 region
 TEST_MM5550_INDEX-UPLOADS,E0,1363689771998.36db41b10c86ac537542104c87950709.
 that was online on sjc1-eng-qa06.carrieriq.com,60020,1373485278668
 2013-07-10 19:42:00,376 INFO
 org.apache.hadoop.hbase.master.AssignmentManager: The master has opened the
 region
 SMALL_GOLDENROD_2012-IDPROFILES,FD,1363783108280.95ac4fb83f1bc2753aca0f6a914f6ff2.
 that was online on sjc1-eng-qa02.carrieriq.com,60020,1373485278611
 2013-07-10 19:42:00,376 INFO
 org.apache.hadoop.hbase.master.AssignmentManager: The master has opened the
 region
 TEST_MM5550_INDEX-UPLOADS,E5,1363689771999.65440a41f85b9dd70afd669280491363.
 that was online on sjc1-eng-qa06.carrieriq.com,60020,1373485278668
 2013-07-10 19:42:00,376 INFO
 org.apache.hadoop.hbase.master.AssignmentManager: The master has opened the
 region
 SMALL_GOLDENROD_2012-IDPROFILES,34,1363783108271.979f840723771588cf65910183ecf55c.
 that was online on sjc1-eng-qa04.carrieriq.com,60020,1373485278882
 2013-07-10 19:42:00,376 INFO
 org.apache.hadoop.hbase.master.AssignmentManager: The master has opened the
 region
 TEST_MM5550_INDEX-UPLOADS,A0,1363689771995.45f193f3dcfcfa76e705b5fa020e4309.
 that was online on sjc1-eng-qa03.carrieriq.com,60020,1373485278616
 2013-07-10 19:42:00,377 FATAL org.apache.hadoop.hbase.master.HMaster:
 Unexpected state :
 packageindex,C000,1362756765100.7287ded900b6f6c14f22db5f9ae15d32.
 state=PENDING_OPEN, ts=1373485320376, 
 server=sjc1-eng-qa06.carrieriq.com,60020,1373485278668
 .. Cannot transit it to OFFLINE.
 java.lang.IllegalStateException: Unexpected state :
 packageindex,C000,1362756765100.7287ded900b6f6c14f22db5f9ae15d32.
 state=PENDING_OPEN, ts=1373485320376, 
 server=sjc1-eng-qa06.carrieriq.com,60020,1373485278668
 .. Cannot transit it to OFFLINE.
 at
 org.apache.hadoop.hbase.master.AssignmentManager.setOfflineInZooKeeper(AssignmentManager.java:1820)
 at
 org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1659)
 at
 org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1424)
 at
 org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1399)
 at
 org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1394)
 at
 org.apache.hadoop.hbase.master.handler.ClosedRegionHandler.process(ClosedRegionHandler.java:105)
 at
 org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:175)
 at
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
 2013-07-10 19:42:00,377 INFO
 org.apache.hadoop.hbase.master.AssignmentManager: The master has opened the
 region XXX-IDPROFILES,36,1363783108271.65e59eae29b9d61d294d2dfcc3f08bd9.
 that was online on sjc1-eng-qa04.carrieriq.com,60020,1373485278882
 2013-07-10 19:42:00,379 INFO org.apache.hadoop.hbase.master.HMaster:
 Aborting


 All errors in a log files are logged after the master started abort
 operation.

 There are two types of ERRORs:

 Type 1:

Re: Poor HBase map-reduce scan performance

2013-07-01 Thread Enis Söztutar

Bryan,

3.6x improvement seems exciting. The ballpark difference between HBase scan
and hdfs scan is in that order, so it is expected I guess.

I plan to get back to the trunk patch, add more tests etc next week. In the
mean time, if you have any changes to the patch, pls attach the patch.

Enis


On Mon, Jul 1, 2013 at 3:59 AM, lars hofhansl la...@apache.org wrote:

 Absolutely.



 - Original Message -
 From: Ted Yu yuzhih...@gmail.com
 To: user@hbase.apache.org
 Cc:
 Sent: Sunday, June 30, 2013 9:32 PM
 Subject: Re: Poor HBase map-reduce scan performance

 Looking at the tail of HBASE-8369, there were some comments which are yet
 to be addressed.

 I think trunk patch should be finalized before backporting.

 Cheers

 On Mon, Jul 1, 2013 at 12:23 PM, Bryan Keller brya...@gmail.com wrote:

  I'll attach my patch to HBASE-8369 tomorrow.
 
  On Jun 28, 2013, at 10:56 AM, lars hofhansl la...@apache.org wrote:
 
   If we can make a clean patch with minimal impact to existing code I
  would be supportive of a backport to 0.94.
  
   -- Lars
  
  
  
   - Original Message -
   From: Bryan Keller brya...@gmail.com
   To: user@hbase.apache.org; lars hofhansl la...@apache.org
   Cc:
   Sent: Tuesday, June 25, 2013 1:56 AM
   Subject: Re: Poor HBase map-reduce scan performance
  
   I tweaked Enis's snapshot input format and backported it to 0.94.6 and
  have snapshot scanning functional on my system. Performance is
 dramatically
  better, as expected i suppose. I'm seeing about 3.6x faster performance
 vs
  TableInputFormat. Also, HBase doesn't get bogged down during a scan as
 the
  regionserver is being bypassed. I'm very excited by this. There are some
  issues with file permissions and library dependencies but nothing that
  can't be worked out.
  
   On Jun 5, 2013, at 6:03 PM, lars hofhansl la...@apache.org wrote:
  
   That's exactly the kind of pre-fetching I was investigating a bit ago
  (made a patch, but ran out of time).
   This pre-fetching is strictly client only, where the client keeps the
  server busy while it is processing the previous batch, but filling up a
 2nd
  buffer.
  
  
   -- Lars
  
  
  
   
   From: Sandy Pratt prat...@adobe.com
   To: user@hbase.apache.org user@hbase.apache.org
   Sent: Wednesday, June 5, 2013 10:58 AM
   Subject: Re: Poor HBase map-reduce scan performance
  
  
   Yong,
  
   As a thought experiment, imagine how it impacts the throughput of TCP
 to
   keep the window size at 1.  That means there's only one packet in
 flight
   at a time, and total throughput is a fraction of what it could be.
  
   That's effectively what happens with RPC.  The server sends a batch,
  then
   does nothing while it waits for the client to ask for more.  During
 that
   time, the pipe between them is empty.  Increasing the batch size can
  help
   a bit, in essence creating a really huge packet, but the problem
  remains.
   There will always be stalls in the pipe.
  
   What you want is for the window size to be large enough that the pipe
 is
   saturated.  A streaming API accomplishes that by stuffing data down
 the
   network pipe as quickly as possible.
  
   Sandy
  
   On 6/5/13 7:55 AM, yonghu yongyong...@gmail.com wrote:
  
   Can anyone explain why client + rpc + server will decrease the
  performance
   of scanning? I mean the Regionserver and Tasktracker are the same
 node
   when
   you use MapReduce to scan the HBase table. So, in my understanding,
  there
   will be no rpc cost.
  
   Thanks!
  
   Yong
  
  
   On Wed, Jun 5, 2013 at 10:09 AM, Sandy Pratt prat...@adobe.com
  wrote:
  
   https://issues.apache.org/jira/browse/HBASE-8691
  
  
   On 6/4/13 6:11 PM, Sandy Pratt prat...@adobe.com wrote:
  
   Haven't had a chance to write a JIRA yet, but I thought I'd pop in
  here
   with an update in the meantime.
  
   I tried a number of different approaches to eliminate latency and
   bubbles in the scan pipeline, and eventually arrived at adding a
   streaming scan API to the region server, along with refactoring the
   scan
   interface into an event-drive message receiver interface.  In so
   doing, I
   was able to take scan speed on my cluster from 59,537 records/sec
  with
   the
   classic scanner to 222,703 records per second with my new scan API.
   Needless to say, I'm pleased ;)
  
   More details forthcoming when I get a chance.
  
   Thanks,
   Sandy
  
   On 5/23/13 3:47 PM, Ted Yu yuzhih...@gmail.com wrote:
  
   Thanks for the update, Sandy.
  
   If you can open a JIRA and attach your producer / consumer scanner
   there,
   that would be great.
  
   On Thu, May 23, 2013 at 3:42 PM, Sandy Pratt prat...@adobe.com
   wrote:
  
   I wrote myself a Scanner wrapper that uses a producer/consumer
   queue to
   keep the client fed with a full buffer as much as possible.  When
   scanning
   my table with scanner caching at 100 records, I see about a 24%
   uplift
   in
   performance (~35k records/sec with

Re: Poor HBase map-reduce scan performance

2013-05-29 Thread Enis Söztutar

Hi,

Regarding running raw scans on top of Hfiles, you can try a version of the
patch attached at https://issues.apache.org/jira/browse/HBASE-8369, which
enables exactly this. However, the patch is for trunk.

In that, we open one region from snapshot files in each record reader, and
run a scan through using an internal region scanner. Since this bypasses
the client + rpc + server daemon layers, it should be able to give optimum
scan performance.

There is also a tool called HFilePerformanceBenchmark that intends to
measure raw performance for HFiles. I've had to do a lot of changes to make
is workable, but it might be worth to take a look to see whether there is
any perf difference between scanning a sequence file from hdfs vs scanning
an hfile.

Enis


On Fri, May 24, 2013 at 10:50 PM, lars hofhansl la...@apache.org wrote:

 Sorry. Haven't gotten to this, yet.

 Scanning in HBase being about 3x slower than straight HDFS is in the right
 ballpark, though. It has to a bit more work.

 Generally, HBase is great at honing in to a subset (some 10-100m rows) of
 the data. Raw scan performance is not (yet) a strength of HBase.

 So with HDFS you get to 75% of the theoretical maximum read throughput;
 hence with HBase you to 25% of the theoretical cluster wide maximum disk
 throughput?


 -- Lars



 - Original Message -
 From: Bryan Keller brya...@gmail.com
 To: user@hbase.apache.org
 Cc:
 Sent: Friday, May 10, 2013 8:46 AM
 Subject: Re: Poor HBase map-reduce scan performance

 FYI, I ran tests with compression on and off.

 With a plain HDFS sequence file and compression off, I am getting very
 good I/O numbers, roughly 75% of theoretical max for reads. With snappy
 compression on with a sequence file, I/O speed is about 3x slower. However
 the file size is 3x smaller so it takes about the same time to scan.

 With HBase, the results are equivalent (just much slower than a sequence
 file). Scanning a compressed table is about 3x slower I/O than an
 uncompressed table, but the table is 3x smaller, so the time to scan is
 about the same. Scanning an HBase table takes about 3x as long as scanning
 the sequence file export of the table, either compressed or uncompressed.
 The sequence file export file size ends up being just barely larger than
 the table, either compressed or uncompressed

 So in sum, compression slows down I/O 3x, but the file is 3x smaller so
 the time to scan is about the same. Adding in HBase slows things down
 another 3x. So I'm seeing 9x faster I/O scanning an uncompressed sequence
 file vs scanning a compressed table.


 On May 8, 2013, at 10:15 AM, Bryan Keller brya...@gmail.com wrote:

  Thanks for the offer Lars! I haven't made much progress speeding things
 up.
 
  I finally put together a test program that populates a table that is
 similar to my production dataset. I have a readme that should describe
 things, hopefully enough to make it useable. There is code to populate a
 test table, code to scan the table, and code to scan sequence files from an
 export (to compare HBase w/ raw HDFS). I use a gradle build script.
 
  You can find the code here:
 
  https://dl.dropboxusercontent.com/u/6880177/hbasetest.zip
 
 
  On May 4, 2013, at 6:33 PM, lars hofhansl la...@apache.org wrote:
 
  The blockbuffers are not reused, but that by itself should not be a
 problem as they are all the same size (at least I have never identified
 that as one in my profiling sessions).
 
  My offer still stands to do some profiling myself if there is an easy
 way to generate data of similar shape.
 
  -- Lars
 
 
 
  
  From: Bryan Keller brya...@gmail.com
  To: user@hbase.apache.org
  Sent: Friday, May 3, 2013 3:44 AM
  Subject: Re: Poor HBase map-reduce scan performance
 
 
  Actually I'm not too confident in my results re block size, they may
 have been related to major compaction. I'm going to rerun before drawing
 any conclusions.
 
  On May 3, 2013, at 12:17 AM, Bryan Keller brya...@gmail.com wrote:
 
  I finally made some progress. I tried a very large HBase block size
 (16mb), and it significantly improved scan performance. I went from 45-50
 min to 24 min. Not great but much better. Before I had it set to 128k.
 Scanning an equivalent sequence file takes 10 min. My random read
 performance will probably suffer with such a large block size
 (theoretically), so I probably can't keep it this big. I care about random
 read performance too. I've read having a block size this big is not
 recommended, is that correct?
 
  I haven't dug too deeply into the code, are the block buffers reused
 or is each new block read a new allocation? Perhaps a buffer pool could
 help here if there isn't one already. When doing a scan, HBase could reuse
 previously allocated block buffers instead of allocating a new one for each
 block. Then block size shouldn't affect scan performance much.
 
  I'm not using a block encoder. Also, I'm still sifting through the
 profiler results, I'll

Re: Does Hadoop 1.0.4 provide a durable sync for HBase-0.94.6?

2013-05-28 Thread Enis Söztutar

Hi,

HDFS has two interfaces for durability: hflush and hsync:

Hflush() : Flush the data packet down the datanode pipeline. Wait for
ack’s.
Hsync() : Flush the data packet down the pipeline. Have datanodes execute
FSYNC equivalent. Wait for ack’s.

There is some work on adding a Durability API in HBase: see HBASE-7801 and
HBASE-8375.

However, as Stack mentioned, without HBASE-5954 is fixed, HBase right now
cannot make use of the hsync() API. I want to rebase the patch in
HBASE-5954, but it might take some more time.

The good news is that although not perfect, hflush, which is current
default makes sure that the update is send to 3 replicas, so unless there
is a data center power failure or similar, the data will make into the
disks pretty quickly.

Hope this helps.
Enis




On Tue, May 28, 2013 at 9:53 AM, Stack st...@duboce.net wrote:

 On Tue, May 28, 2013 at 7:09 AM, jingguo yao yaojing...@gmail.com wrote:

  Section 2.1.3 says that Hadoop 1.0.4 works with HBase-0.94.x [1]. And
  Section 2.1.3.3 says that 1.0.4 has a working durable sync. But when I
  check the source code of DFSClient.DFSOutputStream's sync method, I
  finds the following javadoc:
 
  /**
   * All data is written out to datanodes. It is not guaranteed
   * that data has been flushed to persistent store on the
   * datanode. Block allocations are persisted on namenode.
   */
 
  So it seems that sync does not support a durable sync. It contradicts
  with [1].
 
  Can anybody help me on this confusion? Thanks.



 This issue is probably the best source for the state of sync in hbase (and
 hdfs): https://issues.apache.org/jira/browse/HBASE-5954

 In short, the refguide is misleading -- let me fix -- as 1.0.4 indeed has a
 sync but it is just a sync to the memory of three datanodes, not a true
 fsync out to disk.  The above cited issue is tracking issues that our Lars
 and other have contributed to HDFS to add fsync support.

 Yours,
 St.Ack

Re: Schema Design Question

2013-04-26 Thread Enis Söztutar

Hi,

Interesting use case. I think it depends on job many jobId's you expect to
have. If it is on the order of thousands, I would caution against going the
one table per jobid approach, since for every table, there is some master
overhead, as well as file structures in hdfs. If jobId's are managable,
going with separate tables makes sense if you want to efficiently delete
all the data related to a job.

Also pre-splitting will depend on expected number of jobIds / batchIds and
their ranges vs desired number of regions. You would want to keep number of
regions hosted by a single region server in the low tens, thus, your splits
can be across jobs or within jobs depending on cardinality. Can you share
some more?

Enis


On Fri, Apr 26, 2013 at 2:34 PM, Ted Yu yuzhih...@gmail.com wrote:

 My understanding of your use case is that data for different jobIds would
 be continuously loaded into the underlying table(s).

 Looks like you can have one table per job. This way you drop the table
 after map reduce is complete. In the single table approach, you would
 delete many rows in the table which is not as fast as dropping the separate
 table.

 Cheers

 On Sat, Apr 27, 2013 at 3:49 AM, Cameron Gandevia cgande...@gmail.com
 wrote:

  Hi
 
  I am new to HBase, I have been trying to POC an application and have a
  design questions.
 
  Currently we have a single table with the following key design
 
  jobId_batchId_bundleId_uniquefileId
 
  This is an offline processing system so data would be bulk loaded into
  HBase via map/reduce jobs. We only need to support report generation
  queries using map/reduce over a batch (And possibly a single column
 filter)
  with the batchId as the start/end scan key. Once we have finished
  processing a job we are free to remove the data from HBase.
 
  We have varied workloads so a job could be made up of 10 rows, 100,000
 rows
  or 1 billion rows with the average falling somewhere around 10 million
  rows.
 
  My question is related to pre-splitting. If we have a billion rows all
 with
  the same batchId (Our map/reduce scan key) my understanding is we should
  perform pre-splitting to create buckets hosted by different regions. If a
  jobs workload can be so varied would it make sense to have a single table
  containing all jobs? Or should we create 1 table per job and pre-split
 the
  table for the given workload? If we had separate table we could drop them
  when no longer needed.
 
  If we didn't have a separate table per job how should we perform
 splitting?
  Should we choose our largest possible workload and split for that? even
  though 90% of our jobs would fall in the lower bound in terms of row
 count.
  Would we experience any issue purging jobs of varying sizes if everything
  was in a single table?
 
  any advice would be greatly appreciated.
 
  Thanks

1 2 >

1 - 100 of 120 matches

Mail list logo