Re: auto-added RegionServers

2012-05-23 Thread Bryan Beaudreault
We use puppet and ldap to assign cluster names to servers as we spin them in 
ec2. Our configs are pushed by fabric, which is organized by these cluster 
names. This makes it harder to assign a server to the wrong hmaster because it 
will automatically get the configs for the cluster it is in. 

Additionally, we use ec2 security groups to ensure different clusters are 
sandboxed from each other. You could get this same benefit using VPN or 
firewall .



Sent from iPhone.

On May 23, 2012, at 11:30 PM, Norbert Burger  wrote:

> We had a situation earlier today in our PROD cluster where a test machine
> was accidentally configured with our PROD cluster config.  On startup, the
> HMaster promptly accepted the RS into the fold, and started re-assigning
> regions to it.  The mass migration caused write latencies to increase, and
> it eventually took an HMaster restart to bring things back to normal.
> 
> DFS has the dfs.hosts conf setting, which dictates which datanodes are
> allowed to join.  In our setup, we're managing dfs.hosts via configuration
> management.  From what I can tell looking through
> hbase/master/ServerManager.java, there is no equivalent setting on the
> HBase side.  Do folks already rely on this auto-add feature, or would it be
> helpful if there was a similar stop-gap config param for regionservers?
> 
> Norbert


Re: auto-added RegionServers

2012-05-23 Thread Michael Drzal
I think a similar concept would be a great idea.  It would definitely
prevent the type of issue that you mentioned.  I think that if it was done
in a similar way to how it is handled for hadoop, where you can specify a
list, but if you don't, you get autoadd, should keep everyone happy.

  Mike

On Wed, May 23, 2012 at 8:30 PM, Norbert Burger wrote:

> We had a situation earlier today in our PROD cluster where a test machine
> was accidentally configured with our PROD cluster config.  On startup, the
> HMaster promptly accepted the RS into the fold, and started re-assigning
> regions to it.  The mass migration caused write latencies to increase, and
> it eventually took an HMaster restart to bring things back to normal.
>
> DFS has the dfs.hosts conf setting, which dictates which datanodes are
> allowed to join.  In our setup, we're managing dfs.hosts via configuration
> management.  From what I can tell looking through
> hbase/master/ServerManager.java, there is no equivalent setting on the
> HBase side.  Do folks already rely on this auto-add feature, or would it be
> helpful if there was a similar stop-gap config param for regionservers?
>
> Norbert
>


Re: Append and Put

2012-05-23 Thread Jean-Daniel Cryans
On Wed, May 23, 2012 at 8:11 PM, NNever  wrote:
> Thanks J-D.
>
> so it means 'Append' keeps write-lock only and 'Put' keeps
> write-lock/read-lock both?

Yeah... not at all. First, there's no read lock. Then Put is just a
Put, it takes a write lock. Append is a read+write operation, still
just uses a write lock.

> and  if we use 'Append' instead of 'Put', then the chance Clients to wait
> will reduce, right?

You would use Append instead of Put only if you also need a Get. A
typical example is a list, let's say you have a cell that's like:

a,b,c

Now you want to add ",d". Before Append you'd have to do a Get,
manipulate the value (basically add the new data at the end) and then
do a Put. That's 2 round trips. Append is just those operations put
together and it runs all in the region server, saving you 1 round trip
plus you don't have a race condition when you have multiple appenders
on the same cell.

Hope this helps,

J-D


Re: Append and Put

2012-05-23 Thread NNever
Thanks J-D.

so it means 'Append' keeps write-lock only and 'Put' keeps
write-lock/read-lock both?
and  if we use 'Append' instead of 'Put', then the chance Clients to wait
will reduce, right?



2012/5/24 Jean-Daniel Cryans 

> It's a facility so that you don't have to read+write in order to add
> something to a value. With Append the read is done in the region
> server before the write, also it solves the problem where you could
> have a race when there are multiple appenders.
>
> J-D
>
> On Tue, May 22, 2012 at 8:51 PM, NNever  wrote:
> > Simple question,  what's the difference between "Append" and "Put"?
> > It seems they both can put some datas into a row.
> > Dose "Append" keep several write operations in atom but Put not?
> >
> > if it is, then is "Append" is going to take place of Put? may Append
> slower
> > than Put?
> >
> > Thanks~
> >
> > ---
> > Best regards,
> > nn
>


Re: Unblock Put/Delete

2012-05-23 Thread NNever
Thanks Harsh, I'll try it ;)
---
Best regards,
nn

2012/5/24 Harsh J 

> NNever,
>
> You can use asynchbase (an asynchronous API for HBase) for that need:
> https://github.com/stumbleupon/asynchbase
>
> On Thu, May 24, 2012 at 7:25 AM, NNever  wrote:
> > Dear all
> >
> > When we use Put or Delete we always need to wait their return vals. But
> > sometimes we just need to send a Put/Delete call, and we may not want to
> > know if it success.
> > So, is there any possibility to do Put/Delete in an unblock way?
> >
> > Tks very much
> >
> > ---
> > Best regards,
> > nn
>
>
>
> --
> Harsh J
>


Re: Unblock Put/Delete

2012-05-23 Thread Harsh J
NNever,

You can use asynchbase (an asynchronous API for HBase) for that need:
https://github.com/stumbleupon/asynchbase

On Thu, May 24, 2012 at 7:25 AM, NNever  wrote:
> Dear all
>
> When we use Put or Delete we always need to wait their return vals. But
> sometimes we just need to send a Put/Delete call, and we may not want to
> know if it success.
> So, is there any possibility to do Put/Delete in an unblock way?
>
> Tks very much
>
> ---
> Best regards,
> nn



-- 
Harsh J


Re: Append and Put

2012-05-23 Thread Jean-Daniel Cryans
It's a facility so that you don't have to read+write in order to add
something to a value. With Append the read is done in the region
server before the write, also it solves the problem where you could
have a race when there are multiple appenders.

J-D

On Tue, May 22, 2012 at 8:51 PM, NNever  wrote:
> Simple question,  what's the difference between "Append" and "Put"?
> It seems they both can put some datas into a row.
> Dose "Append" keep several write operations in atom but Put not?
>
> if it is, then is "Append" is going to take place of Put? may Append slower
> than Put?
>
> Thanks~
>
> ---
> Best regards,
> nn


Re: About HBase Memstore Flushes

2012-05-23 Thread Jean-Daniel Cryans
On Wed, May 23, 2012 at 2:33 PM, Alex Baranau  wrote:
> Talked to J-D (and source code). It turned out that
> when hbase.regionserver.global.memstore.lowerLimit is reached flushes are
> forced without blocking reads (of course,
> if hbase.regionserver.global.memstore.upperLimit is not hit). Makes perfect
> sense. Though couldn't figure this out from settings description in
> hbase-default.xml (tried to come up with the patch:
> https://issues.apache.org/jira/browse/HBASE-6076).

Thanks for this.

>
> So (if one is interested), the logic is the following with regard to
> triggering flushes on the "global regionserver level":
> * flushes are forced when memstore size
> hits hbase.regionserver.global.memstore.lowerLimit
> * flushes are forced *and updates are blocked* when memstore size
> reaches hbase.regionserver.global.memstore.upperLimit. In this case flushes
> are forced and updates are blocked until memstore size is less
> than hbase.regionserver.global.memstore.lowerLimit.
>
> Not sure if that would make sense to separate these two things though:
> * mark until memstore flushes are forced and updates are blocked
> * mark when memstore flushes are forced (without blocking updates)
> As for now for two these
> things hbase.regionserver.global.memstore.lowerLimit is used.

Yeah I guess we could but if you need to tune your setup in a way that
it could make use of an additional config as you described then I'd
say that you're trying to solve the wrong problem. The upper limit is
a protection against overruns, not a feature that people should rely
on :)

J-D


Re: Consider individual RSs performance when writing records with random keys?

2012-05-23 Thread Alex Baranau
Talked to Stack. It's not completely crazy idea. May be implemented as tiny
lib, which can be used when row keys are randomized in some way by
application logic. In this case randomization would take into account how
individual regionservers behave (wrt writing speed).

Would be very interesting to try to implement smth like this on top of
asynchbase. Note, that asynchbase helps to cope with the problem when
regionservers have periodic drop-off in writing, but doesn't solve the
problem of slowness of individual RSs. This can't be addressed in generic
way, but in some more specific cases can (like when row keys are
"randomized", as explained above and in earlier message). So, as far as I
understand this should be addressed on higher level.

Alex Baranau
--
Sematext :: http://blog.sematext.com

On Thu, May 17, 2012 at 10:23 AM, Alex Baranau wrote:

> Hi,
>
> 1.
> Not sure if you've seen HBaseWD (https://github.com/sematext/HBaseWD)
> project. It implements the "salt keys with prefix" approach when writing
> monotonically increasing row key/timeseries data. If simplified, the idea
> is to add random prefix to the row key so that writes end up on different
> region servers (avoiding single RS hotspot).
>
> 2.
> When writing data to HBase with salted or random keys (so that load is
> well distributed over cluster) the write speed per RS is limited by the
> slowest RS in cluster (singe one Region is served by one RS).
>
> Given 1 & 2 I got this crazy idea to:
> * write in multiple threads
> * each prefix (or interval of keys in case of completely random keys) is
> assigned to particular thread, so that records with this prefix always
> written by that thread
> * measure how well each thread performs (e.g. write speed)
> * based on each thread performance, salt (or randomize) keys in a biased
> way, so that threads which perform better got more records to write
>
> Thus we will be loading less those RSs that are "slower" and overall load
> will be more or less balanced which will give max write performance for the
> cluster.
> This might work if each thread is writing into relatively small number of
> all RSs though only, I think. Otherwise they will perform more or less the
> same.
>
> Am I completely crazy when thinking about this? Does it makes sense to you
> at all?
>
> Alex Baranau
> --
> Sematext :: http://blog.sematext.com/
>


Re: About HBase Memstore Flushes

2012-05-23 Thread Alex Baranau
Talked to J-D (and source code). It turned out that
when hbase.regionserver.global.memstore.lowerLimit is reached flushes are
forced without blocking reads (of course,
if hbase.regionserver.global.memstore.upperLimit is not hit). Makes perfect
sense. Though couldn't figure this out from settings description in
hbase-default.xml (tried to come up with the patch:
https://issues.apache.org/jira/browse/HBASE-6076).

So (if one is interested), the logic is the following with regard to
triggering flushes on the "global regionserver level":
* flushes are forced when memstore size
hits hbase.regionserver.global.memstore.lowerLimit
* flushes are forced *and updates are blocked* when memstore size
reaches hbase.regionserver.global.memstore.upperLimit. In this case flushes
are forced and updates are blocked until memstore size is less
than hbase.regionserver.global.memstore.lowerLimit.

Not sure if that would make sense to separate these two things though:
* mark until memstore flushes are forced and updates are blocked
* mark when memstore flushes are forced (without blocking updates)
As for now for two these
things hbase.regionserver.global.memstore.lowerLimit is used.

Alex Baranau
--
Sematext :: http://blog.sematext.com/

On Wed, May 9, 2012 at 6:02 PM, Alex Baranau wrote:

> Should I may be create a JIRA issue for that?
>
> Alex Baranau
> --
> Sematext :: http://blog.sematext.com/
>
> On Tue, May 8, 2012 at 4:00 PM, Alex Baranau wrote:
>
>> Hi!
>>
>> Just trying to check that I understand things correctly about configuring
>> memstore flushes.
>>
>> Basically, there are two groups of configuraion properties (leaving out
>> region pre-close flushes):
>> 1. determines when flush should be triggered
>> 2. determines when flush should be triggered and updates should be
>> blocked during flushing
>>
>> 2nd one is for safety reasons: we don't want memstore to grow without a
>> limit, so we forbid writes unless memstore has "bearable" size. Also we
>> don't want flushed files to be too big. These properties are:
>> * hbase.regionserver.global.memstore.upperLimit &
>> hbase.regionserver.global.memstore.lowerLimit [1]   (1)
>> * hbase.hregion.memstore.block.multiplier [2]
>>
>> 1st group (sorry for reverse order) is about triggering "regular
>> flushes". As flushes can be performed without pausing updates, we want them
>> to happen before conditions for "blocking updates" flushes are met. The
>> property for configuring this is
>> * hbase.hregion.memstore.flush.size [3]
>> (* there are also open jira issues for per colfam settings)
>>
>> As we don't want to perform too frequent flushes, we want to keep this
>> option big enough to avoid that. At the same time we want to keep it small
>> enough so that it triggers flushing *before* the "blocking updates"
>> flushing is triggered. This configuration is per-region, while (1) is per
>> regionserver. So, if we had constant (more or less) number of regions per
>> regionserver, we could choose the value in a such way that it is not too
>> small, but small enough. However it is usual situation when regions number
>> assigned to regionserver varies a lot during cluster life. And we don't
>> want to adjust it over time (which requires RSs restarts).
>>
>> Does thinking above make sense to you? If yes, then here are the
>> questions:
>>
>> A. is it a goal to have more or less constant regions number per
>> regionserver? Can anyone share their experience if that is achievable?
>> B. or should there be any config options for setting up triggering
>> flushes based on regionserver state (not just individual regions or
>> stores)? E.g.:
>> B.1 given setting X%, trigger flush of biggest memstore (or whatever
>> is logic for selecting memstore to flush) when memstore takes up X% of heap
>> (similar to (1), but triggers flushing when there's no need to block
>> updates yet)
>> B.2 any other which takes into account regions number
>>
>> Thoughts?
>>
>> Alex Baranau
>> --
>> Sematext :: http://blog.sematext.com/
>>
>> [1]
>>
>>   
>> hbase.regionserver.global.memstore.upperLimit
>> 0.4
>> Maximum size of all memstores in a region server before
>> new
>>   updates are blocked and flushes are forced. Defaults to 40% of heap
>> 
>>   
>>   
>> hbase.regionserver.global.memstore.lowerLimit
>> 0.35
>> When memstores are being forced to flush to make room in
>>   memory, keep flushing until we hit this mark. Defaults to 35% of
>> heap.
>>   This value equal to hbase.regionserver.global.memstore.upperLimit
>> causes
>>   the minimum possible flushing to occur when updates are blocked due
>> to
>>   memstore limiting.
>> 
>>   
>>
>> [2]
>>
>>   
>> hbase.hregion.memstore.block.multiplier
>> 2
>> 
>> Block updates if memstore has hbase.hregion.block.memstore
>> time hbase.hregion.flush.size bytes.  Useful preventing
>> runaway memstore during spikes in update traffic.  Without an
>> upper-bound, memst

Re: HBase 0.94 thrift2 (TScan sturct missing filterString)

2012-05-23 Thread Jay T
**
Added a JIRA to track this issue.
https://issues.apache.org/jira/browse/HBASE-6073

Thanks,
Jay

On 5/23/12 1:14 PM, Ted Yu wrote:

Why don't you log a JIRA ?

By the time you reach the next iteration, hopefully this feature is there -
especially if your team can contribute.

On Wed, May 23, 2012 at 10:06 AM, Jay T 
 wrote:


  We are currently on Hbase 0.90 (cdh3u3) and soon will be upgrading to
Hbase
0.94. Our application is written in Python and we use Thrift to connect
to HBase.
Looking at Thrift2 (hbase.thrift) I noticed that TScan struct does not
accept filterString as a parameter. This was introduced in HBase 0.92
and  is still part of thrift (version1) specification.

We are required to use thrift2 as our application logic requires columns
returned by thrift to be sorted in the order that they are stored in
Hbase. This was not possible in thrift 1 as TRowResult had columns
stored in a map. Thrift 2 takes solves this as it uses a
list to store column values TResult returns list.

So we are in a situation where we want to use thrift2 for the sorted
columns (a bigger requirement at this time) but it would also be great
to not give up on using filterString functionality from thrift1. (our
application currently doesn't use filters but we had plans to start
adopting them in the next iteration).

Are there plans to incorporate filterString in thrift2 TScan struct as
well ?

Thanks,
Jay


Re: Can we store a HBase Result object using Put

2012-05-23 Thread Alex Baranau
I saw the need for such converting many times before. Should we add it as a
public method in some utility class? (create JIRA for that?)

Alex Baranau
--
Sematext :: http://blog.sematext.com/

On Mon, May 21, 2012 at 4:26 PM, Jean-Daniel Cryans wrote:

> How exactly are you building the Put? It doesn't have a constructor
> that can take byte[] and figure out how it should use it, it only
> takes a row key (meaning that if you do new
> Put(Result.getBytes().get()), you're passing the whole thing as a row
> key which is wrong).
>
> In the HBase code we do the Result => Put conversion here:
>
> https://github.com/apache/hbase/blob/trunk/src/main/java/org/apache/hadoop/hbase/mapreduce/Import.java#L115
>
> J-D
>
> On Mon, May 21, 2012 at 12:10 PM, Shahsikant Jain
>  wrote:
> > Hi,
> >
> > I am trying to store a result that I got out of some scan into another
> > Hbase table so that I can read it back via Get and reconstruct.
> >
> > This is what I am doing
> >
> > 1. Result.getBytes().get() -- Get the byte[] and do a Put in HBase
> > 2. Then do a get and read the bytes[] as
> > new Result(new ImmutableBytesWritable(byte[]))
> >
> > Now when I try to read some value from this Result it's give me null.
> > Wondering what I am doing worng
> >
> > Regards,
> > Shahsikant
>


Re: HBase 0.94 thrift2 (TScan sturct missing filterString)

2012-05-23 Thread Ted Yu
Why don't you log a JIRA ?

By the time you reach the next iteration, hopefully this feature is there -
especially if your team can contribute.

On Wed, May 23, 2012 at 10:06 AM, Jay T  wrote:

>  We are currently on Hbase 0.90 (cdh3u3) and soon will be upgrading to
> Hbase
> 0.94. Our application is written in Python and we use Thrift to connect
> to HBase.
> Looking at Thrift2 (hbase.thrift) I noticed that TScan struct does not
> accept filterString as a parameter. This was introduced in HBase 0.92
> and  is still part of thrift (version1) specification.
>
> We are required to use thrift2 as our application logic requires columns
> returned by thrift to be sorted in the order that they are stored in
> Hbase. This was not possible in thrift 1 as TRowResult had columns
> stored in a map. Thrift 2 takes solves this as it uses a
> list to store column values TResult returns list.
>
> So we are in a situation where we want to use thrift2 for the sorted
> columns (a bigger requirement at this time) but it would also be great
> to not give up on using filterString functionality from thrift1. (our
> application currently doesn't use filters but we had plans to start
> adopting them in the next iteration).
>
> Are there plans to incorporate filterString in thrift2 TScan struct as
> well ?
>
> Thanks,
> Jay
>


HBase 0.94 thrift2 (TScan sturct missing filterString)

2012-05-23 Thread Jay T
  We are currently on Hbase 0.90 (cdh3u3) and soon will be upgrading to Hbase
0.94. Our application is written in Python and we use Thrift to connect
to HBase.
Looking at Thrift2 (hbase.thrift) I noticed that TScan struct does not
accept filterString as a parameter. This was introduced in HBase 0.92
and  is still part of thrift (version1) specification.

We are required to use thrift2 as our application logic requires columns
returned by thrift to be sorted in the order that they are stored in
Hbase. This was not possible in thrift 1 as TRowResult had columns
stored in a map. Thrift 2 takes solves this as it uses a
list to store column values TResult returns list.

So we are in a situation where we want to use thrift2 for the sorted
columns (a bigger requirement at this time) but it would also be great
to not give up on using filterString functionality from thrift1. (our
application currently doesn't use filters but we had plans to start
adopting them in the next iteration).

Are there plans to incorporate filterString in thrift2 TScan struct as
well ?

Thanks,
Jay


Re: Restrictions during compactions

2012-05-23 Thread Dave Revell
On Wed, May 23, 2012 at 6:15 AM, Takahiko Kawasaki <
takahiko.kawas...@jibemobile.jp> wrote:

> Hello,
>
> I'm a newbie and wondering whether or not there is any restriction during
> HBase minor/major compactions. I read the online document but could not
> find any explicit mention about restrictions. What I'm mostly worrying
> about is whether read/write operations are blocked during compactions.
>
> The description about 'hbase.hstore.blockingWaitTime':
>
> ---
> The time an HRegion will block updates for after hitting the StoreFile
> limit defined by hbase.hstore.blockingStoreFiles. After this time has
> elapsed, the HRegion will stop blocking updates even if a compaction has
> not been completed. Default: 90 seconds.
> ---
>
> implies that updates are blocked during compactions, but at the same time
> it says that there are cases where blocking updates is stopped even if a
> compaction has not been completed. However, does it mean that the
> compaction is aborted or that the compaction continues and updates are
> performed successfully? And if the latter case is true, what is the reason
> to block updates during compactions? I'm confused.
>

That's not quite true. No operations are blocked during compactions. The
config value you're referring to is a kind of safety valve to prevent high
write throughput from overwhelming HBase's ability to compact. In the
normal case, you shouldn't have to worry about that config value until you
have very high write traffic.

So, short answer: clients can operate normally during compactions.

Best,
Dave


Re: HBase and MapReduce

2012-05-23 Thread Dave Revell
>
> 1. HBase guarantees data locality of store files and Regionserver only if
> it stays up for long. If there are too many region movements or the server
> has been recycled recently, there is a high probability that store file
> blocks are not local to the region server.  But the getSplits command
> always return the RegionServer of the StoreFile. So in this scenario,
> MapReduce loses its data locality?
>

It's impossible to get data locality in this case since mapreduce reads
from the regionserver, and the data is not local to the regionserver. The
data moves from datanode->regionserver->mapreduce. If the blocks are not
local to the regionserver, you cannot avoid using the network from
datanode->regionserver even if the regionserver->mapreduce step is local.


2. As the getSplits return only the RegionServer, the MR job is not aware
> of the multiple replicates of the StoreFile block. It only accesses one
> block (which is local if the point above is not applicable). This can
> constrain the MR processing as you cannot distribute the data processing
> in the best possible manner. Is this correct?
>

I think there's a misunderstanding. The mapreduce job does not read from
HDFS when using TableInputFormat. The mapreduce tasks use the HBase client
API to talk to a regionserver, and the *regionserver* reads from HDFS.

Also yes, the locality of data blocks to regionservers can be suboptimal,
and the locality of mapreduce tasks to regionservers can also be suboptimal.

3. A guess - since the MR processing goes through the RegionServer, it may
> impact the RegionServer performance for other random operations?
>

Yes, absolutely. Some people use separate HBase clusters for mapreduce
versus real-time traffic for this reason. You can also try to limit the
rate of data consumption by your mapreduce job by reducing the number of
map tasks, or sleeping for short periods in your mapper, or any other hack
that will slow your job down.

Good luck!
-Dave


Re: Using put for nullifying qualifiers

2012-05-23 Thread Kristoffer Sjögren
Gotcha.

Columns are quite dynamic in my case, but since I need to fetch rows first
anyways; a KeyOnlyFilter to first find them and then overwrite values will
do just fine.

Cheers,
-Kristoffer


Re: Using put for nullifying qualifiers

2012-05-23 Thread Tom Brown
I didn't mean to set the version to null, I meant to include a revision of
the column whose contents are empty. This empty revision will Still be
returned by any gets on that row, but you can put code into your client
that treats empty values as deleted.

It's a bit of a hack, but it's the best I can come up with.

--Tom

On Wednesday, May 23, 2012, Kristoffer Sjögren wrote:

> Ted: Awesome. I can think of several use cases where this is useful, but im
> pretty stuck on 0.92 right now.
>
> I tried the null-version trick but must be doing something wrong. How do I
> set version to null on a column? Isnt version equal to the timestamp
> (primitive long)?
>
> Setting timestamp to 0 and -1 doesnt work it seems.
>
> HTable t = new HTable(tablename);
> Put p = new Put(r1);
> KeyValue kv1 = new KeyValue(r1, c1, c1, new byte[]{1});
> KeyValue kv2 = new KeyValue(r1, c2, c2, new byte[]{1});
> p.add(kv1);
> p.add(kv2);
> t.put(p);
> t.flushCommits();
> Result res = t.get(new Get(r1));
> byte[] v1 = res.getValue(c1, c1);
> byte[] v2 = res.getValue(c2, c2);
> System.out.println("v1 " + v1[0] + " v2 " + v2[0]);
>
> kv1 = new KeyValue(r1, c1, c1, -1, new byte[]{1});
> p = new Put(r1);
> p.add(kv1);
> t.put(p);
> res = t.get(new Get(r1));
> v1 = res.getValue(c1, c1);
> v2 = res.getValue(c2, c2);
> System.out.println("v1 " + v1[0] + " v2 " + v2[0]);
>
> This prints:
> v1 1 v2 1
> v1 1 v2 1
>
> Any advice?
>
>
> On Tue, May 22, 2012 at 10:45 PM, Ted Yu >
> wrote:
>
> > That's right.
> >
> > In HBase 0.94 and trunk, check out the following API in HRegion:
> >  public void mutateRowsWithLocks(Collection mutations,
> >  Collection rowsToLock) throws IOException {
> >
> > It allows you to combine Put's and Delete's for a single region,
> > atomically.
> >
> > On Tue, May 22, 2012 at 1:22 PM, Kristoffer Sjögren 
> > 
> > >wrote:
> >
> > > Thanks, sounds like that should do it.
> > >
> > > So im guessing it is correct to assume that _all_ KeyValues added to a
> > > _single_ Put operation will either wholely succeed or wholely fail as
> > long
> > > as they belong to the same row?
> > >
> > > On Tue, May 22, 2012 at 8:30 PM, Tom Brown 
> > > >
> wrote:
> > >
> > > > I don't think you can include a delete with a put and keep it atomic.
> > > > You could include a null version of the column with your put, though,
> > > > for a similar effect.
> > > >
> > > > --Tom
> > > >
> > > > On Tue, May 22, 2012 at 10:55 AM, Kristoffer Sjögren <
> sto...@gmail.com 
> > >
> > > > wrote:
> > > > > Hi
> > > > >
> > > > > I'm trying to use Put operations to replace ("set") already
> existing
> > > rows
> > > > > by nullify certain columns and qualifiers as part of an Put
> > operation.
> > > > >
> > > > > The reason I want to do this is 1) keep the operation
> > atomic/consistent
> > > > 2)
> > > > > avoid latency from first doing Delete then Put.
> > > > >
> > > > > Is there some way to do this kind of operation?
> > > > >
> > > > > Cheers,
> > > > > -Kristoffer
> > > >
> > >
> >
>


Restrictions during compactions

2012-05-23 Thread Takahiko Kawasaki
Hello,

I'm a newbie and wondering whether or not there is any restriction during
HBase minor/major compactions. I read the online document but could not
find any explicit mention about restrictions. What I'm mostly worrying
about is whether read/write operations are blocked during compactions.

The description about 'hbase.hstore.blockingWaitTime':

---
The time an HRegion will block updates for after hitting the StoreFile
limit defined by hbase.hstore.blockingStoreFiles. After this time has
elapsed, the HRegion will stop blocking updates even if a compaction has
not been completed. Default: 90 seconds.
---

implies that updates are blocked during compactions, but at the same time
it says that there are cases where blocking updates is stopped even if a
compaction has not been completed. However, does it mean that the
compaction is aborted or that the compaction continues and updates are
performed successfully? And if the latter case is true, what is the reason
to block updates during compactions? I'm confused.

If duration of major compactions were short enough, synchronous blocking
would be acceptable. But I saw some comments saying major compactions took
hours, and if it is true and if updates need be blocked during compactions,
I would be at a loss on how to configure HBase to assure that updates can
be done anytime.

Could you give me some insights about this, please?

Best Regards,
Takahiko Kawasaki


Re: Using put for nullifying qualifiers

2012-05-23 Thread Kristoffer Sjögren
Ted: Awesome. I can think of several use cases where this is useful, but im
pretty stuck on 0.92 right now.

I tried the null-version trick but must be doing something wrong. How do I
set version to null on a column? Isnt version equal to the timestamp
(primitive long)?

Setting timestamp to 0 and -1 doesnt work it seems.

HTable t = new HTable(tablename);
Put p = new Put(r1);
KeyValue kv1 = new KeyValue(r1, c1, c1, new byte[]{1});
KeyValue kv2 = new KeyValue(r1, c2, c2, new byte[]{1});
p.add(kv1);
p.add(kv2);
t.put(p);
t.flushCommits();
Result res = t.get(new Get(r1));
byte[] v1 = res.getValue(c1, c1);
byte[] v2 = res.getValue(c2, c2);
System.out.println("v1 " + v1[0] + " v2 " + v2[0]);

kv1 = new KeyValue(r1, c1, c1, -1, new byte[]{1});
p = new Put(r1);
p.add(kv1);
t.put(p);
res = t.get(new Get(r1));
v1 = res.getValue(c1, c1);
v2 = res.getValue(c2, c2);
System.out.println("v1 " + v1[0] + " v2 " + v2[0]);

This prints:
v1 1 v2 1
v1 1 v2 1

Any advice?


On Tue, May 22, 2012 at 10:45 PM, Ted Yu  wrote:

> That's right.
>
> In HBase 0.94 and trunk, check out the following API in HRegion:
>  public void mutateRowsWithLocks(Collection mutations,
>  Collection rowsToLock) throws IOException {
>
> It allows you to combine Put's and Delete's for a single region,
> atomically.
>
> On Tue, May 22, 2012 at 1:22 PM, Kristoffer Sjögren  >wrote:
>
> > Thanks, sounds like that should do it.
> >
> > So im guessing it is correct to assume that _all_ KeyValues added to a
> > _single_ Put operation will either wholely succeed or wholely fail as
> long
> > as they belong to the same row?
> >
> > On Tue, May 22, 2012 at 8:30 PM, Tom Brown  wrote:
> >
> > > I don't think you can include a delete with a put and keep it atomic.
> > > You could include a null version of the column with your put, though,
> > > for a similar effect.
> > >
> > > --Tom
> > >
> > > On Tue, May 22, 2012 at 10:55 AM, Kristoffer Sjögren  >
> > > wrote:
> > > > Hi
> > > >
> > > > I'm trying to use Put operations to replace ("set") already existing
> > rows
> > > > by nullify certain columns and qualifiers as part of an Put
> operation.
> > > >
> > > > The reason I want to do this is 1) keep the operation
> atomic/consistent
> > > 2)
> > > > avoid latency from first doing Delete then Put.
> > > >
> > > > Is there some way to do this kind of operation?
> > > >
> > > > Cheers,
> > > > -Kristoffer
> > >
> >
>


HBase and MapReduce

2012-05-23 Thread Hemant Bhanawat
I have couple of questions related to MapReduce over HBase

 

1. HBase guarantees data locality of store files and Regionserver only if
it stays up for long. If there are too many region movements or the server
has been recycled recently, there is a high probability that store file
blocks are not local to the region server.  But the getSplits command
always return the RegionServer of the StoreFile. So in this scenario,
MapReduce loses its data locality? 

 

2. As the getSplits return only the RegionServer, the MR job is not aware
of the multiple replicates of the StoreFile block. It only accesses one
block (which is local if the point above is not applicable). This can
constrain the MR processing as you cannot distribute the data processing
in the best possible manner. Is this correct? 

 

3. A guess - since the MR processing goes through the RegionServer, it may
impact the RegionServer performance for other random operations? 

 

Thanks in advance,

Hemant