Want to change key structure

2018-02-19 Thread Marcell Ortutay
I have a large HBase table (~10 TB) that has an existing key structure.
Based on some recent analysis, the key structure is causing performance
problems for our current query load. I would like to re-write the table
with a new key structure that performs substantially better.

What is the best way to go about re-writing this table? Since they key
structure will change, it will affect locality, so all the data will have
to move to a new location. If anyone can point to examples of code that
does something like this, that would be very helpful.

Thanks,
Marcell


Re: Query data like opentsdb

2018-02-19 Thread Ted Yu
Have you looked at FuzzyRowFilter ?

Cheers

On Mon, Feb 19, 2018 at 8:00 AM, kitex101  wrote:

> I have key design like:byte[] rowKey =
> =Bytes.add(Bytes.toBytes("3"),Bytes.toBytes(customer_id),
> Bytes.toBytes(timestamp));
> customer_id and timestamp are long type. As opentsdb uses:[…]I would like
> to
> filter my key by customer_id and timestamp. How do I do that?I have tried
> using prefixfilter. e.g.byte[] prefix = Bytes.add(Bytes.toBytes("3"),
> Bytes.toBytes(customer_id),Bytes.toBytes(cal.getTime().getTime()));
> PrefixFilter prefixFilter = new PrefixFilter(prefix); Scan scan = new Scan(
> prefix);
>
>
>
> --
> Sent from: http://apache-hbase.679495.n3.nabble.com/HBase-User-
> f4020416.html


Re: How to get string from hbase composite key?

2018-02-19 Thread Ted Yu
I mentioned the method accepting int because of the example in your first
email.
I verified in Eclipse that Bytes.toBytes(98) would end up in that
method.

I understand in your application data is typed as long. However that is not
shown in your previous example.

On Sat, Feb 17, 2018 at 9:00 PM, Ted Yu  wrote:

> It seems there are 3 components in the row key.
> Assuming the 2nd and 3rd are integers, you can take a look at the
> following method of Bytes:
>
>   public static byte[] toBytes(int val) {
>
> which returns 4 byte long byte array.
> You can use this knowledge to decode each component of the row key.
>
> FYI
>
> On Sat, Feb 17, 2018 at 8:44 PM, kitex101 
> wrote:
>
>> I have key in following format:
>>
>> byte[] rowKey =
>> =Bytes.add(Bytes.toBytes("3"),Bytes.toBytes(98),Bytes.toBytes(1211));
>>
>> It is stored as byte array.
>>
>> How to decode it when using java?
>>
>> Bytes.toString(CellUtil.cloneRow(cell))
>>
>> results in a�~��
>>
>>
>>
>>
>> --
>> Sent from: http://apache-hbase.679495.n3.nabble.com/HBase-User-f4020416
>> .html
>>
>
>


Query data like opentsdb

2018-02-19 Thread kitex101
I have key design like:byte[] rowKey =
=Bytes.add(Bytes.toBytes("3"),Bytes.toBytes(customer_id),Bytes.toBytes(timestamp));
customer_id and timestamp are long type. As opentsdb uses:[…]I would like to
filter my key by customer_id and timestamp. How do I do that?I have tried
using prefixfilter. e.g.byte[] prefix = Bytes.add(Bytes.toBytes("3"),
Bytes.toBytes(customer_id),Bytes.toBytes(cal.getTime().getTime()));   
PrefixFilter prefixFilter = new PrefixFilter(prefix); Scan scan = new Scan( 
  
prefix);



--
Sent from: http://apache-hbase.679495.n3.nabble.com/HBase-User-f4020416.html

Re: How to get string from hbase composite key?

2018-02-19 Thread kitex101
So, it would be like manually getting 4 byte. Actually they are long.



--
Sent from: http://apache-hbase.679495.n3.nabble.com/HBase-User-f4020416.html


Re: Trying To Understand BucketCache Evictions In HBase 1.3.1

2018-02-19 Thread Saad Mufti
Thanks, it all makes sense now.

Cheers.


Saad

On Mon, Feb 19, 2018 at 5:40 AM Anoop John  wrote:

> Hi
>   Seems you have write ops happening as you mentioned abt
> minor compactions.  When the compaction happens, the compacted file's
> blocks will get evicted.  Whatever be the value of
> 'hbase.rs.evictblocksonclose'.  This config comes to play when the
> Store is closed. Means the region movement is happening or split and
> so a close on stores. Also the table might get disabled or deleted.
> All such store close cases this config comes to picture.  But minor
> compactions means there will be evictions.  These are not via the
> eviction threads which monitor the less spaces and select LRU blocks
> for eviction.  This is done by the compaction threads. That is why you
> can see the evict ops (done by Eviction thread) is zero but the
> #evicted blocks are there.  Those might be the blocks of the compacted
> away files.  Hope this helps you to understand what is going on.
>
> -Anoop-
>
>
> On Mon, Feb 19, 2018 at 5:25 AM, Saad Mufti  wrote:
> > Sorry I meant BLOCKCACHE => 'false' on the one column family we don't
> want
> > getting cached.
> >
> > Cheers.
> >
> > 
> > Saad
> >
> >
> > On Sun, Feb 18, 2018 at 6:51 PM, Saad Mufti 
> wrote:
> >
> >> Hi,
> >>
> >> We have an HBase system running HBase 1.3.1 on an AWS EMR service. Our
> >> BucketCache is configured for 400 GB on a set of attached EBS disk
> volumes,
> >> with all column families marked for in-memory in their column family
> >> schemas using INMEMORY => 'true' (except for one column family we only
> ever
> >> write to, so we set BUCKETCACHE => 'false' on that one).
> >>
> >> Even though all column families are marked INMEMORY, we have the
> following
> >> ratios set:
> >>
> >> "hbase.bucketcache.memory.factor":"0.8",
> >>
> >> "hbase.bucketcache.single.factor":"0.1",
> >>
> >>
> >> "hbase.bucketcache.multi.factor":"0.1",
> >>
> >> Currently the bucket cache shows evictions even though it has tons of
> free
> >> space. I am trying to understand why we get any evictions at all? We do
> >> have minor compactions going on, but we have not set
> hbase.rs.evictblocksonclose
> >> to any value and from looking at the code, it defaults to false. The
> total
> >> bucket cache size is nowhere near any of the above limits, in fact on
> some
> >> long running servers where we stopped traffic, the cache size went down
> to
> >> 0. Which makes me think something is evicting blocks from the bucket
> cache
> >> in the background.
> >>
> >> You can see a screenshot from one of the regionserver L2 stats UI pages
> at
> >> https://imgur.com/a/2ZUSv . Another interesting thing to me on this
> page
> >> is that it has non-zero evicted blocks but says Evictions: 0
> >>
> >> Any help understanding this would be appreciated.
> >>
> >> 
> >> Saad
> >>
> >>
>


Re: Trying To Understand BucketCache Evictions In HBase 1.3.1

2018-02-19 Thread Anoop John
Hi
  Seems you have write ops happening as you mentioned abt
minor compactions.  When the compaction happens, the compacted file's
blocks will get evicted.  Whatever be the value of
'hbase.rs.evictblocksonclose'.  This config comes to play when the
Store is closed. Means the region movement is happening or split and
so a close on stores. Also the table might get disabled or deleted.
All such store close cases this config comes to picture.  But minor
compactions means there will be evictions.  These are not via the
eviction threads which monitor the less spaces and select LRU blocks
for eviction.  This is done by the compaction threads. That is why you
can see the evict ops (done by Eviction thread) is zero but the
#evicted blocks are there.  Those might be the blocks of the compacted
away files.  Hope this helps you to understand what is going on.

-Anoop-


On Mon, Feb 19, 2018 at 5:25 AM, Saad Mufti  wrote:
> Sorry I meant BLOCKCACHE => 'false' on the one column family we don't want
> getting cached.
>
> Cheers.
>
> 
> Saad
>
>
> On Sun, Feb 18, 2018 at 6:51 PM, Saad Mufti  wrote:
>
>> Hi,
>>
>> We have an HBase system running HBase 1.3.1 on an AWS EMR service. Our
>> BucketCache is configured for 400 GB on a set of attached EBS disk volumes,
>> with all column families marked for in-memory in their column family
>> schemas using INMEMORY => 'true' (except for one column family we only ever
>> write to, so we set BUCKETCACHE => 'false' on that one).
>>
>> Even though all column families are marked INMEMORY, we have the following
>> ratios set:
>>
>> "hbase.bucketcache.memory.factor":"0.8",
>>
>> "hbase.bucketcache.single.factor":"0.1",
>>
>>
>> "hbase.bucketcache.multi.factor":"0.1",
>>
>> Currently the bucket cache shows evictions even though it has tons of free
>> space. I am trying to understand why we get any evictions at all? We do
>> have minor compactions going on, but we have not set 
>> hbase.rs.evictblocksonclose
>> to any value and from looking at the code, it defaults to false. The total
>> bucket cache size is nowhere near any of the above limits, in fact on some
>> long running servers where we stopped traffic, the cache size went down to
>> 0. Which makes me think something is evicting blocks from the bucket cache
>> in the background.
>>
>> You can see a screenshot from one of the regionserver L2 stats UI pages at
>> https://imgur.com/a/2ZUSv . Another interesting thing to me on this page
>> is that it has non-zero evicted blocks but says Evictions: 0
>>
>> Any help understanding this would be appreciated.
>>
>> 
>> Saad
>>
>>