Hi,
My setup is as follows:
24 regionservers (7GB RAM, 8-core CPU, 5GB heap space)
hbase 0.94.4
5-7 regions per regionserver
I am doing an avg of 4k-5k random gets per regionserver per second and the
performance is acceptable in the beginning. I have also done ~10K gets for
a single regionserver
This generally happens when the same block is accessed for the HFile. Are
you seeing any contention on the HDFS side?
Regards
Ram
On Thu, May 16, 2013 at 4:19 PM, Bing Jiang jiangbinglo...@gmail.comwrote:
Have you checked your HBase environment? I think it perhaps come from:
1) System uses
Which version of HBase?
Regards
Ram
On Thu, May 16, 2013 at 10:42 PM, Tianying Chang tich...@ebaysf.com wrote:
Hi,
When our customers(using TSDB) loads large amount of data into HBase, we
saw many NullPointerException in the RS logs as below. I checked the source
code, it seems when
Are you trying to get the row lock explicitly ? Using HTable.lockRow?
Regards
Ram
On Thu, May 16, 2013 at 10:46 PM, ramkrishna vasudevan
ramkrishna.s.vasude...@gmail.com wrote:
Which version of HBase?
Regards
Ram
On Thu, May 16, 2013 at 10:42 PM, Tianying Chang
it is HBase 0.92. The customer is using TSDB and AsyncHBase. I am not sure what
their client code is calling exactly. But from the calling stack, it feels it
use HTable.lockRow. Is this not recommended? If so, what should they use
instead?
Thanks
Tian-Ying
FYI, below I quoted the customers' response after I explained the NULLException
is caused by the row lock. So my question is if this is allowed situation for
multiple threads/process to compete for the lock, the one who did not get
should be considered normal and not throwing
Have you checked your HBase environment? I think it perhaps come from:
1) System uses more swap frequently when your continue to execute Gets
operation?
I have set swap to 0. AFAIK, that's a recommended practice. Let me know if
that should not be followed for nodes running HBase.
2) check
Hi,
I am wondering what happens when we add the following:
row, col, timestamp -- v1
A flush happens. Now, we add
row, col, timestamp -- v2
A flush happens again. In this case if MAX_VERSIONS == 1, how is the tie
broken during reads and during minor compactions, is it arbitrary ?
Thanks
This generally happens when the same block is accessed for the HFile. Are
you seeing any contention on the HDFS side?
When you say contention what should I be looking for ? slow operations to
respond to data block requests ? or some specific metric in ganglia ?
-Viral
Michael is correct.
More information available about swap value on wikipedia:
http://en.wikipedia.org/wiki/Swappiness
2013/5/16 Michael Segel michael_se...@hotmail.com
Going from memory, the swap value setting to 0 is a suggestion. You may
still actually swap, but I think its a 'last resort'
Going from memory, the swap value setting to 0 is a suggestion. You may
still actually swap, but I think its a 'last resort' type of thing.
When you look at top, at the top of the page, how much swap do you see?
When I look at top it says: 0K total, 0K used, 0K free (as expected). I can
try
Going from memory, the swap value setting to 0 is a suggestion. You may still
actually swap, but I think its a 'last resort' type of thing.
When you look at top, at the top of the page, how much swap do you see?
On May 16, 2013, at 1:43 PM, Viral Bajaria viral.baja...@gmail.com wrote:
Have
If you're not swapping then don't worry about it.
My comment was that even though you set the swap to 0, and I'm going from
memory, its possible for some swap to occur.
(But I could be wrong. )
You really don't have a lot of memory, and you have a 5GB heap... MSLABS on?
Could you be facing
Last row inserted wins.
On May 16, 2013, at 1:49 PM, Varun Sharma va...@pinterest.com wrote:
Hi,
I am wondering what happens when we add the following:
row, col, timestamp -- v1
A flush happens. Now, we add
row, col, timestamp -- v2
A flush happens again. In this case if
If you're not swapping then don't worry about it.
My comment was that even though you set the swap to 0, and I'm going from
memory, its possible for some swap to occur.
(But I could be wrong. )
Thanks for sharing this info. Will remember for future debugging too.
Checked the vm.swappiness
We are pleased to announce the immediate availability of Phoenix 1.2
(https://github.com/forcedotcom/phoenix/wiki/Download). Here are some of
the release highlights:
* Improve performance of multi-point and multi-range queries (20x plus)
using new skip scan
* Support TopN queries (3-70x
Except in the case of bulk loads; if you import cells with the same
timestamp through a bulk load, the last row is non-deterministic.
Facebook fixed the issue, and the patch has been backported to 0.95. The
friendly folks at Cloudera are working on backporting the fix to 0.94 as
well.
Follow
Lets say I have the following in my table:
col1
row1 v1 -- HFile entry would be row1,col1,ts1--v1
ol1
row1c v2 -- HFile entry would be row1c,ol1,ts1--v2
Now I issue a prefix scan asking row for row row1c, how do we seek - do
we seek
Hi James,
You have mentioned support for TopN query. Can you provide me HBase Jira
ticket for that. I am also doing similar stuff in
https://issues.apache.org/jira/browse/HBASE-7474. I am interested in
knowing the details about that implementation.
Thanks,
Anil Gupta
On Thu, May 16, 2013 at
Or do we use some kind of demarcator b/w rows and columns and timestamps
when building the HFile keys and the indices ?
Thanks
Varun
On Thu, May 16, 2013 at 1:56 PM, Varun Sharma va...@pinterest.com wrote:
Lets say I have the following in my table:
col1
row1 v1
On Thu, May 16, 2013 at 2:03 PM, Varun Sharma va...@pinterest.com wrote:
Or do we use some kind of demarcator b/w rows and columns and timestamps
when building the HFile keys and the indices ?
No demarcation but in KeyValue, we keep row, column family name, column
family qualifier, etc.,
What you seeing Varun (or think you are seeing)?
St.Ack
On Thu, May 16, 2013 at 2:30 PM, Stack st...@duboce.net wrote:
On Thu, May 16, 2013 at 2:03 PM, Varun Sharma va...@pinterest.com wrote:
Or do we use some kind of demarcator b/w rows and columns and timestamps
when building the HFile
Nothing, I am just curious...
So, we will do a bunch of wasteful scanning - that's lets say row1 has col1
- col10 - basically 100K columns, we will scan all those key values
even though we are going to discard them, is that correct ?
On Thu, May 16, 2013 at 2:30 PM, Stack st...@duboce.net
What is your query?
If scanning over rows of 100k, yeah, you will go through each row's content
unless you specify you are only interested in some subset of the rows.
Then a 'skipping' facility will cut where we will use the index to skip
over unwanted content.
St.Ack
On Thu, May 16, 2013 at
Sorry I may have misunderstood what you meant.
When you look for row1c in the HFile index - is it going to also match
for row1,col1 or only match row1c. It all depends how the index is
organized, if its only on HFile keys, it could also match row1,col1 unless
we use some demarcator b/w row1 and
Referring to your comment above again
If you doing a prefix scan w/ row1c, we should be starting the scan at
row1c, not row1 (or more correctly at the row that starts the block we
believe has a row1c row in it...)
I am trying to understand how you could seek right across to the block
containing
Hi Anil,
No HBase changes were required. We're already leveraging coprocessors in
HBase which is a key enabler. The other pieces needed are:
- a type system
- a means to evaluate an ORDER BY expression on the server
- memory tracking/throttling (the topN for each region are held in
memory
Hi James,
Is this implementation present in the GitHub repo of Phoenix? If yes, can
you provide me the package name/classes?
I haven't got the opportunity to try out Phoenix yet but i would like to
have a look at the implementation.
Thanks,
Anil Gupta
On Thu, May 16, 2013 at 4:15 PM, James
The lockRow and unlockRow have been replaced by checkAndXXX and increment()
apis so that any operation on a particular row can be done automically.
Am not sure of the use case that you are addressing here but i recommend
taking a look at these apis if they solve the problem for you.
RowLocks are
29 matches
Mail list logo