Hi, all,
As I understand, HBase will automatically split a region when the region is too
big.
So in what scenario, user needs to do a manual split? Could someone kindly give
me some examples that user need to do the region split explicitly via HBase
Shell or Java API?
Thanks very much.
Hi Ming,
The reason why we have it is because the user can decide where each key
goes. I can think multiple scenarios off the top of my head where it would
be useful and others can correct me if I am wrong.
1. Cases where you cannot have row keys which are equally lexically
distributed, leading
i had a customer with a sequence-based key (yes, he knew all the downsides
for that). being able to split manually meant he could split a region that
got too big at the end vice right down the middle. with a sequentially
increasing key, splitting the region in half left one region half the
desired
Thanks Arun, and John,
Both of your scenarios make a lot of sense to me. But for the sequence-based
key case, I am still confused. It is like an append-only operation, so new
data are always written into the same region, but that region will eventually
reach the hbase.hregion.max.filesize and
to be honest, we were doing manual splits for the main reason that we
wanted to make sure it was done on our schedule.
but it also occurred to me that the automatic splits, at least by default,
split the region in half. normally the idea is that both new halves
continue to grow, but with a
Thanks John,
This is a very good answer, now I understand why you use manual split, thanks.
And I have a typo in my previous post,
The C is very close to A not to B-A/2. So every split in middle of key range
will result a big region and a small region. So very bad.
So HBase only do auto
Hi Ted,
Now I finished reading the filtering section and the source code of
TestJoinedScanners(0.94).
Facts learned:
- While scanning, an entire row will be read even for a rowkey filtering.
(Since a rowkey is not a physically separate entity and stored in KeyValue
object, it's natural. Am I
Hi All,
I am trying to run JUnit for SortingCoprocessor(HBase-7474) in HBase0.98.
I am getting this error:
*14/08/06 07:06:09 ERROR namenode.FSNamesystem: FSNamesystem initialization
failed. org.apache.hadoop.metrics2.MetricsException: Metrics source
RetryCache/NameNodeRetryCache already
I did not quite understand your problem. You store your data in HBase, and
I guess later you also will read data from it. Generally, HBase will first
check if the data exist in memstore, if not, it will check the disk. If you
set the memstore to 0, it denotes every read will directly forward to
bq. HBase will first check if the data exist in memstore, if not, it will
check the disk
For read path, don't forget block cache / bucket cache.
Cheers
On Wed, Aug 6, 2014 at 7:54 AM, yonghu yongyong...@gmail.com wrote:
I did not quite understand your problem. You store your data in HBase,
We have no known vulnerabilities that equate to a SQL injection attack
vulnerability. However, as Esteban says you'd want to treat HBase like any
other datastore underpinning a production service and out of an abundance
of caution deploy it into a secure enclave behind an internal service API,
so
You are just starting up a service and want the load split between multiple
region servers from the start, instead of waiting for the manual splitting.
Say you had 5 region servers, one way to create your table via HBase shell is
like this
create 'tablename', 'f', {NUMREGIONS = 5, SPLITALGO =
Hi,
the description of hbase-5416 stated why it was introduced, if you only
have 1 CF, dummy CF does not help. it is helpful for multi-CF case,
e.g. putting
them in one column family. And Non frequently ones in another.
bq. Field name will be included in rowkey.
Please read the chapter 9
bq. While scanning, an entire row will be read even for a rowkey filtering
If you specify essential column family in your filter, the above would not
be true - only the essential column family would be loaded into memory
first. Once the filter passes, the other family would be loaded.
Cheers
Am 06.08.2014 um 19:07 schrieb Andrew Purtell:
We have no known vulnerabilities that equate to a SQL injection attack
vulnerability. However, as Esteban says you'd want to treat HBase like any
other datastore underpinning a production service and out of an abundance
of caution deploy it into
Thank you Ted.
But RowFilter class has no method that can be uses to set which column family
is essential. (Actually no built-in filter class provides such a method)
So, if I (ever) want to apply the 'dummy' column family technique(?), it seems
that I must do as follows:
- Write my own filter
It will be the key of the KeyValue. Key includes
rk + cf + qualifier + ts + type.
So all these part of key. Your annswer#1 is correct (but with addition of
type also).. Hope this make it clear for you.
-Anoop-
On Tue, Aug 5, 2014 at 9:43 AM, innowireless TaeYun Kim
Hi Qiang,
thank you for your help.
1. Regarding HBASE-5416, I think it's purpose is simple.
Avoid loading column families that is irrelevant to filtering while scanning.
So, it can be applied to my 'dummy CF' case.
That is, a dummy CF can act like an 'relevant' CF to filtering, provided that
bq. no built-in filter intelligently determines which column family is
essential, except for SingleColumnValueFilter
Mostly right - don't forget about SingleColumnValueExcludeFilter which
extends SingleColumnValueFilter.
Cheers
On Wed, Aug 6, 2014 at 9:34 PM, innowireless TaeYun Kim
Thank you Anoop.
Though it's a bit strange to include CF in the index, since all the block index
is contained in a HFile for a specific CF, I'm sure there would be a good
reason (maybe for the performance of the comparison).
Anyways it should be almost no issue since the length of the CF should
Hi TaeYun,
thanks for explain.
On Thu, Aug 7, 2014 at 12:50 PM, innowireless TaeYun Kim
taeyun@innowireless.co.kr wrote:
Hi Qiang,
thank you for your help.
1. Regarding HBASE-5416, I think it's purpose is simple.
Avoid loading column families that is irrelevant to filtering while
Hi, all
I am now using spark to manipulate hbase. But I cant't use HBaseTestingUtility
to do unit test. Because spark needs Guava 15.0 and above while Hbase needs
Guava 14.0.1. These two versions are incompatible. Is there any way to solve
this conflict with maven.
Thanks,
Kevin.
Is there any tutorials available in the net to connect Spark Java API
with HBase ?
Thanks and Regards
Deepa
From: Dai, Kevin yun...@ebay.com
To: user@hbase.apache.org user@hbase.apache.org
Date: 08/07/2014 11:11 AM
Subject:Guava version incompatible
Hi, all
I am now
23 matches
Mail list logo