Take a look at Phoenix(https://github.com/forcedotcom/phoenix). It supports
both salting and fuzzy row filtering through its skip scan.
On Sun, Oct 20, 2013 at 10:42 PM, Premal Shah premal.j.s...@gmail.comwrote:
Have you looked at FuzzyRowFilter? Seems to me that it might satisfy your
Lets look at what you are trying to do...
You want to take data where the key is a timestamp (long datatype)
You append it to a salt value 1=10 or 0-9 your example doesn't say...
You have a couple of problems with your choice of a key...
First after your initial 10 splits, you will still
Sorry if this double posts... may have used the wrong email first.
On Oct 21, 2013, at 6:36 AM, Michael Segel michael_se...@hotmail.com wrote:
Lets look at what you are trying to do...
You want to take data where the key is a timestamp (long datatype)
You append it to a salt value 1=10
Hi Kiru,
Thanks for the reply.
My understanding is not very clear with respect to these warning messages
in region server logs.
When i read documentation , it says its related to slow query logs.
So is it a issue in earlier versions of hbase and is fixed in 0.96 or is it
an issue in my
Thanks Lars! I will give it a try and will let you know.
On Sunday, October 20, 2013, lars hofhansl la...@apache.org wrote:
1/2 - 2/3 of your available RAM should be a good place to start.
From: A Laxmi a.lakshmi...@gmail.com
To: user@hbase.apache.org
Nice job everyone.
On Sat, Oct 19, 2013 at 1:31 PM, Dave Wang d...@cloudera.com wrote:
Congratulations everyone!
- Dave
On Saturday, October 19, 2013, Stack wrote:
hbase-0.96.0 is now available for download [0].
Apache HBase is a scalable, distributed data store that runs atop Apache
FuzzyRowFilter does not work on sub-key ranges.
Salting is bad for any scan operation, unfortunately. When salt prefix
cardinality is small (1-2 bytes),
one can try something similar to FuzzyRowFilter but with additional sub-key
range support.
If salt prefix cardinality is high ( 2 bytes) - do a
I advise you to refactor your key.
1. First, use salting of a low cardinality (say 1 random byte)
2. To improve range query - add time bucket to your time dimensions:
KEY:
salt_timebucket_time
tiimebucket is something similar to: day_hour_min
time - sec+ms part of timestamp
It will be
Phoenix restricts salting to a single byte.
Salting perhaps is misnamed, as the salt byte is a stable hash based on the
row key.
Phoenix's skip scan supports sub-key ranges.
We've found salting in general to be faster (though there are cases where
it's not), as it ensures better parallelization.
Hello,
I’m using the new 94.6 multi scan feature to pull rows from different tables
into a single mapper.
ArrayListScan scans = new ArrayListScan();
scans.add(scanMain);
scans.add(scanJunction);
James, good to hear that Phoenix supports these features.
Best regards,
Vladimir Rodionov
Principal Platform Engineer
Carrier IQ, www.carrieriq.com
e-mail: vrodio...@carrieriq.com
From: James Taylor [jtay...@salesforce.com]
Sent: Monday, October 21, 2013
John:
Can you let us know whether the Import succeeded this time ?
If not, can you provide logs from some region server ?
Thanks
On Sun, Oct 20, 2013 at 11:32 AM, John johnnyenglish...@gmail.com wrote:
Thanks for that ... . I've started the Import again and hope it will work
this time.
Hi Jim,
I don't see an obvious way to gain access to this information. If you don't
find a clever way to get at this, would you mind opening a ticket for this
feature request?
Thanks,
Nick
On Mon, Oct 21, 2013 at 9:44 AM, Jim Holloway
jim.hollo...@windstream.netwrote:
Hello,
I’m using the
Then its not a SALT. And please don't use the term 'salt' because it has
specific meaning outside to what you want it to mean. Just like saying HBase
has ACID because you write the entire row as an atomic element. But I digress….
Ok so to your point…
1 byte == 255 possible values.
So
fyi.
-- Forwarded message --
From: Vivek Mishra vivek.mis...@impetus.co.in
Date: Tue, Oct 22, 2013 at 1:33 AM
Subject: {kundera-discuss} Kundera 2.8 released
To: kundera-disc...@googlegroups.com kundera-disc...@googlegroups.com
Hi All,
We are happy to announce the release of
What do you think it should be called, because
prepending-row-key-with-single-hashed-byte doesn't have a very good ring
to it. :-)
Agree that getting the row key design right is crucial.
The range of prepending-row-key-with-single-hashed-byte is declarative
when you create your table in Phoenix,
What do you call hashing the row key?
Or hashing the row key and then appending the row key to the hash?
Or hashing the row key, truncating the hash value to some subset and then
appending the row key to the value?
The problem is that there is specific meaning to the term salt. Re-using it
Hi,
I would like to fetch data from hbase table using map reduce export API. I
see that I can fetch data using start and stop time, but I don't see any
information regarding start and stop row key. Can any expert guide me or
give me an example in order fetch first 1000 rows (or start and stop row
Hello, I want to check the status of each coprocessor, in a given table.
Let's say I have 3 CPs and one of them is removed due to some unhandled
exception, so I want to see this status (3 deployed, 2 currently alive).
I found this from
w.r.t. #3, there is a config parameter: hbase.coprocessor.abortonerror
which determines whether the hosting server should abort.
See the following tests for examples:
hbase-server/src/test/java/org/apache/hadoop/hbase/coprocessor/TestMasterCoprocessorExceptionWithAbort.java
The version you are using only support PrefixFilter and RegexFilter for scans.
Unless your start and stop row have the same prefix (or you can somehow get it
into a regex), you won't be able to do it as is. You can always write your own
export (we did that to support some more functionality
We don't truncate the hash, we mod it. Why would you expect that data
wouldn't be evenly distributed? We've not seen this to be the case.
On Mon, Oct 21, 2013 at 1:48 PM, Michael Segel msegel_had...@hotmail.comwrote:
What do you call hashing the row key?
Or hashing the row key and then
For #1, the coprocessors reported are live ones.
See CoprocessorHost#getCoprocessors()
For #2, I don't know such API (without parsing ServerLoad) exists.
Cheers
On Mon, Oct 21, 2013 at 2:50 PM, Ted Yu yuzhih...@gmail.com wrote:
w.r.t. #3, there is a config parameter:
Hi Dhaval,
Can you please share your code if possible ? it would benefit others as
well.
Thanks,
karunakar.
--
View this message in context:
http://apache-hbase.679495.n3.nabble.com/How-can-I-export-HBase-table-using-start-and-stop-row-key-tp4051961p4051972.html
Sent from the HBase User
James,
Its evenly distributed, however... because its a time stamp, its a 'tail end
charlie' addition.
So when you split a region, the top half is never added to, so you end up with
all regions half filled except for the last region in each 'modded' value.
I wouldn't say its a bad thing
Uhm...
You can't remove a coprocessor.
Well, you can, but that would require a rolling restart.
It still exists and is still loaded.
On Oct 21, 2013, at 4:41 PM, Wei Tan w...@us.ibm.com wrote:
Hello, I want to check the status of each coprocessor, in a given table.
Let's say I have 3
You can't remove a coprocessor.
Well, you can, but that would require a rolling restart.
It still exists and is still loaded.
Assuming we are talking about RegionObserver coprocessors here, when a
coprocessor throws an exception (other than IOException), it is either:
a) removed from the
Hi Gary, thanks!
It seems that the region observer been removed behavior, is per region and
NOT per coprocessor. So do I have to query each region to get the per
region health status? Or, is there a table level API telling me something
like, I have 10 regions and an observer has been removed in
One thing I neglected to mention is that the table is pre-split at the
prepending-row-key-with-single-hashed-byte boundaries, so the expectation
is that you'd allocate enough buckets that you don't end up needing to
splitting the regions. But if you under allocate (i.e. allocate too small a
Nice job and congrats to everyone !!!
On Mon, Oct 21, 2013 at 9:38 PM, Elliott Clark ecl...@apache.org wrote:
Nice job everyone.
On Sat, Oct 19, 2013 at 1:31 PM, Dave Wang d...@cloudera.com wrote:
Congratulations everyone!
- Dave
On Saturday, October 19, 2013, Stack wrote:
30 matches
Mail list logo