I needed to be sure it works for pre-0.90 version *too*, because this is what one of our clusters use (as I believe most clusters which are in use by others). It works for newer versions too. There's a most recent jar available for download in project sources root which one can just download and use with his/her own cluster.
Alex Baranau ---- Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - Hadoop - HBase On Thu, Apr 21, 2011 at 6:04 PM, Eric Charles <eric.char...@u-mangate.com>wrote: > Hi Alex, > > Yep, saw the "[ANN]: HBaseWD: Distribute Sequential Writes in HBase" > thread. Tks for this :) - I will need some more time to test it (quite busy > atm). > > Why did you declare 0.89.20100924. and not 0.91 (or 92) as HBaseWD > dependency? > > Tks, > - Eric > > > > On 21/04/2011 15:56, Alex Baranau wrote: > >> For those who are looking for the solution to this or similar issue, this >> can be useful: >> >> Take a look at HBaseWD (https://github.com/sematext/HBaseWD) lib, which >> implements solution close to what Lars described. >> Also some info here: http://search-hadoop.com/m/AQ7CG2GkiO >> >> Alex Baranau >> ---- >> Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - Hadoop - >> HBase >> >> On Thu, Mar 17, 2011 at 11:52 AM, Eric Charles >> <eric.char...@u-mangate.com>wrote: >> >> Hi Lars, >>> Many tks for your reply. >>> >>> For now, I just rely on random or hashed keys and don't need any range >>> queries. >>> I will have to choose a nice solution one day for ordered keys upon which >>> I >>> will range-query. >>> >>> I will post the results of the different data models I will try (looking >>> for other threads/articles to have a better view of the options). >>> >>> Tks, >>> - Eric >>> >>> >>> On 16/03/2011 23:02, Lars George wrote: >>> >>> Hi Eric, >>>> >>>> Oops, you are right, my example was not clear and actually confusing >>>> the keys with sequential ones. The hash should map every Nth row key >>>> to the same bucket, so that you would for example see an interleaved >>>> distribution of row keys to regions. Region 1 holds 1, 8, 15,... while >>>> region 2 holds 2, 9, 16,... and so on. I do not think performance is a >>>> big issue. And yes, this is currently all client side driven :( >>>> >>>> Lars >>>> >>>> On Wed, Mar 16, 2011 at 2:57 PM, Eric Charles >>>> <eric.char...@u-mangate.com> wrote: >>>> >>>> Hi Lars, >>>>> Many tks for your explanations! >>>>> >>>>> About DFR (sequential-keys) vs DFW (random-keys) distinction, I imagine >>>>> different cases (just rephrasing what you said to be sure I get it): >>>>> >>>>> - Keys are really random (GUID or whatever): you have the distribution >>>>> for >>>>> free, still can't do, and probably don't need, range-queries. >>>>> >>>>> - If keys are monotonically increasing (timestamp, >>>>> autoincremented,...), >>>>> there are two cases: >>>>> 1) sometimes, you don't need to do some range-queries and can store the >>>>> key >>>>> as a real hash (md5,...) to have distribution. >>>>> 2) For timebased series for example, you may need to do some range >>>>> queries, >>>>> and adding a salt can be an answer to combine best-of-world. >>>>> >>>>> I understand the "salt" approach as recreating on the client side >>>>> "artifical" key spaces. >>>>> >>>>> I was first confused reading "row 1...1000 -> prefix h1_". >>>>> To really make the distribution random, I would have seen prefix/salt >>>>> attributed randomly for a key leading to for example a h1 keyspace as >>>>> such: >>>>> h1_key2032, h1_key0023, h1_key1014343, ... >>>>> >>>>> Maybe you meant the intermediate approach where time keys of "hour 1" >>>>> going >>>>> to h1 keyspace, keys of "hour 2" going to h2 keyspace,... >>>>> In that case, if you look for keys in "hour 1", you would only need one >>>>> scanner cause you know that they reside in "h1_", and you could query >>>>> with >>>>> scan(h1_time1, h1_time2). >>>>> >>>>> But at at time, as you describe, you may need to scan different buckets >>>>> with >>>>> different scanners and use an ordered list to contain the result. >>>>> - What about performance in that case? for very large dataset, a range >>>>> query >>>>> will take much time. I can imagine async client at the rescue. Maybe >>>>> also >>>>> mapreduce jobs could help cause if will benefit from data locality. >>>>> - Also, the client application must manage the salts: it's a bit like >>>>> reinventing a "salt" layer on top of the hbase region servers, letting >>>>> client carry on this layer. The client will have to store (in hbase :)) >>>>> the >>>>> mapping between key ranges and their salt prefixes. It's a bit like >>>>> exporting some core? functionality to the client. >>>>> >>>>> Strange, I fell I missed your point :) >>>>> Tks, >>>>> >>>>> - Eric >>>>> >>>>> Sidenote: ...and yes, it seems I will have to learn some ruby stuff >>>>> (should >>>>> get used to, cause I just learned another scripting language running on >>>>> jvm >>>>> for another project...) >>>>> >>>>> >>>>> On 16/03/2011 13:00, Lars George wrote: >>>>> >>>>> Hi Eric, >>>>>> >>>>>> Socorro is Java and Python, I was just mentioning it as a possible >>>>>> source of inspiration :) You can learn Ruby and implement it (I hear >>>>>> it is easy... *cough*) or write that same in a small Java app and use >>>>>> it from the command line or so. >>>>>> >>>>>> And yes, you can range scan using a prefix. We were discussing this >>>>>> recently and there is this notion of design for reads, or design for >>>>>> writes. DFR is usually sequential keys and DFW is random keys. It is >>>>>> tough to find common grounds as both designs are on the far end of the >>>>>> same spectrum. Finding a middle ground is the bucketed (or salted) >>>>>> approach, which gives you distribution but still being able to scan... >>>>>> but not without some client side support. One typical class of data is >>>>>> timeseries based keys. As for scanning them, you need N client side >>>>>> scanners. Imagine this example: >>>>>> >>>>>> row 1 ... 1000 -> Prefix "h1_" >>>>>> row 1001 ... 2000 -> Prefix "h2_" >>>>>> row 2001 ... 3000 -> Prefix "h3_" >>>>>> row 3001 ... 4000 -> Prefix "h4_" >>>>>> row 4001 ... 5000 -> Prefix "h5_" >>>>>> row 5001 ... 6000 -> Prefix "h6_" >>>>>> row 6001 ... 7000 -> Prefix "h7_" >>>>>> >>>>>> So you have divided the entire range into 7 buckets. The prefixes >>>>>> (also sometimes called salt) are used to distribute them row keys to >>>>>> region servers. To scan the entire range as one large key space you >>>>>> need to create 7 scanners: >>>>>> >>>>>> 1. scanner: start row: "h1_", end row "h2_" >>>>>> 2. scanner: start row: "h2_", end row "h3_" >>>>>> 3. scanner: start row: "h3_", end row "h4_" >>>>>> 4. scanner: start row: "h4_", end row "h5_" >>>>>> 5. scanner: start row: "h5_", end row "h6_" >>>>>> 6. scanner: start row: "h6_", end row "h7_" >>>>>> 7. scanner: start row: "h7_", end row "" >>>>>> >>>>>> Now each of them gives you the first row that matches the start and >>>>>> end row keys they are configure for. So you then take that first KV >>>>>> they offer and add it to a list, sorted by ky.getRow() while removing >>>>>> the hash prefix. For example, scanner 1 may have row "h1_1" to offer, >>>>>> then split and drop the prefix "h1_" to get "1". The list then would >>>>>> hold something like: >>>>>> >>>>>> 1. row "1" -> kv from scanner 1 >>>>>> 2. row "1010" -> kv from scanner 2 >>>>>> 3. row "2001" -> kv from scanner 3 >>>>>> 4. row "3033" -> kv from scanner 4 >>>>>> 5. row "4001" -> kv from scanner 5 >>>>>> 6. row "5002" -> kv from scanner 6 >>>>>> 7. row "6000" -> kv from scanner 7 >>>>>> >>>>>> (assuming that the keys are not contiguous but have gaps) >>>>>> >>>>>> You then pop element #1 and do a "scanner1.next()" to get its next KV >>>>>> offering. Then insert that into the list and you get >>>>>> >>>>>> 1. row "3" -> kv from scanner 1 >>>>>> 2. row "1010" -> kv from scanner 2 >>>>>> 3. row "2001" -> kv from scanner 3 >>>>>> 4. row "3033" -> kv from scanner 4 >>>>>> 5. row "4001" -> kv from scanner 5 >>>>>> 6. row "5002" -> kv from scanner 6 >>>>>> 7. row "6000" -> kv from scanner 7 >>>>>> >>>>>> Notice how you always only have a list with N elements on the client >>>>>> side, each representing the next value the scanners offer. Since the >>>>>> list is sorted you always access item #1 and therefore the next in the >>>>>> entire key space. >>>>>> >>>>>> Once scanner 1 runs out you can close and remove it, the list will >>>>>> then give you values from scanner 2 as the first elements in it. And >>>>>> so on. >>>>>> >>>>>> Makes more sense? >>>>>> >>>>>> Lars >>>>>> >>>>>> On Wed, Mar 16, 2011 at 12:09 PM, Eric Charles >>>>>> <eric.char...@u-mangate.com> wrote: >>>>>> >>>>>> Hi Lars, >>>>>>> Are you talking about http://code.google.com/p/socorro/ ? >>>>>>> I can find python scripts, but no jruby one... >>>>>>> >>>>>>> Aside the hash function I could reuse, are you saying that range >>>>>>> queries >>>>>>> are >>>>>>> possible even with hashed keys (randomly distributed)? >>>>>>> (If possible with the script, it will also be possible from the hbase >>>>>>> java >>>>>>> client). >>>>>>> Even with your explanation, I can't figure out how compound keys >>>>>>> (hasedkey+key) can be range-queried. >>>>>>> >>>>>>> Tks, >>>>>>> - Eric >>>>>>> >>>>>>> On 16/03/2011 11:38, Lars George wrote: >>>>>>> >>>>>>> Hi Eric, >>>>>>>> >>>>>>>> Mozilla Socorro uses an approach where they bucket ranges using >>>>>>>> leading hashes to distribute them across servers. When you want to >>>>>>>> do >>>>>>>> scans you need to create N scans, where N is the number of hashes >>>>>>>> and >>>>>>>> then do a next() on each scanner, putting all KVs into one sorted >>>>>>>> list >>>>>>>> (use the KeyComparator for example) while stripping the prefix hash >>>>>>>> first. You can then access the rows in sorted order where the first >>>>>>>> element in the list is the one with the first key to read. Once you >>>>>>>> took of the first element (being the lowest KV key) you next the >>>>>>>> underlying scanner and reinsert it into the list, reordering it. You >>>>>>>> keep taking from the top and therefore always see the entire range, >>>>>>>> even if the same scanner would return the next logical rows to read. >>>>>>>> >>>>>>>> The shell is written in JRuby, so any function you can use there >>>>>>>> would >>>>>>>> make sense to use in the prefix, then you could compute it on the >>>>>>>> fly. >>>>>>>> This will not help with merging the bucketed key ranges, you need to >>>>>>>> do this with the above approach in code. Though since this is JRuby >>>>>>>> you could write that code in Ruby and add it to you local shell >>>>>>>> giving >>>>>>>> you what you need. >>>>>>>> >>>>>>>> Lars >>>>>>>> >>>>>>>> On Wed, Mar 16, 2011 at 9:01 AM, Eric Charles >>>>>>>> <eric.char...@u-mangate.com> wrote: >>>>>>>> >>>>>>>> Oops, forget my first question about range query (if keys are >>>>>>>>> hashed, >>>>>>>>> they >>>>>>>>> can not be queried based on a range...) >>>>>>>>> Still curious to have info on hash function in shell shell (2.) and >>>>>>>>> advice >>>>>>>>> on md5/jenkins/sha1 (3.) >>>>>>>>> Tks, >>>>>>>>> Eric >>>>>>>>> >>>>>>>>> On 16/03/2011 09:52, Eric Charles wrote: >>>>>>>>> >>>>>>>>> Hi, >>>>>>>>>> >>>>>>>>>> To help avoid hotspots, I'm planning to use hashed keys in some >>>>>>>>>> tables. >>>>>>>>>> >>>>>>>>>> 1. I wonder if this strategy is adviced for range queries (from/to >>>>>>>>>> key) >>>>>>>>>> use case, because the rows will be randomly distributed in >>>>>>>>>> different >>>>>>>>>> regions. Will it cause some performance loose? >>>>>>>>>> 2. Is it possible to query from hbase shell with something like >>>>>>>>>> "get >>>>>>>>>> 't1', >>>>>>>>>> @hash('r1')", to let the shell compute the hash for you from the >>>>>>>>>> readable >>>>>>>>>> key. >>>>>>>>>> 3. There are MD5 and Jenkins classes in hbase.util package. What >>>>>>>>>> would >>>>>>>>>> you >>>>>>>>>> advice? what about SHA1? >>>>>>>>>> >>>>>>>>>> Tks, >>>>>>>>>> - Eric >>>>>>>>>> >>>>>>>>>> PS: I searched the archive but didn't find the answers. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>> >>> >>