Hey dev-list, regarding this... re: "Be careful using hbase row locks. They are (unofficially -- we need to make it official) deprecated."
... is this the official advice? Should I update the book with this? On 1/3/12 4:37 PM, "Stack" <st...@duboce.net> wrote: >On Tue, Jan 3, 2012 at 12:38 PM, Joe Stein ><charmal...@allthingshadoop.com> wrote: >> when the event happened so if we see something from November 3rd today >>then >> we will only keep it for 4 more months (and for events that we see today >> those stay for 6 months) . so it sounds like this might be a viable >>option >> and when we set the timestamp in our checkAndPut we make the timestamp >>be >> the value that represents it as November 3rd, right? cool >> > >This should be fine. > >You might want to protect against future dates. > >> well what i was thinking is that my client code would know to use the >> november table and put the data in the november table (it is all just >> strings) but I am leaning now towards the TTL option (need to futz with >>it >> all more though). One issue/concern with TTL is when all of a sudden we >> want to keep things for only 4 months or maybe 8 months and then having >>to >> re-TTL trillions of rows =8^( (which is nagging thought in the back of >>my >> head about ttls, requirements change).... > >This schema attribute is kept at the table level, not per row. You'll >have to change the table schema which in 0.90.x hbase means offlining >table (in 0.92 hbase, there is an online schema edit but needs to be >enabled and can be problematic in the face of splitting.... more on >this later). > >> That makes sense why a narrow long schema works well then, got it (I am >>use >> to Cassandra and do lots of wide column range slices on those columns >>this >> is like inverting everything up on myself but the row locks and >>checkAndPut >> (and co-processors) hit so many of my uses cases (as Cassandra still >>does >> also) >> > >Be careful using hbase row locks. They are (unofficially -- we need >to make it official) deprecated. You can lock yourself out of a >regionserver if all incoming handlers end up waiting on a particular >row lock to clear. Check back in this mailing list for other rowlock >downsides. > >You can column range slices in hbase if you use filters (if you need to). > >checkAndPut shouldn't care if row is wide or not? > > >> right now I am on 0.90.4 but right now I am going back and forth in >> changing our hadoop cluster, HBase is the primary driver for that so I >>am >> currently wrestling on the decision with upgrading from existing cluster >> CDH2 to CDH3 or going with MapR ... > >Go to CDH3 if you are on CDH2. Does CDH2 have a working sync? >(CDH3u3 when it arrives has some nice perf improvements). > >> my preference is to run my own version >> of HBase (like I do with Kafka and Cassandra) I feel I can do this >>though I >> am not comfortable with running my own Hadoop build (already overloaded >> with things). 0.92 is exciting for co-processors too and it is cool >>system >> to hack on, maybe I will pull from trunk build and test it out some too. >> > >Don't do hbase trunk. Do tip of 0.92 branch if you want to hack. >HBase Trunk is different from 0.92 already and will get even more >"differenter"; it'll be hard to help you if you are pulling from trunk > >St.Ack >