Hi: I opened a jira ticket on this: https://issues.apache.org/jira/browse/HBASE-4999
I have linked to HBASE-4605 in the description to show related work on Constraints by Jesse. Thanks! --Suraj On Sun, Dec 4, 2011 at 1:10 PM, Ted Yu <yuzhih...@gmail.com> wrote: > Currently ConstraintProcessor latches onto prePut() to perform validation > check. > > From HRegion.doMiniBatchPut() where prePut() is called: > /* Run coprocessor pre hook outside of locks to avoid deadlock */ > So to make use of Constraint in Suraj's scenario, we have some decisions to > make about various factors. > > Cheers > > On Sun, Dec 4, 2011 at 8:39 AM, Suraj Varma <svarma...@gmail.com> wrote: > >> Jesse: >> >> Quick soln - write a CP to check the single row (blocking the put). >> >> Yeah - given that I want this to be atomically done, I'm wondering if >> this would even work (because, I believe I'd need to unlock the row so >> that the checkAndMutate can take the lock - so, there is a brief >> window between where there is no lock being held - and some other >> thread could take that lock). One option would be to pass in a lock to >> checkAndMutate ... but that would increase the locking period and may >> have performance implications, I think. >> >> I see a lot of potential in the Constraints implementation - it would >> really open up CAS operations to do functional constraint checking, >> rather than just value comparisons. >> >> --Suraj >> >> On Sun, Dec 4, 2011 at 8:32 AM, Suraj Varma <svarma...@gmail.com> wrote: >> > Thanks - I see that the lock is taken internal to checkAndMutate. >> > >> > I'm wondering whether it is a better idea to actually pass in a >> > Constraint (or even Constraints) as the checkAndMutate argument. Right >> > now it is taking in an Comparator and a CompareOp for verification. >> > But, this could just be a special case of Constraint which is >> > evaluated within the lock. >> > >> > In other words, we could open up a richer Constraint checking api >> > where any "functional" Constraint check can be performed in the >> > checkAndPut operation. >> > >> > This would also not have the same performance impact of taking a >> > rowLock in preCheckAndPut and release in postCheckAndPut. And - it is >> > really (in my mind) implementing the compare-and-set more generically. >> > >> > I also see the potential of passing in multiple constraints (say >> > upper/lower bounds in Increment/Decrement operations) etc. >> > >> > --Suraj >> > >> > >> > On Sat, Dec 3, 2011 at 7:44 PM, Ted Yu <yuzhih...@gmail.com> wrote: >> >> From HRegionServer.checkAndPut(): >> >> if (region.getCoprocessorHost() != null) { >> >> Boolean result = region.getCoprocessorHost() >> >> .preCheckAndPut(row, family, qualifier, CompareOp.EQUAL, >> comparator, >> >> put); >> >> ... >> >> boolean result = checkAndMutate(regionName, row, family, qualifier, >> >> CompareOp.EQUAL, new BinaryComparator(value), put, >> >> lock); >> >> We can see that the lock isn't taken for preCheckAndPut(). >> >> >> >> To satisfy Suraj's requirement, I think a slight change to >> checkAndPut() is >> >> needed so that atomicity can be achieved across preCheckAndPut() and >> >> checkAndMutate(). >> >> >> >> Cheers >> >> >> >> On Sat, Dec 3, 2011 at 4:54 PM, Suraj Varma <svarma...@gmail.com> >> wrote: >> >> >> >>> Just so my question is clear ... everything I'm suggesting is in the >> >>> context of a single row (not cross row / table). - so, yes, I'm >> >>> guessing obtaining a RowLock on the region side during preCheckAndPut >> >>> / postCheckAndPut would certainly work. Which was why I was asking >> >>> whether the pre/postCheckAndPut obtains the row lock or whether the >> >>> row lock is only obtained within checkAndPut. >> >>> >> >>> Let's say the coprocessor takes a rowlock in preCheckAndPut ... will >> >>> that even work? i.e. can the same rowlock be inherited by the >> >>> checkAndPut api within that thread's context? Or will preCheckAndPut >> >>> have to release the lock so that checkAndPut can take it (which won't >> >>> work for my case, as it has to be atomic between the preCheck and >> >>> Put.) >> >>> >> >>> Thanks for pointing me to the Constraints functionality - I'll take a >> >>> look at whether it could potentially work. >> >>> --Suraj >> >>> >> >>> On Sat, Dec 3, 2011 at 10:25 AM, Jesse Yates <jesse.k.ya...@gmail.com> >> >>> wrote: >> >>> > I think the feature you are looking for is a Constraint. Currently >> they >> >>> are >> >>> > being added to 0.94 in >> >>> > HBASE-4605<https://issues.apache.org/jira/browse/HBASE-4605>; >> >>> > they are almost ready to be rolled in, and backporting to 0.92 is >> >>> > definitely doable. >> >>> > >> >>> > However, Constraints aren't going to be quite flexible enough to >> >>> > efficiently support what you are describing. For instance, with a >> >>> > constraint, you are ideally just checking the put value against some >> >>> simple >> >>> > constraint (never over 10 or always an integer), but looking at the >> >>> current >> >>> > state of the table before allowing the put would currently require >> >>> creating >> >>> > a full blown connection to the local table through another HTable. >> >>> > >> >>> > In the short term, you could write a simple coprocessor to do this >> >>> checking >> >>> > and then move over to constraints (which are a simpler, more >> flexible, >> >>> way >> >>> > of doing this) when the necessary features have been added. >> >>> > >> >>> > It is worth discussing if it makes sense to have access to the local >> >>> region >> >>> > through a constraint, though that breaks the idea a little bit, it >> would >> >>> > certainly be useful and not overly wasteful in terms of runtime. >> >>> > >> >>> > Supposing the feature would be added to talk to the local table, and >> >>> since >> >>> > the puts are going to be serialized on the regionserver (at least to >> that >> >>> > single row you are trying to update), you will never get a situation >> >>> where >> >>> > the value added is over the threshold. If you were really worried >> about >> >>> the >> >>> > atomicity of the operation, then when doing the put, first get the >> >>> RowLock, >> >>> > then do the put and release the RowLock. However, that latter method >> is >> >>> > going to be really slow, so should only be used as a stop gap if the >> >>> > constraint doesn't work as expected, until a patch is made for >> >>> constraints. >> >>> > >> >>> > Feel free to open up a ticket and link it to 4605 for adding the >> local >> >>> > table access functionality, and we can discuss the de/merits of >> adding >> >>> the >> >>> > access. >> >>> > >> >>> > -Jesse >> >>> > >> >>> > On Sat, Dec 3, 2011 at 6:24 AM, Suraj Varma <svarma...@gmail.com> >> wrote: >> >>> > >> >>> >> I'm looking at the preCheckAndPut / postCheckAndPut api with >> >>> >> coprocessors and I'm wondering ... are these pre/post checks done >> >>> >> _after_ taking the row lock or is the row lock only done within the >> >>> >> checkAndPut api. >> >>> >> >> >>> >> I'm interested in seeing if we can implement something like: >> >>> >> (in pseudo sql) >> >>> >> update table-name >> >>> >> set column-name = new-value >> >>> >> where (column-value - new-value) > threshold-value >> >>> >> >> >>> >> Basically ... I want to enhance the checkAndPut to not just compare >> >>> >> "values" ... but apply an arbitrary function on the value >> _atomically_ >> >>> >> in the Put call. Multiple threads would be firing these mutations >> and >> >>> >> I'd like the threshold-value above to never be breached under any >> >>> >> circumstance. >> >>> >> >> >>> >> Is there a solution that can be implemented either via checkAndPut >> or >> >>> >> using coprocessors preCheckAndPut? If not, would this be a useful >> >>> >> feature to build in HBase? >> >>> >> >> >>> >> Thanks, >> >>> >> --Suraj >> >>> >> >> >>> > >> >>> > >> >>> > >> >>> > -- >> >>> > ------------------- >> >>> > Jesse Yates >> >>> > 240-888-2200 >> >>> > @jesse_yates >> >>> >>