What motivated me to ask is work for alpha-4 (I think) that rips out coprocessor access to metrics objects. I'm saying if we support bypass, then metrics are a public facing API, and this isn't a good change to make. However it follows to consider if how bypass works now is a good thing to support. I claim it is not. I think we've arrived at a consensus that "some" is still good, especially for RPC. And so we have to get changes in before 2.0 goes GA. I don't think it is a blocker until then.
On Tue, Oct 17, 2017 at 3:29 PM, Josh Elser <els...@apache.org> wrote: > That's fine, and I'm quite familiar with how that works. > > I was just trying to catch up with the express purpose of driving forward > alpha-4. As a bystander, this is an unassigned blocker. I wanted to make > sure that we both had consensus on what people think we should do as well > as someone signing up to do that work. > > If we don't have both, it's my opinion that we shouldn't keep it in > alpha-4. > > > On 10/17/17 6:15 PM, Andrew Purtell wrote: > >> Sometimes we don't arrive at a point where discussions happen and >> decisions >> are made until code is about to ship. A general thing with open source, I >> think. It is less than ideal but important to strike when that iron is >> (eventually) hot. >> >> >> On Tue, Oct 17, 2017 at 3:12 PM, Josh Elser <els...@apache.org> wrote: >> >> (catching up here) >>> >>> I'm glad to see you fine folks came to a conclusion around a >>> reduced-scope >>> solution (correct me if I'm wrong). "Some" bypass mechanism would stay >>> for >>> preXXX methods, and we'd remove it for the other methods? What exactly >>> the >>> "bypass API" would be is up in the air, correct? >>> >>> Duo -- maybe you could put the "current plan" on HBASE-18770 since >>> discussion appears to have died down? >>> >>> I was originally lamenting yet another big, sweeping change to CPs when I >>> had expected alpha-4 to have already landed. But, let me play devil's >>> advocate: is this something we still think is critical to do in alpha-4? >>> I >>> can respect wanting to get address all of these smells, but I'd be worry >>> it >>> delays us further. >>> >>> >>> On 10/11/17 9:53 PM, 张铎(Duo Zhang) wrote: >>> >>> Creating an exception is expensive so if it is not suggested to do it in >>>> a >>>> normal case. A common trick is to create a global exception instance, >>>> and >>>> always throw it to avoid creating every time but I think it is more >>>> friendly to just use a return value? >>>> >>>> And for me, the bypass after preXXX for normal region operations just >>>> equals to a 'cancel', which is very clear and easy to understand, so I >>>> think it is OK to add bypass support for them. And also for compaction >>>> and >>>> flush, it is OK to give CP users the ability to cancel the operation as >>>> the >>>> semantic is clear, although I'm not sure how CP users would use this >>>> feature. >>>> >>>> In general, I think we can provide bypass/cancel support in preXXX >>>> methods >>>> where it is the very beginning of an operation. >>>> >>>> Thanks. >>>> >>>> 2017-10-12 3:10 GMT+08:00 Andrew Purtell <apurt...@apache.org>: >>>> >>>> On Phoenix Increment by-pass, an ornery item is that Phoenix wants to >>>> use >>>> >>>>> >>>>>> its long encoding writing Increments. Not sure how we'd do that, >>>>> selectively. >>>>> >>>>> If we can handle the rest of the trouble that you observed: >>>>> >>>>> 1) Lack of recognition and identification of when the key value to >>>>> increment doesn't exist >>>>> 2) Lack of the ability to set the timestamp of the updated key value. >>>>> >>>>> then they might be able to make it work. Perhaps a conversion from >>>>> HBase >>>>> native to Phoenix LONG encoding when processing results, in the >>>>> wrapping >>>>> scanner, informed by schema metadata. >>>>> >>>>> Or if we are keeping the bypass semantic in select places but >>>>> implementing >>>>> it with something other than today's bypass() API (please) this would >>>>> be >>>>> another candidate for where to keep it. Duo suggests keeping the >>>>> semantic >>>>> in all of the basic RPC preXXX hooks for query and mutation. We could >>>>> redo >>>>> those APIs to skip normal processing based on a return value or >>>>> exception >>>>> but otherwise drop bypass from all the others. It will clean up areas >>>>> of >>>>> confusion, e.g. can I bypass splits or flushes or not? Or what about >>>>> this >>>>> arcane hook in compaction? Or [insert some deep hook here]? The answer >>>>> would be: only RPC hooks will early out, and only if you return this >>>>> value, >>>>> or throw that exception. >>>>> >>>>> >>>>> On Wed, Oct 11, 2017 at 11:56 AM, Stack <st...@duboce.net> wrote: >>>>> >>>>> The YARN Timeline Server has the FlowRunCoprocessor. It does bypass >>>>> when >>>>> >>>>>> user does a Get returning instead the result of its own (Flow) Scan >>>>>> >>>>>> result. >>>>> >>>>> Not sure how we'd do alternative here; Timeline Server is keeping Tags >>>>>> internally. >>>>>> >>>>>> >>>>>> On Wed, Oct 11, 2017 at 10:59 AM, Andrew Purtell <apurt...@apache.org >>>>>> > >>>>>> wrote: >>>>>> >>>>>> Rather than continue to support a weird bypass() which works in some >>>>>> >>>>>>> >>>>>>> places >>>>>> >>>>>> and not in others, perhaps we can substitute it with an exception? So >>>>>>> >>>>>>> if >>>>>> >>>>> >>>>> the coprocessor throws this exception in the pre hook then where it is >>>>>> >>>>>>> allowed we catch it and do the right thing, and where it is not >>>>>>> allowed >>>>>>> >>>>>>> we >>>>>> >>>>>> don't catch it and the server aborts. This will at least improve the >>>>>>> >>>>>>> silent >>>>>> >>>>>> bypass() failure problem. I also don't like, in retrospect, that >>>>>>> >>>>>>> calling >>>>>> >>>>> >>>>> this environment method has magic side effects. Everyone understands >>>>>> >>>>>>> >>>>>>> how >>>>>> >>>>> >>>>> exceptions work, so it will be clearer. >>>>>> >>>>>>> >>>>>>> >>>>>>> We could do that though throw and catch of exceptions would be >>>>>>> costly. >>>>>>> >>>>>> >>>>>> What about the Duo suggestion? Purge bypass flag and replace it w/ >>>>>> preXXX >>>>>> in a few select methods returning a boolean on whether bypass? Would >>>>>> that >>>>>> work? (Would have to figure metrics still). >>>>>> >>>>>> >>>>>> >>>>>> In any case we should try to address the Tephra and Phoenix cases >>>>>> >>>>>>> >>>>>>> brought >>>>>> >>>>> >>>>> up in this discussion. They look like we can find alternatives. Shall I >>>>>> >>>>>>> file JIRAs to follow up? >>>>>>> >>>>>>> >>>>>>> >>>>>>> On Phoenix Increment by-pass, an ornery item is that Phoenix wants to >>>>>>> >>>>>> use >>>>>> its long encoding writing Increments. Not sure how we'd do that, >>>>>> selectively. >>>>>> >>>>>> St.Ack >>>>>> >>>>>> >>>>>> >>>>>> On Wed, Oct 11, 2017 at 6:00 AM, 张铎(Duo Zhang) <palomino...@gmail.com >>>>>> > >>>>>> >>>>>>> wrote: >>>>>>> >>>>>>> These examples are great. >>>>>>> >>>>>>>> >>>>>>>> And I think for normal region operations such as get, put, delete, >>>>>>>> checkAndXXX, increment, it is OK to bypass the real operation after >>>>>>>> >>>>>>>> preXXX >>>>>>> >>>>>>> as the semantic is clear enough. Instead of calling env.bypass, maybe >>>>>>>> >>>>>>>> just >>>>>>> >>>>>>> let these preXXX methods return a boolean is enough to tell the HBase >>>>>>>> framework that we have already done the real operation so just give >>>>>>>> >>>>>>>> up >>>>>>> >>>>>> >>>>> and >>>>>> >>>>>>> >>>>>>> return? >>>>>>>> >>>>>>>> Thanks. >>>>>>>> >>>>>>>> 2017-10-11 3:19 GMT+08:00 Gary Helmling <ghelml...@gmail.com>: >>>>>>>> >>>>>>>> The Tephra TransactionProcessor CP makes use of bypass() in >>>>>>>> >>>>>>>>> >>>>>>>>> preDelete() >>>>>>>> >>>>>>> >>>>>> to >>>>>>> >>>>>>>> >>>>>>>> override handling of delete tombstones in a transactional way: >>>>>>>>> https://github.com/apache/incubator-tephra/blob/master/ >>>>>>>>> tephra-hbase-compat-1.3/src/main/java/org/apache/tephra/ >>>>>>>>> >>>>>>>>> hbase/coprocessor/ >>>>>>>> >>>>>>>> TransactionProcessor.java#L244 >>>>>>>>> >>>>>>>>> The CDAP IncrementHandler CP also makes use of bypass() in >>>>>>>>> >>>>>>>>> preGetOp() >>>>>>>> >>>>>>> >>>>> and >>>>>> >>>>>>> >>>>>>> preIncrementAfterRRowLock() to provide a transaction implementation >>>>>>>> >>>>>>>>> >>>>>>>>> of >>>>>>>> >>>>>>> >>>>>> readless increments: >>>>>>> >>>>>>>> https://github.com/caskdata/cdap/blob/develop/cdap-hbase- >>>>>>>>> compat-1.1/src/main/java/co/cask/cdap/data2/increment/ >>>>>>>>> hbase11/IncrementHandler.java#L121 >>>>>>>>> >>>>>>>>> What would be the alternate approach for these applications? In >>>>>>>>> >>>>>>>>> both >>>>>>>> >>>>>>> >>>>> cases >>>>>> >>>>>>> >>>>>>>> they need to impose their own semantics on the underlying KeyValue >>>>>>>>> storage. Is there a different way this can be done? >>>>>>>>> >>>>>>>>> >>>>>>>>> On Tue, Oct 10, 2017 at 11:58 AM Anoop John <anoop.hb...@gmail.com >>>>>>>>> >>>>>>>>> >>>>>>>> wrote: >>>>>> >>>>>>> >>>>>>>> >>>>>>>>> Wrap core scanners is different right? That can be done in post >>>>>>>>> >>>>>>>>>> hooks. I have seen many use cases for this.. Its the question >>>>>>>>>> >>>>>>>>>> abt >>>>>>>>> >>>>>>>> >>>>> the pre hooks where we have not yet created the core object (like >>>>>> >>>>>>> scanner). The CP pre code itself doing the work of object >>>>>>>>>> >>>>>>>>>> creation >>>>>>>>> >>>>>>>> >>>>> and so the core code is been bypassed. Well the wrapping thing >>>>>> >>>>>>> >>>>>>>>>> can >>>>>>>>> >>>>>>>> >>>>>> be done in pre hook also. First create the core object by CP code >>>>>>> >>>>>>>> itself and then do the wrapped object and return.. I have seen in >>>>>>>>>> >>>>>>>>>> one >>>>>>>>> >>>>>>>> >>>>>> jira issue where the usage was this way.. The wrapping can be >>>>>>> >>>>>>>> >>>>>>>>>> done >>>>>>>>> >>>>>>>> >>>>>> in post also in such cases I believe. >>>>>>> >>>>>>>> >>>>>>>>>> -Anoop- >>>>>>>>>> >>>>>>>>>> On Wed, Oct 11, 2017 at 12:23 AM, Andrew Purtell < >>>>>>>>>> >>>>>>>>>> apurt...@apache.org> >>>>>>>>> >>>>>>>> >>>>>>> wrote: >>>>>>>> >>>>>>>>> >>>>>>>>>> I think we should continue to support overriding function by >>>>>>>>>>> >>>>>>>>>>> object >>>>>>>>>> >>>>>>>>> >>>>>> inheritance. I didn't mention this and am not proposing more >>>>>>> >>>>>>>> >>>>>>>>>>> than >>>>>>>>>> >>>>>>>>> >>>>> removing >>>>>> >>>>>>> >>>>>>>>>> the bypass() sematic. No more no less. Phoenix absolutely >>>>>>>>>>> >>>>>>>>>>> depends >>>>>>>>>> >>>>>>>>> >>>>> on >>>>>> >>>>>>> >>>>>>> being >>>>>>>> >>>>>>>>> >>>>>>>>>> able to wrap core scanners and return the wrappers. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On Tue, Oct 10, 2017 at 11:50 AM, Anoop John < >>>>>>>>>>> >>>>>>>>>>> anoop.hb...@gmail.com> >>>>>>>>>> >>>>>>>>> >>>>>>> wrote: >>>>>>>> >>>>>>>>> >>>>>>>>>> >>>>>>>>>>> When we say bypass the core code, it can be done today not >>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> only >>>>>>>>>>> >>>>>>>>>> >>>>> by >>>>>> >>>>>> calling bypass but by returning a not null object for some of >>>>>>> >>>>>>>> >>>>>>>>>>>> the >>>>>>>>>>> >>>>>>>>>> >>>>>> pre >>>>>>> >>>>>>>> >>>>>>>> hooks. Like preScannerOpen() if it return a scanner object, >>>>>>>>> >>>>>>>>>> >>>>>>>>>>>> we >>>>>>>>>>> >>>>>>>>>> >>>>> will >>>>>> >>>>>>> >>>>>>> avoid the remaining core code execution for creation of the >>>>>>>> >>>>>>>>> scanner(s). So this proposal include this aspect also and >>>>>>>>>>>> >>>>>>>>>>>> remove >>>>>>>>>>> >>>>>>>>>> >>>>>> any >>>>>>> >>>>>>>> >>>>>>>> possible way of bypassing the core code by the CP hook code >>>>>>>>> >>>>>>>>>> >>>>>>>>>>>> execution >>>>>>>>>>> >>>>>>>>>> >>>>>>>> ? Am +1. >>>>>>>>> >>>>>>>>>> >>>>>>>>>>>> -Anoop- >>>>>>>>>>>> >>>>>>>>>>>> On Tue, Oct 10, 2017 at 11:40 PM, Andrew Purtell < >>>>>>>>>>>> >>>>>>>>>>>> apurt...@apache.org >>>>>>>>>>> >>>>>>>>>> >>>>>>>> >>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> The coprocessor API provides an environment method, >>>>>>>>>>>>> >>>>>>>>>>>>> bypass(), >>>>>>>>>>>> >>>>>>>>>>> >>>>> that >>>>>> >>>>>>> >>>>>>> when >>>>>>>> >>>>>>>>> >>>>>>>>>> called from a preXXX hook will cause the core code to skip >>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>> all >>>>>>>>>>>> >>>>>>>>>>> >>>>> remaining >>>>>> >>>>>>> >>>>>>>>>> processing. This capability was introduced on HBASE-3348. >>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>> Since >>>>>>>>>>>> >>>>>>>>>>> >>>>>> this >>>>>>> >>>>>>>> >>>>>>>> time I >>>>>>>>> >>>>>>>>>> >>>>>>>>>>>> think we are more enlightened about the complications of >>>>>>>>>>>>> >>>>>>>>>>>>> this >>>>>>>>>>>> >>>>>>>>>>> >>>>> feature. >>>>>> >>>>>>> >>>>>>>>> (Or, >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> anyway, speaking for myself:) >>>>>>>>>>>>> >>>>>>>>>>>>> Not all hooks provide the bypass semantic. Where this is the >>>>>>>>>>>>> >>>>>>>>>>>>> case >>>>>>>>>>>> >>>>>>>>>>> >>>>>>> the >>>>>>>> >>>>>>>>> >>>>>>>>> javadoc for the hook says so, but it can be missed. If you >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>>> call >>>>>>>>>>>> >>>>>>>>>>> >>>>>> bypass() >>>>>>> >>>>>>>> >>>>>>>>>> in >>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> a hook where it is not supported it is a no-op. This can >>>>>>>>>>>>> >>>>>>>>>>>>> lead >>>>>>>>>>>> >>>>>>>>>>> >>>>> to a >>>>>> >>>>>>> >>>>>>> poor >>>>>>>> >>>>>>>>> >>>>>>>>>> developer experience. >>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>> Where bypass is supported what is being bypassed is all of >>>>>>>>>>>>> >>>>>>>>>>>>> the >>>>>>>>>>>> >>>>>>>>>>> >>>>> core >>>>>> >>>>>>> >>>>>>>> code >>>>>>>>> >>>>>>>>>> >>>>>>>>>> implementing the remainder of the operation. In order to >>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>> understand >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>> what >>>>>>>>> >>>>>>>>>> >>>>>>>>>> calling bypass() will skip, a coprocessor implementer should >>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>> read >>>>>>>>>>>> >>>>>>>>>>> >>>>>>> and >>>>>>>> >>>>>>>>> >>>>>>>>> understand all of the remaining code and its nuances. >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>>> Although I >>>>>>>>>>>> >>>>>>>>>>> >>>>>> think >>>>>>> >>>>>>>> >>>>>>>>> this >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> is good practice for coprocessor developers in general, it >>>>>>>>>>>>> >>>>>>>>>>>>> demands a >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>> lot. I >>>>>>>>> >>>>>>>>>> >>>>>>>>>>>> think it would provide a much better developer experience if >>>>>>>>>>>>> >>>>>>>>>>>>> we >>>>>>>>>>>> >>>>>>>>>>> >>>>>> didn't >>>>>>> >>>>>>>> >>>>>>>>> allow bypass, even though it means - in theory - a >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>>> coprocessor >>>>>>>>>>>> >>>>>>>>>>> >>>>> would >>>>>> >>>>>>> >>>>>>>> be a >>>>>>>>> >>>>>>>>>> >>>>>>>>>> lot more limited in some ways than before. What is skipped >>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>> is >>>>>>>>>>>> >>>>>>>>>>> >>>>> extremely >>>>>> >>>>>>> >>>>>>>>>> version dependent. That core code will vary, perhaps >>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>> significantly, >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>> even >>>>>>>>> >>>>>>>>>> >>>>>>>>>> between point releases. We do not provide the promise of >>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>> consistent >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>> behavior even between point releases for the bypass >>>>>>>>> >>>>>>>>>> >>>>>>>>>>>>> semantic. >>>>>>>>>>>> >>>>>>>>>>> >>>>> To >>>>>> >>>>>> achieve >>>>>>> >>>>>>>> >>>>>>>>>> that we could not change any code between hook points. >>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>> Therefore >>>>>>>>>>>> >>>>>>>>>>> >>>>>> the >>>>>>> >>>>>>>> >>>>>>>> coprocessor implementer becomes an HBase core developer in >>>>>>>>> >>>>>>>>>> >>>>>>>>>>>>> practice >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>> as >>>>>>>>> >>>>>>>>> soon >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> as they rely on bypass(). Every release of HBase may break >>>>>>>>>>>>> >>>>>>>>>>>>> the >>>>>>>>>>>> >>>>>>>>>>> >>>>> assumption >>>>>> >>>>>>> >>>>>>>>>> that the replacement for the bypassed code takes care of all >>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>> necessary >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>> skipped concerns. Because those concerns can change at any >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>>> point, >>>>>>>>>>>> >>>>>>>>>>> >>>>>>> such an >>>>>>>> >>>>>>>>> >>>>>>>>>> assumption is never safe. >>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>> I say "in theory" because I would be surprised if anyone is >>>>>>>>>>>>> >>>>>>>>>>>>> relying >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>> on >>>>>>>>> >>>>>>>>> the >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> bypass for the above reason. I seem to recall that Phoenix >>>>>>>>>>>>> >>>>>>>>>>>>> might >>>>>>>>>>>> >>>>>>>>>>> >>>>>> use >>>>>>> >>>>>>>> >>>>>>>> it >>>>>>>>> >>>>>>>>>> >>>>>>>>>> in >>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> one place to promote a normal mutation into an atomic >>>>>>>>>>>>> >>>>>>>>>>>>> operation, >>>>>>>>>>>> >>>>>>>>>>> >>>>>> by >>>>>>> >>>>>>>> >>>>>>>> substituting one for the other, but if so that objective >>>>>>>>> >>>>>>>>>> >>>>>>>>>>>>> could >>>>>>>>>>>> >>>>>>>>>>> >>>>> be >>>>>> >>>>>>> >>>>>>> reimplemented using their new locking manager. >>>>>>>> >>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>> >>>>> >>>> >> >> -- Best regards, Andrew Words like orphans lost among the crosstalk, meaning torn from truth's decrepit hands - A23, Crosstalk