On Mon, Oct 3, 2016 at 4:28 PM, Enis Söztutar <[email protected]> wrote:
> On Mon, Oct 3, 2016 at 2:55 PM, Andrew Purtell <[email protected]> > wrote: > > > I would not expect Hadoop to shade PB in HDFS in any near time frame. I > > don't think we have a good chance of impressing upon them a sense of > > urgency based on their spotty track of listening to downstreamer > concerns. > > > > Agreed. I am not sure whether we can realistically expect to have shaded > client + server(because of tests) for 3.0 + 2.8 or so time frame, hence my > question. If we can though, we can very easily declare that HBase-2.0 will > only work with Hadoop-2.8+ and get away without server-side shading by > depending on the shaded hdfs artifacts. > > Thanks Enis for asking some fundamentals. If hadoop shaded its artifacts so no leakage whether an HDFS or YARN context (or Spark -- Spark2 is pb2.5.0) and downstreamers made use of our shaded artifacts everywhere then we could avoid this effort. Some form of isolation should show up in hadoop3 before it ships. My guess is that it unlikely we'd get the backport in an hadoop2 even though hadoop has tradtionally been 'relaxed' when it comes to compat across versions. An hbase2 that required hadoop3 or even a 2.9 or 2.10 would be constraining. The above is a big if and a dependency on another project's delivery. Meanwhile the offheap project need a more modern protobuf yesterday so they can make progress. I suggest we commit this patch that lets us own when and on what we can release. If the lay of the land changes substantially before we ship hbase2, I promise to undo this patch. What do you think Enis, St.Ack > > > Even if we could and they deliver, what about all of the already deployed > > 2.x? Or 3.x? Would a change like this get into 3? Or would it be pushed > to > > 4? So we wait until Hadoop 3 or maybe 4 is released *and* GA *and* out > > there so long that telling people to get off 2 is is finally ok before > > moving up to PB3 or getting the off heaping goodness? No, let's get this > > work in. > > > > > > > > > > On Mon, Oct 3, 2016 at 2:46 PM, Enis Söztutar <[email protected]> wrote: > > > > > Is PB only coming via HDFS client or there are other usages that > pollute > > > our classpath? We are doing all this gymnastics just because upstream > > does > > > not do shading is frustrating. Any idea for how likely we can push > shaded > > > hdfs to Hadoop? Should we focus on that instead rather than working > > around > > > the problem? > > > > > > Enis > > > > > > On Mon, Oct 3, 2016 at 11:09 AM, Stack <[email protected]> wrote: > > > > > > > On Mon, Oct 3, 2016 at 9:41 AM, Ted Yu <[email protected]> wrote: > > > > > > > > > The precommit job uses compile-protobuf profile for verification. > The > > > > > absence of compile-protobuf profile in hbase-protocol-shaded module > > > means > > > > > precommit job would only invoke the existing compile-protobuf > profile > > > > > in hbase-protocol > > > > > module. > > > > > > > > > w.r.t. path, when two protoc executables (corresponding to 2.5 and > 3.x, > > > > > respectively) are available, would maven know which one to pick for > > > > > the hbase-protocol and hbase-protocol-shaded modules ? > > > > > > > > > > > > > > > > > > For the former, right, nothing changed. > > > > > > > > On the latter, it is a contrivance, a problem we do not have. The > build > > > > doesn't have to pick between pb 2.5 vs pb 3.1. We don't run protoc as > > > part > > > > of our build. It is done apart from the build by the developer and > > their > > > > results are then checked-in. That continues as it was post-patch. We > > just > > > > do 3.1 now instead of 2.5. > > > > > > > > I'll check in a bit of doc as part of this commit that hopefully will > > > help > > > > make this all clearer. > > > > > > > > St.Ack > > > > > > > > > > > > > > > > > > > > > > > > > Cheers > > > > > > > > > > On Mon, Oct 3, 2016 at 9:32 AM, Stack <[email protected]> wrote: > > > > > > > > > > > On Mon, Oct 3, 2016 at 7:16 AM, Ted Yu <[email protected]> > > wrote: > > > > > > > > > > > > > Looks like compile-protobuf profile is not in > > > > > > hbase-protocol-shaded/pom.xml > > > > > > > (in HBASE-16264 branch) > > > > > > > > > > > > > > > > > > > > Sorry. I don't get what you are saying here > > > > > > > > > > > > (The target in the new module is generate-sources. See the > included > > > > > README. > > > > > > This step does more work now more than just generating protocs, > > hence > > > > new > > > > > > profile name.) > > > > > > > > > > > > > > > > > > > > > > > > > Seems precommit jobs should pass with the current formation. > > > > > > > > > > > > > > > > > > > > Are you stating that this patch is likely to build? (Yes, the > > patch I > > > > > > submitted builds). > > > > > > > > > > > > > > > > > > > > > > > > > In the future, if we add another profile for compiling proto3 > > > files, > > > > we > > > > > > > need to specify the path to proto3 compiler. > > > > > > > > > > > > > > > > > > > > > > > > > > > Please correct me if I am wrong. > > > > > > > > > > > > > > > > > > > > I don't know what you are asking. Why do we have to specify > > 'paths'? > > > We > > > > > > don't have to currently (See the plugin we use generating protos > > now > > > > from > > > > > > hadoop). Maybe you are trying to distinguish the production of > > > > > protobuf2.5 > > > > > > vs 3.1 protos but these are isolated by module.... > > > > > > > > > > > > > > > > > > I said I'd commit this morning but let me wait a while. There may > > be > > > > some > > > > > > more questions/objections and I'd like to have a clean build up > on > > > > > jenkins > > > > > > here [1] before I commit (jenkins is being ornery). > > > > > > > > > > > > St.Ack > > > > > > 1. > > > > > > https://builds.apache.org/job/HBASE-16264/jdk=JDK%201.8%20( > > > > > > latest),label=yahoo-not-h2/28/console > > > > > > > > > > > > Service Unavailable > > > > > > > > > > > > The server is temporarily unable to service your > > > > > > request due to maintenance downtime or capacity > > > > > > problems. Please try again later. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Mon, Oct 3, 2016 at 6:58 AM, Ted Yu <[email protected]> > > > wrote: > > > > > > > > > > > > > > > The protoc generated files (such as MasterProtos) are checked > > > into > > > > > > source > > > > > > > > repo, right ? > > > > > > > > > > > > > > > > Do we need proto3 on the precommit image(s) ? > > > > > > > > > > > > > > > > On Mon, Oct 3, 2016 at 5:18 AM, 张铎 <[email protected]> > > > wrote: > > > > > > > > > > > > > > > >> Then I think we need to file an issue to change the protoc > > > version > > > > > > > >> installed in the precommit docker file to 3.x before the > > merge. > > > > > > > Otherwise > > > > > > > >> the precommit build for protoc check maybe broken after the > > > > merge... > > > > > > > >> > > > > > > > >> > > > > > > > >> 2016-10-03 1:18 GMT+08:00 Andrew Purtell < > > > > [email protected] > > > > > >: > > > > > > > >> > > > > > > > >> > I have 2.5 and 3.0 installed as /opt/protobuf-<version>, > and > > > > have > > > > > > bash > > > > > > > >> > scripts that add the appropriate version's bin directory > to > > > the > > > > > > path. > > > > > > > >> Not > > > > > > > >> > particularly onerous as I also have to switch between > > required > > > > JDK > > > > > > > >> > versions, so the scripts also set JAVA_HOME at and add JDK > > bin > > > > to > > > > > > the > > > > > > > >> path > > > > > > > >> > for the required JDK for the build. > > > > > > > >> > > > > > > > > >> > Unlike with the scala compiler, which is after all JVM > > > bytecode > > > > > at a > > > > > > > >> > fundamental level, I don't think maven automation for > > > automatic > > > > > > > download > > > > > > > >> > and execution is possible. protoc is a native binary. > > > > > > > >> > > > > > > > > >> > > On Oct 1, 2016, at 11:30 PM, 张铎 <[email protected]> > > > > wrote: > > > > > > > >> > > > > > > > > > >> > > Do we need to install protoc 3.0 manully before building > > > HBase > > > > > or > > > > > > > the > > > > > > > >> > maven > > > > > > > >> > > protobuf plugin will automatically download the protoc > > > > compiler? > > > > > > > >> Maybe we > > > > > > > >> > > need to install protoc 3.0 in the precommit docker file. > > > > > > > >> > > > > > > > > > >> > > 2016-10-02 14:20 GMT+08:00 张铎 <[email protected]>: > > > > > > > >> > > > > > > > > > >> > >> > > > > > > > >> > >> 2016-10-02 0:50 GMT+08:00 Stack <[email protected]>: > > > > > > > >> > >> > > > > > > > >> > >>>> On Sat, Oct 1, 2016 at 7:20 AM, 张铎 < > > > [email protected]> > > > > > > > wrote: > > > > > > > >> > >>>> > > > > > > > >> > >>>> Can we setup a compatibility checker job in jenkins? > > > Start > > > > a > > > > > > > >> > >>> minicluster in > > > > > > > >> > >>>> one process, and use a client in another process to > > > > > communicate > > > > > > > >> with > > > > > > > >> > it. > > > > > > > >> > >>>> The version of the client should be >= 0.98 and <= > the > > > > > version > > > > > > of > > > > > > > >> the > > > > > > > >> > >>>> minicluster. Of course we need to design the testing > > code > > > > > > > >> carefully to > > > > > > > >> > >>> make > > > > > > > >> > >>>> sure that we have tested all the cases. > > > > > > > >> > >>>> > > > > > > > >> > >>>> > > > > > > > >> > >>> +1. We need this up and running before we put out an > > > > > > hbase-2.0.0. > > > > > > > I > > > > > > > >> > know > > > > > > > >> > >>> Matteo does this test manually on a regular basis but > a > > > > > > > >> formalization > > > > > > > >> > >>> would > > > > > > > >> > >>> help. I can add an exercise of Coprocessor Endpoints. > I > > > > > believe > > > > > > > this > > > > > > > >> > is on > > > > > > > >> > >>> Dima's list of TODOs but will let him speak for > himself. > > > > > > > >> > >>> > > > > > > > >> > >>> > > > > > > > >> > >>>> And also I think we should make sure that no proto3 > > only > > > > > > feature > > > > > > > is > > > > > > > >> > >>>> introduced in our proto files until branch-1 is dead. > > > > Maybe a > > > > > > > >> > precommit > > > > > > > >> > >>>> check? > > > > > > > >> > >>>> > > > > > > > >> > >>>> > > > > > > > >> > >>> I think you mean wire-format breaking changes? Agree. > > We > > > > have > > > > > > our > > > > > > > >> PB3 > > > > > > > >> > set > > > > > > > >> > >>> to 2.5 compat mode and yes, we can't move on from this > > > until > > > > > we > > > > > > > are > > > > > > > >> in > > > > > > > >> > a > > > > > > > >> > >>> place where we can say no to 2.5 clients. > > > > > > > >> > >>> > > > > > > > >> > >> Yes, for example, pb2.5 does not support map so we > should > > > not > > > > > use > > > > > > > >> map in > > > > > > > >> > >> our proto files. > > > > > > > >> > >> > > > > > > > >> > >>> > > > > > > > >> > >>> Making use of PB3isms cannot be avoided. PB3.1 adds a > > > native > > > > > > > >> > replacement > > > > > > > >> > >>> for our HBaseZeroCopyByteString/ByteStringer hack. It > > > also > > > > > adds > > > > > > > >> > 'unsafe' > > > > > > > >> > >>> methods that we need to exploit if we are to keep our > > > > > read/write > > > > > > > >> paths > > > > > > > >> > >>> offheap. > > > > > > > >> > >>> > > > > > > > >> > >>> St.Ack > > > > > > > >> > >>> > > > > > > > >> > >>> > > > > > > > >> > >>> > > > > > > > >> > >>> > > > > > > > >> > >>> > > > > > > > >> > >>>> Thanks. > > > > > > > >> > >>>> > > > > > > > >> > >>>> 2016-10-01 11:55 GMT+08:00 Sean Busbey < > > > > [email protected] > > > > > >: > > > > > > > >> > >>>> > > > > > > > >> > >>>>> have we experimentally confirmed that wire > > compatibility > > > > is > > > > > > > >> > >>>>> maintained? I saw one mention of expecting wire > > > > > compatibility > > > > > > to > > > > > > > >> be > > > > > > > >> > >>>>> fine, but nothing with someone using e.g. the > > > clusterdock > > > > > work > > > > > > > or > > > > > > > >> > >>>>> something to mix servers / clients or do > replication. > > > > > > > >> > >>>>> > > > > > > > >> > >>>>>> On Fri, Sep 30, 2016 at 6:30 PM, Stack < > > > [email protected] > > > > > > > > > > > > wrote: > > > > > > > >> > >>>>>> I intend to do a mass commit late this weekend that > > > moves > > > > > us > > > > > > on > > > > > > > >> to a > > > > > > > >> > >>>>> shaded > > > > > > > >> > >>>>>> protobuf-3.1.0, either Sunday night or Monday > > morning. > > > > > > > >> > >>>>>> > > > > > > > >> > >>>>>> If objection, please speak up or if need more time > > for > > > > > > > >> > >>>>>> consideration/review, just shout. > > > > > > > >> > >>>>>> > > > > > > > >> > >>>>>> I want to merge the branch HBASE-16264 into master > > (it > > > is > > > > > > > running > > > > > > > >> > >>> here > > > > > > > >> > >>>> up > > > > > > > >> > >>>>>> on jenkins https://builds.apache.org/view > > > > > > > >> /H-L/view/HBase/job/HBASE- > > > > > > > >> > >>>>> 16264/). > > > > > > > >> > >>>>>> The branch at HBASE-16264 has three significant > > > > > > bodies-of-work > > > > > > > >> that > > > > > > > >> > >>>>>> unfortunately are tangled and can only go in of a > > > piece. > > > > > > > >> > >>>>>> > > > > > > > >> > >>>>>> * HBASE-16264 <https://issues.apache.org/ > > > > > > > jira/browse/HBASE-16264 > > > > > > > >> > > > > > > > > >> > >>> The > > > > > > > >> > >>>>>> shading of our protobuf usage so we can upgrade > > and/or > > > > run > > > > > > > with a > > > > > > > >> > >>>> patched > > > > > > > >> > >>>>>> protobuf WITHOUT breaking REST, Spark, and in > > > particular, > > > > > > > >> > >>> Coprocessor > > > > > > > >> > >>>>>> Endpoints. > > > > > > > >> > >>>>>> * HBASE-16567 <https://issues.apache.org/ > > > > > > > jira/browse/HBASE-16567 > > > > > > > >> > > > > > > > > >> > >>> A > > > > > > > >> > >>>>> move > > > > > > > >> > >>>>>> up on to (shaded) protobuf-3.1.0 > > > > > > > >> > >>>>>> * HBASE-16741 <https://issues.apache.org/ > > > > > > > jira/browse/HBASE-16741 > > > > > > > >> > > > > > > > > >> > >>> An > > > > > > > >> > >>>>>> amendment of our generate protobufs step to include > > > > shading > > > > > > > and a > > > > > > > >> > >>>>> bundling > > > > > > > >> > >>>>>> of protobuf src (with a means of calling a patch > srcs > > > > hook) > > > > > > > >> > >>>>>> > > > > > > > >> > >>>>>> Together we're talking about 40MB of change mostly > > made > > > > of > > > > > > the > > > > > > > >> > >>> movement > > > > > > > >> > >>>>> of > > > > > > > >> > >>>>>> generated files or the application of a pattern > that > > > > alters > > > > > > > >> where we > > > > > > > >> > >>>> get > > > > > > > >> > >>>>>> imports from. When done, you should notice no > > > difference > > > > > and > > > > > > > >> should > > > > > > > >> > >>> be > > > > > > > >> > >>>>> able > > > > > > > >> > >>>>>> to go about your business as per usual. Upside is > > that > > > we > > > > > > will > > > > > > > be > > > > > > > >> > >>> able > > > > > > > >> > >>>> to > > > > > > > >> > >>>>>> avoid coming onheap doing protobuf > > > > > marshalling/unmarshalling > > > > > > as > > > > > > > >> > >>>> protobuf > > > > > > > >> > >>>>>> 2.5.0 requires. Downside is that we repeat a good > > > portion > > > > > of > > > > > > > our > > > > > > > >> > >>>> internal > > > > > > > >> > >>>>>> protos, once non-shaded so Coprocessor Endpoints > can > > > keep > > > > > > > working > > > > > > > >> > >>> and > > > > > > > >> > >>>>> then > > > > > > > >> > >>>>>> again as shaded for internal use. > > > > > > > >> > >>>>>> > > > > > > > >> > >>>>>> I provide some more overview below on the changes. > > See > > > > the > > > > > > > >> shading > > > > > > > >> > >>> doc > > > > > > > >> > >>>>>> here: > > > > > > > >> > >>>>>> https://docs.google.com/document/d/ > > > > > 1H4NgLXQ9Y9KejwobddCqaVME > > > > > > > >> DCGby > > > > > > > >> > >>>>> DcXtdF5iAfDIEk/edit# > > > > > > > >> > >>>>>> for more detail (Patches are up on review board -- > > > except > > > > > the > > > > > > > >> latest > > > > > > > >> > >>>>>> HBASE-16264 which is too big for JIRA and RB). I am > > > > > currently > > > > > > > >> > >>> working > > > > > > > >> > >>>> on > > > > > > > >> > >>>>> a > > > > > > > >> > >>>>>> devs chapter for the book on protobuf going forward > > > that > > > > > will > > > > > > > go > > > > > > > >> in > > > > > > > >> > >>> as > > > > > > > >> > >>>>> part > > > > > > > >> > >>>>>> of this patch. > > > > > > > >> > >>>>>> > > > > > > > >> > >>>>>> Thanks, > > > > > > > >> > >>>>>> St.Ack > > > > > > > >> > >>>>>> > > > > > > > >> > >>>>>> Items of note: > > > > > > > >> > >>>>>> > > > > > > > >> > >>>>>> * Two new modules; one named hbase-protocol-shaded > > that > > > > is > > > > > > used > > > > > > > >> by > > > > > > > >> > >>>> hbase > > > > > > > >> > >>>>>> core. It has in it a shaded (and later patched) > > > protobuf. > > > > > The > > > > > > > >> other > > > > > > > >> > >>> new > > > > > > > >> > >>>>>> module is hbase-endpoint which goes after > > hbase-server > > > > and > > > > > > has > > > > > > > >> those > > > > > > > >> > >>>>>> bundled endpoints that I was able to break out of > > core > > > > > (there > > > > > > > >> are a > > > > > > > >> > >>> few > > > > > > > >> > >>>>>> that are hopelessly entangled that need to be > undone > > as > > > > > CPEPs > > > > > > > but > > > > > > > >> > >>>>>> fortunately belong in core: Auth, Access, > MultiRow). > > > > > > > >> > >>>>>> * I've tested running a branch-1 CPEP against a > > master > > > > with > > > > > > > these > > > > > > > >> > >>>>> patches > > > > > > > >> > >>>>>> in place and stuff like ACL (A CPEP) run from the > > > > branch-1 > > > > > > > shell > > > > > > > >> > >>> work > > > > > > > >> > >>>>>> against the branch-2 server. > > > > > > > >> > >>>>>> > > > > > > > >> > >>>>>> > > > > > > > >> > >>>>>> > > > > > > > >> > >>>>>> > > > > > > > >> > >>>>>> > > > > > > > >> > >>>>>>> On Mon, Aug 22, 2016 at 5:20 PM, Stack < > > > > [email protected]> > > > > > > > >> wrote: > > > > > > > >> > >>>>>>> > > > > > > > >> > >>>>>>> This project goes on. I updated HBASE-1563 "Shade > > > > > protobuf" > > > > > > > with > > > > > > > >> > >>> some > > > > > > > >> > >>>>> doc > > > > > > > >> > >>>>>>> on a final approach. We need to be able to refer > to > > > both > > > > > > > shaded > > > > > > > >> and > > > > > > > >> > >>>>>>> non-shaded protobuf so we can support sending HDFS > > > > > > old-school > > > > > > > pb > > > > > > > >> > >>>>> Messages > > > > > > > >> > >>>>>>> but also so Coprocessor Endpoints keep working > > though > > > > > > > internally > > > > > > > >> > >>>>> protobufs > > > > > > > >> > >>>>>>> have been relocated. Funny you should ask, but > yes, > > > > there > > > > > > are > > > > > > > >> some > > > > > > > >> > >>>>>>> downsides (as predicted by contributors on the > > JIRA). > > > > I'd > > > > > be > > > > > > > >> > >>>> interested > > > > > > > >> > >>>>> to > > > > > > > >> > >>>>>>> hear if they are too burdensome. In particular, > your > > > IDE > > > > > > > >> experience > > > > > > > >> > >>>>> gets a > > > > > > > >> > >>>>>>> little convoluted as you will need to add to your > > > build > > > > > > path, > > > > > > > a > > > > > > > >> jar > > > > > > > >> > >>>> with > > > > > > > >> > >>>>>>> the relocated pbs. A pain. > > > > > > > >> > >>>>>>> > > > > > > > >> > >>>>>>> Thanks, > > > > > > > >> > >>>>>>> St.Ack > > > > > > > >> > >>>>>>> > > > > > > > >> > >>>>>>> > > > > > > > >> > >>>>>>>> On Wed, Apr 13, 2016 at 6:09 AM, Stack < > > > > [email protected] > > > > > > > > > > > > > >> wrote: > > > > > > > >> > >>>>>>>> > > > > > > > >> > >>>>>>>> On Tue, Apr 12, 2016 at 9:26 PM, Sean Busbey < > > > > > > > >> [email protected]> > > > > > > > >> > >>>>> wrote: > > > > > > > >> > >>>>>>>> > > > > > > > >> > >>>>>>>>>> On Tue, Apr 12, 2016 at 6:17 PM, Stack < > > > > > [email protected] > > > > > > > > > > > > > > >> > wrote: > > > > > > > >> > >>>>>>>>>> > > > > > > > >> > >>>>>>>>>> > > > > > > > >> > >>>>>>>>>> On an initial pass, the only difficult part > seems > > > to > > > > be > > > > > > > >> > >>>> interaction > > > > > > > >> > >>>>>>>>> with > > > > > > > >> > >>>>>>>>>> HDFS in asyncwal (might just pull in the HDFS > > > > > messages). > > > > > > > >> > >>>>>>>>>> > > > > > > > >> > >>>>>>>>>> > > > > > > > >> > >>>>>>>>> > > > > > > > >> > >>>>>>>>> I have some idea how we can make this work > either > > by > > > > > > pushing > > > > > > > >> > >>>> asyncwal > > > > > > > >> > >>>>>>>>> upstream to HDFS or through some maven tricks, > > > > depending > > > > > > on > > > > > > > >> how > > > > > > > >> > >>> much > > > > > > > >> > >>>>>>>>> time we have. > > > > > > > >> > >>>>>>>>> > > > > > > > >> > >>>>>>>> > > > > > > > >> > >>>>>>>> Maven tricks? Tell us more. Here or drop a note > up > > in > > > > the > > > > > > > >> issue. > > > > > > > >> > >>>>>>>> Thanks Sean, > > > > > > > >> > >>>>>>>> St.Ack > > > > > > > >> > >>>>>>>> > > > > > > > >> > >>>>>>> > > > > > > > >> > >>>>>>> > > > > > > > >> > >>>>> > > > > > > > >> > >>>>> > > > > > > > >> > >>>>> > > > > > > > >> > >>>>> -- > > > > > > > >> > >>>>> busbey > > > > > > > >> > >>>>> > > > > > > > >> > >>>> > > > > > > > >> > >>> > > > > > > > >> > >> > > > > > > > >> > >> > > > > > > > >> > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > Best regards, > > > > - Andy > > > > Problems worthy of attack prove their worth by hitting back. - Piet Hein > > (via Tom White) > > >
