Hi All, I have updated the PR as per @Owen O'Malley <owen.omal...@gmail.com> 's suggestions.
i. Renamed the module to 'hadoop-shaded-protobuf37' ii. Kept the shaded package to 'o.a.h.thirdparty.protobuf37' Please review!! Thanks, -Vinay On Sat, Sep 28, 2019 at 10:29 AM 张铎(Duo Zhang) <palomino...@gmail.com> wrote: > For HBase we have a separated repo for hbase-thirdparty > > https://github.com/apache/hbase-thirdparty > > We will publish the artifacts to nexus so we do not need to include > binaries in our git repo, just add a dependency in the pom. > > > https://mvnrepository.com/artifact/org.apache.hbase.thirdparty/hbase-shaded-protobuf > > > And it has its own release cycles, only when there are special requirements > or we want to upgrade some of the dependencies. This is the vote thread for > the newest release, where we want to provide a shaded gson for jdk7. > > > https://lists.apache.org/thread.html/f12c589baabbc79c7fb2843422d4590bea982cd102e2bd9d21e9884b@%3Cdev.hbase.apache.org%3E > > > Thanks. > > Vinayakumar B <vinayakum...@apache.org> 于2019年9月28日周六 上午1:28写道: > > > Please find replies inline. > > > > -Vinay > > > > On Fri, Sep 27, 2019 at 10:21 PM Owen O'Malley <owen.omal...@gmail.com> > > wrote: > > > > > I'm very unhappy with this direction. In particular, I don't think git > is > > > a good place for distribution of binary artifacts. Furthermore, the PMC > > > shouldn't be releasing anything without a release vote. > > > > > > > > Proposed solution doesnt release any binaries in git. Its actually a > > complete sub-project which follows entire release process, including VOTE > > in public. I have mentioned already that release process is similar to > > hadoop. > > To be specific, using the (almost) same script used in hadoop to generate > > artifacts, sign and deploy to staging repository. Please let me know If I > > am conveying anything wrong. > > > > > > > I'd propose that we make a third party module that contains the > *source* > > > of the pom files to build the relocated jars. This should absolutely be > > > treated as a last resort for the mostly Google projects that regularly > > > break binary compatibility (eg. Protobuf & Guava). > > > > > > > > Same has been implemented in the PR > > https://github.com/apache/hadoop-thirdparty/pull/1. Please check and let > > me > > know If I misunderstood. Yes, this is the last option we have AFAIK. > > > > > > > In terms of naming, I'd propose something like: > > > > > > org.apache.hadoop.thirdparty.protobuf2_5 > > > org.apache.hadoop.thirdparty.guava28 > > > > > > In particular, I think we absolutely need to include the version of the > > > underlying project. On the other hand, since we should not be shading > > > *everything* we can drop the leading com.google. > > > > > > > > IMO, This naming convention is easy for identifying the underlying > project, > > but it will be difficult to maintain going forward if underlying project > > versions changes. Since thirdparty module have its own releases, each of > > those release can be mapped to specific version of underlying project. > Even > > the binary artifact can include a MANIFEST with underlying project > details > > as per Steve's suggestion on HADOOP-13363. > > That said, if you still prefer to have project number in artifact id, it > > can be done. > > > > The Hadoop project can make releases of the thirdparty module: > > > > > > <dependency> > > > <groupId>org.apache.hadoop</groupId> > > > <artifactId>hadoop-thirdparty-protobuf25</artifactId> > > > <version>1.0</version> > > > </dependency> > > > > > > > > Note that the version has to be the hadoop thirdparty release number, > which > > > is part of why you need to have the underlying version in the artifact > > > name. These we can push to maven central as new releases from Hadoop. > > > > > > > > Exactly, same has been implemented in the PR. hadoop-thirdparty module > have > > its own releases. But in HADOOP Jira, thirdparty versions can be > > differentiated using prefix "thirdparty-". > > > > Same solution is being followed in HBase. May be people involved in HBase > > can add some points here. > > > > Thoughts? > > > > > > .. Owen > > > > > > On Fri, Sep 27, 2019 at 8:38 AM Vinayakumar B <vinayakum...@apache.org > > > > > wrote: > > > > > >> Hi All, > > >> > > >> I wanted to discuss about the separate repo for thirdparty > > dependencies > > >> which we need to shaded and include in Hadoop component's jars. > > >> > > >> Apologies for the big text ahead, but this needs clear > explanation!! > > >> > > >> Right now most needed such dependency is protobuf. Protobuf > > dependency > > >> was not upgraded from 2.5.0 onwards with the fear that downstream > > builds, > > >> which depends on transitive dependency protobuf coming from hadoop's > > jars, > > >> may fail with the upgrade. Apparently protobuf does not guarantee > source > > >> compatibility, though it guarantees wire compatibility between > versions. > > >> Because of this behavior, version upgrade may cause breakage in known > > and > > >> unknown (private?) downstreams. > > >> > > >> So to tackle this, we came up the following proposal in > HADOOP-13363. > > >> > > >> Luckily, As far as I know, no APIs, either public to user or > between > > >> Hadoop processes, is not directly using protobuf classes in > signatures. > > >> (If > > >> any exist, please let us know). > > >> > > >> Proposal: > > >> ------------ > > >> > > >> 1. Create a artifact(s) which contains shaded dependencies. All > such > > >> shading/relocation will be with known prefix > > >> **org.apache.hadoop.thirdparty.**. > > >> 2. Right now protobuf jar (ex: > > o.a.h.thirdparty:hadoop-shaded-protobuf) > > >> to start with, all **com.google.protobuf** classes will be relocated > as > > >> **org.apache.hadoop.thirdparty.com.google.protobuf**. > > >> 3. Hadoop modules, which needs protobuf as dependency, will add > this > > >> shaded artifact as dependency (ex: > > >> o.a.h.thirdparty:hadoop-shaded-protobuf). > > >> 4. All previous usages of "com.google.protobuf" will be relocated > to > > >> "org.apache.hadoop.thirdparty.com.google.protobuf" in the code and > will > > be > > >> committed. Please note, this replacement is One-Time directly in > source > > >> code, NOT during compile and package. > > >> 5. Once all usages of "com.google.protobuf" is relocated, then > hadoop > > >> dont care about which version of original "protobuf-java" is in > > >> dependency. > > >> 6. Just keep "protobuf-java:2.5.0" in dependency tree not to break > > the > > >> downstreams. But hadoop will be originally using the latest protobuf > > >> present in "o.a.h.thirdparty:hadoop-shaded-protobuf". > > >> > > >> 7. Coming back to separate repo, Following are most appropriate > > reasons > > >> of keeping shaded dependency artifact in separate repo instead of > > >> submodule. > > >> > > >> 7a. These artifacts need not be built all the time. It needs to > be > > >> built only when there is a change in the dependency version or the > build > > >> process. > > >> 7b. If added as "submodule in Hadoop repo", > > maven-shade-plugin:shade > > >> will execute only in package phase. That means, "mvn compile" or "mvn > > >> test-compile" will not be failed as this artifact will not have > > relocated > > >> classes, instead it will have original classes, resulting in > compilation > > >> failure. Workaround, build thirdparty submodule first and exclude > > >> "thirdparty" submodule in other executions. This will be a complex > > process > > >> compared to keeping in a separate repo. > > >> > > >> 7c. Separate repo, will be a subproject of Hadoop, using the > same > > >> HADOOP jira project, with different versioning prefixed with > > "thirdparty-" > > >> (ex: thirdparty-1.0.0). > > >> 7d. Separate will have same release process as Hadoop. > > >> > > >> HADOOP-13363 (https://issues.apache.org/jira/browse/HADOOP-13363) > > is > > >> an > > >> umbrella jira tracking the changes to protobuf upgrade. > > >> > > >> PR (https://github.com/apache/hadoop-thirdparty/pull/1) has been > > >> raised > > >> for separate repo creation in (HADOOP-16595 ( > > >> https://issues.apache.org/jira/browse/HADOOP-16595) > > >> > > >> Please provide your inputs for the proposal and review the PR to > > >> proceed with the proposal. > > >> > > >> > > > -Thanks, > > >> Vinay > > >> > > >> On Fri, Sep 27, 2019 at 11:54 AM Vinod Kumar Vavilapalli < > > >> vino...@apache.org> > > >> wrote: > > >> > > >> > Moving the thread to the dev lists. > > >> > > > >> > Thanks > > >> > +Vinod > > >> > > > >> > > On Sep 23, 2019, at 11:43 PM, Vinayakumar B < > > vinayakum...@apache.org> > > >> > wrote: > > >> > > > > >> > > Thanks Marton, > > >> > > > > >> > > Current created 'hadoop-thirdparty' repo is empty right now. > > >> > > Whether to use that repo for shaded artifact or not will be > > >> monitored in > > >> > > HADOOP-13363 umbrella jira. Please feel free to join the > discussion. > > >> > > > > >> > > There is no existing codebase is being moved out of hadoop repo. > So > > I > > >> > think > > >> > > right now we are good to go. > > >> > > > > >> > > -Vinay > > >> > > > > >> > > On Mon, Sep 23, 2019 at 11:38 PM Marton Elek <e...@apache.org> > > wrote: > > >> > > > > >> > >> > > >> > >> I am not sure if it's defined when is a vote required. > > >> > >> > > >> > >> https://www.apache.org/foundation/voting.html > > >> > >> > > >> > >> Personally I think it's a big enough change to send a > notification > > to > > >> > the > > >> > >> dev lists with a 'lazy consensus' closure > > >> > >> > > >> > >> Marton > > >> > >> > > >> > >> On 2019/09/23 17:46:37, Vinayakumar B <vinayakum...@apache.org> > > >> wrote: > > >> > >>> Hi, > > >> > >>> > > >> > >>> As discussed in HADOOP-13363, protobuf 3.x jar (and may be more > in > > >> > >> future) > > >> > >>> will be kept as a shaded artifact in a separate repo, which will > > be > > >> > >>> referred as dependency in hadoop modules. This approach avoids > > >> shading > > >> > >> of > > >> > >>> every submodule during build. > > >> > >>> > > >> > >>> So question is does any VOTE required before asking to create a > > git > > >> > repo? > > >> > >>> > > >> > >>> On selfserve platform > > https://gitbox.apache.org/setup/newrepo.html > > >> > >>> I can access see that, requester should be PMC. > > >> > >>> > > >> > >>> Wanted to confirm here first. > > >> > >>> > > >> > >>> -Vinay > > >> > >>> > > >> > >> > > >> > >> > > --------------------------------------------------------------------- > > >> > >> To unsubscribe, e-mail: private-unsubscr...@hadoop.apache.org > > >> > >> For additional commands, e-mail: private-h...@hadoop.apache.org > > >> > >> > > >> > >> > > >> > > > >> > > > >> > > > > > >