Re: [NOTICE] Removal of protobuf classes from Hadoop Token's public APIs' signature

2020-05-17 Thread Vinayakumar B
Hi Wei-chu and steve,

Thanks for sharing insights.

I have also tried to compile and execute ozone pointing to
trunk(3.4.0-SNAPSHOT) which have shaded and upgraded protobuf.

Other than just the usage of internal protobuf APIs, because of which
compilation would break, I found another major problem was, the Hadoop-rpc
implementations in downstreams which is based on non-shaded Protobuf
classes.

'ProtobufRpcEngine' takes arguments and tries to typecast to Protobuf
'Message', which its expecting to be of 3.7 version and shaded package
(i.e. o.a.h.thirdparty.*).

So,unless downstreams upgrade their protobuf classes to 'hadoop-thirdparty'
this issue will continue to occur, even after solving compilation issues
due to internal usage of private APIs with protobuf signatures.

I found a possible workaround for this problem.
Please check https://issues.apache.org/jira/browse/HADOOP-17046
  This Jira proposes to keep existing ProtobuRpcEngine as-is (without
shading and with protobuf-2.5.0 implementation) to support downstream
implementations.
  Use new ProtobufRpcEngine2 to use shaded protobuf classes within Hadoop
and later projects who wish to upgrade their protobufs to 3.x.

For Ozone compilation:
  I have submitted to PRs to make preparations to adopt to Hadoop 3.3+
upgrade. These PRs will remove dependency on Hadoop for those internal APIs
and implemented their own copy in ozone with non-shaded protobuf.
HDDS-3603: https://github.com/apache/hadoop-ozone/pull/93
2
HDDS-3604: https://github.com/apache/hadoop-ozone/pull/933

Also, I had run some tests on Ozone after applying these PRs and
HADOOP-17046 with 3.4.0, tests seems to pass.

Please help review these PRs.

Thanks,
-Vinay


On Wed, Apr 29, 2020 at 5:02 PM Steve Loughran 
wrote:

> Okay.
>
> I am not going to be a purist and say "what were they doing -using our
> private APIs?" because as we all know, with things like UGI tagged @private
> there's been no way to get something is done without getting into the
> private stuff.
>
> But why did we do the protobuf changes? So that we could update our private
> copy of protobuf with out breaking every single downstream application. The
> great protobuf upgrade to 2.5 is not something we wanted to repeat. When
> was that? before hadoop-2.2 shipped? I certainly remember a couple of weeks
> were absolutely nothing would build whatsoever, not until every downstream
> project had upgraded to the same version of the library.
>
> If you ever want to see an upgrade which makes a guava update seem a minor
> detail, protobuf upgrades are it. Hence the shading
>
> HBase
> =
>
> it looks like HBase has been using deep internal stuff. That is,
> "unfortunate". I think in that world we have to look and say is there
> something specific we can do here to help HBase in a way we could also
> backport. They shouldn't need those IPC internals.
>
> Tez & Tokens
> 
>
> I didn't know Tez was using those protobuf APIs internally. That is,
> "unfortunate".
>
> What is key is this: without us moving those methods things like Spark
> wouldn't work. And they weren't even using the methods, just trying to work
> with Token for job submission.
>
> All Tez should need is a byte array serialization of a token. Given Token
> is also Writable, that could be done via WritableUtils in a way which will
> also work with older releases.
>
> Ozone
> =
>
> When these were part of/in-sync with the hadoop build there wouldn't have
> been problems. Now there are. Again, they're going in deep, but here
> clearly to simulate some behaviour. Any way to do that differently?
>
> Ratis
> =
>
> No idea.
>
> On Wed, 29 Apr 2020 at 07:12, Wei-Chiu Chuang  >
> wrote:
>
> > Most of the problems are downstream applications using Hadoop's private
> > APIs.
> >
> > Tez:
> >
> > 17:08:38 2020/04/16 00:08:38 INFO: [ERROR] COMPILATION ERROR :
> > 17:08:38 2020/04/16 00:08:38 INFO: [INFO]
> > -
> > 17:08:38 2020/04/16 00:08:38 INFO: [ERROR]
> >
> >
> /grid/0/jenkins/workspace/workspace/CDH-CANARY-parallel-centos7/SOURCES/tez/tez-plugins/tez-aux-services/src/main/java/org/apache/tez/auxservices/ShuffleHandler.java:[757,45]
> > incompatible types: com.google.protobuf.ByteString cannot be converted
> > to org.apache.hadoop.thirdparty.protobuf.ByteString
> > 17:08:38 2020/04/16 00:08:38 INFO: [INFO] 1 error
> >
> >
> > Tez keeps track of job tokens internally.
> > The change would look like this:
> >
> > private void recordJobShuffleInfo(JobID jobId, String user,
> > Token jobToken) throws IOException {
> >   if (stateDb != null) {
> > TokenProto tokenProto = ProtobufHelper.protoFromToken(jobToken);
> > /*TokenProto tokenProto = TokenProto.newBuilder()
> > .setIdentifier(ByteString.copyFrom(jobToken.getIdentifier()))
> > .setPassword(ByteString.copyFrom(jobToken.getPassword()))
> > .s

[jira] [Resolved] (HADOOP-16750) Backport HADOOP-16548 - ABFS: Config to enable/disable flush operation issue to branch-3.2

2020-05-17 Thread Akira Ajisaka (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-16750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka resolved HADOOP-16750.

Resolution: Duplicate

Reopened and closed this issue to change the resolution.

> Backport HADOOP-16548 - ABFS: Config to enable/disable flush operation issue 
> to branch-3.2
> --
>
> Key: HADOOP-16750
> URL: https://issues.apache.org/jira/browse/HADOOP-16750
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/azure
>Affects Versions: 3.2
>Reporter: Mandar Inamdar
>Assignee: Sneha Vijayarajan
>Priority: Minor
>
> Make flush operation enabled/disabled through configuration. This is part of 
> performance improvements for ABFS driver.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Reopened] (HADOOP-16750) Backport HADOOP-16548 - ABFS: Config to enable/disable flush operation issue to branch-3.2

2020-05-17 Thread Akira Ajisaka (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-16750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka reopened HADOOP-16750:


> Backport HADOOP-16548 - ABFS: Config to enable/disable flush operation issue 
> to branch-3.2
> --
>
> Key: HADOOP-16750
> URL: https://issues.apache.org/jira/browse/HADOOP-16750
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/azure
>Affects Versions: 3.2
>Reporter: Mandar Inamdar
>Assignee: Sneha Vijayarajan
>Priority: Minor
>
> Make flush operation enabled/disabled through configuration. This is part of 
> performance improvements for ABFS driver.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-17046) Support downstreams' existing Hadoop-rpc implementations using non-shaded protobuf classes.

2020-05-17 Thread Vinayakumar B (Jira)
Vinayakumar B created HADOOP-17046:
--

 Summary: Support downstreams' existing Hadoop-rpc implementations 
using non-shaded protobuf classes.
 Key: HADOOP-17046
 URL: https://issues.apache.org/jira/browse/HADOOP-17046
 Project: Hadoop Common
  Issue Type: Improvement
  Components: rpc-server
Affects Versions: 3.3.0
Reporter: Vinayakumar B


After upgrade/shade of protobuf to 3.7 version, existing Hadoop-Rpc 
client-server implementations using ProtobufRpcEngine will not work.

So, this Jira proposes to keep existing ProtobuRpcEngine as-is (without shading 
and with protobuf-2.5.0 implementation) to support downstream implementations.

Use new ProtobufRpcEngine2 to use shaded protobuf classes within Hadoop and 
later projects who wish to upgrade their protobufs to 3.x.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Apache Hadoop qbt Report: branch2.10+JDK7 on Linux/x86

2020-05-17 Thread Apache Jenkins Server
For more details, see 
https://builds.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86/688/

No changes




-1 overall


The following subsystems voted -1:
asflicense compile mvninstall mvnsite pathlen unit xml


The following subsystems voted -1 but
were configured to be filtered/ignored:
cc checkstyle javac javadoc shellcheck whitespace


The following subsystems are considered long running:
(runtime bigger than 1h  0m  0s)
unit


Specific tests:

XML :

   Parsing Error(s): 
   
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/conf/empty-configuration.xml
 
   hadoop-tools/hadoop-azure/src/config/checkstyle-suppressions.xml 
   hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui/public/crossdomain.xml 
   
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui/src/main/webapp/public/crossdomain.xml
 

Failed junit tests :

   hadoop.io.compress.TestCompressorDecompressor 
   hadoop.io.compress.snappy.TestSnappyCompressorDecompressor 
   hadoop.hdfs.qjournal.server.TestJournalNodeRespectsBindHostKeys 
   hadoop.hdfs.TestMultipleNNPortQOP 
   hadoop.hdfs.server.datanode.TestDataNodeHotSwapVolumes 
   hadoop.hdfs.server.namenode.TestNameNodeMXBean 
   hadoop.mapreduce.filecache.TestClientDistributedCacheManager 
  

   mvninstall:

   
https://builds.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86/688/artifact/out/patch-mvninstall-root.txt
  [348K]

   compile:

   
https://builds.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86/688/artifact/out/patch-compile-root.txt
  [268K]

   cc:

   
https://builds.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86/688/artifact/out/patch-compile-root.txt
  [268K]

   javac:

   
https://builds.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86/688/artifact/out/patch-compile-root.txt
  [268K]

   checkstyle:

   
https://builds.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86/688/artifact/out/diff-checkstyle-root.txt
  [16M]

   mvnsite:

   
https://builds.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86/688/artifact/out/patch-mvnsite-root.txt
  [28K]

   pathlen:

   
https://builds.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86/688/artifact/out/pathlen.txt
  [12K]

   shellcheck:

   
https://builds.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86/688/artifact/out/diff-patch-shellcheck.txt
  [72K]

   whitespace:

   
https://builds.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86/688/artifact/out/whitespace-eol.txt
  [12M]
   
https://builds.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86/688/artifact/out/whitespace-tabs.txt
  [1.3M]

   xml:

   
https://builds.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86/688/artifact/out/xml.txt
  [12K]

   javadoc:

   
https://builds.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86/688/artifact/out/patch-javadoc-root.txt
  [224K]

   unit:

   
https://builds.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86/688/artifact/out/patch-unit-hadoop-common-project_hadoop-common.txt
  [200K]
   
https://builds.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86/688/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
  [272K]
   
https://builds.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86/688/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs_src_contrib_bkjournal.txt
  [8.0K]
   
https://builds.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86/688/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-api.txt
  [4.0K]
   
https://builds.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86/688/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-common.txt
  [8.0K]
   
https://builds.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86/688/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-common.txt
  [8.0K]
   
https://builds.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86/688/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
  [8.0K]
   
https://builds.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86/688/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-applicationhistoryservice.txt
  [8.0K]
   
https://builds.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86/688/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
  [8.0K]
   
https://builds.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86/688/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-tests.txt
  [8.0K]
   
https://builds.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86/688/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-client.txt
  [8.0K]
   
https://builds.apache.org/job/hadoop-qb