Re: [DISCUSS] Shade guava into hadoop-thirdparty

2020-04-04 Thread Wei-Chiu Chuang
Great question!

I can run Java API Compliance Checker to detect any API changes. Guess
that's the only one to find out.

On Sat, Apr 4, 2020 at 1:19 PM Igor Dvorzhak  wrote:

> How this proposal will impact public APIs? I.e does Hadoop expose any
> Guava classes in the client APIs that will require recompiling all client
> applications because they need to use shaded Guava classes?
>
> On Sat, Apr 4, 2020 at 12:13 PM Wei-Chiu Chuang 
> wrote:
>
>> Hi Hadoop devs,
>>
>> I spent a good part of the past 7 months working with a dozen of
>> colleagues
>> to update the guava version in Cloudera's software (that includes Hadoop,
>> HBase, Spark, Hive, Cloudera Manager ... more than 20+ projects)
>>
>> After 7 months, I finally came to a conclusion: Update to Hadoop 3.3 /
>> 3.2.1 / 3.1.3, even if you just go from Hadoop 3.0/ 3.1.0 is going to be
>> really hard because of guava. Because of Guava, the amount of work to
>> certify a minor release update is almost equivalent to a major release
>> update.
>>
>> That is because:
>> (1) Going from guava 11 to guava 27 is a big jump. There are several
>> incompatible API changes in many places. Too bad the Google developers are
>> not sympathetic about its users.
>> (2) guava is used in all Hadoop jars. Not just Hadoop servers but also
>> client jars and Hadoop common libs.
>> (3) The Hadoop library is used in practically all software at Cloudera.
>>
>> Here is my proposal:
>> (1) shade guava into hadoop-thirdparty, relocate the classpath to
>> org.hadoop.thirdparty.com.google.common.*
>> (2) make a hadoop-thirdparty 1.1.0 release.
>> (3) update existing references to guava to the relocated path. There are
>> more than 2k imports that need an update.
>> (4) release Hadoop 3.3.1 / 3.2.2 that contains this change.
>>
>> In this way, we will be able to update guava in Hadoop in the future
>> without disrupting Hadoop applications.
>>
>> Note: HBase already did this and this guava update project would have been
>> much more difficult if HBase didn't do so.
>>
>> Thoughts? Other options include
>> (1) force downstream applications to migrate to Hadoop client artifacts as
>> listed here
>>
>> https://hadoop.apache.org/docs/r3.1.1/hadoop-project-dist/hadoop-common/DownstreamDev.html
>> but
>> that's nearly impossible.
>> (2) Migrate Guava to Java APIs. I suppose this is a big project and I
>> can't
>> estimate how much work it's going to be.
>>
>> Weichiu
>>
>


Re: [DISCUSS] Shade guava into hadoop-thirdparty

2020-04-04 Thread Igor Dvorzhak
How this proposal will impact public APIs? I.e does Hadoop expose any Guava
classes in the client APIs that will require recompiling all client
applications because they need to use shaded Guava classes?

On Sat, Apr 4, 2020 at 12:13 PM Wei-Chiu Chuang  wrote:

> Hi Hadoop devs,
>
> I spent a good part of the past 7 months working with a dozen of colleagues
> to update the guava version in Cloudera's software (that includes Hadoop,
> HBase, Spark, Hive, Cloudera Manager ... more than 20+ projects)
>
> After 7 months, I finally came to a conclusion: Update to Hadoop 3.3 /
> 3.2.1 / 3.1.3, even if you just go from Hadoop 3.0/ 3.1.0 is going to be
> really hard because of guava. Because of Guava, the amount of work to
> certify a minor release update is almost equivalent to a major release
> update.
>
> That is because:
> (1) Going from guava 11 to guava 27 is a big jump. There are several
> incompatible API changes in many places. Too bad the Google developers are
> not sympathetic about its users.
> (2) guava is used in all Hadoop jars. Not just Hadoop servers but also
> client jars and Hadoop common libs.
> (3) The Hadoop library is used in practically all software at Cloudera.
>
> Here is my proposal:
> (1) shade guava into hadoop-thirdparty, relocate the classpath to
> org.hadoop.thirdparty.com.google.common.*
> (2) make a hadoop-thirdparty 1.1.0 release.
> (3) update existing references to guava to the relocated path. There are
> more than 2k imports that need an update.
> (4) release Hadoop 3.3.1 / 3.2.2 that contains this change.
>
> In this way, we will be able to update guava in Hadoop in the future
> without disrupting Hadoop applications.
>
> Note: HBase already did this and this guava update project would have been
> much more difficult if HBase didn't do so.
>
> Thoughts? Other options include
> (1) force downstream applications to migrate to Hadoop client artifacts as
> listed here
>
> https://hadoop.apache.org/docs/r3.1.1/hadoop-project-dist/hadoop-common/DownstreamDev.html
> but
> that's nearly impossible.
> (2) Migrate Guava to Java APIs. I suppose this is a big project and I can't
> estimate how much work it's going to be.
>
> Weichiu
>


smime.p7s
Description: S/MIME Cryptographic Signature


[DISCUSS] Shade guava into hadoop-thirdparty

2020-04-04 Thread Wei-Chiu Chuang
Hi Hadoop devs,

I spent a good part of the past 7 months working with a dozen of colleagues
to update the guava version in Cloudera's software (that includes Hadoop,
HBase, Spark, Hive, Cloudera Manager ... more than 20+ projects)

After 7 months, I finally came to a conclusion: Update to Hadoop 3.3 /
3.2.1 / 3.1.3, even if you just go from Hadoop 3.0/ 3.1.0 is going to be
really hard because of guava. Because of Guava, the amount of work to
certify a minor release update is almost equivalent to a major release
update.

That is because:
(1) Going from guava 11 to guava 27 is a big jump. There are several
incompatible API changes in many places. Too bad the Google developers are
not sympathetic about its users.
(2) guava is used in all Hadoop jars. Not just Hadoop servers but also
client jars and Hadoop common libs.
(3) The Hadoop library is used in practically all software at Cloudera.

Here is my proposal:
(1) shade guava into hadoop-thirdparty, relocate the classpath to
org.hadoop.thirdparty.com.google.common.*
(2) make a hadoop-thirdparty 1.1.0 release.
(3) update existing references to guava to the relocated path. There are
more than 2k imports that need an update.
(4) release Hadoop 3.3.1 / 3.2.2 that contains this change.

In this way, we will be able to update guava in Hadoop in the future
without disrupting Hadoop applications.

Note: HBase already did this and this guava update project would have been
much more difficult if HBase didn't do so.

Thoughts? Other options include
(1) force downstream applications to migrate to Hadoop client artifacts as
listed here
https://hadoop.apache.org/docs/r3.1.1/hadoop-project-dist/hadoop-common/DownstreamDev.html
but
that's nearly impossible.
(2) Migrate Guava to Java APIs. I suppose this is a big project and I can't
estimate how much work it's going to be.

Weichiu


Apache Hadoop qbt Report: trunk+JDK8 on Linux/x86

2020-04-04 Thread Apache Jenkins Server
For more details, see 
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1459/

[Apr 3, 2020 1:59:07 AM] (github) MAPREDUCE-7268. Fix TestMapreduceConfigFields 
(#1935)
[Apr 3, 2020 7:37:41 AM] (pjoseph) YARN-10120. Amendment fix for Java Doc.
[Apr 3, 2020 8:27:02 AM] (ayushsaxena) HADOOP-16952. Add .diff to gitignore. 
Contributed by Ayush Saxena.
[Apr 3, 2020 3:13:41 PM] (github) HDFS-15258. RBF: Mark Router FSCK unstable. 
(#1934)
[Apr 3, 2020 10:20:51 PM] (iwasakims) HADOOP-16647. Support OpenSSL 1.1.1 LTS. 
Contributed by Rakesh




-1 overall


The following subsystems voted -1:
asflicense findbugs pathlen unit xml


The following subsystems voted -1 but
were configured to be filtered/ignored:
cc checkstyle javac javadoc pylint shellcheck shelldocs whitespace


The following subsystems are considered long running:
(runtime bigger than 1h  0m  0s)
unit


Specific tests:

XML :

   Parsing Error(s): 
   
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/resources/nvidia-smi-output-excerpt.xml
 
   
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/resources/nvidia-smi-output-missing-tags.xml
 
   
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/resources/nvidia-smi-output-missing-tags2.xml
 
   
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/resources/nvidia-smi-sample-output.xml
 
   
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/resources/fair-scheduler-invalid.xml
 
   
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/resources/yarn-site-with-invalid-allocation-file-ref.xml
 

FindBugs :

   
module:hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common
 
   org.apache.hadoop.yarn.server.webapp.WebServiceClient.sslFactory should 
be package protected At WebServiceClient.java: At WebServiceClient.java:[line 
42] 

FindBugs :

   module:hadoop-cloud-storage-project/hadoop-cos 
   Redundant nullcheck of dir, which is known to be non-null in 
org.apache.hadoop.fs.cosn.BufferPool.createDir(String) Redundant null check at 
BufferPool.java:is known to be non-null in 
org.apache.hadoop.fs.cosn.BufferPool.createDir(String) Redundant null check at 
BufferPool.java:[line 66] 
   org.apache.hadoop.fs.cosn.CosNInputStream$ReadBuffer.getBuffer() may 
expose internal representation by returning CosNInputStream$ReadBuffer.buffer 
At CosNInputStream.java:by returning CosNInputStream$ReadBuffer.buffer At 
CosNInputStream.java:[line 87] 
   Found reliance on default encoding in 
org.apache.hadoop.fs.cosn.CosNativeFileSystemStore.storeFile(String, File, 
byte[]):in org.apache.hadoop.fs.cosn.CosNativeFileSystemStore.storeFile(String, 
File, byte[]): new String(byte[]) At CosNativeFileSystemStore.java:[line 199] 
   Found reliance on default encoding in 
org.apache.hadoop.fs.cosn.CosNativeFileSystemStore.storeFileWithRetry(String, 
InputStream, byte[], long):in 
org.apache.hadoop.fs.cosn.CosNativeFileSystemStore.storeFileWithRetry(String, 
InputStream, byte[], long): new String(byte[]) At 
CosNativeFileSystemStore.java:[line 178] 
   org.apache.hadoop.fs.cosn.CosNativeFileSystemStore.uploadPart(File, 
String, String, int) may fail to clean up java.io.InputStream Obligation to 
clean up resource created at CosNativeFileSystemStore.java:fail to clean up 
java.io.InputStream Obligation to clean up resource created at 
CosNativeFileSystemStore.java:[line 252] is not discharged 

Failed junit tests :

   hadoop.hdfs.server.datanode.TestBPOfferService 
   hadoop.hdfs.server.blockmanagement.TestUnderReplicatedBlocks 
   
hadoop.yarn.server.resourcemanager.reservation.TestCapacityOverTimePolicy 
   hadoop.yarn.applications.distributedshell.TestDistributedShell 
   hadoop.mapred.TestNetworkedJob 
   hadoop.yarn.sls.TestSLSRunner 
  

   cc:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1459/artifact/out/diff-compile-cc-root.txt
  [8.0K]

   javac:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1459/artifact/out/diff-compile-javac-root.txt
  [428K]

   checkstyle:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1459/artifact/out/diff-checkstyle-root.txt
  [16M]

   pathlen:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1459/artifact/out/pathlen.txt
  [12K]

   pylint:

   The source tree stderr: 
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1459/artifact/out/patch-pylint-stderr.txt
  []

   shellcheck:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1459/artifact/out/diff-patch-shellcheck.txt
  [16K]

   shelldocs:

   

Apache Hadoop qbt Report: branch2.10+JDK7 on Linux/x86

2020-04-04 Thread Apache Jenkins Server
For more details, see 
https://builds.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86/645/

No changes




-1 overall


The following subsystems voted -1:
asflicense findbugs hadolint pathlen unit xml


The following subsystems voted -1 but
were configured to be filtered/ignored:
cc checkstyle javac javadoc pylint shellcheck shelldocs whitespace


The following subsystems are considered long running:
(runtime bigger than 1h  0m  0s)
unit


Specific tests:

XML :

   Parsing Error(s): 
   
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/conf/empty-configuration.xml
 
   hadoop-tools/hadoop-azure/src/config/checkstyle-suppressions.xml 
   hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui/public/crossdomain.xml 
   
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui/src/main/webapp/public/crossdomain.xml
 

FindBugs :

   
module:hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase/hadoop-yarn-server-timelineservice-hbase-client
 
   Boxed value is unboxed and then immediately reboxed in 
org.apache.hadoop.yarn.server.timelineservice.storage.common.ColumnRWHelper.readResultsWithTimestamps(Result,
 byte[], byte[], KeyConverter, ValueConverter, boolean) At 
ColumnRWHelper.java:then immediately reboxed in 
org.apache.hadoop.yarn.server.timelineservice.storage.common.ColumnRWHelper.readResultsWithTimestamps(Result,
 byte[], byte[], KeyConverter, ValueConverter, boolean) At 
ColumnRWHelper.java:[line 335] 

Failed junit tests :

   hadoop.hdfs.server.namenode.ha.TestDFSUpgradeWithHA 
   hadoop.hdfs.qjournal.server.TestJournalNodeRespectsBindHostKeys 
   hadoop.contrib.bkjournal.TestBookKeeperHACheckpoints 
   hadoop.contrib.bkjournal.TestBookKeeperHACheckpoints 
   hadoop.yarn.client.api.impl.TestAMRMProxy 
   hadoop.registry.secure.TestSecureLogins 
  

   cc:

   
https://builds.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86/645/artifact/out/diff-compile-cc-root-jdk1.7.0_95.txt
  [4.0K]

   javac:

   
https://builds.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86/645/artifact/out/diff-compile-javac-root-jdk1.7.0_95.txt
  [324K]

   cc:

   
https://builds.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86/645/artifact/out/diff-compile-cc-root-jdk1.8.0_242.txt
  [4.0K]

   javac:

   
https://builds.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86/645/artifact/out/diff-compile-javac-root-jdk1.8.0_242.txt
  [304K]

   checkstyle:

   
https://builds.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86/645/artifact/out/diff-checkstyle-root.txt
  [16M]

   hadolint:

   
https://builds.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86/645/artifact/out/diff-patch-hadolint.txt
  [4.0K]

   pathlen:

   
https://builds.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86/645/artifact/out/pathlen.txt
  [12K]

   pylint:

   The source tree stderr: 
https://builds.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86/645/artifact/out/patch-pylint-stderr.txt
  []

   shellcheck:

   
https://builds.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86/645/artifact/out/diff-patch-shellcheck.txt
  [56K]

   shelldocs:

   
https://builds.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86/645/artifact/out/diff-patch-shelldocs.txt
  [8.0K]

   whitespace:

   
https://builds.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86/645/artifact/out/whitespace-eol.txt
  [12M]
   
https://builds.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86/645/artifact/out/whitespace-tabs.txt
  [1.3M]

   xml:

   
https://builds.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86/645/artifact/out/xml.txt
  [12K]

   findbugs:

   
https://builds.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86/645/artifact/out/branch-findbugs-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-timelineservice-hbase_hadoop-yarn-server-timelineservice-hbase-client-warnings.html
  [8.0K]

   javadoc:

   
https://builds.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86/645/artifact/out/diff-javadoc-javadoc-root-jdk1.7.0_95.txt
  [16K]
   
https://builds.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86/645/artifact/out/diff-javadoc-javadoc-root-jdk1.8.0_242.txt
  [1.1M]

   unit:

   
https://builds.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86/645/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
  [236K]
   
https://builds.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86/645/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs_src_contrib_bkjournal.txt
  [12K]
   
https://builds.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86/645/artifact/out/patch-unit-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-jobclient.txt
  [96K]