Re: [VOTE] Release Apache Hadoop 2.7.3 RC1

2016-08-16 Thread Akira Ajisaka

-1 (binding)

HADOOP-13434 and HADOOP-11814, committed between RC0 and RC1, are not 
reflected in the release note.


-Akira

On 8/17/16 13:29, Allen Wittenauer wrote:



-1

HDFS-9395 is an incompatible change:

a) Why is not marked as such in the changes file?
b) Why is an incompatible change in a micro release, much less a minor?
c) Where is the release note for this change?



On Aug 12, 2016, at 9:45 AM, Vinod Kumar Vavilapalli  wrote:

Hi all,

I've created a release candidate RC1 for Apache Hadoop 2.7.3.

As discussed before, this is the next maintenance release to follow up 2.7.2.

The RC is available for validation at: 
http://home.apache.org/~vinodkv/hadoop-2.7.3-RC1/ 


The RC tag in git is: release-2.7.3-RC1

The maven artifacts are available via repository.apache.org 
 at 
https://repository.apache.org/content/repositories/orgapachehadoop-1045/ 


The release-notes are inside the tar-balls at location 
hadoop-common-project/hadoop-common/src/main/docs/releasenotes.html. I hosted this at 
home.apache.org/~vinodkv/hadoop-2.7.3-RC1/releasenotes.html 
 for your 
quick perusal.

As you may have noted,
- few issues with RC0 forced a RC1 [1]
- a very long fix-cycle for the License & Notice issues (HADOOP-12893) caused 
2.7.3 (along with every other Hadoop release) to slip by quite a bit. This 
release's related discussion thread is linked below: [2].

Please try the release and vote; the vote will run for the usual 5 days.

Thanks,
Vinod

[1] [VOTE] Release Apache Hadoop 2.7.3 RC0: 
https://www.mail-archive.com/hdfs-dev%40hadoop.apache.org/index.html#26106 

[2]: 2.7.3 release plan: 
https://www.mail-archive.com/hdfs-dev%40hadoop.apache.org/msg24439.html 




-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org




-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org



Re: [VOTE] Release Apache Hadoop 2.7.3 RC1

2016-08-16 Thread Allen Wittenauer


-1

HDFS-9395 is an incompatible change:

a) Why is not marked as such in the changes file?
b) Why is an incompatible change in a micro release, much less a minor?
c) Where is the release note for this change?


> On Aug 12, 2016, at 9:45 AM, Vinod Kumar Vavilapalli  
> wrote:
> 
> Hi all,
> 
> I've created a release candidate RC1 for Apache Hadoop 2.7.3.
> 
> As discussed before, this is the next maintenance release to follow up 2.7.2.
> 
> The RC is available for validation at: 
> http://home.apache.org/~vinodkv/hadoop-2.7.3-RC1/ 
> 
> 
> The RC tag in git is: release-2.7.3-RC1
> 
> The maven artifacts are available via repository.apache.org 
>  at 
> https://repository.apache.org/content/repositories/orgapachehadoop-1045/ 
> 
> 
> The release-notes are inside the tar-balls at location 
> hadoop-common-project/hadoop-common/src/main/docs/releasenotes.html. I hosted 
> this at home.apache.org/~vinodkv/hadoop-2.7.3-RC1/releasenotes.html 
>  for 
> your quick perusal.
> 
> As you may have noted,
> - few issues with RC0 forced a RC1 [1]
> - a very long fix-cycle for the License & Notice issues (HADOOP-12893) caused 
> 2.7.3 (along with every other Hadoop release) to slip by quite a bit. This 
> release's related discussion thread is linked below: [2].
> 
> Please try the release and vote; the vote will run for the usual 5 days.
> 
> Thanks,
> Vinod
> 
> [1] [VOTE] Release Apache Hadoop 2.7.3 RC0: 
> https://www.mail-archive.com/hdfs-dev%40hadoop.apache.org/index.html#26106 
> 
> [2]: 2.7.3 release plan: 
> https://www.mail-archive.com/hdfs-dev%40hadoop.apache.org/msg24439.html 
> 


-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org



[jira] [Created] (MAPREDUCE-6758) TestDFSIO should parallelize its creation of control files on setup

2016-08-16 Thread Dennis Huo (JIRA)
Dennis Huo created MAPREDUCE-6758:
-

 Summary: TestDFSIO should parallelize its creation of control 
files on setup
 Key: MAPREDUCE-6758
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6758
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: test
Reporter: Dennis Huo


TestDFSIO currently performs a sequential for-loop to create {{nrFiles}} 
control files in the {{controlDir}} which is a subdirectory of the overall 
{{test.build.data}} directory, which may be a non-HDFS FileSystem 
implementation:

{code:java}
private void createControlFile(FileSystem fs,
long nrBytes, // in bytes
int nrFiles
  ) throws IOException {
  LOG.info("creating control file: "+nrBytes+" bytes, "+nrFiles+" files");

  Path controlDir = getControlDir(config);
  fs.delete(controlDir, true);

  for(int i=0; i < nrFiles; i++) {
String name = getFileName(i);
Path controlFile = new Path(controlDir, "in_file_" + name);
SequenceFile.Writer writer = null;
try {
  writer = SequenceFile.createWriter(fs, config, controlFile,
 Text.class, LongWritable.class,
 CompressionType.NONE);
  writer.append(new Text(name), new LongWritable(nrBytes));
} catch(Exception e) {
  throw new IOException(e.getLocalizedMessage());
} finally {
  if (writer != null)
writer.close();
  writer = null;
}
  }
  LOG.info("created control files for: "+nrFiles+" files");
}
{code}

When testing in an object-store based filesystem with higher round-trip latency 
than HDFS (like S3 or GCS), this means job setup that might only take seconds 
in HDFS ends up taking minutes or even tens of minutes against the object 
stores if the test is using thousands of control files. In the same vein as 
other JIRAs in [https://issues.apache.org/jira/browse/HADOOP-11694], the 
control-file creation should be parallelized/multithreaded to efficiently 
launch large TestDFSIO jobs against FileSystem impls with high round-trip 
latency but which can still support high overall throughput/QPS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org



[jira] [Resolved] (MAPREDUCE-6723) Turn log level to Debug in test

2016-08-16 Thread Haibo Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibo Chen resolved MAPREDUCE-6723.
---
Resolution: Won't Fix

> Turn log level to Debug in test
> ---
>
> Key: MAPREDUCE-6723
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6723
> Project: Hadoop Map/Reduce
>  Issue Type: Test
>Reporter: Haibo Chen
>Assignee: Haibo Chen
> Attachments: mapreduce6723.001.patch
>
>
> The current log level in test enviroment for all mapreduce projects is info. 
> Often in case where we are investigating intermittent test failures, DEBUG 
> level messages in log file can be very useful to identify problems.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org



Re: [VOTE] Release Apache Hadoop 2.7.3 RC1

2016-08-16 Thread Vinod Kumar Vavilapalli
Thanks Steve, this is one area that isn’t very well release-tested usually!

+Vinod

> On Aug 16, 2016, at 2:25 AM, Steve Loughran  wrote:
> 
> I've just looked at the staged JARs and how they worked with downstream apps 
> —that being a key way that Hadoop artifacts are adopted.



Apache Hadoop qbt Report: trunk+JDK8 on Linux/x86

2016-08-16 Thread Apache Jenkins Server
For more details, see 
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/135/

[Aug 15, 2016 4:47:30 PM] (aengineer) HDFS-10737. disk balancer add volume path 
to report command. Contributed
[Aug 15, 2016 6:01:23 PM] (kihwal) HDFS-9696. Garbage snapshot records linger 
forever. Contributed by
[Aug 15, 2016 6:33:29 PM] (varunsaxena) YARN-5521. Fix random failure of
[Aug 15, 2016 7:40:29 PM] (aengineer) HDFS-10580. DiskBalancer: Make use of 
unused methods in GreedyPlanner to
[Aug 15, 2016 9:33:44 PM] (varunsaxena) HADOOP-1. testConf.xml ls 
comparators in wrong order (Vrushali C via
[Aug 15, 2016 9:45:44 PM] (kihwal) HDFS-10744. Internally optimize path 
component resolution. Contributed
[Aug 15, 2016 10:28:09 PM] (kihwal) HDFS-10763. Open files can leak permanently 
due to inconsistent lease
[Aug 16, 2016 1:14:45 AM] (xiao) HADOOP-13437. KMS should reload whitelist and 
default key ACLs when
[Aug 16, 2016 2:58:57 AM] (aengineer) HDFS-10567. Improve plan command help 
message. Contributed by Xiaobing
[Aug 16, 2016 3:10:21 AM] (aengineer) HDFS-10559. DiskBalancer: Use SHA1 for 
Plan ID. Contributed by Xiaobing
[Aug 16, 2016 3:14:05 AM] (liuml07) HDFS-10725. Caller context should always be 
constructed by a builder.
[Aug 16, 2016 3:20:33 AM] (liuml07) HDFS-10724. Document caller context config 
keys. (Contributed by
[Aug 16, 2016 3:22:14 AM] (liuml07) HDFS-10678. Documenting 
NNThroughputBenchmark tool. (Contributed by
[Aug 16, 2016 3:23:47 AM] (liuml07) HDFS-10747. o.a.h.hdfs.tools.DebugAdmin 
usage message is misleading.
[Aug 16, 2016 3:24:54 AM] (liuml07) HADOOP-13470. GenericTestUtils$LogCapturer 
is flaky. (Contributed by
[Aug 16, 2016 3:28:40 AM] (liuml07) HDFS-10641. 
TestBlockManager#testBlockReportQueueing fails
[Aug 16, 2016 4:30:40 AM] (iwasakims) HADOOP-13419. Fix javadoc warnings by 
JDK8 in hadoop-common package.




-1 overall


The following subsystems voted -1:
asflicense unit


The following subsystems voted -1 but
were configured to be filtered/ignored:
cc checkstyle javac javadoc pylint shellcheck shelldocs whitespace


The following subsystems are considered long running:
(runtime bigger than 1h  0m  0s)
unit


Specific tests:

Failed junit tests :

   hadoop.ipc.TestRPCWaitForProxy 
   hadoop.hdfs.server.namenode.ha.TestBootstrapStandby 
   hadoop.hdfs.server.namenode.ha.TestHASafeMode 
   hadoop.hdfs.TestRollingUpgrade 
   hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure 
   hadoop.hdfs.server.blockmanagement.TestPendingInvalidateBlock 
   hadoop.hdfs.server.datanode.TestFsDatasetCache 
   hadoop.yarn.logaggregation.TestAggregatedLogFormat 
   
hadoop.yarn.server.nodemanager.containermanager.queuing.TestQueuingContainerManager
 
   hadoop.yarn.server.applicationhistoryservice.webapp.TestAHSWebServices 
   hadoop.yarn.server.TestContainerManagerSecurity 
   hadoop.yarn.server.TestMiniYarnClusterNodeUtilization 
   hadoop.yarn.client.api.impl.TestYarnClient 
  

   cc:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/135/artifact/out/diff-compile-cc-root.txt
  [4.0K]

   javac:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/135/artifact/out/diff-compile-javac-root.txt
  [172K]

   checkstyle:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/135/artifact/out/diff-checkstyle-root.txt
  [16M]

   pylint:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/135/artifact/out/diff-patch-pylint.txt
  [16K]

   shellcheck:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/135/artifact/out/diff-patch-shellcheck.txt
  [20K]

   shelldocs:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/135/artifact/out/diff-patch-shelldocs.txt
  [16K]

   whitespace:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/135/artifact/out/whitespace-eol.txt
  [12M]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/135/artifact/out/whitespace-tabs.txt
  [1.3M]

   javadoc:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/135/artifact/out/diff-javadoc-javadoc-root.txt
  [2.2M]

   unit:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/135/artifact/out/patch-unit-hadoop-common-project_hadoop-common.txt
  [120K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/135/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
  [160K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/135/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-common.txt
  [24K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/135/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
  [36K]
   

[jira] [Created] (MAPREDUCE-6757) Multithreaded mapper corrupts buffer pusher in nativetask

2016-08-16 Thread He Tianyi (JIRA)
He Tianyi created MAPREDUCE-6757:


 Summary: Multithreaded mapper corrupts buffer pusher in nativetask
 Key: MAPREDUCE-6757
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6757
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: nativetask
Affects Versions: 3.0.0-alpha1
Reporter: He Tianyi


Multiple threads could be calling {{collect}} method of the same 
{{NativeMapOutputCollectorDelegator}} instance at the same time. In this case, 
buffer can be corrupted.
This may occur when executing Hive queries with custom script.

Adding 'synchronized' keyword to {{collect}} method would solve the problem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org



Re: [VOTE] Release Apache Hadoop 2.7.3 RC1

2016-08-16 Thread Steve Loughran
+1 binding


1. built and tested apache slider (incubating) against the Hadoop 2.7.3 
artifacts

2. did a build & test of Apache Spark master branch iwth 2.7.3 JARs, 

For that I had to tweak spark's build to support the staging repo; hopefully 
that will get into Spark 

https://issues.apache.org/jira/browse/SPARK-17058

3. did a test run of my WiP SPARK-7481 spark-cloud module; after fixing a 
couple of things on the test setup side related to HADOOP-13058, 

mvn test --pl cloud -Pyarn,hadoop-2.7,snapshots-and-staging 
-Dhadoop.version=2.7.3 -Dcloud.test.configuration.file=../conf/cloud-tests.xml

all was well —albeit measurably slower than Hadoop 2.8. That's proof that the 
2.8 version of s3a really does deliver measurable speedup for those tests 
(currently just file input/seek; more to come). I had originally thought things 
were broken as s3 init was failing -but that's because the s3 bucket was in 
frankfurt, and the AWS library used can't talk to that endpoint (v4 auth 
protocol, see).

4. did a full spark distribution build of that SPARK-7481 branch

dev/make-distribution.sh  -Pyarn,hadoop-2.7,snapshots-and-staging 
-Dhadoop.version=2.7.3

ran command line test to do read of s3a data:

bin/spark-submit --class org.apache.spark.cloud.s3.examples.S3LineCount 
\
  --conf 
spark.hadoop.fs.s3a.access.key=$AWS_KEY \
  --conf 
spark.hadoop.fs.s3a.secret.key=$AWS_SECRET \
   
examples/jars/spark-examples_2.11-2.1.0-SNAPSHOT.jar


5. Pulled out the microsoft Azure JAR azure-storage-2.0.0.jar and repeated step 
4

-this showed that the 2.7.x branch does handle the failure to load a filesystem 
due to dependency or other classloading problems —this was proving a big 
problem in adding the aws & azure stuff to the spark build, as it'd stop spark 
from starting up if the dependencies were absent.

I've not done any of the .tar.gz diligence; I've just looked at the staged JARs 
and how they worked with downstream apps —that being a key way that Hadoop 
artifacts are adopted.


> On 12 Aug 2016, at 17:45, Vinod Kumar Vavilapalli  wrote:
> 
> Hi all,
> 
> I've created a release candidate RC1 for Apache Hadoop 2.7.3.
> 
> As discussed before, this is the next maintenance release to follow up 2.7.2.
> 
> The RC is available for validation at: 
> http://home.apache.org/~vinodkv/hadoop-2.7.3-RC1/ 
> 
> 
> The RC tag in git is: release-2.7.3-RC1
> 
> The maven artifacts are available via repository.apache.org 
>  at 
> https://repository.apache.org/content/repositories/orgapachehadoop-1045/ 
> 
> 
> The release-notes are inside the tar-balls at location 
> hadoop-common-project/hadoop-common/src/main/docs/releasenotes.html. I hosted 
> this at home.apache.org/~vinodkv/hadoop-2.7.3-RC1/releasenotes.html 
>  for 
> your quick perusal.
> 
> As you may have noted,
> - few issues with RC0 forced a RC1 [1]
> - a very long fix-cycle for the License & Notice issues (HADOOP-12893) caused 
> 2.7.3 (along with every other Hadoop release) to slip by quite a bit. This 
> release's related discussion thread is linked below: [2].
> 
> Please try the release and vote; the vote will run for the usual 5 days.
> 
> Thanks,
> Vinod
> 
> [1] [VOTE] Release Apache Hadoop 2.7.3 RC0: 
> https://www.mail-archive.com/hdfs-dev%40hadoop.apache.org/index.html#26106 
> 
> [2]: 2.7.3 release plan: 
> https://www.mail-archive.com/hdfs-dev%40hadoop.apache.org/msg24439.html 
> 


-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org