[VOTE] Release Apache Hadoop 2.7.3 RC2

2016-08-17 Thread Vinod Kumar Vavilapalli
Hi all,

I've created a new release candidate RC2 for Apache Hadoop 2.7.3.

As discussed before, this is the next maintenance release to follow up 2.7.2.

The RC is available for validation at: 
http://home.apache.org/~vinodkv/hadoop-2.7.3-RC2/ 


The RC tag in git is: release-2.7.3-RC2

The maven artifacts are available via repository.apache.org 
 at 
https://repository.apache.org/content/repositories/orgapachehadoop-1046 


The release-notes are inside the tar-balls at location 
hadoop-common-project/hadoop-common/src/main/docs/releasenotes.html. I hosted 
this at http://home.apache.org/~vinodkv/hadoop-2.7.3-RC2/releasenotes.html 
 for your 
quick perusal.

As you may have noted,
 - few issues with RC0 forced a RC1 [1]
 - few more issues with RC1 forced a RC2 [2]
 - a very long fix-cycle for the License & Notice issues (HADOOP-12893) caused 
2.7.3 (along with every other Hadoop release) to slip by quite a bit. This 
release's related discussion thread is linked below: [3].

Please try the release and vote; the vote will run for the usual 5 days.

Thanks,
Vinod

[1] [VOTE] Release Apache Hadoop 2.7.3 RC0: 
https://www.mail-archive.com/hdfs-dev%40hadoop.apache.org/index.html#26106 

[2] [VOTE] Release Apache Hadoop 2.7.3 RC1: 
https://www.mail-archive.com/hdfs-dev%40hadoop.apache.org/msg26336.html 

[3] 2.7.3 release plan: 
https://www.mail-archive.com/hdfs-dev%40hadoop.apache.org/msg24439.html 


Re: [VOTE] Release Apache Hadoop 2.7.3 RC1

2016-08-17 Thread Vinod Kumar Vavilapalli
Canceling the release vote for this and other issues reported.

+Vinod

> On Aug 16, 2016, at 10:01 PM, Akira Ajisaka  
> wrote:
> 
> -1 (binding)
> 
> HADOOP-13434 and HADOOP-11814, committed between RC0 and RC1, are not 
> reflected in the release note.
> 
> -Akira
> 
> On 8/17/16 13:29, Allen Wittenauer wrote:
>> 
>> 
>> -1
>> 
>> HDFS-9395 is an incompatible change:
>> 
>> a) Why is not marked as such in the changes file?
>> b) Why is an incompatible change in a micro release, much less a minor?
>> c) Where is the release note for this change?
>> 
>> 
>>> On Aug 12, 2016, at 9:45 AM, Vinod Kumar Vavilapalli  
>>> wrote:
>>> 
>>> Hi all,
>>> 
>>> I've created a release candidate RC1 for Apache Hadoop 2.7.3.
>>> 
>>> As discussed before, this is the next maintenance release to follow up 
>>> 2.7.2.
>>> 
>>> The RC is available for validation at: 
>>> http://home.apache.org/~vinodkv/hadoop-2.7.3-RC1/ 
>>> 
>>> 
>>> The RC tag in git is: release-2.7.3-RC1
>>> 
>>> The maven artifacts are available via repository.apache.org 
>>>  at 
>>> https://repository.apache.org/content/repositories/orgapachehadoop-1045/ 
>>> 
>>> 
>>> The release-notes are inside the tar-balls at location 
>>> hadoop-common-project/hadoop-common/src/main/docs/releasenotes.html. I 
>>> hosted this at home.apache.org/~vinodkv/hadoop-2.7.3-RC1/releasenotes.html 
>>>  for 
>>> your quick perusal.
>>> 
>>> As you may have noted,
>>> - few issues with RC0 forced a RC1 [1]
>>> - a very long fix-cycle for the License & Notice issues (HADOOP-12893) 
>>> caused 2.7.3 (along with every other Hadoop release) to slip by quite a 
>>> bit. This release's related discussion thread is linked below: [2].
>>> 
>>> Please try the release and vote; the vote will run for the usual 5 days.
>>> 
>>> Thanks,
>>> Vinod
>>> 
>>> [1] [VOTE] Release Apache Hadoop 2.7.3 RC0: 
>>> https://www.mail-archive.com/hdfs-dev%40hadoop.apache.org/index.html#26106 
>>> 
>>> [2]: 2.7.3 release plan: 
>>> https://www.mail-archive.com/hdfs-dev%40hadoop.apache.org/msg24439.html 
>>> 
>> 
>> 
>> -
>> To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
>> For additional commands, e-mail: common-dev-h...@hadoop.apache.org
>> 
> 
> 
> -
> To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
> 
> 


-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org



Re: [VOTE] Release Apache Hadoop 2.7.3 RC1

2016-08-17 Thread Vinod Kumar Vavilapalli
I always look at CHANGES.txt entries for incompatible-changes and this JIRA 
obviously wasn’t there.

Anyways, this shouldn’t be in any of branch-2.* as committers there clearly 
mentioned that this is an incompatible change.

I am reverting the patch from branch-2* .

Thanks
+Vinod

> On Aug 16, 2016, at 9:29 PM, Allen Wittenauer  
> wrote:
> 
> 
> 
> -1
> 
> HDFS-9395 is an incompatible change:
> 
> a) Why is not marked as such in the changes file?
> b) Why is an incompatible change in a micro release, much less a minor?
> c) Where is the release note for this change?
> 
> 
>> On Aug 12, 2016, at 9:45 AM, Vinod Kumar Vavilapalli  
>> wrote:
>> 
>> Hi all,
>> 
>> I've created a release candidate RC1 for Apache Hadoop 2.7.3.
>> 
>> As discussed before, this is the next maintenance release to follow up 2.7.2.
>> 
>> The RC is available for validation at: 
>> http://home.apache.org/~vinodkv/hadoop-2.7.3-RC1/ 
>> 
>> 
>> The RC tag in git is: release-2.7.3-RC1
>> 
>> The maven artifacts are available via repository.apache.org 
>>  at 
>> https://repository.apache.org/content/repositories/orgapachehadoop-1045/ 
>> 
>> 
>> The release-notes are inside the tar-balls at location 
>> hadoop-common-project/hadoop-common/src/main/docs/releasenotes.html. I 
>> hosted this at home.apache.org/~vinodkv/hadoop-2.7.3-RC1/releasenotes.html 
>>  for 
>> your quick perusal.
>> 
>> As you may have noted,
>> - few issues with RC0 forced a RC1 [1]
>> - a very long fix-cycle for the License & Notice issues (HADOOP-12893) 
>> caused 2.7.3 (along with every other Hadoop release) to slip by quite a bit. 
>> This release's related discussion thread is linked below: [2].
>> 
>> Please try the release and vote; the vote will run for the usual 5 days.
>> 
>> Thanks,
>> Vinod
>> 
>> [1] [VOTE] Release Apache Hadoop 2.7.3 RC0: 
>> https://www.mail-archive.com/hdfs-dev%40hadoop.apache.org/index.html#26106 
>> 
>> [2]: 2.7.3 release plan: 
>> https://www.mail-archive.com/hdfs-dev%40hadoop.apache.org/msg24439.html 
>> 
> 
> 
> -
> To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
> 
> 


-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org



[jira] [Created] (MAPREDUCE-6759) JobSubmitter/JobResourceUploader should parallelize upload of -libjars, -files, -archives

2016-08-17 Thread Dennis Huo (JIRA)
Dennis Huo created MAPREDUCE-6759:
-

 Summary: JobSubmitter/JobResourceUploader should parallelize 
upload of -libjars, -files, -archives
 Key: MAPREDUCE-6759
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6759
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: job submission
Reporter: Dennis Huo


During job submission, the {{JobResourceUploader}} currently iterates over 
for-loops of {{-libjars}}, {{-files}}, and {{-archives}} sequentially, which 
can significantly slow down job startup time when a large number of files need 
to be uploaded, especially if staging the files to a cloud object-store based 
FileSystem implementation like S3, GCS, WABS, etc., where round-trip latencies 
may be higher than HDFS despite having good throughput when parallelized:

{code:title=JobResourceUploader.java}
if (files != null) {
  FileSystem.mkdirs(jtFs, filesDir, mapredSysPerms);
  String[] fileArr = files.split(",");
  for (String tmpFile : fileArr) {
URI tmpURI = null;
try {
  tmpURI = new URI(tmpFile);
} catch (URISyntaxException e) {
  throw new IllegalArgumentException(e);
}
Path tmp = new Path(tmpURI);
Path newPath = copyRemoteFiles(filesDir, tmp, conf, replication);
try {
  URI pathURI = getPathURI(newPath, tmpURI.getFragment());
  DistributedCache.addCacheFile(pathURI, conf);
} catch (URISyntaxException ue) {
  // should not throw a uri exception
  throw new IOException("Failed to create uri for " + tmpFile, ue);
}
  }
}

if (libjars != null) {
  FileSystem.mkdirs(jtFs, libjarsDir, mapredSysPerms);
  String[] libjarsArr = libjars.split(",");
  for (String tmpjars : libjarsArr) {
Path tmp = new Path(tmpjars);
Path newPath = copyRemoteFiles(libjarsDir, tmp, conf, replication);
DistributedCache.addFileToClassPath(
new Path(newPath.toUri().getPath()), conf, jtFs);
  }
}

if (archives != null) {
  FileSystem.mkdirs(jtFs, archivesDir, mapredSysPerms);
  String[] archivesArr = archives.split(",");
  for (String tmpArchives : archivesArr) {
URI tmpURI;
try {
  tmpURI = new URI(tmpArchives);
} catch (URISyntaxException e) {
  throw new IllegalArgumentException(e);
}
Path tmp = new Path(tmpURI);
Path newPath = copyRemoteFiles(archivesDir, tmp, conf, replication);
try {
  URI pathURI = getPathURI(newPath, tmpURI.getFragment());
  DistributedCache.addCacheArchive(pathURI, conf);
} catch (URISyntaxException ue) {
  // should not throw an uri excpetion
  throw new IOException("Failed to create uri for " + tmpArchives, ue);
}
  }
}
{code}

Parallelizing the upload of these files would improve job submission time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org



RE: [VOTE] Release Apache Hadoop 2.7.3 RC1

2016-08-17 Thread Balkrushna Patil
Hi all,
I want to go dipper in mapreduce source code so I have to embed that code in
IDE. Can anybody tell me from where I should have to download source code.

Thanks and Regards
Balkrushna Patil
augmentIQ Data Sciences Pvt Ltd.
Mob: +91-9766 4996 81

-Original Message-
From: Junping Du [mailto:j...@hortonworks.com] 
Sent: 17 August 2016 18:45
To: Allen Wittenauer; common-...@hadoop.apache.org; kshu...@yahoo-inc.com;
kih...@yahoo-inc.com
Cc: hdfs-...@hadoop.apache.org; yarn-...@hadoop.apache.org;
mapreduce-dev@hadoop.apache.org
Subject: Re: [VOTE] Release Apache Hadoop 2.7.3 RC1

>From my quick understanding, HDFS-9395 is more like a bug fix and
improvement for audit logging instead of incompatible changes. We mark
incompatible probably because the audit log behavior could be
corrected/updated in some exception cases. I think it still belongs to 2.7.3
scope. 
Kuhu and Kihwal, any comments here?


Thanks,

Junping 

From: Allen Wittenauer 
Sent: Wednesday, August 17, 2016 5:29 AM
To: common-...@hadoop.apache.org
Cc: hdfs-...@hadoop.apache.org; yarn-...@hadoop.apache.org;
mapreduce-dev@hadoop.apache.org
Subject: Re: [VOTE] Release Apache Hadoop 2.7.3 RC1

-1

HDFS-9395 is an incompatible change:

a) Why is not marked as such in the changes file?
b) Why is an incompatible change in a micro release, much less a minor?
c) Where is the release note for this change?


> On Aug 12, 2016, at 9:45 AM, Vinod Kumar Vavilapalli 
wrote:
>
> Hi all,
>
> I've created a release candidate RC1 for Apache Hadoop 2.7.3.
>
> As discussed before, this is the next maintenance release to follow up
2.7.2.
>
> The RC is available for validation at:
http://home.apache.org/~vinodkv/hadoop-2.7.3-RC1/

>
> The RC tag in git is: release-2.7.3-RC1
>
> The maven artifacts are available via repository.apache.org
 at
https://repository.apache.org/content/repositories/orgapachehadoop-1045/

>
> The release-notes are inside the tar-balls at location
hadoop-common-project/hadoop-common/src/main/docs/releasenotes.html. I
hosted this at home.apache.org/~vinodkv/hadoop-2.7.3-RC1/releasenotes.html
 for
your quick perusal.
>
> As you may have noted,
> - few issues with RC0 forced a RC1 [1]
> - a very long fix-cycle for the License & Notice issues (HADOOP-12893)
caused 2.7.3 (along with every other Hadoop release) to slip by quite a bit.
This release's related discussion thread is linked below: [2].
>
> Please try the release and vote; the vote will run for the usual 5 days.
>
> Thanks,
> Vinod
>
> [1] [VOTE] Release Apache Hadoop 2.7.3 RC0:
https://www.mail-archive.com/hdfs-dev%40hadoop.apache.org/index.html#26106

> [2]: 2.7.3 release plan:
https://www.mail-archive.com/hdfs-dev%40hadoop.apache.org/msg24439.html



-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org


-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org






-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org



Re: [VOTE] Release Apache Hadoop 2.7.3 RC1

2016-08-17 Thread Allen Wittenauer

Touching the audit log is *extremely* dangerous from a compatibility 
perspective.  It is easily the most machine processed log in Hadoop (with the 
second likely being the fsck log).  In particular, this comment tells me that 
we are almost certainly going to break users:

"Some audit logs ( for non-ACE failures ) will go missing. So this 
change needs to be marked as Incompatible, for heads-up."

If that means what I think it means (the ordering of checks is going to 
make previously logged errors disappear in lieu of other, new messages showing 
up first), that is going to cause massive problems for users who are looking 
for a particular entry. Worse, while the JIRA was marked incompatible, there 
are absolutely zero hints to end users (changes file, release notes) that this 
could potentially break their universe without digging into the comments of 
said JIRA.  That's not a heads up, that's a landmine.

It's also arguable that this is actually a bug fix.  A lot of the 
assumptions made in that JIRA about the audit logs original intent are 
completely wrong. Better yet, a lot of the justification is around another 
unmarked, incompatible change that was introduced in the 2.x timeline.

Even if one disagrees and still views this as a bug fix:  it's still an 
incompatible change.  Users are justifiably angry when we don't warn them about 
breakages and this is a great example of that.  

> On Aug 17, 2016, at 6:15 AM, Junping Du  wrote:
> 
> From my quick understanding, HDFS-9395 is more like a bug fix and improvement 
> for audit logging instead of incompatible changes. We mark incompatible 
> probably because the audit log behavior could be corrected/updated in some 
> exception cases. I think it still belongs to 2.7.3 scope. 
> Kuhu and Kihwal, any comments here?
> 
> 
> Thanks,
> 
> Junping 
> 
> From: Allen Wittenauer 
> Sent: Wednesday, August 17, 2016 5:29 AM
> To: common-...@hadoop.apache.org
> Cc: hdfs-...@hadoop.apache.org; yarn-...@hadoop.apache.org; 
> mapreduce-dev@hadoop.apache.org
> Subject: Re: [VOTE] Release Apache Hadoop 2.7.3 RC1
> 
> -1
> 
> HDFS-9395 is an incompatible change:
> 
> a) Why is not marked as such in the changes file?
> b) Why is an incompatible change in a micro release, much less a minor?
> c) Where is the release note for this change?
> 
> 
>> On Aug 12, 2016, at 9:45 AM, Vinod Kumar Vavilapalli  
>> wrote:
>> 
>> Hi all,
>> 
>> I've created a release candidate RC1 for Apache Hadoop 2.7.3.
>> 
>> As discussed before, this is the next maintenance release to follow up 2.7.2.
>> 
>> The RC is available for validation at: 
>> http://home.apache.org/~vinodkv/hadoop-2.7.3-RC1/ 
>> 
>> 
>> The RC tag in git is: release-2.7.3-RC1
>> 
>> The maven artifacts are available via repository.apache.org 
>>  at 
>> https://repository.apache.org/content/repositories/orgapachehadoop-1045/ 
>> 
>> 
>> The release-notes are inside the tar-balls at location 
>> hadoop-common-project/hadoop-common/src/main/docs/releasenotes.html. I 
>> hosted this at home.apache.org/~vinodkv/hadoop-2.7.3-RC1/releasenotes.html 
>>  for 
>> your quick perusal.
>> 
>> As you may have noted,
>> - few issues with RC0 forced a RC1 [1]
>> - a very long fix-cycle for the License & Notice issues (HADOOP-12893) 
>> caused 2.7.3 (along with every other Hadoop release) to slip by quite a bit. 
>> This release's related discussion thread is linked below: [2].
>> 
>> Please try the release and vote; the vote will run for the usual 5 days.
>> 
>> Thanks,
>> Vinod
>> 
>> [1] [VOTE] Release Apache Hadoop 2.7.3 RC0: 
>> https://www.mail-archive.com/hdfs-dev%40hadoop.apache.org/index.html#26106 
>> 
>> [2]: 2.7.3 release plan: 
>> https://www.mail-archive.com/hdfs-dev%40hadoop.apache.org/msg24439.html 
>> 
> 
> 
> -
> To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
> 


-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org