Re: [VOTE] Merge yarn-native-services branch into trunk

2017-09-05 Thread Jian He
>   If it doesn’t have all the bells and whistles, then it shouldn’t be on 
> port 53 by default.
Sure, I’ll change the default port to not use 53 and document it.
>   *how* is it getting launched on a privileged port? It sounds like the 
> expectation is to run “command” as root.   *ALL* of the previous daemons in 
> Hadoop that needed a privileged port used jsvc.  Why isn’t this one? These 
> questions matter from a security standpoint.  
Yes, it is running as “root” to be able to use the privileged port. The DNS 
server is not yet integrated with the hadoop script. 

> Check the output.  It’s pretty obviously borked:
Thanks for pointing out. Missed this when rebasing onto trunk.

> On Sep 5, 2017, at 3:11 PM, Allen Wittenauer  
> wrote:
> 
> 
>> On Sep 5, 2017, at 2:53 PM, Jian He  wrote:
>> 
>>> Based on the documentation, this doesn’t appear to be a fully function DNS 
>>> server as an admin would expect (e.g., BIND, Knot, whatever).  Where’s 
>>> forwarding? How do I setup notify? Are secondaries even supported? etc, etc.
>> 
>> It seems like this is a rehash of some of the discussion you and others had 
>> on the JIRA. The DNS here is a thin layer backed by service registry. My 
>> understanding from the JIRA is that there are no claims that this is already 
>> a DNS with all the bells and whistles - its goal is mainly to expose dynamic 
>> services running on YARN as end-points. Clearly, this is an optional daemon, 
>> if the provided feature set is deemed insufficient, an alternative solution 
>> can be plugged in by specific admins because the DNS piece is completely 
>> decoupled from the rest of native-services. 
> 
>   If it doesn’t have all the bells and whistles, then it shouldn’t be on 
> port 53 by default. It should also be documented that one *can’t* do these 
> things.  If the standard config is likely to be a “real” server on port 53 
> either acting as a secondary to the YARN one or at least able to forward 
> queries to it, then these need to get documented.  As it stands, operations 
> folks are going to be taken completely by surprise by some relatively random 
> process sitting on a very well established port.
> 
>>> In fact:  was this even tested on port 53? How does this get launched such 
>>> that it even has access to open port 53?  I don’t see any calls to use the 
>>> secure daemon code in the shell scripts. Is there any jsvc voodoo or is it 
>>> just “run X as root”?
>> 
>> Yes, we have tested this DNS server on port 53 on a cluster by running the 
>> DNS server as root user. The port is clearly configurable, so the admin has 
>> two options. Run as root + port 53. Run as non-root + non-privileged port. 
>> We tested and left it as port 53 to keep it on a standard DNS port. It is 
>> already documented as such though I can see that part can be improved a 
>> little.
> 
>   *how* is it getting launched on a privileged port? It sounds like the 
> expectation is to run “command” as root.   *ALL* of the previous daemons in 
> Hadoop that needed a privileged port used jsvc.  Why isn’t this one? These 
> questions matter from a security standpoint.  
> 
>>> 4) Post-merge, yarn usage information is broken.  This is especially 
>>> bad since it doesn’t appear that YarnCommands was ever updated to include 
>>> the new sub-commands.
>> 
>> The “yarn” usage command is working for me. what do you mean ? 
> 
> Check the output.  It’s pretty obviously borked:
> 
> ===snip
> 
>Daemon Commands:
> 
> nodemanager  run a nodemanager on each worker
> proxyserver  run the web app proxy server
> resourcemanager  run the ResourceManager
> router   run the Router daemon
> timelineserver   run the timeline server
> 
>Run a service Commands:
> 
> service  run a service
> 
>Run yarn-native-service rest server Commands:
> 
> apiserverrun yarn-native-service rest server
> 
> 
> ===snip===
> 
>> Yeah, looks like some previous features also forgot to update 
>> YarnCommands.md for the new sub commands 
> 
>   Likely.  But I was actually interested in playing with this one to 
> compare it to the competition.  [Lucky you. ;) ]  But with pretty much zero 
> documentation….
> 
> 



Re: [DISCUSS] Looking to a 2.9.0 release

2017-09-05 Thread Jonathan Hung
Hi Subru,

Thanks for starting the discussion. We are targeting merging YARN-5734
(API-based scheduler configuration) to branch-2 before the release of
2.9.0, since the feature is close to complete. Regarding the requirements
for merge,

1. API compatibility - this feature adds new APIs, does not modify any
existing ones.
2. Turning feature off - using the feature is configurable and is turned
off by default.
3. Stability/testing - this is an RM-only change, so we plan on deploying
this feature to a test RM and verifying configuration changes for capacity
scheduler. (Right now fair scheduler is not supported.)
4. Deployment - we want to get this feature in to 2.9.0 since we want to
use this feature and 2.9 version in our next upgrade.
5. Timeline - we have one main blocker which we are planning to resolve by
end of week. The rest of the month will be testing then a merge vote on the
last week of Sept.

Please let me know if you have any concerns. Thanks!


Jonathan Hung

On Wed, Jul 26, 2017 at 11:23 AM, J. Rottinghuis 
wrote:

> Thanks Vrushali for being entirely open as to the current status of ATSv2.
> I appreciate that we want to ensure things are tested at scale, and as you
> said we are working on that right now on our clusters.
> We have tested the feature to demonstrate it works at what we consider
> moderate scale.
>
> I think the criteria for including this feature in the 2.9 release should
> be if it can be safely turned off and not cause impact to anybody not using
> the new feature. The confidence for this is high for timeline service v2.
>
> Therefore, I think timeline service v2 should definitely be part of 2.9.
> That is the big draw for us to work on stabilizing a 2.9 release rather
> than just going to 2.8 and back-porting things ourselves.
>
> Thanks,
>
> Joep
>
> On Tue, Jul 25, 2017 at 11:39 PM, Vrushali Channapattan <
> vrushalic2...@gmail.com> wrote:
>
> > Thanks Subru for initiating this discussion.
> >
> > Wanted to share some thoughts in the context of Timeline Service v2. The
> > current status of this module is that we are ramping up for a second
> merge
> > to trunk. We still have a few merge blocker jiras outstanding, which we
> > think we will finish soon.
> >
> > While we have done some testing, we are yet to test at scale. Given all
> > this, we were thinking of initially targeting a beta release vehicle
> rather
> > than a stable release.
> >
> > As such, timeline service v2 has branch-2 branch called as
> > YARN-5355-branch-2 in case anyone wants to try it out. Timeline service
> v2
> > can be turned off and should not affect the cluster.
> >
> > thanks
> > Vrushali
> >
> >
> >
> >
> >
> > On Mon, Jul 24, 2017 at 1:26 PM, Subru Krishnan 
> wrote:
> >
> > > Folks,
> > >
> > > With the release for 2.8, we would like to look ahead to 2.9 release as
> > > there are many features/improvements in branch-2 (about 1062 commits),
> > that
> > > are in need of a release vechile.
> > >
> > > Here's our first cut of the proposal from the YARN side:
> > >
> > >1. Scheduler improvements (decoupling allocation from node
> heartbeat,
> > >allocation ID, concurrency fixes, LightResource etc).
> > >2. Timeline Service v2
> > >3. Opportunistic containers
> > >4. Federation
> > >
> > > We would like to hear a formal list from HDFS & Hadoop (& MapReduce if
> > any)
> > > and will update the Roadmap wiki accordingly.
> > >
> > > Considering our familiarity with the above mentioned YARN features, we
> > > would like to volunteer as the co-RMs for 2.9.0.
> > >
> > > We want to keep the timeline at 8-12 weeks to keep the release
> pragmatic.
> > >
> > > Feedback?
> > >
> > > -Subru/Arun
> > >
> >
>


Re: [VOTE] Merge yarn-native-services branch into trunk

2017-09-05 Thread Gour Saha
Thanks Allen. You are right, the github renderer does have trouble
rendering the headers. I was only looking at the html generated by mvn
site, which did not have trouble rendering them. Anyway I added a space
after all the hashes and it looks ok through github now.

-Gour 

On 9/5/17, 3:20 PM, "Allen Wittenauer"  wrote:

>
>> On Sep 5, 2017, at 3:12 PM, Gour Saha  wrote:
>> 
>> 2) Lots of markdown problems in the NativeServicesDiscovery.md document.
>> This includes things like Œyarnsite.xml¹ (missing a dash.)
>> 
>> The md patch uploaded to YARN-5244 had some special chars. I fixed those
>> in YARN-7161.
>
>
>   It’s a lot more than just special chars I think.  Even github (which has
>a way better markdown processor than what we’re using for the site docs)
>is having trouble rendering it:
>
>https://github.com/apache/hadoop/blob/51c39c4261236ab714fe0ec8d00753dc4c64
>06ee/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/na
>tive-services/NativeServicesDiscovery.md
>
>e.g., all of those ‘###’ are likely missing a space.
>



Re: [VOTE] Merge yarn-native-services branch into trunk

2017-09-05 Thread Allen Wittenauer

> On Sep 5, 2017, at 3:12 PM, Gour Saha  wrote:
> 
> 2) Lots of markdown problems in the NativeServicesDiscovery.md document.
> This includes things like Œyarnsite.xml¹ (missing a dash.)
> 
> The md patch uploaded to YARN-5244 had some special chars. I fixed those
> in YARN-7161.


It’s a lot more than just special chars I think.  Even github (which 
has a way better markdown processor than what we’re using for the site docs) is 
having trouble rendering it:

https://github.com/apache/hadoop/blob/51c39c4261236ab714fe0ec8d00753dc4c6406ee/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/native-services/NativeServicesDiscovery.md

e.g., all of those ‘###’ are likely missing a space.
-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Re: [VOTE] Merge yarn-native-services branch into trunk

2017-09-05 Thread Gour Saha
 2) Lots of markdown problems in the NativeServicesDiscovery.md document.
This includes things like Œyarnsite.xml¹ (missing a dash.)

The md patch uploaded to YARN-5244 had some special chars. I fixed those
in YARN-7161.


>


-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Re: [VOTE] Merge yarn-native-services branch into trunk

2017-09-05 Thread Allen Wittenauer

> On Sep 5, 2017, at 2:53 PM, Jian He  wrote:
> 
>> Based on the documentation, this doesn’t appear to be a fully function DNS 
>> server as an admin would expect (e.g., BIND, Knot, whatever).  Where’s 
>> forwarding? How do I setup notify? Are secondaries even supported? etc, etc.
> 
> It seems like this is a rehash of some of the discussion you and others had 
> on the JIRA. The DNS here is a thin layer backed by service registry. My 
> understanding from the JIRA is that there are no claims that this is already 
> a DNS with all the bells and whistles - its goal is mainly to expose dynamic 
> services running on YARN as end-points. Clearly, this is an optional daemon, 
> if the provided feature set is deemed insufficient, an alternative solution 
> can be plugged in by specific admins because the DNS piece is completely 
> decoupled from the rest of native-services. 

If it doesn’t have all the bells and whistles, then it shouldn’t be on 
port 53 by default. It should also be documented that one *can’t* do these 
things.  If the standard config is likely to be a “real” server on port 53 
either acting as a secondary to the YARN one or at least able to forward 
queries to it, then these need to get documented.  As it stands, operations 
folks are going to be taken completely by surprise by some relatively random 
process sitting on a very well established port.

>> In fact:  was this even tested on port 53? How does this get launched such 
>> that it even has access to open port 53?  I don’t see any calls to use the 
>> secure daemon code in the shell scripts. Is there any jsvc voodoo or is it 
>> just “run X as root”?
> 
> Yes, we have tested this DNS server on port 53 on a cluster by running the 
> DNS server as root user. The port is clearly configurable, so the admin has 
> two options. Run as root + port 53. Run as non-root + non-privileged port. We 
> tested and left it as port 53 to keep it on a standard DNS port. It is 
> already documented as such though I can see that part can be improved a 
> little.

*how* is it getting launched on a privileged port? It sounds like the 
expectation is to run “command” as root.   *ALL* of the previous daemons in 
Hadoop that needed a privileged port used jsvc.  Why isn’t this one? These 
questions matter from a security standpoint.  

>>  4) Post-merge, yarn usage information is broken.  This is especially 
>> bad since it doesn’t appear that YarnCommands was ever updated to include 
>> the new sub-commands.
> 
> The “yarn” usage command is working for me. what do you mean ? 

Check the output.  It’s pretty obviously borked:

===snip

Daemon Commands:

nodemanager  run a nodemanager on each worker
proxyserver  run the web app proxy server
resourcemanager  run the ResourceManager
router   run the Router daemon
timelineserver   run the timeline server

Run a service Commands:

service  run a service

Run yarn-native-service rest server Commands:

apiserverrun yarn-native-service rest server


===snip===

> Yeah, looks like some previous features also forgot to update YarnCommands.md 
> for the new sub commands 

Likely.  But I was actually interested in playing with this one to 
compare it to the competition.  [Lucky you. ;) ]  But with pretty much zero 
documentation….



-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Re: Apache Hadoop 2.8.2 Release Plan

2017-09-05 Thread Junping Du
I assume the quiet over the holiday means we agreed to move forward without 
taking HADOOP-14439 into 2.8.2.
There is a new release building (docker based) issue could be related to 
HADOOP-14474 where we removed oracle java 7 installer due to recent download 
address/contract change by Oracle. The build refuse to work - report as 
JAVA_HOME issue, but hard coded my local java home in create-release or 
Dockerfile doesn't help so we may need to add java 7 installation back (no 
matter Oracle JDK 7 or openJDK 7). 
Filed HADOOP-14842 with more details to track as blocker for 2.8.2.

Thanks,

Junping

From: Junping Du 
Sent: Friday, September 1, 2017 12:37 PM
To: larry mccay; Steve Loughran
Cc: common-dev@hadoop.apache.org; hdfs-...@hadoop.apache.org; 
mapreduce-...@hadoop.apache.org; yarn-...@hadoop.apache.org
Subject: Re: Apache Hadoop 2.8.2 Release Plan

This issue (HADOOP-14439) is out of my radar given it is marked as Minor 
priority. If my understanding is correct, here is a trade-off between security 
and backward compatibility. IMO, priority of security is generally higher than 
backward compatibility especially 2.8.0 is still non-production release.
I think we should skip this for 2.8.2 in case it doesn't break compatibility 
from 2.7.x. Thoughts?

Thanks,

Junping

From: larry mccay 
Sent: Friday, September 1, 2017 10:55 AM
To: Steve Loughran
Cc: common-dev@hadoop.apache.org; hdfs-...@hadoop.apache.org; 
mapreduce-...@hadoop.apache.org; yarn-...@hadoop.apache.org
Subject: Re: Apache Hadoop 2.8.2 Release Plan

If we do "fix" this in 2.8.2 we should seriously consider not doing so in
3.0.
This is a very poor practice.

I can see an argument for backward compatibility in 2.8.x line though.

On Fri, Sep 1, 2017 at 1:41 PM, Steve Loughran 
wrote:

> One thing we need to consider is
>
> HADOOP-14439: regression: secret stripping from S3x URIs breaks some
> downstream code
>
> Hadoop 2.8 has a best-effort attempt to strip out secrets from the
> toString() value of an s3a or s3n path where someone has embedded them in
> the URI; this has caused problems in some uses, specifically: when people
> use secrets this way (bad) and assume that you can round trip paths to
> string and back
>
> Should we fix this? If so, Hadoop 2.8.2 is the time to do it
>
>
> > On 1 Sep 2017, at 11:14, Junping Du  wrote:
> >
> > HADOOP-14814 get committed and HADOOP-9747 get push out to 2.8.3, so we
> are clean on blocker/critical issues now.
> > I finish practice of going through JACC report and no more incompatible
> public API changes get found between 2.8.2 and 2.7.4. Also I check commit
> history and fixed 10+ commits which are missing from branch-2.8.2 for some
> reason. So, the current branch-2.8.2 should be good to go for RC stage, and
> I will kick off our first RC tomorrow.
> > In the meanwhile, please don't land any commits to branch-2.8.2 since
> now. If some issues really belong to blocker, please ping me on the JIRA
> before doing any commits. branch-2.8 is still open for landing. Thanks for
> your cooperation!
> >
> >
> > Thanks,
> >
> > Junping
> >
> > 
> > From: Junping Du 
> > Sent: Wednesday, August 30, 2017 12:35 AM
> > To: Brahma Reddy Battula; common-dev@hadoop.apache.org;
> hdfs-...@hadoop.apache.org; mapreduce-...@hadoop.apache.org;
> yarn-...@hadoop.apache.org
> > Subject: Re: Apache Hadoop 2.8.2 Release Plan
> >
> > Thanks Brahma for comment on this thread. To be clear, I always update
> branch version just before RC kicking off.
> >
> > For 2.8.2 release, I don't have plan to involve big top or other
> third-party test tools. As always, we will rely on test/verify efforts from
> community especially from large deployed production cluster - as far as I
> know,  there are already several companies. like: Yahoo!, Alibaba, etc.
> already deploy 2.8 release in large production clusters for months which
> give me more confidence on 2.8.2.
> >
> >
> > Here is more update on 2.8.2 release:
> >
> > Blocker issues:
> >
> >-  A new blocker YARN-7076 get reported and fixed by Jian He through
> last weekend.
> >
> >-  Another new blocker - HADOOP-14814 get identified from my latest
> jdiff run against 2.7.4. The simple fix on an incompatible API change
> should get commit soon.
> >
> >
> > Critical issues:
> >
> >-  YARN-7083 already get committed. Thanks Jason for reporting the
> issue and delivering the fix.
> >
> >-  YARN-6091 get push out from 2.8.2 as issue is not a regression and
> pending for a while.
> >
> >-  Daryn is actively working on HADOOP-9747 for a while, and the
> patch are getting close to be committed. However, according to Daryn, the
> patch seems to cause some regression in some corner cases in secured
> environment (Storm auto tgt, etc.). May need 

Re: [VOTE] Merge yarn-native-services branch into trunk

2017-09-05 Thread Jian He
> 1) Did I miss it or is there no actual end-user documentation on how to use 
> this? 

Yes, we are in the process of finishing up the the doc and posting it. We 
considered this a release blocker for 3.0.0-beta1 and so working on it in 
parallel while the branch merge happens.

>   2) Lots of markdown problems in the NativeServicesDiscovery.md 
> document.  This includes things like ‘yarnsite.xml’ (missing a dash.)  Also, 
> I’m also confused why it’s called that when the title is YARN DNS, but 
> whatever.


Thanks for pointing out. We will fix this.

>  Based on the documentation, this doesn’t appear to be a fully function DNS 
> server as an admin would expect (e.g., BIND, Knot, whatever).  Where’s 
> forwarding? How do I setup notify? Are secondaries even supported? etc, etc.

It seems like this is a rehash of some of the discussion you and others had on 
the JIRA. The DNS here is a thin layer backed by service registry. My 
understanding from the JIRA is that there are no claims that this is already a 
DNS with all the bells and whistles - its goal is mainly to expose dynamic 
services running on YARN as end-points. Clearly, this is an optional daemon, if 
the provided feature set is deemed insufficient, an alternative solution can be 
plugged in by specific admins because the DNS piece is completely decoupled 
from the rest of native-services. 

> In fact:  was this even tested on port 53? How does this get launched such 
> that it even has access to open port 53?  I don’t see any calls to use the 
> secure daemon code in the shell scripts. Is there any jsvc voodoo or is it 
> just “run X as root”?

Yes, we have tested this DNS server on port 53 on a cluster by running the DNS 
server as root user. The port is clearly configurable, so the admin has two 
options. Run as root + port 53. Run as non-root + non-privileged port. We 
tested and left it as port 53 to keep it on a standard DNS port. It is already 
documented as such though I can see that part can be improved a little.

>   4) Post-merge, yarn usage information is broken.  This is especially 
> bad since it doesn’t appear that YarnCommands was ever updated to include the 
> new sub-commands.

The “yarn” usage command is working for me. what do you mean ? 
Yeah, looks like some previous features also forgot to update YarnCommands.md 
for the new sub commands 




[jira] [Created] (HADOOP-14842) Hadoop 2.8.2 release build process get stuck due to java issue

2017-09-05 Thread Junping Du (JIRA)
Junping Du created HADOOP-14842:
---

 Summary: Hadoop 2.8.2 release build process get stuck due to java 
issue
 Key: HADOOP-14842
 URL: https://issues.apache.org/jira/browse/HADOOP-14842
 Project: Hadoop Common
  Issue Type: Bug
  Components: build
Reporter: Junping Du
Priority: Blocker


In my latest 2.8.2 release build (via docker) get failed, and following errors 
received:
 
{noformat}
"/usr/bin/mvn -Dmaven.repo.local=/maven -pl hadoop-maven-plugins -am clean 
install
Error: JAVA_HOME is not defined correctly. We cannot execute 
/usr/lib/jvm/java-7-oracle/bin/java"
{noformat}

This looks like related to HADOOP-14474. However, reverting that patch doesn't 
work here because build progress will get failed earlier in java 
download/installation - may be as mentioned in HADOOP-14474, some java 7 
download address get changed by Oracle. 
Hard coding my local JAVA_HOME to create-release or Dockerfile doesn't work 
here although it show correct java home. My suspect so far is we still need to 
download java 7 from somewhere to make build progress continue in docker 
building process, but haven't got clue to go through this.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-13998) Merge initial S3guard release into trunk

2017-09-05 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-13998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang resolved HADOOP-13998.
--
Resolution: Done

Re-resolving per above.

> Merge initial S3guard release into trunk
> 
>
> Key: HADOOP-13998
> URL: https://issues.apache.org/jira/browse/HADOOP-13998
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.0.0-beta1
>Reporter: Steve Loughran
>Assignee: Steve Loughran
> Fix For: 3.0.0-beta1
>
> Attachments: HADOOP-13998-001.patch, HADOOP-13998-002.patch, 
> HADOOP-13998-003.patch, HADOOP-13998-004.patch, HADOOP-13998-005.patch
>
>
> JIRA to link in all the things we think are needed for a preview/merge into 
> trunk



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Reopened] (HADOOP-13998) Merge initial S3guard release into trunk

2017-09-05 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-13998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang reopened HADOOP-13998:
--

Re-opening to resolve as "Complete" or something, since this code change was 
attributed to the parent JIRA HADOOP-13345 in the commit message.

> Merge initial S3guard release into trunk
> 
>
> Key: HADOOP-13998
> URL: https://issues.apache.org/jira/browse/HADOOP-13998
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.0.0-beta1
>Reporter: Steve Loughran
>Assignee: Steve Loughran
> Fix For: 3.0.0-beta1
>
> Attachments: HADOOP-13998-001.patch, HADOOP-13998-002.patch, 
> HADOOP-13998-003.patch, HADOOP-13998-004.patch, HADOOP-13998-005.patch
>
>
> JIRA to link in all the things we think are needed for a preview/merge into 
> trunk



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-14841) Add KMS Client retry to handle 'No content to map' EOFExceptions

2017-09-05 Thread Xiao Chen (JIRA)
Xiao Chen created HADOOP-14841:
--

 Summary: Add KMS Client retry to handle 'No content to map' 
EOFExceptions
 Key: HADOOP-14841
 URL: https://issues.apache.org/jira/browse/HADOOP-14841
 Project: Hadoop Common
  Issue Type: Improvement
  Components: kms
Affects Versions: 2.6.0
Reporter: Xiao Chen
Assignee: Xiao Chen


We have seen quite some occurrences when the KMS server is stressed, some of 
the requests would end up getting a 500 return code, with this in the server 
log:
{noformat}
2017-08-31 06:45:33,021 WARN org.apache.hadoop.crypto.key.kms.server.KMS: User 
impala/HOSTNAME@REALM (auth:KERBEROS) request POST 
https://HOSTNAME:16000/kms/v1/keyversion/MNHDKEdWtZWM4vPb0p2bw544vdSRB2gy7APAQURcZns/_eek?eek_op=decrypt
 caused exception.
java.io.EOFException: No content to map to Object due to end of input
at 
org.codehaus.jackson.map.ObjectMapper._initForReading(ObjectMapper.java:2444)
at 
org.codehaus.jackson.map.ObjectMapper._readMapAndClose(ObjectMapper.java:2396)
at 
org.codehaus.jackson.map.ObjectMapper.readValue(ObjectMapper.java:1648)
at 
org.apache.hadoop.crypto.key.kms.server.KMSJSONReader.readFrom(KMSJSONReader.java:54)
at 
com.sun.jersey.spi.container.ContainerRequest.getEntity(ContainerRequest.java:474)
at 
com.sun.jersey.server.impl.model.method.dispatch.EntityParamDispatchProvider$EntityInjectable.getValue(EntityParamDispatchProvider.java:123)
at 
com.sun.jersey.server.impl.inject.InjectableValuesProvider.getInjectableValues(InjectableValuesProvider.java:46)
at 
com.sun.jersey.server.impl.model.method.dispatch.AbstractResourceMethodDispatchProvider$EntityParamInInvoker.getParams(AbstractResourceMethodDispatchProvider.java:153)
at 
com.sun.jersey.server.impl.model.method.dispatch.AbstractResourceMethodDispatchProvider$ResponseOutInvoker._dispatch(AbstractResourceMethodDispatchProvider.java:203)
at 
com.sun.jersey.server.impl.model.method.dispatch.ResourceJavaMethodDispatcher.dispatch(ResourceJavaMethodDispatcher.java:75)
at 
com.sun.jersey.server.impl.uri.rules.HttpMethodRule.accept(HttpMethodRule.java:288)
at 
com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147)
at 
com.sun.jersey.server.impl.uri.rules.ResourceClassRule.accept(ResourceClassRule.java:108)
at 
com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147)
at 
com.sun.jersey.server.impl.uri.rules.RootResourceClassesRule.accept(RootResourceClassesRule.java:84)
at 
com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1469)
at 
com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1400)
at 
com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1349)
at 
com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1339)
at 
com.sun.jersey.spi.container.servlet.WebComponent.service(WebComponent.java:416)
at 
com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:537)
at 
com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:699)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:723)
at 
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:290)
at 
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
at 
org.apache.hadoop.crypto.key.kms.server.KMSMDCFilter.doFilter(KMSMDCFilter.java:84)
at 
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
at 
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
at 
org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:631)
at 
org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticationFilter.doFilter(DelegationTokenAuthenticationFilter.java:301)
at 
org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:579)
at 
org.apache.hadoop.crypto.key.kms.server.KMSAuthenticationFilter.doFilter(KMSAuthenticationFilter.java:130)
at 
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
at 
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
at 
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
at 
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
at 

Re: [VOTE] Merge yarn-native-services branch into trunk

2017-09-05 Thread Allen Wittenauer

> On Aug 31, 2017, at 8:33 PM, Jian He  wrote:
> I would like to call a vote for merging yarn-native-services to trunk.

1) Did I miss it or is there no actual end-user documentation on how to 
use this?  I see 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/native-services/NativeServicesIntro.md,
 but that’s not particularly useful.  It looks like there are daemons that need 
to get started, based on other documentation?  How?  What do I configure? Is 
there a command to use to say “go do native for this job”?  I honestly have no 
idea how to make this do anything because most of the docs appear to be either 
TBD or expect me to read through a ton of JIRAs.  

2) Lots of markdown problems in the NativeServicesDiscovery.md 
document.  This includes things like ‘yarnsite.xml’ (missing a dash.)  Also, 
I’m also confused why it’s called that when the title is YARN DNS, but whatever.

3) The default port for the DNS server should NOT be 53 if typical 
deployments need to specify an alternate port.  Based on the documentation, 
this doesn’t appear to be a fully function DNS server as an admin would expect 
(e.g., BIND, Knot, whatever).  Where’s forwarding? How do I setup notify? Are 
secondaries even supported? etc, etc. In fact:  was this even tested on port 
53? How does this get launched such that it even has access to open port 53?  I 
don’t see any calls to use the secure daemon code in the shell scripts. Is 
there any jsvc voodoo or is it just “run X as root”?

4) Post-merge, yarn usage information is broken.  This is especially 
bad since it doesn’t appear that YarnCommands was ever updated to include the 
new sub-commands.

At this point in time:

-1 on 3.0.0-beta1
-0 on trunk



-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Re: [DISCUSS] Merge yarn-native-services branch into trunk

2017-09-05 Thread Arun Suresh
Thanks for all the work on this folks.
I know the VOTE thread has started for this.

But I did have a couple of questions:
* Is there a branch-2 merge planned for this ?
* I understand YARN-7126 has some introductory documentation, But I think
we need to flesh it up a bit more before release, I would also like to see
steps to deploy a sample service.

Cheers
-Arun

On Thu, Aug 31, 2017 at 12:40 AM, Jian He  wrote:

> Update:
> I’ve chatted with Andrew offline, we’ll proceed with merging
> yarn-native-services into trunk for beta.
> We’ll advertise this feature as “alpha"
> Currently, we have completed all the jiras for this merge - I’ve also
> moved out the subtasks that are not blocking this merge.
>
> I’ve created YARN-7127 to run the entire patch against trunk, once that
> goes green, I plan to start a formal vote.
>
> Thanks,
> Jian
>
> On Aug 18, 2017, at 2:48 PM, Andrew Wang > wrote:
>
> Hi Jian, thanks for the reply,
>
> On Thu, Aug 17, 2017 at 1:03 PM, Jian He  e...@hortonworks.com>> wrote:
> Thanks Andrew for the comments. Answers below:
>
> - There are no new APIs added in YARN/Hadoop core. In fact, all the new
> code are running outside of existing system and they are optional and
> require users to explicitly opt in. The new system’s own rest API is not
> stable and will be evolving.
>
> Great! That adds a lot more confidence that this is safe to merge.
>
> Are these new APIs listed in user documentation, and described as unstable?
>
> - We have been running/testing a version of the entire system internally
> for quite a while.
>
> Do you mind elaborating on the level of testing? Number of nodes, types of
> applications, production or test workload, etc. It'd help us build
> confidence.
>
> - I’d like to see this in hadoop3-beta1. Of course, we’ll take
> responsibility of moving fast and not block the potential timeline.
>
> Few more questions:
>
> How should we advertise this feature in the release? Since the APIs are
> unstable, I'd propose calling it "alpha" in the release notes, like we do
> the TSv2.
>
> Could you move out subtasks from YARN-5079 that are not blocking the
> merge? This would make it easier to understand what's remaining.
>
> Thanks,
> Andrew
>
>


[jira] [Created] (HADOOP-14840) Tool to estimate resource requirements of an application pipeline based on prior executions

2017-09-05 Thread Subru Krishnan (JIRA)
Subru Krishnan created HADOOP-14840:
---

 Summary: Tool to estimate resource requirements of an application 
pipeline based on prior executions
 Key: HADOOP-14840
 URL: https://issues.apache.org/jira/browse/HADOOP-14840
 Project: Hadoop Common
  Issue Type: New Feature
  Components: tools
Reporter: Subru Krishnan


We have been working on providing SLAs for job execution on Hadoop. At high 
level this involves 2 parts: deriving the resource requirements of a job and 
guaranteeing the estimated resources at runtime. The {{YARN ReservationSystem}} 
(YARN-1051/YARN-2572/YARN-5326) enable the latter and in this JIRA, we propose 
to add a tool to Hadoop to predict the  resource requirements of a job based on 
past executions of the job. The system (aka *Morpheus*) deep dive can be found 
in our OSDI'16 paper 
[here|https://www.usenix.org/conference/osdi16/technical-sessions/presentation/jyothi].



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Re: [VOTE] Merge yarn-native-services branch into trunk

2017-09-05 Thread Arun Suresh
+1 (binding).

Cheers
-Arun

On Fri, Sep 1, 2017 at 5:24 PM, Wangda Tan  wrote:

> +1 (Binding), I tried to use YARN service assembly before to run different
> kinds of jobs (for example, distributed Tensorflow), it is really easy for
> end user to run jobs on YARN.
>
> Thanks to the whole team for the great job!
>
> Best,
> Wangda
>
>
> On Fri, Sep 1, 2017 at 3:33 PM, Gour Saha  wrote:
>
> > +1 (non-binding)
> >
> > On 9/1/17, 11:58 AM, "Billie Rinaldi"  wrote:
> >
> > >+1 (non-binding)
> > >
> > >On Thu, Aug 31, 2017 at 8:33 PM, Jian He  wrote:
> > >
> > >> Hi All,
> > >>
> > >> I would like to call a vote for merging yarn-native-services to trunk.
> > >>The
> > >> vote will run for 7 days as usual.
> > >>
> > >> At a high level, the following are the key feautres implemented.
> > >> - YARN-5079[1]. A native YARN framework (ApplicationMaster) to migrate
> > >>and
> > >> orchestrate existing services to YARN either docker or non-docker
> based.
> > >> - YARN-4793[2]. A Rest API server for user to deploy a service via a
> > >> simple JSON spec
> > >> - YARN-4757[3]. Extending today's service registry with a simple DNS
> > >> service to enable users to discover services deployed on YARN
> > >> - YARN-6419[4]. UI support for native-services on the new YARN UI
> > >> All these new services are optional and are sitting outside of the
> > >> existing system, and have no impact on existing system if disabled.
> > >>
> > >> Special thanks to a team of folks who worked hard towards this: Billie
> > >> Rinaldi, Gour Saha, Vinod Kumar Vavilapalli, Jonathan Maron, Rohith
> > >>Sharma
> > >> K S, Sunil G, Akhil PB. This effort could not be possible without
> their
> > >> ideas and hard work.
> > >>
> > >> Thanks,
> > >> Jian
> > >>
> > >> [1] https://issues.apache.org/jira/browse/YARN-5079
> > >> [2] https://issues.apache.org/jira/browse/YARN-4793
> > >> [3] https://issues.apache.org/jira/browse/YARN-4757
> > >> [4] https://issues.apache.org/jira/browse/YARN-6419
> > >>
> > >>
> > >> -
> > >> To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
> > >> For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
> > >>
> > >>
> >
> >
> > -
> > To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
> > For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
> >
> >
>


Re: DISCUSS: Hadoop Compatability Guidelines

2017-09-05 Thread Arun Suresh
Thanks for starting this Daniel.

I think we should also add a section for store compatibility (all state
stores including RM, NM, Federation etc.). Essentially an explicit policy
detailing when is it ok to change the major and minor versions and how it
should relate to the hadoop release version.
Thoughts ?

Cheers
-Arun


On Tue, Sep 5, 2017 at 10:38 AM, Daniel Templeton 
wrote:

> Good idea.  I should have thought of that. :)  Done.
>
> Daniel
>
>
> On 9/5/17 10:33 AM, Anu Engineer wrote:
>
>> Could you please attach the PDFs to the JIRA. I think the mailer is
>> stripping them off from the mail.
>>
>> Thanks
>> Anu
>>
>>
>>
>>
>>
>> On 9/5/17, 9:44 AM, "Daniel Templeton"  wrote:
>>
>> Resending with a broader audience, and reattaching the PDFs.
>>>
>>> Daniel
>>>
>>> On 9/4/17 9:01 AM, Daniel Templeton wrote:
>>>
 All, in prep for Hadoop 3 beta 1 I've been working on updating the
 compatibility guidelines on HADOOP-13714.  I think the initial doc is
 more or less complete, so I'd like to open the discussion up to the
 broader Hadoop community.

 In the new guidelines, I have drawn some lines in the sand regarding
 compatibility between releases.  In some cases these lines are more
 restrictive than the current practices.  The intent with the new
 guidelines is not to limit progress by restricting what goes into a
 release, but rather to drive release numbering to keep in line with
 the reality of the code.

 Please have a read and provide feedback on the JIRA.  I'm sure there
 are more than a couple of areas that could be improved.  If you'd
 rather not read markdown from a diff patch, I've attached PDFs of the
 two modified docs.

 Thanks!
 Daniel

>>>
>>>
>
> -
> To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
>
>


Re: DISCUSS: Hadoop Compatability Guidelines

2017-09-05 Thread Daniel Templeton

Good idea.  I should have thought of that. :)  Done.

Daniel

On 9/5/17 10:33 AM, Anu Engineer wrote:

Could you please attach the PDFs to the JIRA. I think the mailer is stripping 
them off from the mail.

Thanks
Anu





On 9/5/17, 9:44 AM, "Daniel Templeton"  wrote:


Resending with a broader audience, and reattaching the PDFs.

Daniel

On 9/4/17 9:01 AM, Daniel Templeton wrote:

All, in prep for Hadoop 3 beta 1 I've been working on updating the
compatibility guidelines on HADOOP-13714.  I think the initial doc is
more or less complete, so I'd like to open the discussion up to the
broader Hadoop community.

In the new guidelines, I have drawn some lines in the sand regarding
compatibility between releases.  In some cases these lines are more
restrictive than the current practices.  The intent with the new
guidelines is not to limit progress by restricting what goes into a
release, but rather to drive release numbering to keep in line with
the reality of the code.

Please have a read and provide feedback on the JIRA.  I'm sure there
are more than a couple of areas that could be improved.  If you'd
rather not read markdown from a diff patch, I've attached PDFs of the
two modified docs.

Thanks!
Daniel





-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Re: DISCUSS: Hadoop Compatability Guidelines

2017-09-05 Thread Anu Engineer
Could you please attach the PDFs to the JIRA. I think the mailer is stripping 
them off from the mail.

Thanks
Anu





On 9/5/17, 9:44 AM, "Daniel Templeton"  wrote:

>Resending with a broader audience, and reattaching the PDFs.
>
>Daniel
>
>On 9/4/17 9:01 AM, Daniel Templeton wrote:
>> All, in prep for Hadoop 3 beta 1 I've been working on updating the 
>> compatibility guidelines on HADOOP-13714.  I think the initial doc is 
>> more or less complete, so I'd like to open the discussion up to the 
>> broader Hadoop community.
>>
>> In the new guidelines, I have drawn some lines in the sand regarding 
>> compatibility between releases.  In some cases these lines are more 
>> restrictive than the current practices.  The intent with the new 
>> guidelines is not to limit progress by restricting what goes into a 
>> release, but rather to drive release numbering to keep in line with 
>> the reality of the code.
>>
>> Please have a read and provide feedback on the JIRA.  I'm sure there 
>> are more than a couple of areas that could be improved.  If you'd 
>> rather not read markdown from a diff patch, I've attached PDFs of the 
>> two modified docs.
>>
>> Thanks!
>> Daniel
>
>

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org


[jira] [Resolved] (HADOOP-14828) RetryUpToMaximumTimeWithFixedSleep is not bounded by maximum time

2017-09-05 Thread Jonathan Hung (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Hung resolved HADOOP-14828.

Resolution: Duplicate

> RetryUpToMaximumTimeWithFixedSleep is not bounded by maximum time
> -
>
> Key: HADOOP-14828
> URL: https://issues.apache.org/jira/browse/HADOOP-14828
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Jonathan Hung
>
> In RetryPolicies.java, RetryUpToMaximumTimeWithFixedSleep is converted to a 
> RetryUpToMaximumCountWithFixedSleep, whose count is the maxTime / sleepTime: 
> {noformat}public RetryUpToMaximumTimeWithFixedSleep(long maxTime, long 
> sleepTime,
> TimeUnit timeUnit) {
>   super((int) (maxTime / sleepTime), sleepTime, timeUnit);
>   this.maxTime = maxTime;
>   this.timeUnit = timeUnit;
> }
> {noformat}
> But if retries take a long time, then the maxTime passed to the 
> RetryUpToMaximumTimeWithFixedSleep is exceeded.
> As an example, while doing NM restarts, we saw an issue where the NMProxy 
> creates a retry policy which specifies a maximum wait time of 15 minutes and 
> a 10 sec interval (which is converted to a MaximumCount policy with 15 min / 
> 10 sec = 90 tries). But each NMProxy retry policy invokes o.a.h.ipc.Client's 
> retry policy: {noformat}  if (connectionRetryPolicy == null) {
> final int max = conf.getInt(
> CommonConfigurationKeysPublic.IPC_CLIENT_CONNECT_MAX_RETRIES_KEY,
> 
> CommonConfigurationKeysPublic.IPC_CLIENT_CONNECT_MAX_RETRIES_DEFAULT);
> final int retryInterval = conf.getInt(
> 
> CommonConfigurationKeysPublic.IPC_CLIENT_CONNECT_RETRY_INTERVAL_KEY,
> CommonConfigurationKeysPublic
> .IPC_CLIENT_CONNECT_RETRY_INTERVAL_DEFAULT);
> connectionRetryPolicy = 
> RetryPolicies.retryUpToMaximumCountWithFixedSleep(
> max, retryInterval, TimeUnit.MILLISECONDS);
>   }{noformat}
> So the time it takes the NMProxy to fail is actually (90 retries) * (10 sec 
> NMProxy interval + o.a.h.ipc.Client retry time). In the default case, ipc 
> client retries 10 times with a 1 sec interval, meaning the time it takes for 
> NMProxy to fail is (90)(10 sec + 10 sec) = 30 min instead of the 15 min 
> specified by NMProxy configuration.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Apache Hadoop qbt Report: trunk+JDK8 on Linux/x86

2017-09-05 Thread Apache Jenkins Server
For more details, see 
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/514/

[Sep 4, 2017 2:41:02 PM] (varunsaxena) YARN-7152. [ATSv2] Registering timeline 
client before AMRMClient service
[Sep 5, 2017 2:36:43 AM] (sunilg) YARN-7022. Improve click interaction in queue 
topology in new YARN UI.




-1 overall


The following subsystems voted -1:
findbugs unit


The following subsystems voted -1 but
were configured to be filtered/ignored:
cc checkstyle javac javadoc pylint shellcheck shelldocs whitespace


The following subsystems are considered long running:
(runtime bigger than 1h  0m  0s)
unit


Specific tests:

FindBugs :

   
module:hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 
   Hard coded reference to an absolute pathname in 
org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DockerLinuxContainerRuntime.launchContainer(ContainerRuntimeContext)
 At DockerLinuxContainerRuntime.java:absolute pathname in 
org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DockerLinuxContainerRuntime.launchContainer(ContainerRuntimeContext)
 At DockerLinuxContainerRuntime.java:[line 490] 

Failed junit tests :

   hadoop.hdfs.TestLeaseRecoveryStriped 
   hadoop.hdfs.TestReadStripedFileWithDecoding 
   hadoop.hdfs.TestDFSStripedOutputStreamWithFailure070 
   hadoop.hdfs.TestFileCreationDelete 
   
hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerAllocation 
   hadoop.yarn.server.TestDiskFailures 
   hadoop.yarn.client.cli.TestLogsCLI 
   hadoop.yarn.client.api.impl.TestAMRMClient 
   hadoop.mapreduce.v2.hs.webapp.TestHSWebApp 
   hadoop.yarn.sls.TestReservationSystemInvariants 
   hadoop.yarn.sls.TestSLSRunner 

Timed out junit tests :

   org.apache.hadoop.hdfs.TestWriteReadStripedFile 
   
org.apache.hadoop.yarn.server.resourcemanager.TestSubmitApplicationWithRMHA 
   
org.apache.hadoop.yarn.server.resourcemanager.TestKillApplicationWithRMHA 
  

   cc:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/514/artifact/out/diff-compile-cc-root.txt
  [4.0K]

   javac:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/514/artifact/out/diff-compile-javac-root.txt
  [292K]

   checkstyle:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/514/artifact/out/diff-checkstyle-root.txt
  [17M]

   pylint:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/514/artifact/out/diff-patch-pylint.txt
  [20K]

   shellcheck:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/514/artifact/out/diff-patch-shellcheck.txt
  [20K]

   shelldocs:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/514/artifact/out/diff-patch-shelldocs.txt
  [12K]

   whitespace:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/514/artifact/out/whitespace-eol.txt
  [11M]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/514/artifact/out/whitespace-tabs.txt
  [1.2M]

   findbugs:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/514/artifact/out/branch-findbugs-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager-warnings.html
  [8.0K]

   javadoc:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/514/artifact/out/patch-javadoc-root.txt
  [2.0M]

   unit:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/514/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
  [1.4M]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/514/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
  [64K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/514/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-tests.txt
  [12K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/514/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-client.txt
  [16K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/514/artifact/out/patch-unit-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-hs.txt
  [16K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/514/artifact/out/patch-unit-hadoop-tools_hadoop-sls.txt
  [20K]

Powered by Apache Yetus 0.6.0-SNAPSHOT   http://yetus.apache.org

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org

Re: DISCUSS: Hadoop Compatability Guidelines

2017-09-05 Thread Daniel Templeton

Resending with a broader audience, and reattaching the PDFs.

Daniel

On 9/4/17 9:01 AM, Daniel Templeton wrote:
All, in prep for Hadoop 3 beta 1 I've been working on updating the 
compatibility guidelines on HADOOP-13714.  I think the initial doc is 
more or less complete, so I'd like to open the discussion up to the 
broader Hadoop community.


In the new guidelines, I have drawn some lines in the sand regarding 
compatibility between releases.  In some cases these lines are more 
restrictive than the current practices.  The intent with the new 
guidelines is not to limit progress by restricting what goes into a 
release, but rather to drive release numbering to keep in line with 
the reality of the code.


Please have a read and provide feedback on the JIRA.  I'm sure there 
are more than a couple of areas that could be improved.  If you'd 
rather not read markdown from a diff patch, I've attached PDFs of the 
two modified docs.


Thanks!
Daniel




-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org

[jira] [Created] (HADOOP-14837) Handle S3A "glacier" data

2017-09-05 Thread Steve Loughran (JIRA)
Steve Loughran created HADOOP-14837:
---

 Summary: Handle S3A "glacier" data
 Key: HADOOP-14837
 URL: https://issues.apache.org/jira/browse/HADOOP-14837
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: fs/s3
Affects Versions: 3.0.0-beta1
Reporter: Steve Loughran


SPARK-21797 covers how if you have AWS S3 set to copy some files to glacier, 
they appear in the listing but GETs fail, and so does everything else

We should think about how best to handle this.

# report better
# if listings can identify files which are glaciated then maybe we could have 
an option to filter them out
# test & see what happens




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-14836) multiple versions of maven-clean-plugin in use

2017-09-05 Thread Allen Wittenauer (JIRA)
Allen Wittenauer created HADOOP-14836:
-

 Summary: multiple versions of maven-clean-plugin in use
 Key: HADOOP-14836
 URL: https://issues.apache.org/jira/browse/HADOOP-14836
 Project: Hadoop Common
  Issue Type: Bug
  Components: build
Affects Versions: 3.0.0-beta1
Reporter: Allen Wittenauer


hadoop-yarn-ui re-declares maven-clean-plugin with 3.0 while the rest of the 
source tree uses 2.5.  This should get synced up.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-14835) mvn site build throws SAX errors

2017-09-05 Thread Allen Wittenauer (JIRA)
Allen Wittenauer created HADOOP-14835:
-

 Summary: mvn site build throws SAX errors
 Key: HADOOP-14835
 URL: https://issues.apache.org/jira/browse/HADOOP-14835
 Project: Hadoop Common
  Issue Type: Bug
  Components: build, site
Affects Versions: 3.0.0-beta1
Reporter: Allen Wittenauer
Priority: Critical




Running mvn  install site site:stage -DskipTests -Pdist,src -Preleasedocs,docs 
results in a stack trace when run on a fresh .m2 directory.  It appears to be 
coming from the jdiff doclets in the annotations code.





--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Reopened] (HADOOP-14736) S3AInputStream to implement an efficient skip() call through seeking

2017-09-05 Thread Steve Loughran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14736?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran reopened HADOOP-14736:
-

> S3AInputStream to implement an efficient skip() call through seeking
> 
>
> Key: HADOOP-14736
> URL: https://issues.apache.org/jira/browse/HADOOP-14736
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Steve Loughran
>Priority: Minor
>
> {{S3AInputStream}} implements skip() naively through the byte class: Reading 
> and discarding all data. Efficient on classic "sequential" reads, provided 
> the forward skip is <1MB. For larger skip values or on random IO, seek() 
> should be used.
> After some range checks/handling past-EOF skips to seek (EOF-1), a seek() 
> should handle the skip file.
> *there are no FS contract tests for skip semantics*



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-14834) Make default output stream of S3a the block output stream

2017-09-05 Thread Steve Loughran (JIRA)
Steve Loughran created HADOOP-14834:
---

 Summary: Make default output stream of S3a the block output stream
 Key: HADOOP-14834
 URL: https://issues.apache.org/jira/browse/HADOOP-14834
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: fs/s3
Affects Versions: 3.0.0-beta1
Reporter: Steve Loughran
Priority: Minor


The S3A Block output stream is working well and much better than the original 
stream in terms of: scale, performance, instrumentation, robustness

Proposed: switch this to be the default, as a precursor to removing it later 
HADOOP-14746



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-14826) review S3 docs prior to 3.0-beta-1

2017-09-05 Thread Steve Loughran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran resolved HADOOP-14826.
-
Resolution: Duplicate

> review S3 docs prior to 3.0-beta-1
> --
>
> Key: HADOOP-14826
> URL: https://issues.apache.org/jira/browse/HADOOP-14826
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: documentation, fs/s3
>Affects Versions: 3.0.0-beta1
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Blocker
>
> The hadoop-aws docs need a review and update
> * things marked as stabilizing (fast upload, .fadvise ..) can be considered 
> stable
> * move s3n docs off to the side
> * add a "how to move from s3n to s3a" para



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-13630) split up AWS index.md

2017-09-05 Thread Steve Loughran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-13630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran resolved HADOOP-13630.
-
Resolution: Duplicate

> split up AWS index.md
> -
>
> Key: HADOOP-13630
> URL: https://issues.apache.org/jira/browse/HADOOP-13630
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: documentation, fs/s3
>Affects Versions: 2.9.0
>Reporter: Steve Loughran
>
> The AWS index.md file is too big, too much written by developers as we go 
> along, not for end users.
> I propose splitting it into its own docs
> * Intro
> * S3A
> * S3N
> * S3 (branch-2 only, obviously)
> * testing
> * maybe in future: something on effective coding against object stores,
> though that could go toplevel, as it applies to all
> I propose waiting for HADOOP-13560 to be in, as that changes the docs.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-14736) S3AInputStream to implement an efficient skip() call through seeking

2017-09-05 Thread Steve Loughran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14736?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran resolved HADOOP-14736.
-
Resolution: Duplicate

> S3AInputStream to implement an efficient skip() call through seeking
> 
>
> Key: HADOOP-14736
> URL: https://issues.apache.org/jira/browse/HADOOP-14736
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Steve Loughran
>Priority: Minor
>
> {{S3AInputStream}} implements skip() naively through the byte class: Reading 
> and discarding all data. Efficient on classic "sequential" reads, provided 
> the forward skip is <1MB. For larger skip values or on random IO, seek() 
> should be used.
> After some range checks/handling past-EOF skips to seek (EOF-1), a seek() 
> should handle the skip file.
> *there are no FS contract tests for skip semantics*



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-14605) transient list consistency failure in ITestS3AContractRootDir

2017-09-05 Thread Steve Loughran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran resolved HADOOP-14605.
-
Resolution: Duplicate

closing as a duplicate of HADOOP-13271. It's a slightly different stack, but I 
suspect the same cause: observed listing inconsistency

> transient list consistency failure in ITestS3AContractRootDir
> -
>
> Key: HADOOP-14605
> URL: https://issues.apache.org/jira/browse/HADOOP-14605
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3, test
>Affects Versions: 3.0.0-alpha3
> Environment: s3guard disabled
>Reporter: Steve Loughran
>Priority: Minor
>
> Test against s3 ireland just failed with a deleted-path-still-found 
> exception; clearly a consistency event of one form or another.
> This was on a {{fileSystem.getFileStatus(path);}} call; a HEAD against the 
> object. Assume that a tombstone marker hadn't made it to the shard the 
> request went to.
> Test will need to move to an eventually()



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-13950) S3A create(path, overwrite=true) need only check for path being a dir, not a file

2017-09-05 Thread Steve Loughran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-13950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran resolved HADOOP-13950.
-
Resolution: Duplicate

> S3A create(path, overwrite=true) need only check for path being a dir, not a 
> file
> -
>
> Key: HADOOP-13950
> URL: https://issues.apache.org/jira/browse/HADOOP-13950
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 2.7.3
>Reporter: Steve Loughran
>Priority: Minor
>
> When you create a file with overwrite=true, you don't care that a path 
> resolves to a file, only that there isn't a directory at the destination.
> S3A can use this, bypass the {{GET path}} and only do a {{GET path + "/"}} 
> and LIST path. That way: one HTTPS request saved, and no negative caching of 
> the path to confuse followup checks.
> That is: better performance and consistency



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-13757) Remove verifyBuckets overhead in S3AFileSystem::initialize()

2017-09-05 Thread Steve Loughran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-13757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran resolved HADOOP-13757.
-
Resolution: Won't Fix

I'm going to make this a wontfix, with s3guard & c logging in has become more 
complex. sorry

> Remove verifyBuckets overhead in S3AFileSystem::initialize()
> 
>
> Key: HADOOP-13757
> URL: https://issues.apache.org/jira/browse/HADOOP-13757
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.0.0-beta1
>Reporter: Rajesh Balamohan
>Priority: Minor
>
> {{S3AFileSystem.initialize()}} invokes verifyBuckets, but in case the bucket 
> does not exist and gets a 403 error message, it ends up returning {{true}} 
> for {{s3.doesBucketExists(bucketName}}.  In that aspect,  verifyBuckets() is 
> an unnecessary call during initialization. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-14833) Move s3a user:secret auth out of the main chain of auth mechanisms

2017-09-05 Thread Steve Loughran (JIRA)
Steve Loughran created HADOOP-14833:
---

 Summary: Move s3a user:secret auth out of the main chain of auth 
mechanisms
 Key: HADOOP-14833
 URL: https://issues.apache.org/jira/browse/HADOOP-14833
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: fs/s3
Affects Versions: 3.0.0-beta1
Reporter: Steve Loughran
Priority: Minor


Remove the s3a://user:secret@host auth mechanism from S3a

I think we could consider retain it as an explicit credential provider you can 
ask for, so that people who cannot move off it (yet) can reconfigure their 
system, but unless you do that, it stops working. 

We could add a dummy credential handler which recognises the user:secret 
pattern & then tells the user "no longer supported, sorry, here's how to 
migrate", & add that to the default chain after everything else.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org