Re: [DISCUSS] Merge yarn-native-services branch into trunk

2017-09-08 Thread Jian He
Hi Arun

Sorry for late reply.
* Is there a branch-2 merge planned for this ?
Branch-2 is not planned for this merge.

* I understand YARN-7126 has some introductory documentation, But I think we 
need to flesh it up a bit more before release, I would also like to see steps 
to deploy a sample service.
We have added more documentations, QuickStart.md, Overview.md and others in the 
same folder.
YarnCommands.md is also updated to document the new shell commands.

I encourage everyone to try  and share suggestions.

As said in another email thread, we decided to drop this for beta and re-target 
it for GA.

Thanks,
Jian

On Sep 5, 2017, at 1:37 PM, Arun Suresh 
> wrote:

Thanks for all the work on this folks.
I know the VOTE thread has started for this.

But I did have a couple of questions:
* Is there a branch-2 merge planned for this ?
* I understand YARN-7126 has some introductory documentation, But I think we 
need to flesh it up a bit more before release, I would also like to see steps 
to deploy a sample service.

Cheers
-Arun

On Thu, Aug 31, 2017 at 12:40 AM, Jian He 
> wrote:
Update:
I’ve chatted with Andrew offline, we’ll proceed with merging 
yarn-native-services into trunk for beta.
We’ll advertise this feature as “alpha"
Currently, we have completed all the jiras for this merge - I’ve also moved out 
the subtasks that are not blocking this merge.

I’ve created YARN-7127 to run the entire patch against trunk, once that goes 
green, I plan to start a formal vote.

Thanks,
Jian

On Aug 18, 2017, at 2:48 PM, Andrew Wang 
>>
 wrote:

Hi Jian, thanks for the reply,

On Thu, Aug 17, 2017 at 1:03 PM, Jian He 
>>
 wrote:
Thanks Andrew for the comments. Answers below:

- There are no new APIs added in YARN/Hadoop core. In fact, all the new code 
are running outside of existing system and they are optional and require users 
to explicitly opt in. The new system’s own rest API is not stable and will be 
evolving.

Great! That adds a lot more confidence that this is safe to merge.

Are these new APIs listed in user documentation, and described as unstable?

- We have been running/testing a version of the entire system internally for 
quite a while.

Do you mind elaborating on the level of testing? Number of nodes, types of 
applications, production or test workload, etc. It'd help us build confidence.

- I’d like to see this in hadoop3-beta1. Of course, we’ll take responsibility 
of moving fast and not block the potential timeline.

Few more questions:

How should we advertise this feature in the release? Since the APIs are 
unstable, I'd propose calling it "alpha" in the release notes, like we do the 
TSv2.

Could you move out subtasks from YARN-5079 that are not blocking the merge? 
This would make it easier to understand what's remaining.

Thanks,
Andrew





Re: [VOTE] Merge yarn-native-services branch into trunk

2017-09-08 Thread Jian He
Hi Andrew,

At this point, there are no more release blockers including documentations from 
our side - all work done.
But I agree it is too close to the release, after talking with other team 
members, we are fine to drop  this from beta,

And we want to target this for GA.
I’m withdrawing this vote and will start afresh vote later for GA. 
Thanks all who voted this effort !

Thanks,
Jian


> On Sep 7, 2017, at 3:59 PM, Andrew Wang  wrote:
> 
> Hi folks,
> 
> This vote closes today. I see a -1 from Allen on inclusion in beta1. I see
> there's active fixing going on, but given that we're one week out from RC0,
> I think we should drop this from beta1.
> 
> Allen, Jian, others, is this reasonable? What release should we retarget
> this for? I don't have a sense for how much work there is left to do, but
> as a reminder, we're planning GA for Nov 1st, and 3.1.0 for January.
> 
> Best,
> Andrew
> 
> On Wed, Sep 6, 2017 at 10:19 AM, Jian He  wrote:
> 
>>>  Please correct me if I’m wrong, but the current summary of the
>> branch, post these changes, looks like:
>> Sorry for confusion, I was actively writing the formal documentation for
>> how to use/how it works etc. and will post soon in a few hours.
>> 
>> 
>>> On Sep 6, 2017, at 10:15 AM, Allen Wittenauer 
>> wrote:
>>> 
>>> 
 On Sep 5, 2017, at 6:23 PM, Jian He  wrote:
 
>If it doesn’t have all the bells and whistles, then it shouldn’t
>> be on port 53 by default.
 Sure, I’ll change the default port to not use 53 and document it.
>*how* is it getting launched on a privileged port? It sounds like
>> the expectation is to run “command” as root.   *ALL* of the previous
>> daemons in Hadoop that needed a privileged port used jsvc.  Why isn’t this
>> one? These questions matter from a security standpoint.
 Yes, it is running as “root” to be able to use the privileged port. The
>> DNS server is not yet integrated with the hadoop script.
 
> Check the output.  It’s pretty obviously borked:
 Thanks for pointing out. Missed this when rebasing onto trunk.
>>> 
>>> 
>>>  Please correct me if I’m wrong, but the current summary of the
>> branch, post these changes, looks like:
>>> 
>>>  * A bunch of mostly new Java code that may or may not have
>> javadocs (post-revert YARN-6877, still working out HADOOP-14835)
>>>  * ~1/3 of the docs are roadmap/TBD
>>>  * ~1/3 of the docs are for an optional DNS daemon that has
>> no end user hook to start it
>>>  * ~1/3 of the docs are for a REST API that comes from some
>> undefined daemon (apiserver?)
>>>  * Two new, but undocumented, subcommands to yarn
>>>  * There are no docs for admins or users on how to actually
>> start or use this completely new/separate/optional feature
>>> 
>>>  How are outside people (e.g., non-branch committers) supposed to
>> test this new feature under these conditions?
>>> 
>> 
>> 
>> -
>> To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
>> For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
>> 
>> 



[jira] [Created] (MAPREDUCE-6956) FileOutputCommitter to gain abstract superclass PathOutputCommitter

2017-09-08 Thread Steve Loughran (JIRA)
Steve Loughran created MAPREDUCE-6956:
-

 Summary: FileOutputCommitter to gain abstract superclass 
PathOutputCommitter
 Key: MAPREDUCE-6956
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6956
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv2
Affects Versions: 3.0.0-beta1
Reporter: Steve Loughran
Assignee: Steve Loughran


This is the initial step of MAPREDUCE-6823, which proposes a factory behind 
{{FileOutputFormat}} to create different committers for different filesystems, 
if so configured..

This patch simply adds the new abstract superclass of {{FileOutputCommitter}}, 
{{PathOutputCommitter extends OutputCommitter}}. This abstract class adds the 
{{getWorkPath()}} method as an abstract method, with {{FIleOutputCommitter}} 
being the implementation..

{{FileOutputFormat}} then relaxes its requirement of any committer returned by 
{{getOutputCommitter()}}, so that instead of requiring a  
{{FileOutputCommitter}} or subclass, it only needs a {{PathOutputCommitter}}, 
using {{PathOutputCommitter.getWorkPath()}} to get the work path.

What does that do?

It allows people to implement subclasses of {{FileOutputFormat}} which can 
provide their own committers *which don't need to inherit the complexity that 
FileOutputCommitter has acquired over time*

Currently anyone implementing a new committer (example: Netflix S3 committer) 
needs to subclass {{FileOutputCommitter}}, which is too complex to understand 
except under a debugger with co-recursive routines, lots of methods which need 
to be overwritten to guarantee a safe subclass, and, because of its critical 
role and known subclassing, something which isn't ever going to be cleaned up.

A new, lean, parent class which {{FileOutputFormat}} can handle allows people 
to write new committers which don't have to worry about implementation details 
of {{FileOutputCommitter}}, but instead how well they implement the semantics 
of committing work.

The full MAPREDUCE-6823 goes beyond this with a change to {{FileOutputFormat}} 
for a factory for creating FS-specific {{PathOutputCommitter}} instances. This 
patch doesn't include that, as that is something which needs to be reviewed in 
the context of HADOOP-13786 and ideally 1+ committer for another store, so 
people can say "this factory model works".

All I'm proposing here is: tune the committer class hierarchy in MRv2 so that 
people can more easily implement committers, and when that factory is done, for 
it to be switched to easily. And I'd like this in branch-3 from the outset, so 
existing code which calls {{FileOutputFormat.getCommitter()}} to get a 
{{FileOutputCommitter}} *just to call getWorkPath()* can move to the new 
interface across all of Hadoop 3.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org



Apache Hadoop qbt Report: trunk+JDK8 on Linux/x86

2017-09-08 Thread Apache Jenkins Server
For more details, see 
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/517/

[Sep 7, 2017 7:46:20 AM] (sunilg) YARN-6992. Kill application button is visible 
even if the application is
[Sep 7, 2017 12:38:23 PM] (kai.zheng) HDFS-12402. Refactor 
ErasureCodingPolicyManager and related codes.
[Sep 7, 2017 3:18:28 PM] (arp) HDFS-12376. Enable JournalNode Sync by default. 
Contributed by Hanisha
[Sep 7, 2017 4:50:36 PM] (yzhang) HDFS-12357. Let NameNode to bypass external 
attribute provider for
[Sep 7, 2017 5:35:03 PM] (stevel) HADOOP-14520. WASB: Block compaction for 
Azure Block Blobs. Contributed
[Sep 7, 2017 5:23:12 PM] (Arun Suresh) YARN-6978. Add updateContainer API to 
NMClient. (Kartheek Muthyala via
[Sep 7, 2017 6:55:56 PM] (stevel) HADOOP-14774. S3A case 
"testRandomReadOverBuffer" failed due to improper
[Sep 7, 2017 7:40:09 PM] (aengineer) HDFS-12350. Support meta tags in configs. 
Contributed by Ajay Kumar.
[Sep 7, 2017 9:13:37 PM] (wangda) YARN-7033. Add support for NM Recovery of 
assigned resources (e.g.
[Sep 7, 2017 9:17:03 PM] (jlowe) YARN-6930. Admins should be able to explicitly 
enable specific
[Sep 7, 2017 11:30:12 PM] (xiao) HDFS-12369. Edit log corruption due to hard 
lease recovery of not-closed
[Sep 7, 2017 11:56:35 PM] (wang) HDFS-12218. Rename split EC / replicated block 
metrics in BlockManager.
[Sep 7, 2017 11:57:19 PM] (wang) HDFS-12218. Addendum. Rename split EC / 
replicated block metrics in
[Sep 8, 2017 12:20:42 AM] (manojpec) HDFS-12404. Rename hdfs config 
authorization.provider.bypass.users to
[Sep 8, 2017 1:01:37 AM] (lei) HDFS-12349. Improve log message when it could 
not alloc enough blocks
[Sep 8, 2017 1:45:17 AM] (sunilg) YARN-6600. Introduce default and max lifetime 
of application at
[Sep 8, 2017 2:07:17 AM] (subru) YARN-5330. SharingPolicy enhancements required 
to support recurring
[Sep 8, 2017 3:51:02 AM] (xiao) HDFS-12400. Provide a way for NN to drain the 
local key cache before




-1 overall


The following subsystems voted -1:
findbugs unit


The following subsystems voted -1 but
were configured to be filtered/ignored:
cc checkstyle javac javadoc pylint shellcheck shelldocs whitespace


The following subsystems are considered long running:
(runtime bigger than 1h  0m  0s)
unit


Specific tests:

FindBugs :

   module:hadoop-hdfs-project/hadoop-hdfs 
   Format-string method String.format(String, Object[]) called with format 
string "File %s could only be written to %d of the %d %s. There are %d 
datanode(s) running and %s node(s) are excluded in this operation." wants 6 
arguments but is given 7 in 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(String,
 int, Node, Set, long, List, byte, BlockType, ErasureCodingPolicy, EnumSet) At 
BlockManager.java:with format string "File %s could only be written to %d of 
the %d %s. There are %d datanode(s) running and %s node(s) are excluded in this 
operation." wants 6 arguments but is given 7 in 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(String,
 int, Node, Set, long, List, byte, BlockType, ErasureCodingPolicy, EnumSet) At 
BlockManager.java:[line 2076] 

FindBugs :

   
module:hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 
   Hard coded reference to an absolute pathname in 
org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DockerLinuxContainerRuntime.launchContainer(ContainerRuntimeContext)
 At DockerLinuxContainerRuntime.java:absolute pathname in 
org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DockerLinuxContainerRuntime.launchContainer(ContainerRuntimeContext)
 At DockerLinuxContainerRuntime.java:[line 490] 

Failed junit tests :

   hadoop.hdfs.server.namenode.TestReencryption 
   hadoop.hdfs.server.namenode.TestReencryptionWithKMS 
   hadoop.hdfs.TestLeaseRecoveryStriped 
   hadoop.hdfs.TestClientProtocolForPipelineRecovery 
   hadoop.hdfs.TestReconstructStripedFile 
   hadoop.hdfs.server.blockmanagement.TestBlockManager 
   hadoop.hdfs.server.blockmanagement.TestBlockStatsMXBean 
   hadoop.hdfs.protocol.datatransfer.sasl.TestSaslDataTransfer 
   hadoop.hdfs.TestDFSStripedOutputStreamWithFailureWithRandomECPolicy 
   
hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerAllocation 
   hadoop.yarn.server.TestDiskFailures 
   hadoop.mapreduce.v2.hs.webapp.TestHSWebApp 
   hadoop.yarn.sls.TestReservationSystemInvariants 
   hadoop.yarn.sls.TestSLSRunner 

Timed out junit tests :

   org.apache.hadoop.hdfs.TestWriteReadStripedFile 
   
org.apache.hadoop.yarn.server.resourcemanager.TestSubmitApplicationWithRMHA 
   
org.apache.hadoop.yarn.server.resourcemanager.TestKillApplicationWithRMHA 
  

   cc:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/517/artifact/out/diff-compile-cc-root.txt
  [4.0K]

 

[jira] [Resolved] (MAPREDUCE-6955) remove unnecessary dependency from hadoop-mapreduce-client-app to hadoop-mapreduce-client-shuffle

2017-09-08 Thread Haibo Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibo Chen resolved MAPREDUCE-6955.
---
Resolution: Not A Problem

> remove unnecessary dependency from hadoop-mapreduce-client-app to 
> hadoop-mapreduce-client-shuffle
> -
>
> Key: MAPREDUCE-6955
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6955
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Haibo Chen
>Assignee: Haibo Chen
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org



[jira] [Created] (MAPREDUCE-6955) remove unnecessary dependency from hadoop-mapreduce-client-app to hadoop-mapreduce-client-shuffle

2017-09-08 Thread Haibo Chen (JIRA)
Haibo Chen created MAPREDUCE-6955:
-

 Summary: remove unnecessary dependency from 
hadoop-mapreduce-client-app to hadoop-mapreduce-client-shuffle
 Key: MAPREDUCE-6955
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6955
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Haibo Chen
Assignee: Haibo Chen






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org



Re: [VOTE] Merge yarn-native-services branch into trunk

2017-09-08 Thread Jian He
Hi Allen,
The documentations are committed. Please check QuickStart.md and others in the 
same folder.
YarnCommands.md doc is updated to include new commands.
DNS default port is also documented. 
Would you like to give a look and see if it address your concerns ?

Jian

> On Sep 6, 2017, at 10:19 AM, Jian He  wrote:
> 
>>  Please correct me if I’m wrong, but the current summary of the branch, 
>> post these changes, looks like:
> Sorry for confusion, I was actively writing the formal documentation for how 
> to use/how it works etc. and will post soon in a few hours.
> 
> 
>> On Sep 6, 2017, at 10:15 AM, Allen Wittenauer  
>> wrote:
>> 
>> 
>>> On Sep 5, 2017, at 6:23 PM, Jian He  wrote:
>>> 
If it doesn’t have all the bells and whistles, then it shouldn’t be on 
 port 53 by default.
>>> Sure, I’ll change the default port to not use 53 and document it.
*how* is it getting launched on a privileged port? It sounds like the 
 expectation is to run “command” as root.   *ALL* of the previous daemons 
 in Hadoop that needed a privileged port used jsvc.  Why isn’t this one? 
 These questions matter from a security standpoint.  
>>> Yes, it is running as “root” to be able to use the privileged port. The DNS 
>>> server is not yet integrated with the hadoop script. 
>>> 
 Check the output.  It’s pretty obviously borked:
>>> Thanks for pointing out. Missed this when rebasing onto trunk.
>> 
>> 
>>  Please correct me if I’m wrong, but the current summary of the branch, 
>> post these changes, looks like:
>> 
>>  * A bunch of mostly new Java code that may or may not have 
>> javadocs (post-revert YARN-6877, still working out HADOOP-14835)
>>  * ~1/3 of the docs are roadmap/TBD
>>  * ~1/3 of the docs are for an optional DNS daemon that has no 
>> end user hook to start it
>>  * ~1/3 of the docs are for a REST API that comes from some 
>> undefined daemon (apiserver?)
>>  * Two new, but undocumented, subcommands to yarn
>>  * There are no docs for admins or users on how to actually 
>> start or use this completely new/separate/optional feature
>> 
>>  How are outside people (e.g., non-branch committers) supposed to test 
>> this new feature under these conditions?
>> 
> 



[jira] [Created] (MAPREDUCE-6954) Disable erasure coding for files that are uploaded to the MR staging area

2017-09-08 Thread Peter Bacsko (JIRA)
Peter Bacsko created MAPREDUCE-6954:
---

 Summary: Disable erasure coding for files that are uploaded to the 
MR staging area
 Key: MAPREDUCE-6954
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6954
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: client
Reporter: Peter Bacsko
Assignee: Peter Bacsko


Depending on the encoder/decoder used and the type or MR workload, EC might 
negatively affect the performance of an MR job if too many files are localized.

In such a scenario, users might want to disable EC in the staging area to speed 
up the execution.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org



[DISCUSS] official docker image(s) for hadoop

2017-09-08 Thread Marton, Elek


TL;DR: I propose to create official hadoop images and upload them to the 
dockerhub.


GOAL/SCOPE: I would like improve the existing documentation with 
easy-to-use docker based recipes to start hadoop clusters with various 
configuration.


The images also could be used to test experimental features. For example 
ozone could be tested easily with these compose file and configuration:


https://gist.github.com/elek/1676a97b98f4ba561c9f51fce2ab2ea6

Or even the configuration could be included in the compose file:

https://github.com/elek/hadoop/blob/docker-2.8.0/example/docker-compose.yaml

I would like to create separated example compose files for federation, 
ha, metrics usage, etc. to make it easier to try out and understand the 
features.


CONTEXT: There is an existing Jira 
https://issues.apache.org/jira/browse/HADOOP-13397
But it’s about a tool to generate production quality docker images 
(multiple types, in a flexible way). If no objections, I will create a 
separated issue to create simplified docker images for rapid prototyping 
and investigating new features. And register the branch to the dockerhub 
to create the images automatically.


MY BACKGROUND: I am working with docker based hadoop/spark clusters 
quite a while and run them succesfully in different environments 
(kubernetes, docker-swarm, nomad-based scheduling, etc.) My work is 
available from here: https://github.com/flokkr but they could handle 
more complex use cases (eg. instrumenting java processes with btrace, or 
read/reload configuration from consul).
 And IMHO in the official hadoop documentation it’s better to suggest 
to use official apache docker images and not external ones (which could 
be changed).


Please let me know if you have any comments.

Marton

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org



hadoop roadmaps

2017-09-08 Thread Marton, Elek

Hi,

I tried to summarize all of the information from different mail threads 
about the upcomming releases:


https://cwiki.apache.org/confluence/display/HADOOP/Roadmap

Please fix it / let me know if you see any invalid data. I will try to 
follow the conversations and update accordingly.


Two administrative questions:

 * Is there any information about which wiki should be used? Or about 
the migration process? As I see the new pages are created on the cwiki 
recently.


 * Could you please give me permission (user: elek) to the old wiki. I 
would like to update the old Roadmap page 
(https://wiki.apache.org/hadoop/Roadmap)


Thanks
Marton

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org



[jira] [Created] (MAPREDUCE-6953) Skip the testcase testJobWithChangePriority if FairScheduler is used

2017-09-08 Thread Peter Bacsko (JIRA)
Peter Bacsko created MAPREDUCE-6953:
---

 Summary: Skip the testcase testJobWithChangePriority if 
FairScheduler is used
 Key: MAPREDUCE-6953
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6953
 Project: Hadoop Map/Reduce
  Issue Type: Test
  Components: client
Reporter: Peter Bacsko
Assignee: Peter Bacsko


We run the unit tests with Fair Scheduler downstream. FS does not support 
priorities at the moment, so TestMRJobs#testJobWithChangePriority fails.

Just add {{Assume.assumeFalse(usingFairScheduler);}} and JUnit will skip the 
test.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org



Re: 2017-09-07 Hadoop 3 release status update

2017-09-08 Thread Steve Loughran

On 8 Sep 2017, at 00:50, Andrew Wang 
> wrote:

  - HADOOP-14738  (Remove
  S3N and obsolete bits of S3A; rework docs): Steve has been actively revving
  this with our new committer Aaron Fabbri ready to review. The scope has
  expanded from HADOOP-14826, so it's not just a doc update.

For people not tracking this, it's merged with other cleanup code so pulls the 
entirety of the s3n:// connector and the original 
S3AOutputStreamessentially the unmaintained and obsolete bits of code. The 
ones where any bugrep would be dealt with "have you switched to..."