[VOTE] Merging branch HDFS-7240 to trunk

2018-02-26 Thread Jitendra Pandey
Dear folks,
   We would like to start a vote to merge HDFS-7240 branch into trunk. 
The context can be reviewed in the DISCUSSION thread, and in the jiras (See 
references below).
  
HDFS-7240 introduces Hadoop Distributed Storage Layer (HDSL), which is a 
distributed, replicated block layer.
The old HDFS namespace and NN can be connected to this new block layer as 
we have described in HDFS-10419.
We also introduce a key-value namespace called Ozone built on HDSL.
  
The code is in a separate module and is turned off by default. In a secure 
setup, HDSL and Ozone daemons cannot be started.

The detailed documentation is available at 
 
https://cwiki.apache.org/confluence/display/HADOOP/Hadoop+Distributed+Storage+Layer+and+Applications


I will start with my vote.
+1 (binding)


Discussion Thread:
  https://s.apache.org/7240-merge
  https://s.apache.org/4sfU

Jiras:
   https://issues.apache.org/jira/browse/HDFS-7240
   https://issues.apache.org/jira/browse/HDFS-10419
   https://issues.apache.org/jira/browse/HDFS-13074
   https://issues.apache.org/jira/browse/HDFS-13180

   
Thanks
jitendra





DISCUSSION THREAD SUMMARY :

On 2/13/18, 6:28 PM, "sanjay Radia"  wrote:

Sorry the formatting got messed by my email client.  Here it is 
again


Dear
 Hadoop Community Members,

   We had multiple community discussions, a few meetings in 
smaller groups and also jira discussions with respect to this thread. We 
express our gratitude for participation and valuable comments. 

The key questions raised were following
1) How the new block storage layer and OzoneFS benefit HDFS and 
we were asked to chalk out a roadmap towards the goal of a scalable namenode 
working with the new storage layer
2) We were asked to provide a security design
3)There were questions around stability given ozone brings in a 
large body of code.
4) Why can’t they be separate projects forever or merged in 
when production ready?

We have responded to all the above questions with detailed 
explanations and answers on the jira as well as in the discussions. We believe 
that should sufficiently address community’s concerns. 

Please see the summary below:

1) The new code base benefits HDFS scaling and a roadmap has 
been provided. 

Summary:
  - New block storage layer addresses the scalability of the 
block layer. We have shown how existing NN can be connected to the new block 
layer and its benefits. We have shown 2 milestones, 1st milestone is much 
simpler than 2nd milestone while giving almost the same scaling benefits. 
Originally we had proposed simply milestone 2 and the community felt that 
removing the FSN/BM lock was was a fair amount of work and a simpler solution 
would be useful
  - We provide a new K-V namespace called Ozone FS with 
FileSystem/FileContext plugins to allow the users to use the new system. BTW 
Hive and Spark work very well on KV-namespaces on the cloud. This will 
facilitate stabilizing the new block layer. 
  - The new block layer has a new netty based protocol engine 
in the Datanode which, when stabilized, can be used by  the old hdfs block 
layer. See details below on sharing of code.


2) Stability impact on the existing HDFS code base and code 
separation. The new block layer and the OzoneFS are in modules that are 
separate from old HDFS code - currently there are no calls from HDFS into Ozone 
except for DN starting the new block  layer module if configured to do so. It 
does not add instability (the instability argument has been raised many times). 
Over time as we share code, we will ensure that the old HDFS continues to 
remains stable. (for example we plan to stabilize the new netty based protocol 
engine in the new block layer before sharing it with HDFS’s old block layer)


3) In the short term and medium term, the new system and HDFS  
will be used side-by-side by users. Side by-side usage in the short term for 
testing and side-by-side in the medium term for actual production use till the 
new system has feature parity with old HDFS. During this time, sharing the DN 
daemon and admin functions between the two systems is operationally important: 

Apache Hadoop qbt Report: branch2+JDK7 on Linux/x86

2018-02-26 Thread Apache Jenkins Server
For more details, see 
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/147/

No changes




-1 overall


The following subsystems voted -1:
asflicense findbugs mvnsite unit xml


The following subsystems voted -1 but
were configured to be filtered/ignored:
cc checkstyle javac javadoc pylint shellcheck shelldocs whitespace


The following subsystems are considered long running:
(runtime bigger than 1h  0m  0s)
unit


Specific tests:

FindBugs :

   
module:hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase/hadoop-yarn-server-timelineservice-hbase-client
 
   Boxed value is unboxed and then immediately reboxed in 
org.apache.hadoop.yarn.server.timelineservice.storage.common.ColumnRWHelper.readResultsWithTimestamps(Result,
 byte[], byte[], KeyConverter, ValueConverter, boolean) At 
ColumnRWHelper.java:then immediately reboxed in 
org.apache.hadoop.yarn.server.timelineservice.storage.common.ColumnRWHelper.readResultsWithTimestamps(Result,
 byte[], byte[], KeyConverter, ValueConverter, boolean) At 
ColumnRWHelper.java:[line 335] 

Unreaped Processes :

   hadoop-hdfs:20 
   bkjournal:5 
   hadoop-yarn-server-resourcemanager:1 
   hadoop-yarn-client:4 
   hadoop-yarn-applications-distributedshell:1 
   hadoop-mapreduce-client-jobclient:9 
   hadoop-distcp:4 
   hadoop-extras:1 

Failed junit tests :

   hadoop.hdfs.server.namenode.snapshot.TestSnapshotBlocksMap 
   hadoop.hdfs.web.TestWebHdfsWithMultipleNameNodes 
   hadoop.hdfs.web.TestFSMainOperationsWebHdfs 
   
hadoop.yarn.server.nodemanager.containermanager.linux.runtime.TestDockerContainerRuntime
 
   
hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing 
   hadoop.yarn.server.resourcemanager.metrics.TestSystemMetricsPublisher 
   hadoop.yarn.server.TestDiskFailures 
   hadoop.conf.TestNoDefaultsJobConf 
   hadoop.mapred.TestJobSysDirWithDFS 
   hadoop.tools.TestIntegration 
   hadoop.tools.TestDistCpViewFs 
   hadoop.resourceestimator.solver.impl.TestLpSolver 
   hadoop.resourceestimator.service.TestResourceEstimatorService 

Timed out junit tests :

   org.apache.hadoop.hdfs.TestLeaseRecovery2 
   org.apache.hadoop.hdfs.TestRead 
   org.apache.hadoop.security.TestPermission 
   org.apache.hadoop.hdfs.web.TestWebHdfsTokens 
   org.apache.hadoop.hdfs.TestDFSInotifyEventInputStream 
   org.apache.hadoop.hdfs.TestDatanodeLayoutUpgrade 
   org.apache.hadoop.hdfs.TestFileAppendRestart 
   org.apache.hadoop.hdfs.TestReadWhileWriting 
   org.apache.hadoop.hdfs.security.TestDelegationToken 
   org.apache.hadoop.hdfs.TestDFSMkdirs 
   org.apache.hadoop.hdfs.TestDFSOutputStream 
   org.apache.hadoop.hdfs.web.TestWebHDFS 
   org.apache.hadoop.metrics2.sink.TestRollingFileSystemSinkWithSecureHdfs 
   org.apache.hadoop.hdfs.web.TestWebHDFSXAttr 
   org.apache.hadoop.metrics2.sink.TestRollingFileSystemSinkWithHdfs 
   org.apache.hadoop.hdfs.TestDistributedFileSystem 
   org.apache.hadoop.hdfs.TestReplaceDatanodeFailureReplication 
   org.apache.hadoop.hdfs.TestDFSShell 
   org.apache.hadoop.contrib.bkjournal.TestBootstrapStandbyWithBKJM 
   org.apache.hadoop.contrib.bkjournal.TestBookKeeperJournalManager 
   org.apache.hadoop.contrib.bkjournal.TestBookKeeperHACheckpoints 
   org.apache.hadoop.contrib.bkjournal.TestBookKeeperAsHASharedDir 
   org.apache.hadoop.contrib.bkjournal.TestBookKeeperSpeculativeRead 
   
org.apache.hadoop.yarn.server.resourcemanager.metrics.TestCombinedSystemMetricsPublisher
 
   
org.apache.hadoop.yarn.server.resourcemanager.recovery.TestFSRMStateStore 
   org.apache.hadoop.yarn.client.TestRMFailover 
   org.apache.hadoop.yarn.client.TestApplicationClientProtocolOnHA 
   org.apache.hadoop.yarn.client.api.impl.TestYarnClientWithReservation 
   org.apache.hadoop.yarn.client.api.impl.TestAMRMClient 
   
org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell 
   org.apache.hadoop.fs.TestFileSystem 
   org.apache.hadoop.mapred.TestMiniMRClasspath 
   org.apache.hadoop.mapred.TestClusterMapReduceTestCase 
   org.apache.hadoop.mapred.TestMRIntermediateDataEncryption 
   org.apache.hadoop.mapred.TestMRTimelineEventHandling 
   org.apache.hadoop.mapred.join.TestDatamerge 
   org.apache.hadoop.mapred.TestReduceFetchFromPartialMem 
   org.apache.hadoop.mapred.TestLazyOutput 
   org.apache.hadoop.mapred.TestReduceFetch 
   org.apache.hadoop.tools.TestDistCpWithAcls 
   org.apache.hadoop.tools.TestDistCpSync 
   org.apache.hadoop.tools.TestDistCpSyncReverseFromTarget 
   org.apache.hadoop.tools.TestDistCpSyncReverseFromSource 
   org.apache.hadoop.tools.TestCopyFiles 
  

   cc:

   

[jira] [Created] (MAPREDUCE-7060) Cherry Pick PathOutputCommitter class/factory to branch-3.0 & 2.10

2018-02-26 Thread Steve Loughran (JIRA)
Steve Loughran created MAPREDUCE-7060:
-

 Summary: Cherry Pick PathOutputCommitter class/factory to 
branch-3.0 & 2.10
 Key: MAPREDUCE-7060
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7060
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Affects Versions: 3.0.0, 2.10.0
Reporter: Steve Loughran
Assignee: Steve Loughran


It's easier for downstream apps like Spark to pick up the new 
PathOutputCommitter superclass if it is there on 2.10+, even if the S3A 
committer isn't there. 

Adding the interface & binding stuff of HADOOP-6956 allows for third party 
committers to be deployed. 

I'm not proposing a backport of the HADOOP-13786 committer: that's Java 8, 
S3Guard, etc. Too traumatic. All I want here is to allow downstream code to be 
able to pick up the new interface and so be able to support it and other store 
committers when available



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org



[jira] [Resolved] (MAPREDUCE-6961) Pull up FileOutputCommitter.getOutputPath to PathOutputCommitter

2018-02-26 Thread Steve Loughran (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran resolved MAPREDUCE-6961.
---
   Resolution: Duplicate
Fix Version/s: 3.1.0

This went in with HADOOP-13786

> Pull up FileOutputCommitter.getOutputPath to PathOutputCommitter
> 
>
> Key: MAPREDUCE-6961
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6961
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: mrv2
>Affects Versions: 3.0.0-beta1
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Minor
> Fix For: 3.1.0
>
>
> SPARK-21549 has shown that downstream code is relying on the internal 
> property 
> if we pulled {{FileOutputCommitter.getOutputPath}} to the 
> {{PathOutputCommitter}} of MAPREDUCE-6956, then there'd be a public/stable 
> way to get this. Admittedly, it does imply that the committer will always 
> have *some* output path, but FileOutputFormat depends on that anyway.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org



Apache Hadoop qbt Report: trunk+JDK8 on Linux/x86

2018-02-26 Thread Apache Jenkins Server
For more details, see 
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/704/

No changes




-1 overall


The following subsystems voted -1:
findbugs unit xml


The following subsystems voted -1 but
were configured to be filtered/ignored:
cc checkstyle javac javadoc pylint shellcheck shelldocs whitespace


The following subsystems are considered long running:
(runtime bigger than 1h  0m  0s)
unit


Specific tests:

FindBugs :

   module:hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
   org.apache.hadoop.yarn.api.records.Resource.getResources() may expose 
internal representation by returning Resource.resources At Resource.java:by 
returning Resource.resources At Resource.java:[line 234] 

Failed junit tests :

   hadoop.crypto.key.kms.server.TestKMS 
   hadoop.hdfs.TestDFSStripedOutputStreamWithFailure 
   hadoop.hdfs.web.TestWebHdfsTimeouts 
   hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure 
   hadoop.fs.http.server.TestHttpFSServerWebServer 
   hadoop.yarn.client.api.impl.TestTimelineClientV2Impl 
   hadoop.yarn.server.nodemanager.webapp.TestContainerLogsPage 
  

   cc:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/704/artifact/out/diff-compile-cc-root.txt
  [4.0K]

   javac:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/704/artifact/out/diff-compile-javac-root.txt
  [280K]

   checkstyle:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/704/artifact/out/diff-checkstyle-root.txt
  [17M]

   pylint:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/704/artifact/out/diff-patch-pylint.txt
  [24K]

   shellcheck:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/704/artifact/out/diff-patch-shellcheck.txt
  [20K]

   shelldocs:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/704/artifact/out/diff-patch-shelldocs.txt
  [12K]

   whitespace:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/704/artifact/out/whitespace-eol.txt
  [9.2M]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/704/artifact/out/whitespace-tabs.txt
  [288K]

   xml:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/704/artifact/out/xml.txt
  [4.0K]

   findbugs:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/704/artifact/out/branch-findbugs-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-api-warnings.html
  [8.0K]

   javadoc:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/704/artifact/out/diff-javadoc-javadoc-root.txt
  [760K]

   unit:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/704/artifact/out/patch-unit-hadoop-common-project_hadoop-kms.txt
  [12K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/704/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
  [320K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/704/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs-httpfs.txt
  [20K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/704/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-common.txt
  [40K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/704/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
  [48K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/704/artifact/out/patch-unit-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-jobclient.txt
  [84K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/704/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-applications_hadoop-yarn-services_hadoop-yarn-services-core.txt
  [8.0K]

Powered by Apache Yetus 0.8.0-SNAPSHOT   http://yetus.apache.org

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org