Re: [DISCUSSION] Merging HDFS-7240 Object Store (Ozone) to trunk

2017-11-04 Thread Konstantin Shvachko
Hi Sanjay,

Read your doc. I clearly see the value of Ozone with your use cases, but I
agree with Stack and others the question why it should be a part of Hadoop
isn't clear. More details in the jira:

https://issues.apache.org/jira/browse/HDFS-7240?focusedCommentId=16239313=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16239313

Thanks,
--Konstantin

On Fri, Nov 3, 2017 at 1:56 PM, sanjay Radia  wrote:

> Konstantine,
>  Thanks for your comments, questions and feedback. I have attached a
> document to the HDFS-7240 jira
>  that explains a design for scaling HDFS and how Ozone paves the way
> towards the full solution.
>
>
> https://issues.apache.org/jira/secure/attachment/
> 12895963/HDFS%20Scalability%20and%20Ozone.pdf
>
>
> sanjay
>
>
>
>
> On Oct 28, 2017, at 2:00 PM, Konstantin Shvachko 
> wrote:
>
> Hey guys,
>
> It is an interesting question whether Ozone should be a part of Hadoop.
> There are two main reasons why I think it should not.
>
> 1. With close to 500 sub-tasks, with 6 MB of code changes, and with a
> sizable community behind, it looks to me like a whole new project.
> It is essentially a new storage system, with different (than HDFS)
> architecture, separate S3-like APIs. This is really great - the World sure
> needs more distributed file systems. But it is not clear why Ozone should
> co-exist with HDFS under the same roof.
>
> 2. Ozone is probably just the first step in rebuilding HDFS under a new
> architecture. With the next steps presumably being HDFS-10419 and
> HDFS-8.
> The design doc for the new architecture has never been published. I can
> only assume based on some presentations and personal communications that
> the idea is to use Ozone as a block storage, and re-implement NameNode, so
> that it stores only a partial namesapce in memory, while the bulk of it
> (cold data) is persisted to a local storage.
> Such architecture makes me wonder if it solves Hadoop's main problems.
> There are two main limitations in HDFS:
>  a. The throughput of Namespace operations. Which is limited by the number
> of RPCs the NameNode can handle
>  b. The number of objects (files + blocks) the system can maintain. Which
> is limited by the memory size of the NameNode.
> The RPC performance (a) is more important for Hadoop scalability than the
> object count (b). The read RPCs being the main priority.
> The new architecture targets the object count problem, but in the expense
> of the RPC throughput. Which seems to be a wrong resolution of the
> tradeoff.
> Also based on the use patterns on our large clusters we read up to 90% of
> the data we write, so cold data is a small fraction and most of it must be
> cached.
>
> To summarize:
> - Ozone is a big enough system to deserve its own project.
> - The architecture that Ozone leads to does not seem to solve the intrinsic
> problems of current HDFS.
>
> I will post my opinion in the Ozone jira. Should be more convenient to
> discuss it there for further reference.
>
> Thanks,
> --Konstantin
>
>
>
> On Wed, Oct 18, 2017 at 6:54 PM, Yang Weiwei 
> wrote:
>
> Hello everyone,
>
>
> I would like to start this thread to discuss merging Ozone (HDFS-7240) to
> trunk. This feature implements an object store which can co-exist with
> HDFS. Ozone is disabled by default. We have tested Ozone with cluster sizes
> varying from 1 to 100 data nodes.
>
>
>
> The merge payload includes the following:
>
>  1.  All services, management scripts
>  2.  Object store APIs, exposed via both REST and RPC
>  3.  Master service UIs, command line interfaces
>  4.  Pluggable pipeline Integration
>  5.  Ozone File System (Hadoop compatible file system implementation,
> passes all FileSystem contract tests)
>  6.  Corona - a load generator for Ozone.
>  7.  Essential documentation added to Hadoop site.
>  8.  Version specific Ozone Documentation, accessible via service UI.
>  9.  Docker support for ozone, which enables faster development cycles.
>
>
> To build Ozone and run ozone using docker, please follow instructions in
> this wiki page. https://cwiki.apache.org/confl
> uence/display/HADOOP/Dev+cluster+with+docker.
>
>
> We have built a passionate and diverse community to drive this feature
> development. As a team, we have achieved significant progress in past 3
> years since first JIRA for HDFS-7240 was opened on Oct 2014. So far, we
> have resolved almost 400 JIRAs by 20+ contributors/committers from
> different countries and affiliations. We also want to thank the large
> number of community members who were supportive of our efforts and
> contributed ideas and participated in the design of ozone.
>
>
> Please share your thoughts, thanks!
>
>
> -- Weiwei Yang
>
>
>
> On Wed, Oct 18, 2017 at 6:54 PM, Yang Weiwei 
> wrote:
>
> Hello everyone,
>
>
> I would like to start this thread to discuss merging Ozone (HDFS-7240) to
> trunk. This feature implements 

Apache Hadoop qbt Report: trunk+JDK8 on Linux/x86

2017-11-04 Thread Apache Jenkins Server
For more details, see 
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/580/

[Nov 3, 2017 7:05:45 PM] (xiao) HDFS-11467. Support ErasureCoding section in 
OIV XML/ReverseXML.
[Nov 3, 2017 8:16:46 PM] (kihwal) HDFS-12771. Add genstamp and block size to 
metasave Corrupt blocks list.
[Nov 3, 2017 9:30:57 PM] (cdouglas) HDFS-12681. Fold HdfsLocatedFileStatus into 
HdfsFileStatus.
[Nov 3, 2017 11:10:37 PM] (xyao) HADOOP-14987. Improve KMSClientProvider log 
around delegation token
[Nov 4, 2017 3:34:40 AM] (xyao) HDFS-10528. Add logging to successful standby 
checkpointing. Contributed
[Nov 4, 2017 4:01:56 AM] (liuml07) HADOOP-15015. TestConfigurationFieldsBase to 
use SLF4J for logging.




-1 overall


The following subsystems voted -1:
asflicense unit


The following subsystems voted -1 but
were configured to be filtered/ignored:
cc checkstyle javac javadoc pylint shellcheck shelldocs whitespace


The following subsystems are considered long running:
(runtime bigger than 1h  0m  0s)
unit


Specific tests:

Unreaped Processes :

   hadoop-hdfs:7 
   hadoop-mapreduce-client-jobclient:1 

Failed junit tests :

   hadoop.fs.shell.TestCopyPreserveFlag 
   hadoop.hdfs.TestDFSStripedOutputStreamWithFailure130 
   hadoop.hdfs.TestDFSStripedOutputStreamWithFailure200 
   hadoop.hdfs.TestDFSStripedInputStreamWithRandomECPolicy 
   hadoop.hdfs.TestSafeModeWithStripedFile 
   hadoop.hdfs.server.mover.TestMover 
   hadoop.hdfs.server.diskbalancer.TestDiskBalancerWithMockMover 
   hadoop.hdfs.TestDFSStripedOutputStreamWithFailure020 
   hadoop.hdfs.server.balancer.TestBalancerRPCDelay 
   hadoop.hdfs.TestReconstructStripedFile 
   hadoop.hdfs.TestDFSStripedOutputStreamWithFailure 
   hadoop.hdfs.TestFileChecksum 
   hadoop.hdfs.server.balancer.TestBalancer 
   hadoop.hdfs.TestDFSStripedOutputStream 
   hadoop.hdfs.TestDFSStripedOutputStreamWithFailure090 
   hadoop.hdfs.TestDFSStripedOutputStreamWithFailure180 
   hadoop.hdfs.TestDFSStripedOutputStreamWithFailure120 
   hadoop.hdfs.TestDFSStartupVersions 
   hadoop.hdfs.TestDFSStripedOutputStreamWithFailure170 
   hadoop.hdfs.server.blockmanagement.TestBlockTokenWithDFSStriped 
   hadoop.hdfs.TestDecommissionWithStriped 
   hadoop.hdfs.TestDFSStripedOutputStreamWithFailure000 
   hadoop.hdfs.TestDFSStripedOutputStreamWithFailure190 
   hadoop.hdfs.server.namenode.TestQuotaByStorageType 
   hadoop.hdfs.TestDFSStripedOutputStreamWithFailure070 
   hadoop.hdfs.TestReservedRawPaths 
   hadoop.yarn.server.TestContainerManagerSecurity 
   hadoop.mapreduce.lib.join.TestJoinDatamerge 
   hadoop.mapreduce.lib.input.TestDelegatingInputFormat 
   hadoop.mapred.lib.TestDelegatingInputFormat 
   hadoop.mapreduce.lib.join.TestJoinProperties 
   hadoop.mapred.join.TestDatamerge 
   hadoop.streaming.TestSymLink 
   hadoop.streaming.TestMultipleArchiveFiles 
   hadoop.streaming.TestMultipleCachefiles 
   hadoop.contrib.utils.join.TestDataJoin 

Timed out junit tests :

   org.apache.hadoop.mapred.pipes.TestPipeApplication 
  

   cc:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/580/artifact/out/diff-compile-cc-root.txt
  [4.0K]

   javac:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/580/artifact/out/diff-compile-javac-root.txt
  [280K]

   checkstyle:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/580/artifact/out/diff-checkstyle-root.txt
  [17M]

   pylint:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/580/artifact/out/diff-patch-pylint.txt
  [20K]

   shellcheck:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/580/artifact/out/diff-patch-shellcheck.txt
  [20K]

   shelldocs:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/580/artifact/out/diff-patch-shelldocs.txt
  [12K]

   whitespace:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/580/artifact/out/whitespace-eol.txt
  [8.8M]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/580/artifact/out/whitespace-tabs.txt
  [288K]

   javadoc:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/580/artifact/out/diff-javadoc-javadoc-root.txt
  [760K]

   UnreapedProcessesLog:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/580/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs-reaper.txt
  [4.0K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/580/artifact/out/patch-unit-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-jobclient-reaper.txt
  [4.0K]

   unit:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/580/artifact/out/patch-unit-hadoop-common-project_hadoop-common.txt
  [156K]
   

Apache Hadoop qbt Report: branch2+JDK7 on Linux/x86

2017-11-04 Thread Apache Jenkins Server
For more details, see 
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/30/

No changes




-1 overall


The following subsystems voted -1:
findbugs mvninstall mvnsite unit xml


The following subsystems voted -1 but
were configured to be filtered/ignored:
cc checkstyle javac javadoc pylint shellcheck shelldocs whitespace


   mvninstall:

   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/30/artifact/out/patch-mvninstall-root.txt
  [4.0K]

   cc:

   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/30/artifact/out/diff-compile-cc-root.txt
  [4.0K]

   javac:

   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/30/artifact/out/diff-compile-javac-root.txt
  [324K]

   checkstyle:

   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/30/artifact/out/diff-checkstyle-root.txt
  [4.0K]

   mvnsite:

   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/30/artifact/out/patch-mvnsite-root.txt
  [12K]

   pylint:

   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/30/artifact/out/diff-patch-pylint.txt
  [20K]

   shellcheck:

   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/30/artifact/out/diff-patch-shellcheck.txt
  [72K]

   shelldocs:

   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/30/artifact/out/diff-patch-shelldocs.txt
  [48K]

   whitespace:

   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/30/artifact/out/whitespace-eol.txt
  [12M]
   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/30/artifact/out/whitespace-tabs.txt
  [1.2M]

   xml:

   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/30/artifact/out/xml.txt
  [4.0K]

   findbugs:

   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/30/artifact/out/branch-findbugs-hadoop-common-project_hadoop-annotations.txt
  [4.0K]
   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/30/artifact/out/branch-findbugs-hadoop-common-project_hadoop-auth.txt
  [4.0K]
   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/30/artifact/out/branch-findbugs-hadoop-common-project_hadoop-auth-examples.txt
  [4.0K]
   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/30/artifact/out/branch-findbugs-hadoop-common-project_hadoop-common.txt
  [4.0K]
   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/30/artifact/out/branch-findbugs-hadoop-common-project_hadoop-kms.txt
  [0]
   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/30/artifact/out/branch-findbugs-hadoop-common-project_hadoop-minikdc.txt
  [4.0K]
   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/30/artifact/out/branch-findbugs-hadoop-common-project_hadoop-nfs.txt
  [4.0K]
   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/30/artifact/out/branch-findbugs-hadoop-hdfs-project_hadoop-hdfs.txt
  [4.0K]
   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/30/artifact/out/branch-findbugs-hadoop-hdfs-project_hadoop-hdfs-client.txt
  [4.0K]
   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/30/artifact/out/branch-findbugs-hadoop-hdfs-project_hadoop-hdfs-httpfs.txt
  [8.0K]
   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/30/artifact/out/branch-findbugs-hadoop-hdfs-project_hadoop-hdfs-nfs.txt
  [4.0K]
   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/30/artifact/out/branch-findbugs-hadoop-hdfs-project_hadoop-hdfs_src_contrib_bkjournal.txt
  [4.0K]
   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/30/artifact/out/branch-findbugs-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-app.txt
  [16K]
   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/30/artifact/out/branch-findbugs-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-common.txt
  [4.0K]
   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/30/artifact/out/branch-findbugs-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-core.txt
  [8.0K]
   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/30/artifact/out/branch-findbugs-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-hs.txt
  [0]
   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/30/artifact/out/branch-findbugs-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-hs-plugins.txt
  [4.0K]
   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/30/artifact/out/branch-findbugs-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-jobclient.txt
  [0]
   

Apache Hadoop qbt Report: branch2+JDK7 on Linux/x86

2017-11-04 Thread Apache Jenkins Server
For more details, see 
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/29/

[Nov 3, 2017 6:11:25 PM] (Arun Suresh) YARN-6932. Fix 
TestFederationRMFailoverProxyProvider test case failure.




-1 overall


The following subsystems voted -1:
asflicense unit xml


The following subsystems voted -1 but
were configured to be filtered/ignored:
cc checkstyle javac javadoc pylint shellcheck shelldocs whitespace


The following subsystems are considered long running:
(runtime bigger than 1h  0m  0s)
unit


Specific tests:

Unreaped Processes :

   hadoop-common:1 
   hadoop-hdfs:25 
   bkjournal:5 
   hadoop-mapreduce-client-jobclient:7 
   hadoop-archives:1 
   hadoop-distcp:3 
   hadoop-extras:1 
   hadoop-yarn-applications-distributedshell:1 
   hadoop-yarn-client:4 
   hadoop-yarn-server-timelineservice:1 

Failed junit tests :

   hadoop.minikdc.TestMiniKdc 
   hadoop.hdfs.server.namenode.TestParallelImageWrite 
   hadoop.hdfs.TestRestartDFS 
   hadoop.hdfs.TestLeaseRecovery 
   hadoop.hdfs.TestSecureEncryptionZoneWithKMS 
   hadoop.tools.TestIntegration 
   hadoop.tools.TestDistCpViewFs 
   hadoop.resourceestimator.service.TestResourceEstimatorService 
   hadoop.resourceestimator.solver.impl.TestLpSolver 
   TEST-cetest 

Timed out junit tests :

   org.apache.hadoop.log.TestLogLevel 
   org.apache.hadoop.hdfs.TestModTime 
   org.apache.hadoop.hdfs.server.namenode.TestDefaultBlockPlacementPolicy 
   org.apache.hadoop.hdfs.server.namenode.TestSecondaryNameNodeUpgrade 
   org.apache.hadoop.hdfs.server.namenode.TestFileContextAcl 
   org.apache.hadoop.fs.TestEnhancedByteBufferAccess 
   org.apache.hadoop.hdfs.TestDataTransferKeepalive 
   org.apache.hadoop.hdfs.server.namenode.TestQuotaByStorageType 
   org.apache.hadoop.hdfs.TestFileAppend4 
   org.apache.hadoop.hdfs.server.namenode.TestNameNodeRespectsBindHostKeys 
   org.apache.hadoop.hdfs.TestDFSPermission 
   org.apache.hadoop.hdfs.TestDatanodeStartupFixesLegacyStorageIDs 
   org.apache.hadoop.hdfs.TestFileAppendRestart 
   org.apache.hadoop.hdfs.TestFileCreationDelete 
   org.apache.hadoop.hdfs.server.namenode.TestNameNodeMXBean 
   org.apache.hadoop.hdfs.shortcircuit.TestShortCircuitCache 
   org.apache.hadoop.hdfs.TestFileConcurrentReader 
   org.apache.hadoop.metrics2.sink.TestRollingFileSystemSinkWithHdfs 
   org.apache.hadoop.hdfs.shortcircuit.TestShortCircuitLocalRead 
   org.apache.hadoop.hdfs.TestFSOutputSummer 
   org.apache.hadoop.hdfs.server.namenode.TestDeleteRace 
   org.apache.hadoop.hdfs.server.namenode.TestNameNodeMetadataConsistency 
   org.apache.hadoop.hdfs.TestEncryptedTransfer 
   org.apache.hadoop.fs.TestHDFSFileContextMainOperations 
   org.apache.hadoop.contrib.bkjournal.TestBookKeeperHACheckpoints 
   org.apache.hadoop.contrib.bkjournal.TestBookKeeperAsHASharedDir 
   org.apache.hadoop.contrib.bkjournal.TestBookKeeperSpeculativeRead 
   org.apache.hadoop.contrib.bkjournal.TestCurrentInprogress 
   org.apache.hadoop.contrib.bkjournal.TestBookKeeperConfiguration 
   org.apache.hadoop.mapred.TestMiniMRClasspath 
   org.apache.hadoop.mapred.TestMRCJCFileInputFormat 
   org.apache.hadoop.mapred.TestClusterMapReduceTestCase 
   org.apache.hadoop.mapred.TestMRTimelineEventHandling 
   org.apache.hadoop.mapred.TestJobName 
   org.apache.hadoop.mapred.TestMiniMRClientCluster 
   org.apache.hadoop.mapred.TestMROpportunisticMaps 
   org.apache.hadoop.tools.TestHadoopArchives 
   org.apache.hadoop.tools.TestDistCpSync 
   org.apache.hadoop.tools.TestDistCpSyncReverseFromTarget 
   org.apache.hadoop.tools.TestDistCpSyncReverseFromSource 
   org.apache.hadoop.tools.TestCopyFiles 
   
org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell 
   org.apache.hadoop.yarn.client.TestRMFailover 
   org.apache.hadoop.yarn.client.TestApplicationClientProtocolOnHA 
   org.apache.hadoop.yarn.client.api.impl.TestYarnClient 
   org.apache.hadoop.yarn.client.api.impl.TestAMRMClient 
   
org.apache.hadoop.yarn.server.timelineservice.reader.TestTimelineReaderWebServices
 
  

   cc:

   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/29/artifact/out/diff-compile-cc-root.txt
  [4.0K]

   javac:

   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/29/artifact/out/diff-compile-javac-root.txt
  [324K]

   checkstyle:

   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/29/artifact/out/diff-checkstyle-root.txt
  [16M]

   pylint:

   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/29/artifact/out/diff-patch-pylint.txt
  [20K]

   shellcheck:

   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/29/artifact/out/diff-patch-shellcheck.txt