Apache Hadoop qbt Report: trunk+JDK9 on Linux/x86

2017-10-20 Thread Apache Jenkins Server
For more details, see 
https://builds.apache.org/job/hadoop-qbt-trunk-java9-linux-x86/2/

[Oct 20, 2017 9:24:17 PM] (haibochen) YARN-7372.
[Oct 20, 2017 9:27:04 PM] (stevel) HADOOP-14942. DistCp#cleanup() should check 
whether jobFS is null.
[Oct 20, 2017 10:54:15 PM] (wangda) YARN-7318. Fix shell check warnings of SLS. 
(Gergely Novák via wangda)
[Oct 20, 2017 11:13:41 PM] (jzhuge) HADOOP-14954. MetricsSystemImpl#init should 
increment refCount when
[Oct 20, 2017 11:25:04 PM] (xiao) HDFS-12518. Re-encryption should handle task 
cancellation and progress




-1 overall


The following subsystems voted -1:
compile findbugs mvninstall mvnsite shadedclient unit


The following subsystems voted -1 but
were configured to be filtered/ignored:
cc checkstyle javac javadoc pylint shellcheck shelldocs whitespace


Specific tests:

Failed junit tests :

   hadoop.security.authentication.client.TestKerberosAuthenticator 
   hadoop.security.authentication.util.TestKerberosName 
   
hadoop.security.authentication.server.TestAltKerberosAuthenticationHandler 
   hadoop.minikdc.TestChangeOrgNameAndDomain 
   hadoop.minikdc.TestMiniKdc 
  

   mvninstall:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java9-linux-x86/2/artifact/out/patch-mvninstall-root.txt
  [1.7M]

   compile:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java9-linux-x86/2/artifact/out/patch-compile-root.txt
  [48K]

   cc:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java9-linux-x86/2/artifact/out/patch-compile-root.txt
  [48K]

   javac:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java9-linux-x86/2/artifact/out/patch-compile-root.txt
  [48K]

   checkstyle:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java9-linux-x86/2/artifact/out/diff-checkstyle-root.txt
  [3.1M]

   mvnsite:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java9-linux-x86/2/artifact/out/patch-mvnsite-root.txt
  [40K]

   pylint:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java9-linux-x86/2/artifact/out/diff-patch-pylint.txt
  [20K]

   shellcheck:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java9-linux-x86/2/artifact/out/diff-patch-shellcheck.txt
  [20K]

   shelldocs:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java9-linux-x86/2/artifact/out/diff-patch-shelldocs.txt
  [12K]

   whitespace:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java9-linux-x86/2/artifact/out/whitespace-eol.txt
  [8.5M]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java9-linux-x86/2/artifact/out/whitespace-tabs.txt
  [292K]

   findbugs:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java9-linux-x86/2/artifact/out/branch-findbugs-hadoop-common-project_hadoop-annotations.txt
  [276K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java9-linux-x86/2/artifact/out/branch-findbugs-hadoop-common-project_hadoop-auth.txt
  [8.0K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java9-linux-x86/2/artifact/out/branch-findbugs-hadoop-common-project_hadoop-auth-examples.txt
  [8.0K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java9-linux-x86/2/artifact/out/branch-findbugs-hadoop-common-project_hadoop-common.txt
  [12K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java9-linux-x86/2/artifact/out/branch-findbugs-hadoop-common-project_hadoop-kms.txt
  [8.0K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java9-linux-x86/2/artifact/out/branch-findbugs-hadoop-common-project_hadoop-minikdc.txt
  [8.0K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java9-linux-x86/2/artifact/out/branch-findbugs-hadoop-common-project_hadoop-nfs.txt
  [4.0K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java9-linux-x86/2/artifact/out/branch-findbugs-hadoop-hdfs-project_hadoop-hdfs.txt
  [12K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java9-linux-x86/2/artifact/out/branch-findbugs-hadoop-hdfs-project_hadoop-hdfs-client.txt
  [16K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java9-linux-x86/2/artifact/out/branch-findbugs-hadoop-hdfs-project_hadoop-hdfs-httpfs.txt
  [8.0K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java9-linux-x86/2/artifact/out/branch-findbugs-hadoop-hdfs-project_hadoop-hdfs-nfs.txt
  [8.0K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java9-linux-x86/2/artifact/out/branch-findbugs-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-app.txt
  [36K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java9-linux-x86/2/artifact/out/branch-findbugs-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-common.txt
  [12K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java9-linux-x86/2/artifact/out/branch-findbugs-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-core.txt
  [12K]
   

Re: [DISCUSS] Feature Branch Merge and Security Audits

2017-10-20 Thread larry mccay
Hi Eric -

Thanks for the additional item suggestions!

"We might want to start a security section for Hadoop wiki for each of the
services and components.
This helps to track what has been completed."

Do you mean to keep the audit checklist for each service and component
there?
Interesting idea, I wonder what sort of maintenance that implies and
whether we want to take on that burden even though it would be great
information to have for future reviewers.

"How do we want to enforce security completeness?  Most features will not
meet all security requirements on merge day."

This is a really important question and point.
Maybe we should have started with goals and intents before the actual list.

My high level goals:

1. To have a holistic idea of what a given feature (or merge) is bringing
to the table in terms of attack surface
2. To understand the level of security that intended for the feature in its
endstate (GA)
3. To fully understand the stated level of security that is in place at the
time of each merge
4. To ensure that a merge meets some minimal bar for not adding security
vulnerabilities to deployments of a release or even builds from trunk. Not
the least of which is whether it is enabled by default and what it means to
disabled.
5. To be as unobtrusive to the branch committers as possible while still
communicating what we need for security review.
6. To have a reasonable checklist of security concerns that may or may not
apply to each merge but should be at least thought about in the final
security model design for the particular feature.

I think that feature merges often span multiple branch merges with security
coming in phases or other aspects of the feature.
This intent should maybe be part of the checklist itself so that we can
assess the audit with the level of scrutiny appropriate for the current
merge.

I will work on another revision of the list and incorporate your
suggestions as well.

thanks!

--larry

On Fri, Oct 20, 2017 at 7:42 PM, Eric Yang  wrote:

> The check list looks good.  Some more items to add:
>
> Kerberos
>   TGT renewal
>   SPNEGO support
>   Delegation token
> Proxy User ACL
>
> CVE tracking list
>
> We might want to start a security section for Hadoop wiki for each of the
> services and components.
> This helps to track what has been completed.
>
> How do we want to enforce security completeness?  Most features will not
> meet all security requirements on merge day.
>
> Regards,
> Eric
>
> On 10/20/17, 12:41 PM, "larry mccay"  wrote:
>
> Adding security@hadoop list as well...
>
> On Fri, Oct 20, 2017 at 2:29 PM, larry mccay 
> wrote:
>
> > All -
> >
> > Given the maturity of Hadoop at this point, I would like to propose
> that
> > we start doing explicit security audits of features at merge time.
> >
> > There are a few reasons that I think this is a good place/time to do
> the
> > review:
> >
> > 1. It represents a specific snapshot of where the feature stands as a
> > whole. This means that we can more easily identity the attack
> surface of a
> > given feature.
> > 2. We can identify any security gaps that need to be fixed before a
> > release that carries the feature can be considered ready.
> > 3. We - in extreme cases - can block a feature from merging until
> some
> > baseline of security coverage is achieved.
> > 4. The folks that are interested and able to review security aspects
> can't
> > scale for every iteration over every JIRA but can review the
> checklist and
> > follow pointers for specific areas of interest.
> >
> > I have provided an impromptu security audit checklist on the DISCUSS
> > thread for merging Ozone - HDFS-7240 into trunk.
> >
> > I don't want to pick on it particularly but I think it is a good way
> to
> > bootstrap this audit process and figure out how to incorporate it
> without
> > being too intrusive.
> >
> > The questions that I provided below are a mix of general questions
> that
> > could be on a standard checklist that you provide along with the
> merge
> > thread and some that are specific to what I read about ozone in the
> > excellent docs provided. So, we should consider some subset of the
> > following as a proposal for a general checklist.
> >
> > Perhaps, a shared document can be created to iterate over the list
> to fine
> > tune it?
> >
> > Any thoughts on this, any additional datapoints to collect, etc?
> >
> > thanks!
> >
> > --larry
> >
> > 1. UIs
> > I see there are at least two UIs - Storage Container Manager and Key
> Space
> > Manager. There are a number of typical vulnerabilities that we find
> in UIs
> >
> > 1.1. What sort of validation is being done on any accepted user
> input?
> > (pointers to code would be appreciated)
> > 1.2. What explicit 

2.9.0 status update (10/20/2017)

2017-10-20 Thread Subru Krishnan
Today was the feature freeze date and we are glad to inform that all the
major planned features are merged in branch-2:
https://cwiki.apache.org/confluence/display/HADOOP/Roadmap#Roadmap-Plannedfeatures
:

Kudos to everyone who pulled through multiple blockers and made this
happen. Special shoutout to Vrushali, Varun (both :)), Wangda, Inigo,
Sunil, and Jonathan.

I have set up a nightly build for branch-2 (hopefully):
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/

We have 5 blockers and 10 other jiras in development that have to be
addressed by *27th October 2017* following which we plan to cut branch-2.9:
Blockers - https://issues.apache.org/jira/issues/?filter=12342048
WIP JIRAs - https://issues.apache.org/jira/issues/?filter=12342468

We'll be following up with each of the above JIRAs individually next week,
lets make sure that we complete them by next Friday.

-Subru/Arun


Re: [DISCUSS] Feature Branch Merge and Security Audits

2017-10-20 Thread Eric Yang
The check list looks good.  Some more items to add:

Kerberos
  TGT renewal
  SPNEGO support
  Delegation token
Proxy User ACL

CVE tracking list

We might want to start a security section for Hadoop wiki for each of the 
services and components.
This helps to track what has been completed.

How do we want to enforce security completeness?  Most features will not meet 
all security requirements on merge day.

Regards,
Eric

On 10/20/17, 12:41 PM, "larry mccay"  wrote:

Adding security@hadoop list as well...

On Fri, Oct 20, 2017 at 2:29 PM, larry mccay  wrote:

> All -
>
> Given the maturity of Hadoop at this point, I would like to propose that
> we start doing explicit security audits of features at merge time.
>
> There are a few reasons that I think this is a good place/time to do the
> review:
>
> 1. It represents a specific snapshot of where the feature stands as a
> whole. This means that we can more easily identity the attack surface of a
> given feature.
> 2. We can identify any security gaps that need to be fixed before a
> release that carries the feature can be considered ready.
> 3. We - in extreme cases - can block a feature from merging until some
> baseline of security coverage is achieved.
> 4. The folks that are interested and able to review security aspects can't
> scale for every iteration over every JIRA but can review the checklist and
> follow pointers for specific areas of interest.
>
> I have provided an impromptu security audit checklist on the DISCUSS
> thread for merging Ozone - HDFS-7240 into trunk.
>
> I don't want to pick on it particularly but I think it is a good way to
> bootstrap this audit process and figure out how to incorporate it without
> being too intrusive.
>
> The questions that I provided below are a mix of general questions that
> could be on a standard checklist that you provide along with the merge
> thread and some that are specific to what I read about ozone in the
> excellent docs provided. So, we should consider some subset of the
> following as a proposal for a general checklist.
>
> Perhaps, a shared document can be created to iterate over the list to fine
> tune it?
>
> Any thoughts on this, any additional datapoints to collect, etc?
>
> thanks!
>
> --larry
>
> 1. UIs
> I see there are at least two UIs - Storage Container Manager and Key Space
> Manager. There are a number of typical vulnerabilities that we find in UIs
>
> 1.1. What sort of validation is being done on any accepted user input?
> (pointers to code would be appreciated)
> 1.2. What explicit protections have been built in for (pointers to code
> would be appreciated):
>   1.2.1. cross site scripting
>   1.2.2. cross site request forgery
>   1.2.3. click jacking (X-Frame-Options)
> 1.3. What sort of authentication is required for access to the UIs?
> 1.4. What authorization is available for determining who can access what
> capabilities of the UIs for either viewing, modifying data or affecting
> object stores and related processes?
> 1.5. Are the UIs built with proxying in mind by leveraging X-Forwarded
> headers?
> 1.6. Is there any input that will ultimately be persisted in configuration
> for executing shell commands or processes?
> 1.7. Do the UIs support the trusted proxy pattern with doas impersonation?
> 1.8. Is there TLS/SSL support?
>
> 2. REST APIs
>
> 2.1. Do the REST APIs support the trusted proxy pattern with doas
> impersonation capabilities?
> 2.2. What explicit protections have been built in for:
>   2.2.1. cross site scripting (XSS)
>   2.2.2. cross site request forgery (CSRF)
>   2.2.3. XML External Entity (XXE)
> 2.3. What is being used for authentication - Hadoop Auth Module?
> 2.4. Are there separate processes for the HTTP resources (UIs and REST
> endpoints) or are the part of existing HDFS processes?
> 2.5. Is there TLS/SSL support?
> 2.6. Are there new CLI commands and/or clients for access the REST APIs?
> 2.7. Bucket Level API allows for setting of ACLs on a bucket - what
> authorization is required here - is there a restrictive ACL set on 
creation?
> 2.8. Bucket Level API allows for deleting a bucket - I assume this is
> dependent on ACLs based access control?
> 2.9. Bucket Level API to list bucket returns up to 1000 keys - is there
> paging available?
> 2.10. Storage Level APIs indicate “Signed with User Authorization” what
> does this refer to exactly?
> 2.11. Object Level APIs indicate that there is no ACL support and only
> bucket owners can read and write - but there are ACL APIs on the Bucket
> Level are they meaningless for now?
> 2.12. How does a REST 

[jira] [Resolved] (HADOOP-14961) Docker failed to build yetus/hadoop:0de40f0: Oracle JDK 8 is NOT installed

2017-10-20 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer resolved HADOOP-14961.
---
Resolution: Fixed

> Docker failed to build yetus/hadoop:0de40f0: Oracle JDK 8 is NOT installed
> --
>
> Key: HADOOP-14961
> URL: https://issues.apache.org/jira/browse/HADOOP-14961
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: build, test
>Affects Versions: 3.1.0
>Reporter: John Zhuge
>
> https://builds.apache.org/job/PreCommit-HADOOP-Build/13546/console 
> {noformat} 
> Downloading Oracle Java 8... 
> --2017-10-18 18:28:11-- 
> http://download.oracle.com/otn-pub/java/jdk/8u144-b01/090f390dda5b47b9b721c7dfaa008135/jdk-8u144-linux-x64.tar.gz
>  
> Resolving download.oracle.com (download.oracle.com)... 
> 23.59.190.131, 23.59.190.130 
> Connecting to download.oracle.com (download.oracle.com)|23.59.190.131|:80... 
> connected. 
> HTTP request sent, awaiting response... 302 Moved Temporarily 
> Location: 
> https://edelivery.oracle.com/otn-pub/java/jdk/8u144-b01/090f390dda5b47b9b721c7dfaa008135/jdk-8u144-linux-x64.tar.gz
>  [following] 
> --2017-10-18 18:28:11-- 
> https://edelivery.oracle.com/otn-pub/java/jdk/8u144-b01/090f390dda5b47b9b721c7dfaa008135/jdk-8u144-linux-x64.tar.gz
>  
> Resolving edelivery.oracle.com (edelivery.oracle.com)... 
> 23.39.16.136, 2600:1409:a:39c::2d3e, 2600:1409:a:39e::2d3e 
> Connecting to edelivery.oracle.com 
> (edelivery.oracle.com)|23.39.16.136|:443... connected. 
> HTTP request sent, awaiting response... 302 Moved 
> Temporarily 
> Location: 
> http://download.oracle.com/otn-pub/java/jdk/8u144-b01/090f390dda5b47b9b721c7dfaa008135/jdk-8u144-linux-x64.tar.gz?AuthParam=1508351411_3d448519d55b9741af15953ef5049a7c
>  [following] 
> --2017-10-18 18:28:11-- 
> http://download.oracle.com/otn-pub/java/jdk/8u144-b01/090f390dda5b47b9b721c7dfaa008135/jdk-8u144-linux-x64.tar.gz?AuthParam=1508351411_3d448519d55b9741af15953ef5049a7c
>  
> Connecting to download.oracle.com (download.oracle.com)|23.59.190.131|:80... 
> connected. 
> HTTP request sent, awaiting response... 404 Not Found 
> 2017-10-18 18:28:12 ERROR 404: Not Found. 
> download failed 
> Oracle JDK 8 is NOT installed. 
> {noformat}
> Looks like Oracle JDK 8u144 is no longer available for download using that 
> link. 8u151 and 8u152 are available.
> Many of last 10 https://builds.apache.org/job/PreCommit-HADOOP-Build/ jobs 
> failed the same way, all on build host H1 and H6.
> [~aw] has a patch available in HADOOP-14816 "Update Dockerfile to use Xenial" 
> for a long term fix.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-14971) Merge S3A committers into trunk

2017-10-20 Thread Steve Loughran (JIRA)
Steve Loughran created HADOOP-14971:
---

 Summary: Merge S3A committers into trunk
 Key: HADOOP-14971
 URL: https://issues.apache.org/jira/browse/HADOOP-14971
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: fs/s3
Affects Versions: 3.0.0
Reporter: Steve Loughran
Assignee: Steve Loughran


Merge the HADOOP-13786 committer into trunk. This branch is being set up as a 
github PR for review there & to keep it out the mailboxes of the watchers on 
the main JIRA



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Re: 答复: [DISCUSSION] Merging HDFS-7240 Object Store (Ozone) to trunk

2017-10-20 Thread larry mccay
All -

I broke this list of questions out into a separate DISCUSS thread where we
can iterate over how a security audit process at merge time might look and
whether it is even something that we want to take on.

I will try and continue discussion on that thread and drive that to some
conclusion before bringing it into any particular merge discussion.

thanks,

--larry

On Fri, Oct 20, 2017 at 12:37 PM, larry mccay  wrote:

> I previously sent this same email from my work email and it doesn't seem
> to have gone through - resending from apache account (apologizing up from
> for the length)
>
> For such sizable merges in Hadoop, I would like to start doing security
> audits in order to have an initial idea of the attack surface, the
> protections available for known threats, what sort of configuration is
> being used to launch processes, etc.
>
> I dug into the architecture documents while in the middle of this list -
> nice docs!
> I do intend to try and make a generic check list like this for such
> security audits in the future so a lot of this is from that but I tried to
> also direct specific questions from those docs as well.
>
> 1. UIs
> I see there are at least two UIs - Storage Container Manager and Key Space
> Manager. There are a number of typical vulnerabilities that we find in UIs
>
> 1.1. What sort of validation is being done on any accepted user input?
> (pointers to code would be appreciated)
> 1.2. What explicit protections have been built in for (pointers to code
> would be appreciated):
>   1.2.1. cross site scripting
>   1.2.2. cross site request forgery
>   1.2.3. click jacking (X-Frame-Options)
> 1.3. What sort of authentication is required for access to the UIs?
> 1.4. What authorization is available for determining who can access what
> capabilities of the UIs for either viewing, modifying data or affecting
> object stores and related processes?
> 1.5. Are the UIs built with proxying in mind by leveraging X-Forwarded
> headers?
> 1.6. Is there any input that will ultimately be persisted in configuration
> for executing shell commands or processes?
> 1.7. Do the UIs support the trusted proxy pattern with doas impersonation?
> 1.8. Is there TLS/SSL support?
>
> 2. REST APIs
>
> 2.1. Do the REST APIs support the trusted proxy pattern with doas
> impersonation capabilities?
> 2.2. What explicit protections have been built in for:
>   2.2.1. cross site scripting (XSS)
>   2.2.2. cross site request forgery (CSRF)
>   2.2.3. XML External Entity (XXE)
> 2.3. What is being used for authentication - Hadoop Auth Module?
> 2.4. Are there separate processes for the HTTP resources (UIs and REST
> endpoints) or are the part of existing HDFS processes?
> 2.5. Is there TLS/SSL support?
> 2.6. Are there new CLI commands and/or clients for access the REST APIs?
> 2.7. Bucket Level API allows for setting of ACLs on a bucket - what
> authorization is required here - is there a restrictive ACL set on creation?
> 2.8. Bucket Level API allows for deleting a bucket - I assume this is
> dependent on ACLs based access control?
> 2.9. Bucket Level API to list bucket returns up to 1000 keys - is there
> paging available?
> 2.10. Storage Level APIs indicate “Signed with User Authorization” what
> does this refer to exactly?
> 2.11. Object Level APIs indicate that there is no ACL support and only
> bucket owners can read and write - but there are ACL APIs on the Bucket
> Level are they meaningless for now?
> 2.12. How does a REST client know which Ozone Handler to connect to or am
> I missing some well known NN type endpoint in the architecture doc
> somewhere?
>
> 3. Encryption
>
> 3.1. Is there any support for encryption of persisted data?
> 3.2. If so, is KMS and the hadoop key command used for key management?
>
> 4. Configuration
>
> 4.1. Are there any passwords or secrets being added to configuration?
> 4.2. If so, are they accessed via Configuration.getPassword() to allow for
> provisioning in credential providers?
> 4.3. Are there any settings that are used to launch docker containers or
> shell out any commands, etc?
>
> 5. HA
>
> 5.1. Are there provisions for HA?
> 5.2. Are we leveraging the existing HA capabilities in HDFS?
> 5.3. Is Storage Container Manager a SPOF?
> 5.4. I see HA listed in future work in the architecture doc - is this
> still an open issue?
>
> On Fri, Oct 20, 2017 at 11:19 AM, Anu Engineer 
> wrote:
>
>> Hi Steve,
>>
>> In addition to everything Weiwei mentioned (chapter 3 of user guide), if
>> you really want to drill down to REST protocol you might want to apply this
>> patch and build ozone.
>>
>> https://issues.apache.org/jira/browse/HDFS-12690
>>
>> This will generate an Open API (https://www.openapis.org ,
>> http://swagger.io) based specification which can be accessed from KSM UI
>> or just as a json file.
>> Unfortunately, this patch is still at code review stage, so you will have
>> to apply the patch and build it yourself.

Re: [VOTE] Release Apache Hadoop 2.8.2 (RC1)

2017-10-20 Thread Wei Yan
Thanks, Junping.

+1 (non-binding)

- Build from hadoop-2.8.2-src.tar.gz (skipped test) and also tried
hadoop-2.8.2.tar.gz
- Setup a 30-node HDFS cluster
- Run some basic HDFS/webhdfs commands
(read/write/mkdirs/listStatus/getFileInfo)
- Check NN Web UI and operations through the UI
- Run a simple spark job (using other version YARN) to read/write parquet
files
- Run a large spark job with 5000 executors to do HDFS operations and
verify FairCallQueue performance


On Fri, Oct 20, 2017 at 1:48 PM, Hanisha Koneru 
wrote:

> Hi Junping,
>
> Thanks for preparing the 2.8.2-RC1 release.
>
> Verified the following:
> - Built from source on Mac OS X 10.11.6 with Java 1.7.0_79
> - Deployed binary to a 3-node docker cluster
> - Sanity checks
> - Basic dfs operations
> - MapReduce Wordcount & Grep
>
>
> +1 (non-binding)
>
>
>
> Thanks,
> Hanisha
>
>
>
>
>
>
>
>
> On 10/19/17, 5:42 PM, "Junping Du"  wrote:
>
> >Hi folks,
> > I've created our new release candidate (RC1) for Apache Hadoop 2.8.2.
> >
> > Apache Hadoop 2.8.2 is the first stable release of Hadoop 2.8 line
> and will be the latest stable/production release for Apache Hadoop - it
> includes 315 new fixed issues since 2.8.1 and 69 fixes are marked as
> blocker/critical issues.
> >
> >  More information about the 2.8.2 release plan can be found here:
> https://cwiki.apache.org/confluence/display/HADOOP/Hadoop+2.8+Release
> >
> >  New RC is available at: http://home.apache.org/~
> junping_du/hadoop-2.8.2-RC1 du/hadoop-2.8.2-RC0>
> >
> >  The RC tag in git is: release-2.8.2-RC1, and the latest commit id
> is: 66c47f2a01ad9637879e95f80c41f798373828fb
> >
> >  The maven artifacts are available via repository.apache.org repository.apache.org/> at: https://repository.apache.org/
> content/repositories/orgapachehadoop-1064 repository.apache.org/content/repositories/orgapachehadoop-1062>
> >
> >  Please try the release and vote; the vote will run for the usual 5
> days, ending on 10/24/2017 6pm PST time.
> >
> >Thanks,
> >
> >Junping
> >
>


Re: [VOTE] Release Apache Hadoop 2.8.2 (RC1)

2017-10-20 Thread Hanisha Koneru
Hi Junping,

Thanks for preparing the 2.8.2-RC1 release.

Verified the following:
- Built from source on Mac OS X 10.11.6 with Java 1.7.0_79
- Deployed binary to a 3-node docker cluster
- Sanity checks
- Basic dfs operations
- MapReduce Wordcount & Grep


+1 (non-binding)



Thanks,
Hanisha








On 10/19/17, 5:42 PM, "Junping Du"  wrote:

>Hi folks,
> I've created our new release candidate (RC1) for Apache Hadoop 2.8.2.
>
> Apache Hadoop 2.8.2 is the first stable release of Hadoop 2.8 line and 
> will be the latest stable/production release for Apache Hadoop - it includes 
> 315 new fixed issues since 2.8.1 and 69 fixes are marked as blocker/critical 
> issues.
>
>  More information about the 2.8.2 release plan can be found here: 
> https://cwiki.apache.org/confluence/display/HADOOP/Hadoop+2.8+Release
>
>  New RC is available at: 
> http://home.apache.org/~junping_du/hadoop-2.8.2-RC1
>
>  The RC tag in git is: release-2.8.2-RC1, and the latest commit id is: 
> 66c47f2a01ad9637879e95f80c41f798373828fb
>
>  The maven artifacts are available via 
> repository.apache.org at: 
> https://repository.apache.org/content/repositories/orgapachehadoop-1064
>
>  Please try the release and vote; the vote will run for the usual 5 days, 
> ending on 10/24/2017 6pm PST time.
>
>Thanks,
>
>Junping
>


2017-10-20 Hadoop 3 release status update

2017-10-20 Thread Andrew Wang
https://cwiki.apache.org/confluence/display/HADOOP/Hadoop+3+release+status+updates

2017-10-20

Apologies for skipping the update last week. Here's how we're tracking for
GA.

Highlights:

   - Merge of HDFS router-based federation and API-based scheduler
   configuration with no reported problems. Kudos to the contributors involved!

Red flags:

   - We're making a last-minute push to get resource types (but not
   resource profiles in). Coming this late, it's a risk, but we decided it's
   worthwhile for this feature. See Daniel's yarn-dev email
   

for
   the full rationale.
   - Still uncovering EC bugs from testing

Previously tracked GA blockers that have been resolved or dropped:

   - YARN-6623
    - Add
   support to turn off launching privileged containers in the
   container-executor RESOLVED: Committed and resolved
   - Change of ExecutionType
  - YARN-7275
   - NM
  Statestore cleanup for Container updates RESOLVED : Patch committed,
  resolved.
   - ReservationSystem
  - YARN-4859
   - [Bug]
  Unable to submit a job to a reservation when using FairScheduler
  RESOLVED: Yufei tested this and found things mostly worked, filed two
  not-blocker followons: YARN-7347
   - Fixe
  the bug in Fair scheduler to handle a queue named "root.root" OPEN
   and YARN-7348
   - Ignore
  the vcore in reservation request for fair policy queue OPEN

GA blockers:

   - Change of ExecutionType
  - YARN-7178
   - Add
  documentation for Container Update API OPEN : Still no update from
  Arun, I pinged it.
   - ReservationSystem
  - YARN-4827
  
- Document
  configuration of ReservationSystem for FairScheduler OPEN: Yufei said
  he'd work on it as of 2 days ago
   - Rolling upgrade
  - YARN-6142
   - Support
  rolling upgrade between 2.x and 3.x OPEN : I pinged this and asked
  for a status update
  - HDFS-11096
  
- Support
  rolling upgrade between 2.x and 3.xPATCH AVAILABLE: I pinged this and
  asked for a status update
   - Erasure coding
  - HDFS-12682
  
- ECAdmin
  -listPolicies will always show policy state as DISABLED OPEN: New
  blocker filed this week, Xiao is working on it
  - HDFS-12686
  
- Erasure
  coding system policy state is not correctly saved and loaded during real
  cluster restart OPEN: New blocker filed this week, Sammi is on it
  - HDFS-12686
  
- Erasure
  coding system policy state is not correctly saved and loaded during real
  cluster restart OPEN: Old blocker, Huafeng is on it, waiting on
  review from Wei-Chiu or Sammi

Features merged for GA:

   - Erasure coding
  - Continued bug reporting and fixing based on testing at Cloudera.
  - Two new blockers filed this week, mentioned above.
  - Huafeng completed patch to reenable disabled EC tests
   - Classpath isolation (HADOOP-11656)
   - HADOOP-13916
  
- Document
  how downstream clients should make use of the new shaded client artifacts
   IN PROGRESS: I pinged it
   - Compat guide (HADOOP-13714
   )
  - HADOOP-14876
  
- Create
  downstream developer docs from the compatibility guidelines PATCH
  AVAILABLE: Daniel has a patch up, revved based on Steve's review
  feedback, waiting on Steve's reply
  - HADOOP-14875
  
- Create
  end user documentation from the compatibility guidelines OPEN: No
  patch yet
   - TSv2 alpha 2
   - This was merged, no problems thus far [image: (smile)]
   - API-based scheduler configuration YARN-5734
    - OrgQueue
   for easy CapacityScheduler queue configuration management RESOLVED
  - Merged, no problems thus far [image: (smile)]
   - HDFS router-based configuration HDFS-10467
   

Re: [DISCUSS] Feature Branch Merge and Security Audits

2017-10-20 Thread larry mccay
Adding security@hadoop list as well...

On Fri, Oct 20, 2017 at 2:29 PM, larry mccay  wrote:

> All -
>
> Given the maturity of Hadoop at this point, I would like to propose that
> we start doing explicit security audits of features at merge time.
>
> There are a few reasons that I think this is a good place/time to do the
> review:
>
> 1. It represents a specific snapshot of where the feature stands as a
> whole. This means that we can more easily identity the attack surface of a
> given feature.
> 2. We can identify any security gaps that need to be fixed before a
> release that carries the feature can be considered ready.
> 3. We - in extreme cases - can block a feature from merging until some
> baseline of security coverage is achieved.
> 4. The folks that are interested and able to review security aspects can't
> scale for every iteration over every JIRA but can review the checklist and
> follow pointers for specific areas of interest.
>
> I have provided an impromptu security audit checklist on the DISCUSS
> thread for merging Ozone - HDFS-7240 into trunk.
>
> I don't want to pick on it particularly but I think it is a good way to
> bootstrap this audit process and figure out how to incorporate it without
> being too intrusive.
>
> The questions that I provided below are a mix of general questions that
> could be on a standard checklist that you provide along with the merge
> thread and some that are specific to what I read about ozone in the
> excellent docs provided. So, we should consider some subset of the
> following as a proposal for a general checklist.
>
> Perhaps, a shared document can be created to iterate over the list to fine
> tune it?
>
> Any thoughts on this, any additional datapoints to collect, etc?
>
> thanks!
>
> --larry
>
> 1. UIs
> I see there are at least two UIs - Storage Container Manager and Key Space
> Manager. There are a number of typical vulnerabilities that we find in UIs
>
> 1.1. What sort of validation is being done on any accepted user input?
> (pointers to code would be appreciated)
> 1.2. What explicit protections have been built in for (pointers to code
> would be appreciated):
>   1.2.1. cross site scripting
>   1.2.2. cross site request forgery
>   1.2.3. click jacking (X-Frame-Options)
> 1.3. What sort of authentication is required for access to the UIs?
> 1.4. What authorization is available for determining who can access what
> capabilities of the UIs for either viewing, modifying data or affecting
> object stores and related processes?
> 1.5. Are the UIs built with proxying in mind by leveraging X-Forwarded
> headers?
> 1.6. Is there any input that will ultimately be persisted in configuration
> for executing shell commands or processes?
> 1.7. Do the UIs support the trusted proxy pattern with doas impersonation?
> 1.8. Is there TLS/SSL support?
>
> 2. REST APIs
>
> 2.1. Do the REST APIs support the trusted proxy pattern with doas
> impersonation capabilities?
> 2.2. What explicit protections have been built in for:
>   2.2.1. cross site scripting (XSS)
>   2.2.2. cross site request forgery (CSRF)
>   2.2.3. XML External Entity (XXE)
> 2.3. What is being used for authentication - Hadoop Auth Module?
> 2.4. Are there separate processes for the HTTP resources (UIs and REST
> endpoints) or are the part of existing HDFS processes?
> 2.5. Is there TLS/SSL support?
> 2.6. Are there new CLI commands and/or clients for access the REST APIs?
> 2.7. Bucket Level API allows for setting of ACLs on a bucket - what
> authorization is required here - is there a restrictive ACL set on creation?
> 2.8. Bucket Level API allows for deleting a bucket - I assume this is
> dependent on ACLs based access control?
> 2.9. Bucket Level API to list bucket returns up to 1000 keys - is there
> paging available?
> 2.10. Storage Level APIs indicate “Signed with User Authorization” what
> does this refer to exactly?
> 2.11. Object Level APIs indicate that there is no ACL support and only
> bucket owners can read and write - but there are ACL APIs on the Bucket
> Level are they meaningless for now?
> 2.12. How does a REST client know which Ozone Handler to connect to or am
> I missing some well known NN type endpoint in the architecture doc
> somewhere?
>
> 3. Encryption
>
> 3.1. Is there any support for encryption of persisted data?
> 3.2. If so, is KMS and the hadoop key command used for key management?
>
> 4. Configuration
>
> 4.1. Are there any passwords or secrets being added to configuration?
> 4.2. If so, are they accessed via Configuration.getPassword() to allow for
> provisioning in credential providers?
> 4.3. Are there any settings that are used to launch docker containers or
> shell out any commands, etc?
>
> 5. HA
>
> 5.1. Are there provisions for HA?
> 5.2. Are we leveraging the existing HA capabilities in HDFS?
> 5.3. Is Storage Container Manager a SPOF?
> 5.4. I see HA listed in future work in the architecture doc - is this
> still an open issue?
>


[DISCUSS] Feature Branch Merge and Security Audits

2017-10-20 Thread larry mccay
All -

Given the maturity of Hadoop at this point, I would like to propose that we
start doing explicit security audits of features at merge time.

There are a few reasons that I think this is a good place/time to do the
review:

1. It represents a specific snapshot of where the feature stands as a
whole. This means that we can more easily identity the attack surface of a
given feature.
2. We can identify any security gaps that need to be fixed before a release
that carries the feature can be considered ready.
3. We - in extreme cases - can block a feature from merging until some
baseline of security coverage is achieved.
4. The folks that are interested and able to review security aspects can't
scale for every iteration over every JIRA but can review the checklist and
follow pointers for specific areas of interest.

I have provided an impromptu security audit checklist on the DISCUSS thread
for merging Ozone - HDFS-7240 into trunk.

I don't want to pick on it particularly but I think it is a good way to
bootstrap this audit process and figure out how to incorporate it without
being too intrusive.

The questions that I provided below are a mix of general questions that
could be on a standard checklist that you provide along with the merge
thread and some that are specific to what I read about ozone in the
excellent docs provided. So, we should consider some subset of the
following as a proposal for a general checklist.

Perhaps, a shared document can be created to iterate over the list to fine
tune it?

Any thoughts on this, any additional datapoints to collect, etc?

thanks!

--larry

1. UIs
I see there are at least two UIs - Storage Container Manager and Key Space
Manager. There are a number of typical vulnerabilities that we find in UIs

1.1. What sort of validation is being done on any accepted user input?
(pointers to code would be appreciated)
1.2. What explicit protections have been built in for (pointers to code
would be appreciated):
  1.2.1. cross site scripting
  1.2.2. cross site request forgery
  1.2.3. click jacking (X-Frame-Options)
1.3. What sort of authentication is required for access to the UIs?
1.4. What authorization is available for determining who can access what
capabilities of the UIs for either viewing, modifying data or affecting
object stores and related processes?
1.5. Are the UIs built with proxying in mind by leveraging X-Forwarded
headers?
1.6. Is there any input that will ultimately be persisted in configuration
for executing shell commands or processes?
1.7. Do the UIs support the trusted proxy pattern with doas impersonation?
1.8. Is there TLS/SSL support?

2. REST APIs

2.1. Do the REST APIs support the trusted proxy pattern with doas
impersonation capabilities?
2.2. What explicit protections have been built in for:
  2.2.1. cross site scripting (XSS)
  2.2.2. cross site request forgery (CSRF)
  2.2.3. XML External Entity (XXE)
2.3. What is being used for authentication - Hadoop Auth Module?
2.4. Are there separate processes for the HTTP resources (UIs and REST
endpoints) or are the part of existing HDFS processes?
2.5. Is there TLS/SSL support?
2.6. Are there new CLI commands and/or clients for access the REST APIs?
2.7. Bucket Level API allows for setting of ACLs on a bucket - what
authorization is required here - is there a restrictive ACL set on creation?
2.8. Bucket Level API allows for deleting a bucket - I assume this is
dependent on ACLs based access control?
2.9. Bucket Level API to list bucket returns up to 1000 keys - is there
paging available?
2.10. Storage Level APIs indicate “Signed with User Authorization” what
does this refer to exactly?
2.11. Object Level APIs indicate that there is no ACL support and only
bucket owners can read and write - but there are ACL APIs on the Bucket
Level are they meaningless for now?
2.12. How does a REST client know which Ozone Handler to connect to or am I
missing some well known NN type endpoint in the architecture doc somewhere?

3. Encryption

3.1. Is there any support for encryption of persisted data?
3.2. If so, is KMS and the hadoop key command used for key management?

4. Configuration

4.1. Are there any passwords or secrets being added to configuration?
4.2. If so, are they accessed via Configuration.getPassword() to allow for
provisioning in credential providers?
4.3. Are there any settings that are used to launch docker containers or
shell out any commands, etc?

5. HA

5.1. Are there provisions for HA?
5.2. Are we leveraging the existing HA capabilities in HDFS?
5.3. Is Storage Container Manager a SPOF?
5.4. I see HA listed in future work in the architecture doc - is this still
an open issue?


[jira] [Created] (HADOOP-14970) MiniHadoopClusterManager doesn't respect lack of format option

2017-10-20 Thread Erik Krogen (JIRA)
Erik Krogen created HADOOP-14970:


 Summary: MiniHadoopClusterManager doesn't respect lack of format 
option
 Key: HADOOP-14970
 URL: https://issues.apache.org/jira/browse/HADOOP-14970
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Erik Krogen
Assignee: Erik Krogen
Priority: Minor


The CLI MiniCluster, {{MiniHadoopClusterManager}}, says that by default it does 
not format its directories, and provides the {{-format}} option to specify that 
it should do so. However, it builds its {{MiniDFSCluster}} like:
{code}
  dfs = new MiniDFSCluster.Builder(conf).nameNodePort(nnPort)
  .nameNodeHttpPort(nnHttpPort).numDataNodes(numDataNodes)
  .startupOption(dfsOpts).build();
{code}
{{MiniDFSCluster.Builder}}, by default, sets {{format}} to true, so even though 
the {{startupOption}} is {{REGULAR}}, it will still format regardless of 
whether or not the flag is supplied.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Re: 答复: [DISCUSSION] Merging HDFS-7240 Object Store (Ozone) to trunk

2017-10-20 Thread larry mccay
I previously sent this same email from my work email and it doesn't seem to
have gone through - resending from apache account (apologizing up from for
the length)

For such sizable merges in Hadoop, I would like to start doing security
audits in order to have an initial idea of the attack surface, the
protections available for known threats, what sort of configuration is
being used to launch processes, etc.

I dug into the architecture documents while in the middle of this list -
nice docs!
I do intend to try and make a generic check list like this for such
security audits in the future so a lot of this is from that but I tried to
also direct specific questions from those docs as well.

1. UIs
I see there are at least two UIs - Storage Container Manager and Key Space
Manager. There are a number of typical vulnerabilities that we find in UIs

1.1. What sort of validation is being done on any accepted user input?
(pointers to code would be appreciated)
1.2. What explicit protections have been built in for (pointers to code
would be appreciated):
  1.2.1. cross site scripting
  1.2.2. cross site request forgery
  1.2.3. click jacking (X-Frame-Options)
1.3. What sort of authentication is required for access to the UIs?
1.4. What authorization is available for determining who can access what
capabilities of the UIs for either viewing, modifying data or affecting
object stores and related processes?
1.5. Are the UIs built with proxying in mind by leveraging X-Forwarded
headers?
1.6. Is there any input that will ultimately be persisted in configuration
for executing shell commands or processes?
1.7. Do the UIs support the trusted proxy pattern with doas impersonation?
1.8. Is there TLS/SSL support?

2. REST APIs

2.1. Do the REST APIs support the trusted proxy pattern with doas
impersonation capabilities?
2.2. What explicit protections have been built in for:
  2.2.1. cross site scripting (XSS)
  2.2.2. cross site request forgery (CSRF)
  2.2.3. XML External Entity (XXE)
2.3. What is being used for authentication - Hadoop Auth Module?
2.4. Are there separate processes for the HTTP resources (UIs and REST
endpoints) or are the part of existing HDFS processes?
2.5. Is there TLS/SSL support?
2.6. Are there new CLI commands and/or clients for access the REST APIs?
2.7. Bucket Level API allows for setting of ACLs on a bucket - what
authorization is required here - is there a restrictive ACL set on creation?
2.8. Bucket Level API allows for deleting a bucket - I assume this is
dependent on ACLs based access control?
2.9. Bucket Level API to list bucket returns up to 1000 keys - is there
paging available?
2.10. Storage Level APIs indicate “Signed with User Authorization” what
does this refer to exactly?
2.11. Object Level APIs indicate that there is no ACL support and only
bucket owners can read and write - but there are ACL APIs on the Bucket
Level are they meaningless for now?
2.12. How does a REST client know which Ozone Handler to connect to or am I
missing some well known NN type endpoint in the architecture doc somewhere?

3. Encryption

3.1. Is there any support for encryption of persisted data?
3.2. If so, is KMS and the hadoop key command used for key management?

4. Configuration

4.1. Are there any passwords or secrets being added to configuration?
4.2. If so, are they accessed via Configuration.getPassword() to allow for
provisioning in credential providers?
4.3. Are there any settings that are used to launch docker containers or
shell out any commands, etc?

5. HA

5.1. Are there provisions for HA?
5.2. Are we leveraging the existing HA capabilities in HDFS?
5.3. Is Storage Container Manager a SPOF?
5.4. I see HA listed in future work in the architecture doc - is this still
an open issue?

On Fri, Oct 20, 2017 at 11:19 AM, Anu Engineer 
wrote:

> Hi Steve,
>
> In addition to everything Weiwei mentioned (chapter 3 of user guide), if
> you really want to drill down to REST protocol you might want to apply this
> patch and build ozone.
>
> https://issues.apache.org/jira/browse/HDFS-12690
>
> This will generate an Open API (https://www.openapis.org ,
> http://swagger.io) based specification which can be accessed from KSM UI
> or just as a json file.
> Unfortunately, this patch is still at code review stage, so you will have
> to apply the patch and build it yourself.
>
> Thanks
> Anu
>
>
> On 10/20/17, 6:09 AM, "Yang Weiwei"  wrote:
>
> Hi Steve
>
>
> The code is available in HDFS-7240 feature branch, public git repo
> here.
>
> I am not sure if there is a "public" API for object stores, but the
> design doc 12799549/ozone_user_v0.pdf> uses most common syntax so I believe it
> should be compliance. You can find the rest API doc here<
> https://github.com/apache/hadoop/blob/HDFS-7240/
> 

[jira] [Resolved] (HADOOP-7553) hadoop-common tries to find hadoop-assemblies:jar:0.23.0-SNAPSHOT in http://snapshots.repository.codehaus.org

2017-10-20 Thread Andras Bokor (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-7553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andras Bokor resolved HADOOP-7553.
--
Resolution: Invalid

It is no longer an issue.

> hadoop-common tries to find hadoop-assemblies:jar:0.23.0-SNAPSHOT in 
> http://snapshots.repository.codehaus.org 
> --
>
> Key: HADOOP-7553
> URL: https://issues.apache.org/jira/browse/HADOOP-7553
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: build
>Reporter: Arun C Murthy
>Priority: Critical
>
> hadoop-common tries to find hadoop-assemblies:jar:0.23.0-SNAPSHOT in 
> http://snapshots.repository.codehaus.org - shouldn't it be apache repo?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Re: 答复: [DISCUSSION] Merging HDFS-7240 Object Store (Ozone) to trunk

2017-10-20 Thread Anu Engineer
Hi Steve,

In addition to everything Weiwei mentioned (chapter 3 of user guide), if you 
really want to drill down to REST protocol you might want to apply this patch 
and build ozone.

https://issues.apache.org/jira/browse/HDFS-12690

This will generate an Open API (https://www.openapis.org , http://swagger.io) 
based specification which can be accessed from KSM UI or just as a json file.
Unfortunately, this patch is still at code review stage, so you will have to 
apply the patch and build it yourself. 

Thanks
Anu


On 10/20/17, 6:09 AM, "Yang Weiwei"  wrote:

Hi Steve


The code is available in HDFS-7240 feature branch, public git repo 
here.

I am not sure if there is a "public" API for object stores, but the design 
doc
 uses most common syntax so I believe it should be compliance. You can find the 
rest API doc 
here
 (with some example usages), and commandline API 
here.


Look forward for your feedback!


--Weiwei



发件人: Steve Loughran 
发送时间: 2017年10月20日 11:49
收件人: Yang Weiwei
抄送: hdfs-...@hadoop.apache.org; mapreduce-...@hadoop.apache.org; 
yarn-...@hadoop.apache.org; common-dev@hadoop.apache.org
主题: Re: [DISCUSSION] Merging HDFS-7240 Object Store (Ozone) to trunk


Wow, big piece of work

1. Where is a PR/branch on github with rendered docs for us to look at?
2. Have you made any public APi changes related to object stores? That's 
probably something I'll have opinions on more than implementation details.

thanks

> On 19 Oct 2017, at 02:54, Yang Weiwei  wrote:
>
> Hello everyone,
>
>
> I would like to start this thread to discuss merging Ozone (HDFS-7240) to 
trunk. This feature implements an object store which can co-exist with HDFS. 
Ozone is disabled by default. We have tested Ozone with cluster sizes varying 
from 1 to 100 data nodes.
>
>
>
> The merge payload includes the following:
>
>  1.  All services, management scripts
>  2.  Object store APIs, exposed via both REST and RPC
>  3.  Master service UIs, command line interfaces
>  4.  Pluggable pipeline Integration
>  5.  Ozone File System (Hadoop compatible file system implementation, 
passes all FileSystem contract tests)
>  6.  Corona - a load generator for Ozone.
>  7.  Essential documentation added to Hadoop site.
>  8.  Version specific Ozone Documentation, accessible via service UI.
>  9.  Docker support for ozone, which enables faster development cycles.
>
>
> To build Ozone and run ozone using docker, please follow instructions in 
this wiki page. 
https://cwiki.apache.org/confluence/display/HADOOP/Dev+cluster+with+docker.
Dev cluster with docker - Hadoop - Apache Software 
Foundation
cwiki.apache.org
First, it uses a much more smaller common image which doesn't contains 
Hadoop. Second, the real Hadoop should be built from the source and the dist 
director should be ...



>
>
> We have built a passionate and diverse community to drive this feature 
development. As a team, we have achieved significant progress in past 3 years 
since first JIRA for HDFS-7240 was opened on Oct 2014. So far, we have resolved 
almost 400 JIRAs by 20+ contributors/committers from different countries and 
affiliations. We also want to thank the large number of community members who 
were supportive of our efforts and contributed ideas and participated in the 
design of ozone.
>
>
> Please share your thoughts, thanks!
>
>
> -- Weiwei Yang




-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org


答复: [DISCUSSION] Merging HDFS-7240 Object Store (Ozone) to trunk

2017-10-20 Thread Yang Weiwei
Hi Steve


The code is available in HDFS-7240 feature branch, public git repo 
here.

I am not sure if there is a "public" API for object stores, but the design 
doc
 uses most common syntax so I believe it should be compliance. You can find the 
rest API doc 
here
 (with some example usages), and commandline API 
here.


Look forward for your feedback!


--Weiwei



发件人: Steve Loughran 
发送时间: 2017年10月20日 11:49
收件人: Yang Weiwei
抄送: hdfs-...@hadoop.apache.org; mapreduce-...@hadoop.apache.org; 
yarn-...@hadoop.apache.org; common-dev@hadoop.apache.org
主题: Re: [DISCUSSION] Merging HDFS-7240 Object Store (Ozone) to trunk


Wow, big piece of work

1. Where is a PR/branch on github with rendered docs for us to look at?
2. Have you made any public APi changes related to object stores? That's 
probably something I'll have opinions on more than implementation details.

thanks

> On 19 Oct 2017, at 02:54, Yang Weiwei  wrote:
>
> Hello everyone,
>
>
> I would like to start this thread to discuss merging Ozone (HDFS-7240) to 
> trunk. This feature implements an object store which can co-exist with HDFS. 
> Ozone is disabled by default. We have tested Ozone with cluster sizes varying 
> from 1 to 100 data nodes.
>
>
>
> The merge payload includes the following:
>
>  1.  All services, management scripts
>  2.  Object store APIs, exposed via both REST and RPC
>  3.  Master service UIs, command line interfaces
>  4.  Pluggable pipeline Integration
>  5.  Ozone File System (Hadoop compatible file system implementation, passes 
> all FileSystem contract tests)
>  6.  Corona - a load generator for Ozone.
>  7.  Essential documentation added to Hadoop site.
>  8.  Version specific Ozone Documentation, accessible via service UI.
>  9.  Docker support for ozone, which enables faster development cycles.
>
>
> To build Ozone and run ozone using docker, please follow instructions in this 
> wiki page. 
> https://cwiki.apache.org/confluence/display/HADOOP/Dev+cluster+with+docker.
Dev cluster with docker - Hadoop - Apache Software 
Foundation
cwiki.apache.org
First, it uses a much more smaller common image which doesn't contains Hadoop. 
Second, the real Hadoop should be built from the source and the dist director 
should be ...



>
>
> We have built a passionate and diverse community to drive this feature 
> development. As a team, we have achieved significant progress in past 3 years 
> since first JIRA for HDFS-7240 was opened on Oct 2014. So far, we have 
> resolved almost 400 JIRAs by 20+ contributors/committers from different 
> countries and affiliations. We also want to thank the large number of 
> community members who were supportive of our efforts and contributed ideas 
> and participated in the design of ozone.
>
>
> Please share your thoughts, thanks!
>
>
> -- Weiwei Yang



Re: [DISCUSSION] Merging HDFS-7240 Object Store (Ozone) to trunk

2017-10-20 Thread Steve Loughran

Wow, big piece of work

1. Where is a PR/branch on github with rendered docs for us to look at?
2. Have you made any public APi changes related to object stores? That's 
probably something I'll have opinions on more than implementation details.

thanks

> On 19 Oct 2017, at 02:54, Yang Weiwei  wrote:
> 
> Hello everyone,
> 
> 
> I would like to start this thread to discuss merging Ozone (HDFS-7240) to 
> trunk. This feature implements an object store which can co-exist with HDFS. 
> Ozone is disabled by default. We have tested Ozone with cluster sizes varying 
> from 1 to 100 data nodes.
> 
> 
> 
> The merge payload includes the following:
> 
>  1.  All services, management scripts
>  2.  Object store APIs, exposed via both REST and RPC
>  3.  Master service UIs, command line interfaces
>  4.  Pluggable pipeline Integration
>  5.  Ozone File System (Hadoop compatible file system implementation, passes 
> all FileSystem contract tests)
>  6.  Corona - a load generator for Ozone.
>  7.  Essential documentation added to Hadoop site.
>  8.  Version specific Ozone Documentation, accessible via service UI.
>  9.  Docker support for ozone, which enables faster development cycles.
> 
> 
> To build Ozone and run ozone using docker, please follow instructions in this 
> wiki page. 
> https://cwiki.apache.org/confluence/display/HADOOP/Dev+cluster+with+docker.
> 
> 
> We have built a passionate and diverse community to drive this feature 
> development. As a team, we have achieved significant progress in past 3 years 
> since first JIRA for HDFS-7240 was opened on Oct 2014. So far, we have 
> resolved almost 400 JIRAs by 20+ contributors/committers from different 
> countries and affiliations. We also want to thank the large number of 
> community members who were supportive of our efforts and contributed ideas 
> and participated in the design of ozone.
> 
> 
> Please share your thoughts, thanks!
> 
> 
> -- Weiwei Yang


-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Jars in Maven Central Repo

2017-10-20 Thread Bokor Andras
Hi,

I am wondering how the jar files like javadoc, sources, tests, tests-sources 
and so on get to Central Repo.
As I am checking different versions I see different set of jar files. E.g:
 - 2.8.0 has only tests and the binary jar but 2.8.1 has all the necessary jars
 - 3.0 alpha1 has all other than javadoc but other 3.0 releases has only tests 
and the normal jar file

Is it intended?
http://repo1.maven.org/maven2/org/apache/hadoop/hadoop-common

Thanks,
Andras

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org