Re: [VOTE] Release Apache Hadoop 3.0.0 RC1

2017-12-12 Thread Elek, Marton

+1 (non-binding)

 * built from the source tarball (archlinux) / verified signature
 * Deployed to a kubernetes cluster (10/10 datanode/nodemanager pods)
 * Enabled ec on hdfs directory (hdfs cli)
 * Started example yarn jobs (pi/terragen)
 * checked yarn ui/ui2

Thanks for all the efforts.

Marton


On 12/08/2017 09:31 PM, Andrew Wang wrote:

Hi all,

Let me start, as always, by thanking the efforts of all the contributors
who contributed to this release, especially those who jumped on the issues
found in RC0.

I've prepared RC1 for Apache Hadoop 3.0.0. This release incorporates 302
fixed JIRAs since the previous 3.0.0-beta1 release.

You can find the artifacts here:

http://home.apache.org/~wang/3.0.0-RC1/

I've done the traditional testing of building from the source tarball and
running a Pi job on a single node cluster. I also verified that the shaded
jars are not empty.

Found one issue that create-release (probably due to the mvn deploy change)
didn't sign the artifacts, but I fixed that by calling mvn one more time.
Available here:

https://repository.apache.org/content/repositories/orgapachehadoop-1075/

This release will run the standard 5 days, closing on Dec 13th at 12:31pm
Pacific. My +1 to start.

Best,
Andrew



-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Re: [DISCUSS] Feature Branch Merge and Security Audits

2017-10-21 Thread Elek, Marton



On 10/21/2017 02:41 AM, larry mccay wrote:



"We might want to start a security section for Hadoop wiki for each of the
services and components.
This helps to track what has been completed."


Do you mean to keep the audit checklist for each service and component
there?
Interesting idea, I wonder what sort of maintenance that implies and
whether we want to take on that burden even though it would be great
information to have for future reviewers.


I think we should care about the maintenance of the documentation 
anyway. We also need to maintain all the other documentations. I think 
it could be even part of the generated docs and not the wiki.


I also suggest to fill this list about the current trunk/3.0 as a first 
step.


1. It would be a very usefull documentation for the end-users (some 
answers could link the existing documentation, it exists, but I am not 
sure if all the answers are in the current documentation.)


2. It would be a good example who the questions could be answered.

3. It would help to check, if something is missing from the list.

4. There are future branches where some of the components are not 
touched. For example, no web ui or no REST service. A prefilled list 
could help to check if the branch doesn't break any old security 
functionality on trunk.


5. It helps to document the security features in one place. If we have a 
list for the existing functionality in the same format, it would be easy 
to merge the new documentation of the new features as they will be 
reported in the same form. (So it won't be so hard to maintain the list...).


Marton

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Re: HADOOP-14163 proposal for new hadoop.apache.org

2018-02-16 Thread Elek, Marton

Hi,

I would like to bump this thread up.

TLDR; There is a proposed version of a new hadoop site which is 
available from here: https://elek.github.io/hadoop-site-proposal/ and 
https://issues.apache.org/jira/browse/HADOOP-14163


Please let me know what you think about it.


Longer version:

This thread started long time ago to use a more modern hadoop site:

Goals were:

 1. To make it easier to manage it (the release entries could be 
created by a script as part of the release process)

 2. To use a better look-and-feel
 3. Move it out from svn to git

I proposed to:

 1. Move the existing site to git and generate it with hugo (which is a 
single, standalone binary)

 2. Move both the rendered and source branches to git.
 3. (Create a jenkins job to generate the site automatically)

NOTE: this is just about forrest based hadoop.apache.org, NOT about the 
documentation which is generated by mvn-site (as before)



I got multiple valuable feedback and I improved the proposed site 
according to the comments. Allen had some concerns about the used 
technologies (hugo vs. mvn-site) and I answered all the questions why I 
think mvn-site is the best for documentation and hugo is best for 
generating site.



I would like to finish this effort/jira: I would like to start a 
discussion about using this proposed version and approach as a new site 
of Apache Hadoop. Please let me know what you think.



Thanks a lot,
Marton

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Re: HADOOP-14163 proposal for new hadoop.apache.org

2018-06-21 Thread Elek, Marton



Thank you very much to bump up this thread.


About [2]: (Just for the clarification) the content of the proposed 
website is exactly the same as the old one.


About [1]. I believe that the "mvn site" is perfect for the 
documentation but for website creation there are more simple and 
powerful tools.


Hugo has more simple compared to jekyll. Just one binary, without 
dependencies, works everywhere (mac, linux, windows)


Hugo has much more powerful compared to "mvn site". Easier to create/use 
more modern layout/theme, and easier to handle the content (for example 
new release announcements could be generated as part of the release 
process)


I think it's very low risk to try out a new approach for the site (and 
easy to rollback in case of problems)


Marton

ps: I just updated the patch/preview site with the recent releases:

***
* http://hadoop.anzix.net *
***

On 06/21/2018 01:27 AM, Vinod Kumar Vavilapalli wrote:

Got pinged about this offline.

Thanks for keeping at it, Marton!

I think there are two road-blocks here
  (1) Is the mechanism using which the website is built good enough - mvn-site 
/ hugo etc?
  (2) Is the new website good enough?

For (1), I just think we need more committer attention and get feedback rapidly 
and get it in.

For (2), how about we do it in a different way in the interest of progress?
  - We create a hadoop.apache.org/new-site/ where this new site goes.
  - We then modify the existing web-site to say that there is a new 
site/experience that folks can click on a link and navigate to
  - As this new website matures and gets feedback & fixes, we finally pull the 
plug at a later point of time when we think we are good to go.

Thoughts?

+Vinod


On Feb 16, 2018, at 3:10 AM, Elek, Marton  wrote:

Hi,

I would like to bump this thread up.

TLDR; There is a proposed version of a new hadoop site which is available from 
here: https://elek.github.io/hadoop-site-proposal/ and 
https://issues.apache.org/jira/browse/HADOOP-14163

Please let me know what you think about it.


Longer version:

This thread started long time ago to use a more modern hadoop site:

Goals were:

1. To make it easier to manage it (the release entries could be created by a 
script as part of the release process)
2. To use a better look-and-feel
3. Move it out from svn to git

I proposed to:

1. Move the existing site to git and generate it with hugo (which is a single, 
standalone binary)
2. Move both the rendered and source branches to git.
3. (Create a jenkins job to generate the site automatically)

NOTE: this is just about forrest based hadoop.apache.org, NOT about the 
documentation which is generated by mvn-site (as before)


I got multiple valuable feedback and I improved the proposed site according to 
the comments. Allen had some concerns about the used technologies (hugo vs. 
mvn-site) and I answered all the questions why I think mvn-site is the best for 
documentation and hugo is best for generating site.


I would like to finish this effort/jira: I would like to start a discussion 
about using this proposed version and approach as a new site of Apache Hadoop. 
Please let me know what you think.


Thanks a lot,
Marton

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org




-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Re: [VOTE] Release Apache Hadoop 3.1.1 - RC0

2018-08-07 Thread Elek, Marton



+1 (non-binding)

1. Built from the source package.
2. Checked the signature
3. Started docker based pseudo cluster and smoketested some basic 
functionality (hdfs cli, ec cli, viewfs, yarn examples, spark word count 
job)


Thank you very much the work Wangda.
Marton


On 08/02/2018 08:43 PM, Wangda Tan wrote:

Hi folks,

I've created RC0 for Apache Hadoop 3.1.1. The artifacts are available here:

http://people.apache.org/~wangda/hadoop-3.1.1-RC0/

The RC tag in git is release-3.1.1-RC0:
https://github.com/apache/hadoop/commits/release-3.1.1-RC0

The maven artifacts are available via repository.apache.org at
https://repository.apache.org/content/repositories/orgapachehadoop-1139/

You can find my public key at
http://svn.apache.org/repos/asf/hadoop/common/dist/KEYS

This vote will run 5 days from now.

3.1.1 contains 435 [1] fixed JIRA issues since 3.1.0.

I have done testing with a pseudo cluster and distributed shell job. My +1
to start.

Best,
Wangda Tan

[1] project in (YARN, HADOOP, MAPREDUCE, HDFS) AND fixVersion in (3.1.1)
ORDER BY priority DESC



-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[DISCUSS] Alpha Release of Ozone

2018-08-06 Thread Elek, Marton

Hi All,

I would like to discuss creating an Alpha release for Ozone. The core 
functionality of Ozone is complete but there are two missing features; 
Security and HA, work on these features are progressing in Branches 
HDDS-4 and HDDS-151. Right now, Ozone can handle millions of keys and 
has a Hadoop compatible file system, which allows applications like 
Hive, Spark, and YARN use Ozone.


Having an Alpha release of Ozone will help in getting some early 
feedback (this release will be marked as an Alpha -- and not production 
ready).


Going through a complete release cycle will help us flush out Ozone 
release process, update user documentation and nail down deployment models.


Please share your thoughts on the Alpha release (over mail or in 
HDDS-214), as voted on by the community earlier, Ozone release will be 
independent of Hadoop releases.


Thanks a lot,
Marton Elek




-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Re: HADOOP-14163 proposal for new hadoop.apache.org

2018-09-07 Thread Elek, Marton

Thanks all the positive feedback.

I just uploaded the new site to the new repository:

https://gitbox.apache.org/repos/asf/hadoop-site.git (asf-site branch)

It contains:

1. Same content, new layout. (source files of the site)

2. The rendered content under /content together with all the javadocs 
(289003 file)


3. The old site (as suggested by Vinod. I added a link back to the old 
site). https://hadoop.apache.org/old


Infra has already changed the pubsub script. The new site is live. 
Please let me know if you see any problem...


I will update the wiki pages / release instruction very soon.

Thanks,
Marton

ps:

Please give me write permission to the OLD wiki 
(https://wiki.apache.org/hadoop/), if you can. My username is MartonElek

Thanks a lot.


On 08/31/2018 10:07 AM, Elek, Marton wrote:

Bumping this thread at last time.

I have the following proposal:

1. I will request a new git repository hadoop-site.git and import the 
new site to there (which has exactly the same content as the existing 
site).


2. I will ask infra to use the new repository as the source of 
hadoop.apache.org


3. I will sync manually all of the changes in the next two months back 
to the svn site from the git (release announcements, new committers)


IN CASE OF ANY PROBLEM we can switch back to the svn without any problem.

If no-one objects within three days, I'll assume lazy consensus and 
start with this plan. Please comment if you have objections.


Again: it allows immediate fallback at any time as svn repo will be kept 
as is (+ I will keep it up-to-date in the next 2 months)


Thanks,
Marton


On 06/21/2018 09:00 PM, Elek, Marton wrote:


Thank you very much to bump up this thread.


About [2]: (Just for the clarification) the content of the proposed 
website is exactly the same as the old one.


About [1]. I believe that the "mvn site" is perfect for the 
documentation but for website creation there are more simple and 
powerful tools.


Hugo has more simple compared to jekyll. Just one binary, without 
dependencies, works everywhere (mac, linux, windows)


Hugo has much more powerful compared to "mvn site". Easier to 
create/use more modern layout/theme, and easier to handle the content 
(for example new release announcements could be generated as part of 
the release process)


I think it's very low risk to try out a new approach for the site (and 
easy to rollback in case of problems)


Marton

ps: I just updated the patch/preview site with the recent releases:

***
* http://hadoop.anzix.net *
***

On 06/21/2018 01:27 AM, Vinod Kumar Vavilapalli wrote:

Got pinged about this offline.

Thanks for keeping at it, Marton!

I think there are two road-blocks here
  (1) Is the mechanism using which the website is built good enough - 
mvn-site / hugo etc?

  (2) Is the new website good enough?

For (1), I just think we need more committer attention and get 
feedback rapidly and get it in.


For (2), how about we do it in a different way in the interest of 
progress?

  - We create a hadoop.apache.org/new-site/ where this new site goes.
  - We then modify the existing web-site to say that there is a new 
site/experience that folks can click on a link and navigate to
  - As this new website matures and gets feedback & fixes, we finally 
pull the plug at a later point of time when we think we are good to go.


Thoughts?

+Vinod


On Feb 16, 2018, at 3:10 AM, Elek, Marton  wrote:

Hi,

I would like to bump this thread up.

TLDR; There is a proposed version of a new hadoop site which is 
available from here: https://elek.github.io/hadoop-site-proposal/ 
and https://issues.apache.org/jira/browse/HADOOP-14163


Please let me know what you think about it.


Longer version:

This thread started long time ago to use a more modern hadoop site:

Goals were:

1. To make it easier to manage it (the release entries could be 
created by a script as part of the release process)

2. To use a better look-and-feel
3. Move it out from svn to git

I proposed to:

1. Move the existing site to git and generate it with hugo (which is 
a single, standalone binary)

2. Move both the rendered and source branches to git.
3. (Create a jenkins job to generate the site automatically)

NOTE: this is just about forrest based hadoop.apache.org, NOT about 
the documentation which is generated by mvn-site (as before)



I got multiple valuable feedback and I improved the proposed site 
according to the comments. Allen had some concerns about the used 
technologies (hugo vs. mvn-site) and I answered all the questions 
why I think mvn-site is the best for documentation and hugo is best 
for generating site.



I would like to finish this effort/jira: I would like to start a 
discussion about using this proposed version and approach as a new 
site of Apache Hadoop. Please let me know what you think.



Thanks a lot,
Marton


Re: HADOOP-14163 proposal for new hadoop.apache.org

2018-08-31 Thread Elek, Marton

Bumping this thread at last time.

I have the following proposal:

1. I will request a new git repository hadoop-site.git and import the 
new site to there (which has exactly the same content as the existing site).


2. I will ask infra to use the new repository as the source of 
hadoop.apache.org


3. I will sync manually all of the changes in the next two months back 
to the svn site from the git (release announcements, new committers)


IN CASE OF ANY PROBLEM we can switch back to the svn without any problem.

If no-one objects within three days, I'll assume lazy consensus and 
start with this plan. Please comment if you have objections.


Again: it allows immediate fallback at any time as svn repo will be kept 
as is (+ I will keep it up-to-date in the next 2 months)


Thanks,
Marton


On 06/21/2018 09:00 PM, Elek, Marton wrote:


Thank you very much to bump up this thread.


About [2]: (Just for the clarification) the content of the proposed 
website is exactly the same as the old one.


About [1]. I believe that the "mvn site" is perfect for the 
documentation but for website creation there are more simple and 
powerful tools.


Hugo has more simple compared to jekyll. Just one binary, without 
dependencies, works everywhere (mac, linux, windows)


Hugo has much more powerful compared to "mvn site". Easier to create/use 
more modern layout/theme, and easier to handle the content (for example 
new release announcements could be generated as part of the release 
process)


I think it's very low risk to try out a new approach for the site (and 
easy to rollback in case of problems)


Marton

ps: I just updated the patch/preview site with the recent releases:

***
* http://hadoop.anzix.net *
***

On 06/21/2018 01:27 AM, Vinod Kumar Vavilapalli wrote:

Got pinged about this offline.

Thanks for keeping at it, Marton!

I think there are two road-blocks here
  (1) Is the mechanism using which the website is built good enough - 
mvn-site / hugo etc?

  (2) Is the new website good enough?

For (1), I just think we need more committer attention and get 
feedback rapidly and get it in.


For (2), how about we do it in a different way in the interest of 
progress?

  - We create a hadoop.apache.org/new-site/ where this new site goes.
  - We then modify the existing web-site to say that there is a new 
site/experience that folks can click on a link and navigate to
  - As this new website matures and gets feedback & fixes, we finally 
pull the plug at a later point of time when we think we are good to go.


Thoughts?

+Vinod


On Feb 16, 2018, at 3:10 AM, Elek, Marton  wrote:

Hi,

I would like to bump this thread up.

TLDR; There is a proposed version of a new hadoop site which is 
available from here: https://elek.github.io/hadoop-site-proposal/ and 
https://issues.apache.org/jira/browse/HADOOP-14163


Please let me know what you think about it.


Longer version:

This thread started long time ago to use a more modern hadoop site:

Goals were:

1. To make it easier to manage it (the release entries could be 
created by a script as part of the release process)

2. To use a better look-and-feel
3. Move it out from svn to git

I proposed to:

1. Move the existing site to git and generate it with hugo (which is 
a single, standalone binary)

2. Move both the rendered and source branches to git.
3. (Create a jenkins job to generate the site automatically)

NOTE: this is just about forrest based hadoop.apache.org, NOT about 
the documentation which is generated by mvn-site (as before)



I got multiple valuable feedback and I improved the proposed site 
according to the comments. Allen had some concerns about the used 
technologies (hugo vs. mvn-site) and I answered all the questions why 
I think mvn-site is the best for documentation and hugo is best for 
generating site.



I would like to finish this effort/jira: I would like to start a 
discussion about using this proposed version and approach as a new 
site of Apache Hadoop. Please let me know what you think.



Thanks a lot,
Marton

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org




-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org


-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Re: [VOTE] Release Apache Hadoop 3.0.1 (RC1)

2018-03-22 Thread Elek, Marton


+1 (non binding)

I did a full build from source code, created a docker container and did 
various basic level tests with robotframework based automation and 
docker-compose based pseudo clusters[1].


Including:

* Hdfs federation smoke test
* Basic ViewFS configuration
* Yarn example jobs
* Spark example jobs (with and without yarn)
* Simple hive table creation

Marton


[1]: https://github.com/flokkr/runtime-compose

On 03/18/2018 05:11 AM, Lei Xu wrote:

Hi, all

I've created release candidate RC-1 for Apache Hadoop 3.0.1

Apache Hadoop 3.0.1 will be the first bug fix release for Apache
Hadoop 3.0 release. It includes 49 bug fixes and security fixes, which
include 12
blockers and 17 are critical.

Please note:
* HDFS-12990. Change default NameNode RPC port back to 8020. It makes
incompatible changes to Hadoop 3.0.0.  After 3.0.1 releases, Apache
Hadoop 3.0.0 will be deprecated due to this change.

The release page is:
https://cwiki.apache.org/confluence/display/HADOOP/Hadoop+3.0+Release

New RC is available at: http://home.apache.org/~lei/hadoop-3.0.1-RC1/

The git tag is release-3.0.1-RC1, and the latest commit is
496dc57cc2e4f4da117f7a8e3840aaeac0c1d2d0

The maven artifacts are available at:
https://repository.apache.org/content/repositories/orgapachehadoop-1081/

Please try the release and vote; the vote will run for the usual 5
days, ending on 3/22/2017 6pm PST time.

Thanks!

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[DISCUSS] Release Apache Hadoop Ozone 0.3.0-alpha

2018-10-25 Thread Elek, Marton
Hi all,

Since the previous 0.2.1-alpha ozone release more than 170 patches have
been  committed to the apache trunk under hadoop-ozone/hadoop-hdds
subprojects.

The 0.3.0-alpha release carries an S3 compatible rest server. This
allows S3 applications to work against Ozone with zero-changes. The
first phase of the implementation is done and it has been tested with
various s3 tools such as aws s3 cli and s3 fuse driver.

Stability and usability have also significantly improved due to testing
using hive/spark applications.

The community had agreed to separate the ozone/hadoop release lifecycle
to make it possible to create more frequent ozone releases, So I propose
to release the next ozone version (0.3.0-alpha) in the next few weeks.

Thanks,
Marton


PS: The Ozone release plan is available from the wiki:

https://cwiki.apache.org/confluence/display/HADOOP/Ozone+Road+Map


PPS: We have regular community calls for informal ozone related discussions:

https://cwiki.apache.org/confluence/display/HADOOP/Ozone+Community+Calls

Please join if you are interested, have questions etc.

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[VOTE] Release Apache Hadoop Ozone 0.3.0-alpha (RC0)

2018-11-13 Thread Elek, Marton
Hi all,

I've created the first release candidate (RC0) for Apache Hadoop Ozone
0.3.0-alpha according to the plans shared here previously.

This is the second release of Apache Hadoop Ozone. Notable changes since
the first release:

* A new S3 compatible rest server is added. Ozone can be used from any
S3 compatible tools (HDDS-434)
* Ozone Hadoop file system URL prefix is renamed from o3:// to o3fs://
(HDDS-651)
* Extensive testing and stability improvements of OzoneFs.
* Spark, YARN and Hive support and stability improvements.
* Improved Pipeline handling and recovery.
* Separated/dedicated classpath definitions for all the Ozone
components. (HDDS-447)

The RC artifacts are available from:
https://home.apache.org/~elek/ozone-0.3.0-alpha-rc0/

The RC tag in git is: ozone-0.3.0-alpha-RC0 (dc661083683)

Please try it out, vote, or just give us feedback.

The vote will run for 5 days, ending on November 18, 2018 13:00 UTC.


Thank you very much,
Marton

PS:

The easiest way to try it out is:

1. Download the binary artifact
2. Read the docs from ./docs/index.html
3. TLDR; cd compose/ozone && docker-compose up -d
4. open localhost:9874 or localhost:9876



The easiest way to try it out from the source:

1. mvn  install -DskipTests -Pdist -Dmaven.javadoc.skip=true -Phdds
-DskipShade -am -pl :hadoop-ozone-dist
2. cd hadoop-ozone/dist/target/ozone-0.3.0-alpha && docker-compose up -d



The easiest way to test basic functionality (with acceptance tests):

1. mvn  install -DskipTests -Pdist -Dmaven.javadoc.skip=true -Phdds
-DskipShade -am -pl :hadoop-ozone-dist
2. cd hadoop-ozone/dist/target/ozone-0.3.0-alpha/smoketest
3. ./test.sh

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[CANCELED] [VOTE] Release Apache Hadoop Ozone 0.3.0-alpha (RC0)

2018-11-14 Thread Elek, Marton
Unfortunately a memory issue is found with the default settings. Fixed
in HDDS-834 (thanks Mukul and Shashikant)

I cancel this vote and start a rc1 soon.

Marton

On 11/13/18 1:53 PM, Elek, Marton wrote:
> Hi all,
> 
> I've created the first release candidate (RC0) for Apache Hadoop Ozone
> 0.3.0-alpha according to the plans shared here previously.
> 
> This is the second release of Apache Hadoop Ozone. Notable changes since
> the first release:
> 
> * A new S3 compatible rest server is added. Ozone can be used from any
> S3 compatible tools (HDDS-434)
> * Ozone Hadoop file system URL prefix is renamed from o3:// to o3fs://
> (HDDS-651)
> * Extensive testing and stability improvements of OzoneFs.
> * Spark, YARN and Hive support and stability improvements.
> * Improved Pipeline handling and recovery.
> * Separated/dedicated classpath definitions for all the Ozone
> components. (HDDS-447)
> 
> The RC artifacts are available from:
> https://home.apache.org/~elek/ozone-0.3.0-alpha-rc0/
> 
> The RC tag in git is: ozone-0.3.0-alpha-RC0 (dc661083683)
> 
> Please try it out, vote, or just give us feedback.
> 
> The vote will run for 5 days, ending on November 18, 2018 13:00 UTC.
> 
> 
> Thank you very much,
> Marton
> 
> PS:
> 
> The easiest way to try it out is:
> 
> 1. Download the binary artifact
> 2. Read the docs from ./docs/index.html
> 3. TLDR; cd compose/ozone && docker-compose up -d
> 4. open localhost:9874 or localhost:9876
> 
> 
> 
> The easiest way to try it out from the source:
> 
> 1. mvn  install -DskipTests -Pdist -Dmaven.javadoc.skip=true -Phdds
> -DskipShade -am -pl :hadoop-ozone-dist
> 2. cd hadoop-ozone/dist/target/ozone-0.3.0-alpha && docker-compose up -d
> 
> 
> 
> The easiest way to test basic functionality (with acceptance tests):
> 
> 1. mvn  install -DskipTests -Pdist -Dmaven.javadoc.skip=true -Phdds
> -DskipShade -am -pl :hadoop-ozone-dist
> 2. cd hadoop-ozone/dist/target/ozone-0.3.0-alpha/smoketest
> 3. ./test.sh
> 
> -
> To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: common-dev-h...@hadoop.apache.org
> 

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[VOTE] Release Apache Hadoop Ozone 0.3.0-alpha (RC1)

2018-11-14 Thread Elek, Marton
Hi all,

I've created the second release candidate (RC1) for Apache Hadoop Ozone
0.3.0-alpha including one more fix on top of the previous RC0 (HDDS-854)

This is the second release of Apache Hadoop Ozone. Notable changes since
the first release:

* A new S3 compatible rest server is added. Ozone can be used from any
S3 compatible tools (HDDS-434)
* Ozone Hadoop file system URL prefix is renamed from o3:// to o3fs://
(HDDS-651)
* Extensive testing and stability improvements of OzoneFs.
* Spark, YARN and Hive support and stability improvements.
* Improved Pipeline handling and recovery.
* Separated/dedicated classpath definitions for all the Ozone
components. (HDDS-447)

The RC artifacts are available from:
https://home.apache.org/~elek/ozone-0.3.0-alpha-rc1/

The RC tag in git is: ozone-0.3.0-alpha-RC1 (ebbf459e6a6)

Please try it out, vote, or just give us feedback.

The vote will run for 5 days, ending on November 19, 2018 18:00 UTC.


Thank you very much,
Marton

PS:

The easiest way to try it out is:

1. Download the binary artifact
2. Read the docs from ./docs/index.html
3. TLDR; cd compose/ozone && docker-compose up -d
4. open localhost:9874 or localhost:9876



The easiest way to try it out from the source:

1. mvn  install -DskipTests -Pdist -Dmaven.javadoc.skip=true -Phdds
-DskipShade -am -pl :hadoop-ozone-dist
2. cd hadoop-ozone/dist/target/ozone-0.3.0-alpha && docker-compose up -d



The easiest way to test basic functionality (with acceptance tests):

1. mvn  install -DskipTests -Pdist -Dmaven.javadoc.skip=true -Phdds
-DskipShade -am -pl :hadoop-ozone-dist
2. cd hadoop-ozone/dist/target/ozone-0.3.0-alpha/smoketest
3. ./test.sh

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[ANNOUNCE] Apache Hadoop Ozone 0.2.1-alpha release

2018-10-01 Thread Elek, Marton



It gives me great pleasure to announce that the Apache Hadoop community 
has voted to release Apache Hadoop Ozone 0.2.1-alpha.


Apache Hadoop Ozone is an object store for Hadoop built using Hadoop 
Distributed Data Store.


For more information and to download, please check

https://hadoop.apache.org/ozone

Note: This release is alpha quality, it's not recommended to use in 
production.


Many thanks to everyone who contributed to the release, and everyone in 
the Apache Hadoop community! The release is a result of work from many 
contributors. Thank you for all of them.


On behalf of the Hadoop community,
Márton Elek


ps: Hadoop Ozone and HDDS are released separately from the main Hadoop 
releases, this release doesn't include new Hadoop Yarn/Mapreduce/Hdfs 
versions.


-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[VOTE] Release Apache Hadoop Ozone 0.2.1-alpha (RC0)

2018-09-19 Thread Elek, Marton

Hi all,

After the recent discussion about the first Ozone release I've created 
the first release candidate (RC0) for Apache Hadoop Ozone 0.2.1-alpha.


This release is alpha quality: it’s not recommended to use in production 
but we believe that it’s stable enough to try it out the feature set and 
collect feedback.


The RC artifacts are available from: 
https://home.apache.org/~elek/ozone-0.2.1-alpha-rc0/


The RC tag in git is: ozone-0.2.1-alpha-RC0 (968082ffa5d)

Please try the release and vote; the vote will run for the usual 5 
working days, ending on September 26, 2018 10pm UTC time.


The easiest way to try it out is:

1. Download the binary artifact
2. Read the docs at ./docs/index.html
3. TLDR; cd compose/ozone && docker-compose up -d


Please try it out, vote, or just give us feedback.

Thank you very much,
Marton

ps: At next week, we will have a BoF session at ApacheCon North Europe, 
Montreal on Monday evening. Please join, if you are interested, or need 
support to try out the package or just have any feedback.



-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Re: [VOTE] Release Apache Hadoop 2.8.5 (RC0)

2018-09-19 Thread Elek, Marton

Please try

git clone https://gitbox.apache.org/repos/asf/hadoop-site.git -b asf-site

(It seems git tries to check out master instead of the branch).

I updated the wiki, sorry for the inconvenience.

Marton

On 9/18/18 8:05 PM, 俊平堵 wrote:

Hey Marton,
      The new release web-site actually doesn't work for me.  When I 
follow your steps in wiki, and hit the issue during git clone repository 
(writable) for hadoop-site as below:


git clone https://gitbox.apache.org/repos/asf/hadoop-site.git
Cloning into 'hadoop-site'...
remote: Counting objects: 252414, done.
remote: Compressing objects: 100% (29625/29625), done.
remote: Total 252414 (delta 219617), reused 252211 (delta 219422)
Receiving objects: 100% (252414/252414), 98.78 MiB | 3.32 MiB/s, done.
Resolving deltas: 100% (219617/219617), done.
warning: remote HEAD refers to nonexistent ref, unable to checkout.

Can you check above repository is correct for clone?
I can clone readable repository (https://github.com/apache/hadoop-site) 
successfully though but cannot push back changes which is expected.


Thanks,

Junping

Elek, Marton mailto:e...@apache.org>>于2018年9月17日 
周一上午6:15写道:


Hi Junping,

Thank you to work on this release.

This release is the first release after the hadoop site change, and I
would like to be sure that everything works fine.

Unfortunately I didn't get permission to edit the old wiki, but this is
definition of the site update on the new wiki:


https://cwiki.apache.org/confluence/display/HADOOP/How+to+generate+and+push+ASF+web+site+after+HADOOP-14163

Please let me know if something is not working for you...

Thanks,
Marton


On 09/10/2018 02:00 PM, 俊平堵 wrote:
 > Hi all,
 >
 >       I've created the first release candidate (RC0) for Apache
 > Hadoop 2.8.5. This is our next point release to follow up 2.8.4. It
 > includes 33 important fixes and improvements.
 >
 >
 >      The RC artifacts are available at:
 > http://home.apache.org/~junping_du/hadoop-2.8.5-RC0
 >
 >
 >      The RC tag in git is: release-2.8.5-RC0
 >
 >
 >
 >      The maven artifacts are available via repository.apache.org
<http://repository.apache.org><
 > http://repository.apache.org> at:
 >
 >
https://repository.apache.org/content/repositories/orgapachehadoop-1140
 >
 >
 >      Please try the release and vote; the vote will run for the
usual 5 working
 > days, ending on 9/15/2018 PST time.
 >
 >
 > Thanks,
 >
 >
 > Junping
 >



-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[RESULT][VOTE] Release Apache Hadoop Ozone 0.2.1-alpha (RC0)

2018-09-26 Thread Elek, Marton

Thank you very much the tests and the votes for all of you.

The vote is PASSED with the following details:

3 binding +1, (thanks Anu, Xiaoyu, Arpit)

10 non-binding +1, (thanks Hanisha, Bharat, Shashikant, Sandeep, Lokesh, 
Nanda, Mukul, Ajay, Dinesh) together with my closing +1 [*]


no -1/0

Thanks again, will publish the release and announce it soon...

Marton

[*]: My (non-binding) +1: I run full acceptance test and I also 
successfully executed Spark wordcount jobs (both with local and yarn 
executions) with Hadoop 3.1 and Spark 2.3 using ozonefs.


-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Re: [VOTE] Release Apache Hadoop 2.8.5 (RC0)

2018-09-23 Thread Elek, Marton
Yes. Currently you need to commit the generated site (source + rendered 
site, both are on the same branch).


We can create a jenkins job to do the generation + commit automatically.

I updated the wiki to make it more clean.

Marton

On 9/22/18 10:48 PM, Anu Engineer wrote:

I believe that you need to regenerate the site using ‘hugo’ command (hugo is a 
site builder). Then commit and push the generated files.

Thanks
Anu


On 9/22/18, 9:56 AM, "俊平堵"  wrote:

 Martin, thanks for your reply. It works now, but after git changes - I
 haven’t seen Apache Hadoop website get refreshed. It seems like to need
 some manually steps to refresh the website -if so, can you also update to
 the wiki?
 
 Thanks,
 
 Junping
 
 Elek, Marton 于2018年9月20日 周四下午1:40写道:
 
 > Please try

 >
 > git clone https://gitbox.apache.org/repos/asf/hadoop-site.git -b asf-site
 >
 > (It seems git tries to check out master instead of the branch).
 >
 > I updated the wiki, sorry for the inconvenience.
 >
 > Marton
 >
 > On 9/18/18 8:05 PM, 俊平堵 wrote:
 > > Hey Marton,
 > >   The new release web-site actually doesn't work for me.  When I
 > > follow your steps in wiki, and hit the issue during git clone 
repository
 > > (writable) for hadoop-site as below:
 > >
 > > git clone https://gitbox.apache.org/repos/asf/hadoop-site.git
 > > Cloning into 'hadoop-site'...
 > > remote: Counting objects: 252414, done.
 > > remote: Compressing objects: 100% (29625/29625), done.
 > > remote: Total 252414 (delta 219617), reused 252211 (delta 219422)
 > > Receiving objects: 100% (252414/252414), 98.78 MiB | 3.32 MiB/s, done.
 > > Resolving deltas: 100% (219617/219617), done.
 > > warning: remote HEAD refers to nonexistent ref, unable to checkout.
 > >
 > > Can you check above repository is correct for clone?
 > > I can clone readable repository (https://github.com/apache/hadoop-site)
 > > successfully though but cannot push back changes which is expected.
 > >
 > > Thanks,
 > >
 > > Junping
 > >
 > > Elek, Marton mailto:e...@apache.org>>于2018年9月17日
 > > 周一上午6:15写道:
 > >
 > > Hi Junping,
 > >
 > > Thank you to work on this release.
 > >
 > > This release is the first release after the hadoop site change, 
and I
 > > would like to be sure that everything works fine.
 > >
 > > Unfortunately I didn't get permission to edit the old wiki, but 
this
 > is
 > > definition of the site update on the new wiki:
 > >
 > >
 > 
https://cwiki.apache.org/confluence/display/HADOOP/How+to+generate+and+push+ASF+web+site+after+HADOOP-14163
 > >
 > > Please let me know if something is not working for you...
 > >
 > > Thanks,
 > > Marton
 > >
 > >
 > > On 09/10/2018 02:00 PM, 俊平堵 wrote:
 > >  > Hi all,
 > >  >
 > >  >   I've created the first release candidate (RC0) for Apache
 > >  > Hadoop 2.8.5. This is our next point release to follow up 2.8.4.
 > It
 > >  > includes 33 important fixes and improvements.
 > >  >
 > >  >
 > >  >  The RC artifacts are available at:
 > >  > http://home.apache.org/~junping_du/hadoop-2.8.5-RC0
 > >  >
 > >  >
 > >  >  The RC tag in git is: release-2.8.5-RC0
 > >  >
 > >  >
 > >  >
 > >  >  The maven artifacts are available via repository.apache.org
 > > <http://repository.apache.org><
 > >  > http://repository.apache.org> at:
 > >  >
 > >  >
 > >
 > https://repository.apache.org/content/repositories/orgapachehadoop-1140
 > >  >
 > >  >
 > >  >  Please try the release and vote; the vote will run for the
 > > usual 5 working
 > >  > days, ending on 9/15/2018 PST time.
 > >  >
 > >  >
 > >  > Thanks,
 > >  >
 > >  >
 > >  > Junping
 > >  >
 > >
 >
 



-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Re: [Urgent] Question about Nexus repo and Hadoop release

2019-01-15 Thread Elek, Marton
My key was pushed to the server with pgp about 1 year ago, and it worked
well with the last Ratis release. So it should be synced between the key
servers.

But it seems that the INFRA solved the problem with shuffling the key
server order (or it was an intermittent issue): see INFRA-17649

Seems to be working now...

Marton


On 1/15/19 5:19 AM, Wangda Tan wrote:
> HI Brain,
> Thanks for responding, could u share how to push to keys to Apache pgp pool?
> 
> Best,
> Wangda
> 
> On Mon, Jan 14, 2019 at 10:44 AM Brian Fox  wrote:
> 
>> Did you push your key up to the pgp pool? That's what Nexus is validating
>> against. It might take time to propagate if you just pushed it.
>>
>> On Mon, Jan 14, 2019 at 9:59 AM Elek, Marton  wrote:
>>
>>> Seems to be an INFRA issue for me:
>>>
>>> 1. I downloaded a sample jar file [1] + the signature from the
>>> repository and it was ok, locally I verified it.
>>>
>>> 2. I tested it with an other Apache project (Ratis) and my key. I got
>>> the same problem even if it worked at last year during the 0.3.0
>>> release. (I used exactly the same command)
>>>
>>> I opened an infra ticket to check the logs of the Nexus as it was
>>> suggested in the error message:
>>>
>>> https://issues.apache.org/jira/browse/INFRA-17649
>>>
>>> Marton
>>>
>>>
>>> [1]:
>>>
>>> https://repository.apache.org/service/local/repositories/orgapachehadoop-1183/content/org/apache/hadoop/hadoop-mapreduce-client-jobclient/3.1.2/hadoop-mapreduce-client-jobclient-3.1.2-javadoc.jar
>>>
>>>
>>> On 1/13/19 6:27 AM, Wangda Tan wrote:
>>>> Uploaded sample file and signature.
>>>>
>>>>
>>>>
>>>> On Sat, Jan 12, 2019 at 9:18 PM Wangda Tan >>> <mailto:wheele...@gmail.com>> wrote:
>>>>
>>>> Actually, among the hundreds of failed messages, the "No public key"
>>>> issues still occurred several times:
>>>>
>>>> failureMessage  No public key: Key with id: (b3fa653d57300d45)
>>>> was not able to be located on http://gpg-keyserver.de/. Upload
>>>> your public key and try the operation again.
>>>> failureMessage  No public key: Key with id: (b3fa653d57300d45)
>>>> was not able to be located on
>>>> http://pool.sks-keyservers.net:11371. Upload your public key
>>> and
>>>> try the operation again.
>>>> failureMessage  No public key: Key with id: (b3fa653d57300d45)
>>>> was not able to be located on http://pgp.mit.edu:11371. Upload
>>>> your public key and try the operation again.
>>>>
>>>> Once the close operation returned, I will upload sample files which
>>>> may help troubleshoot the issue.
>>>>
>>>> Thanks,
>>>>
>>>> On Sat, Jan 12, 2019 at 9:04 PM Wangda Tan >>> <mailto:wheele...@gmail.com>> wrote:
>>>>
>>>> Thanks David for the quick response!
>>>>
>>>> I just retried, now the "No public key" issue is gone. However,
>>>> the issue:
>>>>
>>>> failureMessage  Failed to validate the pgp signature of
>>>>
>>>  
>>> '/org/apache/hadoop/hadoop-mapreduce-client-jobclient/3.1.2/hadoop-mapreduce-client-jobclient-3.1.2-tests.jar',
>>>> check the logs.
>>>> failureMessage  Failed to validate the pgp signature of
>>>>
>>>  
>>> '/org/apache/hadoop/hadoop-mapreduce-client-jobclient/3.1.2/hadoop-mapreduce-client-jobclient-3.1.2-test-sources.jar',
>>>> check the logs.
>>>> failureMessage  Failed to validate the pgp signature of
>>>>
>>>  
>>> '/org/apache/hadoop/hadoop-mapreduce-client-jobclient/3.1.2/hadoop-mapreduce-client-jobclient-3.1.2.pom',
>>>> check the logs.
>>>>
>>>>
>>>> Still exists and repeated hundreds of times. Do you know how to
>>>> access the logs mentioned by above log?
>>>>
>>>> Best,
>>>> Wangda
>>>>
>>>> On Sat, Jan 12, 2019 at 8:37 PM David Nalley >>> <mailto:da...@gnsa.us>> wrote:
>>>>
>>>> On Sat, Jan 12, 2019 at 9:09 PM Wangda Tan
>&g

Re: [VOTE] Release Apache Hadoop 3.2.0 - RC1

2019-01-14 Thread Elek, Marton
Thanks Sunil to manage this release.

+1 (non-binding)

1. built from the source (with clean local maven repo)
2. verified signatures + checksum
3. deployed 3 node cluster to Google Kubernetes Engine with generated
k8s resources [1]
4. Executed basic HDFS commands
5. Executed basic yarn example jobs

Marton

[1]: FTR: resources:
https://github.com/flokkr/k8s/tree/master/examples/hadoop , generator:
https://github.com/elek/flekszible


On 1/8/19 12:42 PM, Sunil G wrote:
> Hi folks,
> 
> 
> Thanks to all of you who helped in this release [1] and for helping to vote
> for RC0. I have created second release candidate (RC1) for Apache Hadoop
> 3.2.0.
> 
> 
> Artifacts for this RC are available here:
> 
> http://home.apache.org/~sunilg/hadoop-3.2.0-RC1/
> 
> 
> RC tag in git is release-3.2.0-RC1.
> 
> 
> 
> The maven artifacts are available via repository.apache.org at
> https://repository.apache.org/content/repositories/orgapachehadoop-1178/
> 
> 
> This vote will run 7 days (5 weekdays), ending on 14th Jan at 11:59 pm PST.
> 
> 
> 
> 3.2.0 contains 1092 [2] fixed JIRA issues since 3.1.0. Below feature
> additions
> 
> are the highlights of this release.
> 
> 1. Node Attributes Support in YARN
> 
> 2. Hadoop Submarine project for running Deep Learning workloads on YARN
> 
> 3. Support service upgrade via YARN Service API and CLI
> 
> 4. HDFS Storage Policy Satisfier
> 
> 5. Support Windows Azure Storage - Blob file system in Hadoop
> 
> 6. Phase 3 improvements for S3Guard and Phase 5 improvements S3a
> 
> 7. Improvements in Router-based HDFS federation
> 
> 
> 
> Thanks to Wangda, Vinod, Marton for helping me in preparing the release.
> 
> I have done few testing with my pseudo cluster. My +1 to start.
> 
> 
> 
> Regards,
> 
> Sunil
> 
> 
> 
> [1]
> 
> https://lists.apache.org/thread.html/68c1745dcb65602aecce6f7e6b7f0af3d974b1bf0048e7823e58b06f@%3Cyarn-dev.hadoop.apache.org%3E
> 
> [2] project in (YARN, HADOOP, MAPREDUCE, HDFS) AND fixVersion in (3.2.0)
> AND fixVersion not in (3.1.0, 3.0.0, 3.0.0-beta1) AND status = Resolved
> ORDER BY fixVersion ASC
> 

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Re: [Urgent] Question about Nexus repo and Hadoop release

2019-01-14 Thread Elek, Marton
Seems to be an INFRA issue for me:

1. I downloaded a sample jar file [1] + the signature from the
repository and it was ok, locally I verified it.

2. I tested it with an other Apache project (Ratis) and my key. I got
the same problem even if it worked at last year during the 0.3.0
release. (I used exactly the same command)

I opened an infra ticket to check the logs of the Nexus as it was
suggested in the error message:

https://issues.apache.org/jira/browse/INFRA-17649

Marton


[1]:
https://repository.apache.org/service/local/repositories/orgapachehadoop-1183/content/org/apache/hadoop/hadoop-mapreduce-client-jobclient/3.1.2/hadoop-mapreduce-client-jobclient-3.1.2-javadoc.jar


On 1/13/19 6:27 AM, Wangda Tan wrote:
> Uploaded sample file and signature.  
> 
> 
> 
> On Sat, Jan 12, 2019 at 9:18 PM Wangda Tan  > wrote:
> 
> Actually, among the hundreds of failed messages, the "No public key"
> issues still occurred several times:
> 
> failureMessage  No public key: Key with id: (b3fa653d57300d45)
> was not able to be located on http://gpg-keyserver.de/. Upload
> your public key and try the operation again.
> failureMessage  No public key: Key with id: (b3fa653d57300d45)
> was not able to be located on
> http://pool.sks-keyservers.net:11371. Upload your public key and
> try the operation again.
> failureMessage  No public key: Key with id: (b3fa653d57300d45)
> was not able to be located on http://pgp.mit.edu:11371. Upload
> your public key and try the operation again.
> 
> Once the close operation returned, I will upload sample files which
> may help troubleshoot the issue. 
> 
> Thanks,
> 
> On Sat, Jan 12, 2019 at 9:04 PM Wangda Tan  > wrote:
> 
> Thanks David for the quick response! 
> 
> I just retried, now the "No public key" issue is gone. However, 
> the issue: 
> 
> failureMessage  Failed to validate the pgp signature of
> 
> '/org/apache/hadoop/hadoop-mapreduce-client-jobclient/3.1.2/hadoop-mapreduce-client-jobclient-3.1.2-tests.jar',
> check the logs.
> failureMessage  Failed to validate the pgp signature of
> 
> '/org/apache/hadoop/hadoop-mapreduce-client-jobclient/3.1.2/hadoop-mapreduce-client-jobclient-3.1.2-test-sources.jar',
> check the logs.
> failureMessage  Failed to validate the pgp signature of
> 
> '/org/apache/hadoop/hadoop-mapreduce-client-jobclient/3.1.2/hadoop-mapreduce-client-jobclient-3.1.2.pom',
> check the logs.
> 
> 
> Still exists and repeated hundreds of times. Do you know how to
> access the logs mentioned by above log?
> 
> Best,
> Wangda
> 
> On Sat, Jan 12, 2019 at 8:37 PM David Nalley  > wrote:
> 
> On Sat, Jan 12, 2019 at 9:09 PM Wangda Tan
> mailto:wheele...@gmail.com>> wrote:
> >
> > Hi Devs,
> >
> > I'm currently rolling Hadoop 3.1.2 release candidate,
> however, I saw an issue when I try to close repo in Nexus.
> >
> > Logs of https://repository.apache.org/#stagingRepositories
> (orgapachehadoop-1183) shows hundreds of lines of the
> following error:
> >
> > failureMessage  No public key: Key with id:
> (b3fa653d57300d45) was not able to be located on
> http://gpg-keyserver.de/. Upload your public key and try the
> operation again.
> > failureMessage  No public key: Key with id:
> (b3fa653d57300d45) was not able to be located on
> http://pool.sks-keyservers.net:11371. Upload your public key
> and try the operation again.
> > failureMessage  No public key: Key with id:
> (b3fa653d57300d45) was not able to be located on
> http://pgp.mit.edu:11371. Upload your public key and try the
> operation again.
> > ...
> > failureMessage  Failed to validate the pgp signature of
> 
> '/org/apache/hadoop/hadoop-yarn-registry/3.1.2/hadoop-yarn-registry-3.1.2-tests.jar',
> check the logs.
> > failureMessage  Failed to validate the pgp signature of
> 
> '/org/apache/hadoop/hadoop-yarn-registry/3.1.2/hadoop-yarn-registry-3.1.2-test-sources.jar',
> check the logs.
> > failureMessage  Failed to validate the pgp signature of
> 
> '/org/apache/hadoop/hadoop-yarn-registry/3.1.2/hadoop-yarn-registry-3.1.2-sources.jar',
> check the logs.
> >
> >
> > This is the same key I used before (and finished two
> releases), the same environment I used before.
> >
> > I 

Re: [DISCUSS] Move to gitbox

2018-12-13 Thread Elek, Marton



On 12/12/18 12:27 PM, Akira Ajisaka wrote:
> Thank you for your positive feedback! I'll file a jira to INFRA in this 
> weekend.
> 
>> If I understood well the only bigger task here is to update all the jenkins 
>> jobs. (I am happy to help/contribute what I can do)

> Thank you Elek for the information. Do you have the privilege to
> update the Jenkins jobs?
> 
I have, but I am more familiar with the Ozone jenkins jobs. I created a
jira (HADOOP-16003) to discuss / record the changes / where it can be
discussed or commented by anybody with more expertise.

Marton

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Re: [DISCUSS] Move to gitbox

2018-12-09 Thread Elek, Marton


Thanks Akira,

+1 (non-binding)

I think it's better to do it now at a planned date.

If I understood well the only bigger task here is to update all the
jenkins jobs. (I am happy to help/contribute what I can do)


Marton

On 12/8/18 6:25 AM, Akira Ajisaka wrote:
> Hi all,
> 
> Apache Hadoop git repository is in git-wip-us server and it will be
> decommissioned.
> If there are no objection, I'll file a JIRA ticket with INFRA to
> migrate to https://gitbox.apache.org/ and update documentation.
> 
> According to ASF infra team, the timeframe is as follows:
> 
>> - December 9th 2018 -> January 9th 2019: Voluntary (coordinated) relocation
>> - January 9th -> February 6th: Mandated (coordinated) relocation
>> - February 7th: All remaining repositories are mass migrated.
>> This timeline may change to accommodate various scenarios.
> 
> If we got consensus by January 9th, I can file a ticket with INFRA and
> migrate it.
> Even if we cannot got consensus, the repository will be migrated by
> February 7th.
> 
> Regards,
> Akira
> 
> -
> To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
> 

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[ANNOUNCE] Apache Hadoop Ozone 0.3.0-alpha release

2018-11-22 Thread Elek, Marton


It gives me great pleasure to announce that the Apache Hadoop community
has voted to release Apache Hadoop Ozone 0.3.0-alpha (Arches).

Apache Hadoop Ozone is an object store for Hadoop built using Hadoop
Distributed Data Store.

This release contains a new S3 compatible interface and additional
stability improvements.

For more information and to download, please check

https://hadoop.apache.org/ozone

Many thanks to everyone who contributed to the release, and everyone in
the Apache Hadoop community! The release is a result of work from many
contributors. Thank you for all of them.

Cheers,
Marton Elek

ps: This release is still alpha quality, it's not recommended to use in
production.


-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[RESULT][VOTE] Release Apache Hadoop Ozone 0.3.0-alpha (RC1)

2018-11-19 Thread Elek, Marton


Thank you very much the tests and the votes for all of you.

The vote is PASSED with the following details:

3 binding +1, (thanks Arpit, Anu, Jitendra)

6 non-binding +1, (thanks Dinesh, Shashikant, Lokesh, Mukul, Bharat)
together with my closing +1 [*]

no -1/0

Thanks again, will publish the release and announce it soon...

Marton

[*]: My (non-binding) +1: Compiled from the source package and run full
acceptance test from the binary package.

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Re: [NOTICE] Move to gitbox

2019-01-07 Thread Elek, Marton
Please use the gitbox url instead of git.a.o:

https://gitbox.apache.org/repos/asf/hadoop.git

Or the github url:

g...@github.com:apache/hadoop.git


Note: To use the github url for _push_, you may need to link your github
account to your apache account: https://gitbox.apache.org/setup/

Marton

ps: git.apache.org/hadoop was synced to the old repository and removed
by INFRA


On 1/7/19 3:11 PM, Steve Loughran wrote:
> doesn't like me doing a git pull on my R/O view of the repo
> 
>>  git remote -v
> apachegit://git.apache.org/hadoop.git (fetch)
> apachegit://git.apache.org/hadoop.git (push)
> 
>>  git checkout trunk; and git pull
> Checking out files: 100% (5084/5084), done.
> Switched to branch 'trunk'
> Your branch is up to date with 'apache/trunk'.
> fatal: remote error: access denied or repository not exported: /hadoop.git
> 
> 
> 
> 
>> On 7 Jan 2019, at 06:55, Akira Ajisaka > <mailto:aajis...@apache.org>> wrote:
>>
>> Thanks Ayush for the report and thanks Elek for the fix!
>>
>> -Akira
>>
>> 2019年1月3日(木) 0:54 Elek, Marton > <mailto:e...@apache.org>>:
>>>
>>> Thanks the report Ayush,
>>>
>>> The bogus repository is removed by the INFRA:
>>>
>>> https://issues.apache.org/jira/browse/INFRA-17526
>>>
>>> And the cwiki page[1] is updated to use the gitbox url instead of
>>> git.apache.org
>>>
>>> Marton
>>>
>>> [1] https://cwiki.apache.org/confluence/display/HADOOP/Git+And+Hadoop
>>>
>>> On 1/2/19 8:56 AM, Ayush Saxena wrote:
>>>> Hi Akira
>>>>
>>>> I guess the mirror at git.apache.org/hadoop.git hasn’t been updated
>>>> with the new location.
>>>>
>>>> It is still pointing to
>>>> https://git-wip-us.apache.org/repos/asf/hadoop.git
>>>>
>>>> Checking out the source still has its mention
>>>> As
>>>> git clone git://git.apache.org/hadoop.git
>>>>
>>>> Or this link needs to be updated?
>>>>
>>>> Can you give a check.
>>>>
>>>> -Ayush
>>>>
>>>>> On 31-Dec-2018, at 3:33 AM, Akira Ajisaka  wrote:
>>>>>
>>>>> Hi all,
>>>>>
>>>>> The migration has been finished.
>>>>> All the jenkins jobs under the hadoop view and ozone view were also
>>>>> updated except the beam_PerformanceTests_* jobs.
>>>>>
>>>>> Thank you Elek and ASF infra team for your help!
>>>>>
>>>>> Regards,
>>>>> Akira
>>>>>
>>>>> 2018年12月25日(火) 13:27 Akira Ajisaka :
>>>>>>
>>>>>> Hi all,
>>>>>>
>>>>>> The Apache Hadoop git repository will be migrated to gitbox at 9PM UTC
>>>>>> in 30th December.
>>>>>> After the migration, the old repository cannot be accessed. Please use
>>>>>> the new repository for committing. The migration is pretty much atomic
>>>>>> and it will take up to a few minutes.
>>>>>>
>>>>>> Old repository: https://git-wip-us.apache.org/repos/asf?p=hadoop.git
>>>>>> New repository: https://gitbox.apache.org/repos/asf?p=hadoop.git
>>>>>>
>>>>>> The GitHub repository (https://github.com/apache/hadoop) is not
>>>>>> affected.
>>>>>>
>>>>>> Elek will update the jenkins jobs and I'll update the source code and
>>>>>> documentation as soon as the migration is finished.
>>>>>>
>>>>>> Discussion:
>>>>>> https://lists.apache.org/thread.html/8b37cd69191648f1163ee23e3498f33da1c44ac876c6225b429dc835@%3Ccommon-dev.hadoop.apache.org%3E
>>>>>>
>>>>>> JIRA:
>>>>>> - https://issues.apache.org/jira/browse/HADOOP-16003
>>>>>> - https://issues.apache.org/jira/browse/INFRA-17448
>>>>>>
>>>>>> Happy Holidays!
>>>>>>
>>>>>> -Akira
>>>>>
>>>>> -
>>>>> To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
>>>>> For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
>>>>>
>>
>> -
>> To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
>> <mailto:yarn-dev-unsubscr...@hadoop.apache.org>
>> For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
>> <mailto:yarn-dev-h...@hadoop.apache.org>
> 

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Re: [OZONE] Community calls

2019-01-07 Thread Elek, Marton
As a reminder:

In the new year we will continue our weekly, informal discussion about
the current Ozone issues/work items.

This call is open for anybody who is interested about Ozone or has any
Ozone related questions.

The details are here:

https://cwiki.apache.org/confluence/display/HADOOP/Ozone+Community+Calls


>From 2019 I would like to write quick summaries about the calls to make
the discussions more visible.

If no objections I will send a weekly mails...

 * only to hdfs-dev (it's storage related)
 * with [Ozone][status] prefix (to make it easy to filter out)


Any feedback is welcome,

Thanks,
Marton

On 10/5/18 5:51 PM, Elek, Marton wrote:
> 
> Hi everybody,
> 
> 
> We start a new community call series about Apache Hadoop Ozone. It's an
> informal discussion about the current items, short-term plans,
> directions and contribution possibilities.
> 
> Please join if you are interested or have questions about Ozone.
> 
> For more details, please check:
> 
> https://cwiki.apache.org/confluence/display/HADOOP/Ozone+Community+Calls
> 
> Marton
> 
> ps: As it's written in the wiki, this is not a replacement of the
> mailing lists. All main proposals/decisions will be published to the
> mailing list/wiki to generate further discussion.
> 
> -
> To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Re: [NOTICE] Move to gitbox

2019-01-08 Thread Elek, Marton
Yes. git.apache.org/hadoop is deceased, deleted by INFRA.

But both the gitbox and github urls are working. If you don't configure
credentials for one of these (or don't link your apache account with
github account) you can use that one as read only repo url.

Marton

((I don't now the history of the git.apache.org. It seems to be a mirror
page only for the old-style+svn repositories))

On 1/7/19 3:18 PM, Steve Loughran wrote:
> OK. so the original git://git.apache.org/hadoop.git is in fact deceased?
> 
> 
> On 7 Jan 2019, at 14:15, Elek, Marton 
> mailto:e...@apache.org>> wrote:
> 
> Please use the gitbox url instead of git.a.o:
> 
> https://gitbox.apache.org/repos/asf/hadoop.git
> 
> Or the github url:
> 
> g...@github.com:apache/hadoop.git
> 
> 
> Note: To use the github url for _push_, you may need to link your github
> account to your apache account: https://gitbox.apache.org/setup/
> 
> Marton
> 
> ps: git.apache.org/hadoop was synced to the old repository and removed
> by INFRA
> 
> 
> On 1/7/19 3:11 PM, Steve Loughran wrote:
> doesn't like me doing a git pull on my R/O view of the repo
> 
>  git remote -v
> apachegit://git.apache.org/hadoop.git (fetch)
> apachegit://git.apache.org/hadoop.git (push)
> 
>  git checkout trunk; and git pull
> Checking out files: 100% (5084/5084), done.
> Switched to branch 'trunk'
> Your branch is up to date with 'apache/trunk'.
> fatal: remote error: access denied or repository not exported: /hadoop.git
> 
> 
> 
> 
> On 7 Jan 2019, at 06:55, Akira Ajisaka  <mailto:aajis...@apache.org>> wrote:
> 
> Thanks Ayush for the report and thanks Elek for the fix!
> 
> -Akira
> 
> 2019年1月3日(木) 0:54 Elek, Marton  <mailto:e...@apache.org>>:
> 
> Thanks the report Ayush,
> 
> The bogus repository is removed by the INFRA:
> 
> https://issues.apache.org/jira/browse/INFRA-17526
> 
> And the cwiki page[1] is updated to use the gitbox url instead of
> git.apache.org
> 
> Marton
> 
> [1] https://cwiki.apache.org/confluence/display/HADOOP/Git+And+Hadoop
> 
> On 1/2/19 8:56 AM, Ayush Saxena wrote:
> Hi Akira
> 
> I guess the mirror at git.apache.org/hadoop.git hasn’t been updated
> with the new location.
> 
> It is still pointing to
> https://git-wip-us.apache.org/repos/asf/hadoop.git
> 
> Checking out the source still has its mention
> As
> git clone git://git.apache.org/hadoop.git
> 
> Or this link needs to be updated?
> 
> Can you give a check.
> 
> -Ayush
> 
> On 31-Dec-2018, at 3:33 AM, Akira Ajisaka  wrote:
> 
> Hi all,
> 
> The migration has been finished.
> All the jenkins jobs under the hadoop view and ozone view were also
> updated except the beam_PerformanceTests_* jobs.
> 
> Thank you Elek and ASF infra team for your help!
> 
> Regards,
> Akira
> 
> 2018年12月25日(火) 13:27 Akira Ajisaka :
> 
> Hi all,
> 
> The Apache Hadoop git repository will be migrated to gitbox at 9PM UTC
> in 30th December.
> After the migration, the old repository cannot be accessed. Please use
> the new repository for committing. The migration is pretty much atomic
> and it will take up to a few minutes.
> 
> Old repository: https://git-wip-us.apache.org/repos/asf?p=hadoop.git
> New repository: https://gitbox.apache.org/repos/asf?p=hadoop.git
> 
> The GitHub repository (https://github.com/apache/hadoop) is not
> affected.
> 
> Elek will update the jenkins jobs and I'll update the source code and
> documentation as soon as the migration is finished.
> 
> Discussion:
> https://lists.apache.org/thread.html/8b37cd69191648f1163ee23e3498f33da1c44ac876c6225b429dc835@%3Ccommon-dev.hadoop.apache.org%3E
> 
> JIRA:
> - https://issues.apache.org/jira/browse/HADOOP-16003
> - https://issues.apache.org/jira/browse/INFRA-17448
> 
> Happy Holidays!
> 
> -Akira
> 
> -
> To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
> 
> 
> -
> To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
> <mailto:yarn-dev-unsubscr...@hadoop.apache.org>
> For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
> <mailto:yarn-dev-h...@hadoop.apache.org>
> 
> 
> -
> To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
> 
> 

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Re: [NOTICE] Move to gitbox

2019-01-02 Thread Elek, Marton
Thanks the report Ayush,

The bogus repository is removed by the INFRA:

https://issues.apache.org/jira/browse/INFRA-17526

And the cwiki page[1] is updated to use the gitbox url instead of
git.apache.org

Marton

[1] https://cwiki.apache.org/confluence/display/HADOOP/Git+And+Hadoop

On 1/2/19 8:56 AM, Ayush Saxena wrote:
> Hi Akira
> 
> I guess the mirror at git.apache.org/hadoop.git hasn’t been updated with the 
> new location.
> 
> It is still pointing to https://git-wip-us.apache.org/repos/asf/hadoop.git
> 
> Checking out the source still has its mention 
> As 
> git clone git://git.apache.org/hadoop.git 
> 
> Or this link needs to be updated?
> 
> Can you give a check.
> 
> -Ayush
> 
>> On 31-Dec-2018, at 3:33 AM, Akira Ajisaka  wrote:
>>
>> Hi all,
>>
>> The migration has been finished.
>> All the jenkins jobs under the hadoop view and ozone view were also
>> updated except the beam_PerformanceTests_* jobs.
>>
>> Thank you Elek and ASF infra team for your help!
>>
>> Regards,
>> Akira
>>
>> 2018年12月25日(火) 13:27 Akira Ajisaka :
>>>
>>> Hi all,
>>>
>>> The Apache Hadoop git repository will be migrated to gitbox at 9PM UTC
>>> in 30th December.
>>> After the migration, the old repository cannot be accessed. Please use
>>> the new repository for committing. The migration is pretty much atomic
>>> and it will take up to a few minutes.
>>>
>>> Old repository: https://git-wip-us.apache.org/repos/asf?p=hadoop.git
>>> New repository: https://gitbox.apache.org/repos/asf?p=hadoop.git
>>>
>>> The GitHub repository (https://github.com/apache/hadoop) is not affected.
>>>
>>> Elek will update the jenkins jobs and I'll update the source code and
>>> documentation as soon as the migration is finished.
>>>
>>> Discussion: 
>>> https://lists.apache.org/thread.html/8b37cd69191648f1163ee23e3498f33da1c44ac876c6225b429dc835@%3Ccommon-dev.hadoop.apache.org%3E
>>>
>>> JIRA:
>>> - https://issues.apache.org/jira/browse/HADOOP-16003
>>> - https://issues.apache.org/jira/browse/INFRA-17448
>>>
>>> Happy Holidays!
>>>
>>> -Akira
>>
>> -
>> To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
>> For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
>>

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Re: [DISCUSS] Docker build process

2019-03-19 Thread Elek, Marton



Thank you Eric to describe the problem.

I have multiple small comments, trying to separate them.

I. separated vs in-build container image creation

> The disadvantages are:
>
> 1.  Require developer to have access to docker.
> 2.  Default build takes longer.


These are not the only disadvantages (IMHO) as I wrote it in in the
previous thread and the issue [1]

Using in-build container image creation doesn't enable:

1. to modify the image later (eg. apply security fixes to the container
itself or apply improvements for the startup scripts)
2. create images for older releases (eg. hadoop 2.7.1)

I think there are two kind of images:

a) images for released artifacts
b) developer images

I would prefer to manage a) with separated branch repositories but b)
with (optional!) in-build process.

II. Agree with Steve. I think it's better to make it optional as most of
the time it's not required. I think it's better to support the default
dev build with the default settings (=just enough to start)

III. Maven best practices

(https://dzone.com/articles/maven-profile-best-practices)

I think this is a good article. But this is not against profiles but
creating multiple versions from the same artifact with the same name
(eg. jdk8/jdk11). In Hadoop, profiles are used to introduce optional
steps. I think it's fine as the maven lifecycle/phase model is very
static (compare it with the tree based approach in Gradle).

Marton

[1]: https://issues.apache.org/jira/browse/HADOOP-16091

On 3/13/19 11:24 PM, Eric Yang wrote:
> Hi Hadoop developers,
> 
> In the recent months, there were various discussions on creating docker build 
> process for Hadoop.  There was convergence to make docker build process 
> inline in the mailing list last month when Ozone team is planning new 
> repository for Hadoop/ozone docker images.  New feature has started to add 
> docker image build process inline in Hadoop build.
> A few lessons learnt from making docker build inline in YARN-7129.  The build 
> environment must have docker to have a successful docker build.  BUILD.txt 
> stated for easy build environment use Docker.  There is logic in place to 
> ensure that absence of docker does not trigger docker build.  The inline 
> process tries to be as non-disruptive as possible to existing development 
> environment with one exception.  If docker’s presence is detected, but user 
> does not have rights to run docker.  This will cause the build to fail.
> 
> Now, some developers are pushing back on inline docker build process because 
> existing environment did not make docker build process mandatory.  However, 
> there are benefits to use inline docker build process.  The listed benefits 
> are:
> 
> 1.  Source code tag, maven repository artifacts and docker hub artifacts can 
> all be produced in one build.
> 2.  Less manual labor to tag different source branches.
> 3.  Reduce intermediate build caches that may exist in multi-stage builds.
> 4.  Release engineers and developers do not need to search a maze of build 
> flags to acquire artifacts.
> 
> The disadvantages are:
> 
> 1.  Require developer to have access to docker.
> 2.  Default build takes longer.
> 
> There is workaround for above disadvantages by using -DskipDocker flag to 
> avoid docker build completely or -pl !modulename to bypass subprojects.
> Hadoop development did not follow Maven best practice because a full Hadoop 
> build requires a number of profile and configuration parameters.  Some 
> evolutions are working against Maven design and require fork of separate 
> source trees for different subprojects and pom files.  Maven best practice 
> (https://dzone.com/articles/maven-profile-best-practices) has explained that 
> do not use profile to trigger different artifact builds because it will 
> introduce maven artifact naming conflicts on maven repository using this 
> pattern.  Maven offers flags to skip certain operations, such as -DskipTests 
> -Dmaven.javadoc.skip=true -pl or -DskipDocker.  It seems worthwhile to make 
> some corrections to follow best practice for Hadoop build.
> 
> Some developers have advocated for separate build process for docker images.  
> We need consensus on the direction that will work best for Hadoop development 
> community.  Hence, my questions are:
> 
> Do we want to have inline docker build process in maven?
> If yes, it would be developer’s responsibility to pass -DskipDocker flag to 
> skip docker.  Docker is mandatory for default build.
> If no, what is the release flow for docker images going to look like?
> 
> Thank you for your feedback.
> 
> Regards,
> Eric
> 

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Re: [DISCUSS] Docker build process

2019-03-21 Thread Elek, Marton



> If versioning is done correctly, older branches can have the same docker 
> subproject, and Hadoop 2.7.8 can be released for older Hadoop branches.  We 
> don't generate timeline paradox to allow changing the history of Hadoop 
> 2.7.1.  That release has passed and let it stay that way.

I understand your point but I am afraid that my concerns were not
expressed clearly enough (sorry for that).

Let's say that we use centos as the base image. In case of a security
problem on the centos side (eg. in libssl) or jdk side, I would rebuild
all the hadoop:2.x / hadoop:3.x images and republish them. Exactly the
same hadoop bytes but updated centos/jdk libraries.

I understand your concerns that in this case the an image with the same
tag (eg. hadoop:3.2.1) will be changed over the time. But this can be
solved by adding date specific postfixes (eg. hadoop:3.2.1-20190321 tag
would never change but hadoop:3.2.1 can be changed)

I know that it's not perfect, but this is widely used. For example the
centos:7 tag is not fixed but centos:7.6.1810 is (hopefully).

Without this flexibility any centos/jdk security issue can invalidate
all of our images (and would require new releases from all the active lines)

Marton

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Re: [DISCUSS] Docker build process

2019-03-22 Thread Elek, Marton



Thanks the answer,

I agree, sha256 based tags seems to be more safe and bumping versions
only after some tests.


Let's say we have multiple hadoop docker images:

apache/hadoop:3.2.0
apache/hadoop:3.1.2
apache/hadoop:2.9.2
apache/hadoop:2.8.5
apache/hadoop:2.7.7


If I understood well, your proposal is the following:

In case of any security issue in centos/jdk, or in case of any bug in
the apache/hadoop-runner base image (we have a few shell/python scripts
there):

1) We need to wait until the next release to fix them (3.2.1) which
means all the previous images would be unsecure / bad forever (but still
available?)

OR

2) in case of a serious problem a new release can be created from all
the lines (3.2.1, 3.1.3, 2.9.3, 2.8.6) with the help of all the release
managers. (old images remain the same).



But on the other hand the image creation would be as easy as activating
a new profile during the release. (As a contrast: Using separated repo a
new branch would be created and the version in the Dockerfile would be
adjusted).

Marton

ps: for the development (non published images) I am convinced that the
optional docker profile can be an easier way to create images. Will
create a similar plugin execution for this Dockerfile:

https://github.com/apache/hadoop/tree/trunk/hadoop-ozone/dist

On 3/21/19 11:33 PM, Eric Yang wrote:
> The flexibility of date appended release number is equivalent to maven 
> snapshot or Docker latest image convention, machine can apply timestamp 
> better than human.  By using the Jenkins release process, this can be done 
> with little effort.  For official release, it is best to use Docker image 
> digest id to ensure uniqueness.  E.g.
> 
> FROM 
> centos@sha256:67dad89757a55bfdfabec8abd0e22f8c7c12a1856514726470228063ed86593b
>  
> 
> Developer downloaded released source would build with the same docker image 
> without getting side effects.  
> 
> A couple years ago, RedHat has decided to fix SSL vulnerability in RedHat 6/7 
> by adding extra parameter to disable certification validation in urllib2 
> python library and force certificate signer validation on by default.  It 
> completely broke Ambari agent and its self-signed certificate.  Customers had 
> to backtrack to pick up a specific version of python SSL library to keep 
> their production cluster operational.  Without doing the due-diligence of 
> certify Hadoop code and the OS image, there is wriggle room for errors.  OS 
> update example is a perfect example that we want the container OS image 
> certified with Hadoop binary release to avoid the wriggle rooms.  Snapshot 
> release is ok to have wriggle room for developers, but I don't think that 
> flexibility is necessary for official release.
> 
> Regards,
> Eric
> 
> On 3/21/19, 2:44 PM, "Elek, Marton"  wrote:
> 
> 
> 
> > If versioning is done correctly, older branches can have the same 
> docker subproject, and Hadoop 2.7.8 can be released for older Hadoop 
> branches.  We don't generate timeline paradox to allow changing the history 
> of Hadoop 2.7.1.  That release has passed and let it stay that way.
> 
> I understand your point but I am afraid that my concerns were not
> expressed clearly enough (sorry for that).
> 
> Let's say that we use centos as the base image. In case of a security
> problem on the centos side (eg. in libssl) or jdk side, I would rebuild
> all the hadoop:2.x / hadoop:3.x images and republish them. Exactly the
> same hadoop bytes but updated centos/jdk libraries.
> 
> I understand your concerns that in this case the an image with the same
> tag (eg. hadoop:3.2.1) will be changed over the time. But this can be
> solved by adding date specific postfixes (eg. hadoop:3.2.1-20190321 tag
> would never change but hadoop:3.2.1 can be changed)
> 
> I know that it's not perfect, but this is widely used. For example the
> centos:7 tag is not fixed but centos:7.6.1810 is (hopefully).
> 
> Without this flexibility any centos/jdk security issue can invalidate
> all of our images (and would require new releases from all the active 
> lines)
> 
> Marton
> 
> 
> 
> 
> -
> To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: common-dev-h...@hadoop.apache.org
> 

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Re: proposed new repository for hadoop/ozone docker images (+update on docker works)

2019-01-30 Thread Elek, Marton
Thanks Eric the suggestions.

Unfortunately (as Anu wrote it) our use-case is slightly different.

It was discussed in HADOOP-14898 and HDDS-851 but let me summarize the
motivation:

We would like to upload containers to the dockerhub for each releases
(eg: apache/hadoop:3.2.0)

According to the Apache release policy, it's not allowed, to publish
snapshot builds (=not voted by PMC) outside of the developer community.

1. We started to follow the pattern which is used by other Apache
projects: docker containers are just different packaging of the already
voted binary releases. Therefore we create the containers from the voted
releases. (See [1] as an example)

2. With separating the build of the source code and the docker image we
get additional benefits: for example we can rebuild the images in case
of a security problem in the underlying container OS. This is just a new
empty commit on the branch and the original release will be repackaged.

3. Technically it would be possible to add the Dockerfile to the source
tree and publish the docker image together with the release by the
release manager but it's also problematic:

  a) there is no easy way to stage the images for the vote
  b) we have no access to the apache dockerhub credentials
  c) it couldn't be flagged as automated on dockerhub
  d) It couldn't support the critical updates as I wrote in (2.).

So the easy way what we found is ask INFRA to register a branch to the
dockerhub to use for the image creation. The build/packaging will be
done by the dockerhub but only released artifacts will be included.
Because the limitation of the dockerhub to set a map between branch
names and tags, we need a new repository instead of the branch (see the
comments in HDDS-851 for more details).

We also have a different use case to build developer images to create a
test cluster. These images will never be uploaded to the hub. We have a
Dokcerfile in the source tree for this use case (see HDDS-872). And
thank you very much the hint, I will definitely check how YARN-7129 can
do it and will try to learn from it.

Thanks,
Marton


[1]: https://github.com/apache/hadoop/tree/docker-hadoop-3



On 1/30/19 2:50 AM, Anu Engineer wrote:
> Marton please correct me I am wrong, but I believe that without this branch 
> it is hard for us to push to Apache DockerHub. This allows for Apache account 
> integration and dockerHub.
> Does YARN publish to the Docker Hub via Apache account?
> 
> 
> Thanks
> Anu
> 
> 
> On 1/29/19, 4:54 PM, "Eric Yang"  wrote:
> 
> By separating Hadoop docker related build into a separate git repository 
> have some slippery slope.  It is harder to synchronize the changes between 
> two separate source trees.  There is multi-steps process to build jar, 
> tarball, and docker images.  This might be problematic to reproduce.
> 
> It would be best to arrange code such that docker image build process can 
> be invoked as part of maven build process.  The profile is activated only if 
> docker is installed and running on the environment.  This allows to produce 
> jar, tarball, and docker images all at once without hindering existing build 
> procedure.
> 
> YARN-7129 is one of the examples that making a subproject in YARN to 
> build a docker image that can run in YARN.  It automatically detects presence 
> of docker and build docker image when docker is available.  If docker is not 
> running, the subproject skips and proceed to next sub-project.  Please try 
> out YARN-7129 style of build process, and see this is a possible solution to 
> solve docker image generation issue?  Thanks
> 
> Regards,
> Eric
> 
> On 1/29/19, 3:44 PM, "Arpit Agarwal"  
> wrote:
> 
>     I’ve requested a new repo hadoop-docker-ozone.git in gitbox.
> 
> 
> > On Jan 22, 2019, at 4:59 AM, Elek, Marton  wrote:
> > 
> > 
> > 
> > TLDR;
> > 
> > I proposed to create a separated git repository for ozone docker 
> images
> > in HDDS-851 (hadoop-docker-ozone.git)
> > 
> > If there is no objections in the next 3 days I will ask an Apache 
> Member
> > to create the repository.
> > 
> > 
> > 
> > 
> > LONG VERSION:
> > 
> > In HADOOP-14898 multiple docker containers and helper scripts are
> > created for Hadoop.
> > 
> > The main goal was to:
> > 
> > 1.) help the development with easy-to-use docker images
> > 2.) provide official hadoop images to make it easy to test new 
> features
> > 
> > As of now we have:
>

Re: [DISCUSS] Making submarine to different release model like Ozone

2019-02-01 Thread Elek, Marton
+1.

I like the idea.

For me, submarine/ML-job-execution seems to be a natural extension of
the existing Hadoop/Yarn capabilities.

And like the proposed project structure / release lifecycle, too. I
think it's better to be more modularized but keep the development in the
same project. IMHO it worked well with the Ozone releases. We can do
more frequent releases and support multiple versions of core hadoop but
the tested new improvements could be moved back to the hadoop-common.

Marton

On 1/31/19 7:53 PM, Wangda Tan wrote:
> Hi devs,
> 
> Since we started submarine-related effort last year, we received a lot of
> feedbacks, several companies (such as Netease, China Mobile, etc.)  are
> trying to deploy Submarine to their Hadoop cluster along with big data
> workloads. Linkedin also has big interests to contribute a Submarine TonY (
> https://github.com/linkedin/TonY) runtime to allow users to use the same
> interface.
> 
> From what I can see, there're several issues of putting Submarine under
> yarn-applications directory and have same release cycle with Hadoop:
> 
> 1) We started 3.2.0 release at Sep 2018, but the release is done at Jan
> 2019. Because of non-predictable blockers and security issues, it got
> delayed a lot. We need to iterate submarine fast at this point.
> 
> 2) We also see a lot of requirements to use Submarine on older Hadoop
> releases such as 2.x. Many companies may not upgrade Hadoop to 3.x in a
> short time, but the requirement to run deep learning is urgent to them. We
> should decouple Submarine from Hadoop version.
> 
> And why we wanna to keep it within Hadoop? First, Submarine included some
> innovation parts such as enhancements of user experiences for YARN
> services/containerization support which we can add it back to Hadoop later
> to address common requirements. In addition to that, we have a big overlap
> in the community developing and using it.
> 
> There're several proposals we have went through during Ozone merge to trunk
> discussion:
> https://mail-archives.apache.org/mod_mbox/hadoop-common-dev/201803.mbox/%3ccahfhakh6_m3yldf5a2kq8+w-5fbvx5ahfgs-x1vajw8gmnz...@mail.gmail.com%3E
> 
> I propose to adopt Ozone model: which is the same master branch, different
> release cycle, and different release branch. It is a great example to show
> agile release we can do (2 Ozone releases after Oct 2018) with less
> overhead to setup CI, projects, etc.
> 
> *Links:*
> - JIRA: https://issues.apache.org/jira/browse/YARN-8135
> - Design doc
> 
> - User doc
> 
> (3.2.0
> release)
> - Blogposts, {Submarine} : Running deep learning workloads on Apache Hadoop
> ,
> (Chinese Translation: Link )
> - Talks: Strata Data Conf NY
> 
> 
> Thoughts?
> 
> Thanks,
> Wangda Tan
> 

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Re: proposed new repository for hadoop/ozone docker images (+update on docker works)

2019-01-31 Thread Elek, Marton


Hi Eric,

Thanks for the answers

1.

> Hadoop-docker-ozone.git source tree naming seems to create a unique
process for Ozone.

Not at all. We would like to follow the existing practice which is
established in HADOOP-14898. In HDDS-851 we discussed why we need two
separated repositories for hadoop/ozone: because the limitation of the
dockerhub branch/tag mapping.

I am 100% open to switch to use an other approach. I would suggest to
create a JIRA for that as it requires code modification in the
docker-hadoop-* branches.


2.

> Flagging automated build on dockerhub seems conflicts with Apache
release policy.

Honestly I don't know. It was discussed in HADOOP-14989 and the
connected INFRA ticket and there was no arguments against it. Especially
as we just followed the existing practice and we just followed the
practice which is started by other projects.

Now I checked again the docker related INFRA tickets it seems that we
have two other practice since than:

 1) build docker image on the jenkins (is it compliant?)
 2) get permission to push to the apache/... from local.

You suggested to the the second one. Do you have more information how is
it possible? How and who can request permission to push the
apache/hadoop for example?


3.

>From one point of view, publishing existing, voted releases in docker
images is something like to repackage it. But you may have right and
this is wrong because it should be handled as separated releases.

Do you know any official ASF wiki/doc/mail discussion about managing
docker images? If not, I would suggest to create a new wiki/doc as it
seems that we have no clear answer which is the most compliant way to do it.

4.

Thank you the suggestions to use dockerhub/own namespace to stage docker
images during the build. Sounds good to me. But I also wrote some other
problems in my previous mail (3 b,c,d), this is is just one (3/a). Do
you have any suggestion to solve the other problems?

 * Updating existing images (for example in case of an ssl bug, rebuild
all the existing images with exactly the same payload but updated base
image/os environment)

 * Creating image for older releases (We would like to provide images,
for hadoop 2.6/2.7/2.7/2.8/2.9. Especially for doing automatic testing
with different versions).

Thanks a lot,
Marton


On 1/30/19 6:50 PM, Eric Yang wrote:
> Hi Marton,
> 
> Hi Marton,
> 
> Flagging automated build on dockerhub seems conflicts with Apache release 
> policy.  The vote and release process are manual processes of Apache Way.  
> Therefore, 3 b)-3 d) improvement will be out of reach unless policy changes.
> 
> YARN-7129 is straight forward by using dockerfile-maven-plugin to build 
> docker image locally.  It also checks for existence of /var/run/docker.sock 
> to ensure docker is running.  This allows the docker image to build in 
> developer sandbox, if the developer sandbox mounts the host 
> /var/run/docker.sock.  Maven deploy can configure repository location and 
> authentication credential using ~/.docker/config.json and maven settings.xml. 
>  This can upload release candidate image to release manager's dockerhub 
> account for release vote.  Once the vote passes, the image can be pushed to 
> Apache official dockerhub repository by release manager or an Apache Jenkin 
> job to tag the image and push to Apache account.
> 
> Ozone image and application catalog image are in similar situation that test 
> image can be built and tested locally.  The official voted artifacts can be 
> uploaded to Apache dockerhub account.  Hence, less variant of the same 
> procedure will be great.  Hadoop-docker-ozone.git source tree naming seems to 
> create a unique process for Ozone.  I think it would be preferable to call 
> the Hadoop-docker.git that comprise all docker image builds or 
> dockerfile-maven-plugin approach.
> 
> Regards,
> Eric
> 
> On 1/30/19, 12:56 AM, "Elek, Marton"  wrote:
> 
> Thanks Eric the suggestions.
> 
> Unfortunately (as Anu wrote it) our use-case is slightly different.
> 
> It was discussed in HADOOP-14898 and HDDS-851 but let me summarize the
> motivation:
> 
> We would like to upload containers to the dockerhub for each releases
> (eg: apache/hadoop:3.2.0)
> 
> According to the Apache release policy, it's not allowed, to publish
> snapshot builds (=not voted by PMC) outside of the developer community.
> 
> 1. We started to follow the pattern which is used by other Apache
> projects: docker containers are just different packaging of the already
> voted binary releases. Therefore we create the containers from the voted
> releases. (See [1] as an example)
> 
> 2. With separating the build of the source code and the docker image we
> get additional benefits: for exa

Re: proposed new repository for hadoop/ozone docker images (+update on docker works)

2019-02-05 Thread Elek, Marton
Thanks Eric the answers.


If I understood well, these are two proposals (use the same repository,
use inline build). I created separated jiras for both of them where we
can discuss the technical details:

https://issues.apache.org/jira/browse/HADOOP-16092

https://issues.apache.org/jira/browse/HADOOP-16091


Until the implementation of the jiras we can use the existing approach,
but (again) I am fine with switching to any newer approach anytime. The
only thing what we need is the availability of the images during any
transition.


I started to document the current state in the wiki to make the
discussion easier.

https://cwiki.apache.org/confluence/display/HADOOP/Container+support

https://cwiki.apache.org/confluence/display/HADOOP/Ozone+Container+support

Marton




On 1/31/19 8:00 PM, Eric Yang wrote:
> 1, 3. There are 38 Apache projects hosting docker images on Docker hub using 
> Apache Organization.  By browsing Apache github mirror.  There are only 7 
> projects using a separate repository for docker image build.  Popular 
> projects official images are not from Apache organization, such as zookeeper, 
> tomcat, httpd.  We may not disrupt what other Apache projects are doing, but 
> it looks like inline build process is widely employed by majority of projects 
> such as Nifi, Brooklyn, thrift, karaf, syncope and others.  The situation 
> seems a bit chaotic for Apache as a whole.  However, Hadoop community can 
> decide what is best for Hadoop.  My preference is to remove ozone from source 
> tree naming, if Ozone is intended to be subproject of Hadoop for long period 
> of time.  This enables Hadoop community to host docker images for various 
> subproject without having to check out several source tree to trigger a grand 
> build.  However, inline build process seems more popular than separated 
> process.  Hence, I highly recommend making docker build inline if possible.
> 
> 2. I think open an INFRA ticket, and there are Jenkins users who can 
> configure the job to run on nodes that have Apache repo credential.
> 
> 4. The docker image name maps to maven project name.  Hence, if it is 
> Hadoop-ozone as project name.  The convention automatically follows the maven 
> artifact name with option to customize.  I think it is reasonable and it 
> automatically tagged with the same maven project version, which minimize 
> version number management between maven and docker.
> 
> Regards,
> Eric
> 
> On 1/31/19, 8:59 AM, "Elek, Marton"  wrote:
> 
> 
> Hi Eric,
> 
> Thanks for the answers
> 
> 1.
> 
> > Hadoop-docker-ozone.git source tree naming seems to create a unique
> process for Ozone.
> 
> Not at all. We would like to follow the existing practice which is
> established in HADOOP-14898. In HDDS-851 we discussed why we need two
> separated repositories for hadoop/ozone: because the limitation of the
> dockerhub branch/tag mapping.
> 
> I am 100% open to switch to use an other approach. I would suggest to
> create a JIRA for that as it requires code modification in the
> docker-hadoop-* branches.
> 
> 
> 2.
> 
> > Flagging automated build on dockerhub seems conflicts with Apache
> release policy.
> 
> Honestly I don't know. It was discussed in HADOOP-14989 and the
> connected INFRA ticket and there was no arguments against it. Especially
> as we just followed the existing practice and we just followed the
> practice which is started by other projects.
> 
> Now I checked again the docker related INFRA tickets it seems that we
> have two other practice since than:
> 
>  1) build docker image on the jenkins (is it compliant?)
>  2) get permission to push to the apache/... from local.
> 
> You suggested to the the second one. Do you have more information how is
> it possible? How and who can request permission to push the
> apache/hadoop for example?
> 
> 
> 3.
> 
> From one point of view, publishing existing, voted releases in docker
> images is something like to repackage it. But you may have right and
> this is wrong because it should be handled as separated releases.
> 
> Do you know any official ASF wiki/doc/mail discussion about managing
> docker images? If not, I would suggest to create a new wiki/doc as it
> seems that we have no clear answer which is the most compliant way to do 
> it.
> 
> 4.
> 
> Thank you the suggestions to use dockerhub/own namespace to stage docker
> images during the build. Sounds good to me. But I also wrote some other
> problems in my previous mail (3 b,c,d), thi

Re: [VOTE] Propose to start new Hadoop sub project "submarine"

2019-02-04 Thread Elek, Marton
+1 (non-binding)

(my arguments are in the discuss thread. small move, huge benefit)

Thanks,
Marton

On 2/1/19 11:15 PM, Wangda Tan wrote:
> Hi all,
> 
> According to positive feedbacks from the thread [1]
> 
> This is vote thread to start a new subproject named "hadoop-submarine"
> which follows the release process already established for ozone.
> 
> The vote runs for usual 7 days, which ends at Feb 8th 5 PM PDT.
> 
> Thanks,
> Wangda Tan
> 
> [1]
> https://lists.apache.org/thread.html/f864461eb188bd12859d51b0098ec38942c4429aae7e4d001a633d96@%3Cyarn-dev.hadoop.apache.org%3E
> 

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Re: VOTE: Hadoop Ozone 0.4.0-alpha RC1

2019-04-20 Thread Elek, Marton
+1 (non-binding)

 - build from source
 - run all the smoketest (from the fresh build based on the src package)
 - run all the smoketest from the binary package
 - signature files are checked
 - sha512 checksums are verified
 - ozone version shows the right commit information

Thanks Ajay all the releasing work,
Marton

ps: used archlinux, java 8, docker-compose 1.23.2, docker 18.09.4-ce

On 4/20/19 3:24 PM, Xiaoyu Yao wrote:
> 
> +1 (binding)
> 
> - Build from source
> - Misc security tests with docker compose
> - MR and Spark sample jobs with secure ozone cluster
> 
> —Xiaoyu
> 
>> On Apr 19, 2019, at 3:40 PM, Anu Engineer  
>> wrote:
>>
>> +1 (Binding)
>>
>> -- Verified the checksums.
>> -- Built from sources.
>> -- Sniff tested the functionality.
>>
>> --Anu
>>
>>
>> On Mon, Apr 15, 2019 at 4:09 PM Ajay Kumar 
>> wrote:
>>
>>> Hi all,
>>>
>>> We have created the second release candidate (RC1) for Apache Hadoop Ozone
>>> 0.4.0-alpha.
>>>
>>> This release contains security payload for Ozone. Below are some important
>>> features in it:
>>>
>>>  *   Hadoop Delegation Tokens and Block Tokens supported for Ozone.
>>>  *   Transparent Data Encryption (TDE) Support - Allows data blocks to be
>>> encrypted-at-rest.
>>>  *   Kerberos support for Ozone.
>>>  *   Certificate Infrastructure for Ozone  - Tokens use PKI instead of
>>> shared secrets.
>>>  *   Datanode to Datanode communication secured via mutual TLS.
>>>  *   Ability secure ozone cluster that works with Yarn, Hive, and Spark.
>>>  *   Skaffold support to deploy Ozone clusters on K8s.
>>>  *   Support S3 Authentication Mechanisms like - S3 v4 Authentication
>>> protocol.
>>>  *   S3 Gateway supports Multipart upload.
>>>  *   S3A file system is tested and supported.
>>>  *   Support for Tracing and Profiling for all Ozone components.
>>>  *   Audit Support - including Audit Parser tools.
>>>  *   Apache Ranger Support in Ozone.
>>>  *   Extensive failure testing for Ozone.
>>>
>>> The RC artifacts are available at
>>> https://home.apache.org/~ajay/ozone-0.4.0-alpha-rc1
>>>
>>> The RC tag in git is ozone-0.4.0-alpha-RC1 (git hash
>>> d673e16d14bb9377f27c9017e2ffc1bcb03eebfb)
>>>
>>> Please try out<
>>> https://cwiki.apache.org/confluence/display/HADOOP/Running+via+Apache+Release>,
>>> vote, or just give us feedback.
>>>
>>> The vote will run for 5 days, ending on April 20, 2019, 19:00 UTC.
>>>
>>> Thank you very much,
>>>
>>> Ajay
>>>
>>>
>>>
> 
> -
> To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
> 

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Re: VOTE: Hadoop Ozone 0.4.0-alpha RC2

2019-05-04 Thread Elek, Marton
+1 (non-binding).

Thanks the continuous effort Ajay. This is the best Ozone release
package what I have ever seen:

I checked the following:

 * Signatures are checked: OK
 * sha512 checksums are checked: OK
 * can be built from the source: OK
 * smoketest executed from the bin package (after build): OK
 * smoketest executed from the src package: OK
 * 'ozone version' shows the right version info: OK
 * docs are included: OK
 * docs are visible from the web ui: OK
 * Using latest release ratis: OK

Marton

On 4/30/19 6:04 AM, Ajay Kumar wrote:
> Hi All,
> 
> 
> 
> We have created the third release candidate (RC2) for Apache Hadoop Ozone 
> 0.4.0-alpha.
> 
> 
> 
> This release contains security payload for Ozone. Below are some important 
> features in it:
> 
> 
> 
>   *   Hadoop Delegation Tokens and Block Tokens supported for Ozone.
>   *   Transparent Data Encryption (TDE) Support - Allows data blocks to be 
> encrypted-at-rest.
>   *   Kerberos support for Ozone.
>   *   Certificate Infrastructure for Ozone  - Tokens use PKI instead of 
> shared secrets.
>   *   Datanode to Datanode communication secured via mutual TLS.
>   *   Ability secure ozone cluster that works with Yarn, Hive, and Spark.
>   *   Skaffold support to deploy Ozone clusters on K8s.
>   *   Support S3 Authentication Mechanisms like - S3 v4 Authentication 
> protocol.
>   *   S3 Gateway supports Multipart upload.
>   *   S3A file system is tested and supported.
>   *   Support for Tracing and Profiling for all Ozone components.
>   *   Audit Support - including Audit Parser tools.
>   *   Apache Ranger Support in Ozone.
>   *   Extensive failure testing for Ozone.
> 
> The RC artifacts are available at 
> https://home.apache.org/~ajay/ozone-0.4.0-alpha-rc2/
> 
> 
> 
> The RC tag in git is ozone-0.4.0-alpha-RC2 (git hash 
> 4ea602c1ee7b5e1a5560c6cbd096de4b140f776b)
> 
> 
> 
> Please try 
> out,
>  vote, or just give us feedback.
> 
> 
> 
> The vote will run for 5 days, ending on May 4, 2019, 04:00 UTC.
> 
> 
> 
> Thank you very much,
> 
> Ajay
> 

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Re: [DISCUSS] Docker build process

2019-05-06 Thread Elek, Marton
Thanks the answers Eric Yang, I think we have similar view about how the
releases are working and what you wrote is exactly the reason why I
prefer the current method (docker image creation from separated branch)
instead of the proposed one (create images from maven).

1. Not all the branches can be deprecated. Usually we have two or three
branches which have large user base. Can't deprecate all but the last one.

2. Yes, release managers of the old releases are may or may not be
available.

3. This is one reason to use 100% voted and approved packages inside
container images:

 * It makes it clean what's inside (hadoop version shows that it is
exactly the same bits which are voted and approved by PMC)

 * It makes possible to upgrade the convenience docker packaging (and
not hadoop!) of older but actively used releases (eg. 3.1 today). For
example in case of a serious ssl problem.

 * I prefer to keep container images for a few older versions. In Ozone
there are tests to test the compatibility between different hadoop
version. Docker containers (with older images) help a lot to test it.

Marton





>> 1) We need to wait until the next release to fix them (3.2.1) which
>> means all the previous images would be unsecure / bad forever (but still
>> available?)
> 
> 
> Yes.  This prevents recursive 3.2.1.1.1 version forking to be maintained by
> Apache.  Some company own internal decision might require them to use FROM
> apache/hadoop:3.2.0 and applies their own internal patch.  Apache can phase
> out deprecated versions, and old version can be found in archives.apache.org


>> 2) in case of a serious problem a new release can be created from all
>> the lines (3.2.1, 3.1.3, 2.9.3, 2.8.6) with the help of all the release
>> managers. (old images remain the same).
>>
> 
> Release manager come and go, branches will eventually die off.  There is no
> need to address super old images with unreachable release manager (maybe
> retired).  The release only happens when there is demand for it.



-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[DISCUSS] Separate Hadoop Core trunk and Hadoop Ozone trunk source tree

2019-09-17 Thread Elek, Marton




TLDR; I propose to move Ozone related code out from Hadoop trunk and 
store it in a separated *Hadoop* git repository apache/hadoop-ozone.git





When Ozone was adopted as a new Hadoop subproject it was proposed[1] to 
be part of the source tree but with separated release cadence, mainly 
because it had the hadoop-trunk/SNAPSHOT as compile time dependency.


During the last Ozone releases this dependency is removed to provide 
more stable releases. Instead of using the latest trunk/SNAPSHOT build 
from Hadoop, Ozone uses the latest stable Hadoop (3.2.0 as of now).


As we have no more strict dependency between Hadoop trunk SNAPSHOT and 
Ozone trunk I propose to separate the two code base from each other with 
creating a new Hadoop git repository (apache/hadoop-ozone.git):


With moving Ozone to a separated git repository:

 * It would be easier to contribute and understand the build (as of now 
we always need `-f pom.ozone.xml` as a Maven parameter)
 * It would be possible to adjust build process without breaking 
Hadoop/Ozone builds.
 * It would be possible to use different Readme/.asf.yaml/github 
template for the Hadoop Ozone and core Hadoop. (For example the current 
github template [2] has a link to the contribution guideline [3]. Ozone 
has an extended version [4] from this guideline with additional 
information.)
 * Testing would be more safe as it won't be possible to change core 
Hadoop and Hadoop Ozone in the same patch.
 * It would be easier to cut branches for Hadoop releases (based on the 
original consensus, Ozone should be removed from all the release 
branches after creating relase branches from trunk)



What do you think?

Thanks,
Marton

[1]: 
https://lists.apache.org/thread.html/c85e5263dcc0ca1d13cbbe3bcfb53236784a39111b8c353f60582eb4@%3Chdfs-dev.hadoop.apache.org%3E
[2]: 
https://github.com/apache/hadoop/blob/trunk/.github/pull_request_template.md

[3]: https://cwiki.apache.org/confluence/display/HADOOP/How+To+Contribute
[4]: 
https://cwiki.apache.org/confluence/display/HADOOP/How+To+Contribute+to+Ozone


-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Re: [DISCUSS] Separate Hadoop Core trunk and Hadoop Ozone trunk source tree

2019-09-18 Thread Elek, Marton

> one thing to consider here as you are giving up your ability to make
> changes in hadoop-* modules, including hadoop-common, and their
> dependencies, in sync with your own code. That goes for filesystem 
contract

> tests.
>
> are you happy with that?


Yes. I think we can live with it.

Fortunatelly the Hadoop parts which are used by Ozone (security + rpc) 
are stable enough, we didn't need bigger changes until now (small 
patches are already included in 3.1/3.2).


I think it's better to use released Hadoop bits in Ozone anyway, and 
worst (best?) case we can try to do more frequent patch releases from 
Hadoop (if required).



m.


On 9/18/19 12:06 PM, Steve Loughran wrote:

one thing to consider here as you are giving up your ability to make
changes in hadoop-* modules, including hadoop-common, and their
dependencies, in sync with your own code. That goes for filesystem contract
tests.

are you happy with that?

On Tue, Sep 17, 2019 at 10:48 AM Elek, Marton  wrote:




TLDR; I propose to move Ozone related code out from Hadoop trunk and
store it in a separated *Hadoop* git repository apache/hadoop-ozone.git




When Ozone was adopted as a new Hadoop subproject it was proposed[1] to
be part of the source tree but with separated release cadence, mainly
because it had the hadoop-trunk/SNAPSHOT as compile time dependency.

During the last Ozone releases this dependency is removed to provide
more stable releases. Instead of using the latest trunk/SNAPSHOT build
from Hadoop, Ozone uses the latest stable Hadoop (3.2.0 as of now).

As we have no more strict dependency between Hadoop trunk SNAPSHOT and
Ozone trunk I propose to separate the two code base from each other with
creating a new Hadoop git repository (apache/hadoop-ozone.git):

With moving Ozone to a separated git repository:

   * It would be easier to contribute and understand the build (as of now
we always need `-f pom.ozone.xml` as a Maven parameter)
   * It would be possible to adjust build process without breaking
Hadoop/Ozone builds.
   * It would be possible to use different Readme/.asf.yaml/github
template for the Hadoop Ozone and core Hadoop. (For example the current
github template [2] has a link to the contribution guideline [3]. Ozone
has an extended version [4] from this guideline with additional
information.)
   * Testing would be more safe as it won't be possible to change core
Hadoop and Hadoop Ozone in the same patch.
   * It would be easier to cut branches for Hadoop releases (based on the
original consensus, Ozone should be removed from all the release
branches after creating relase branches from trunk)


What do you think?

Thanks,
Marton

[1]:

https://lists.apache.org/thread.html/c85e5263dcc0ca1d13cbbe3bcfb53236784a39111b8c353f60582eb4@%3Chdfs-dev.hadoop.apache.org%3E
[2]:

https://github.com/apache/hadoop/blob/trunk/.github/pull_request_template.md
[3]: https://cwiki.apache.org/confluence/display/HADOOP/How+To+Contribute
[4]:

https://cwiki.apache.org/confluence/display/HADOOP/How+To+Contribute+to+Ozone

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org






-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Re: [VOTE] Release Apache Hadoop 3.2.1 - RC0

2019-09-18 Thread Elek, Marton



+1 (binding)

Thanks Rohith the work with the release.



 * built from the source (archlinux)
 * verified signatures
 * verified sha512 checksums
 * started a docker-based pseudo cluster
 * tested basic HDFS operations with CLI
 * Checked if the sources are uploaded to the maven staging repo

Note 1: I haven't seen the ./patchprocess/gpgagent.conf file in earlier 
releases and It seems to be included. But I don't think it's a blocker.


Note 2: *.sha512 files can be improved before uploading with removing 
the absolute path (to make it easier to check with sha512sum -c).



Marton


On 9/18/19 4:25 PM, Ayush Saxena wrote:

Thanks Rohith for driving the release.

+1 (non binding)

-Built from source on Ubuntu-18.04
-Successful Native build.
-Verified basic HDFS Commands.
-Verified basic Erasure Coding Commands.
-Verified basic RBF commands.
-Browsed HDFS UI.

Thanks

-Ayush

On Wed, 18 Sep 2019 at 15:41, Weiwei Yang  wrote:


+1 (binding)

Downloaded tarball, setup a pseudo cluster manually
Verified basic HDFS operations, copy/view files
Verified basic YARN operations, run sample DS jobs
Verified basic YARN restful APIs, e.g cluster/nodes info etc
Set and verified YARN node-attributes, including CLI

Thanks
Weiwei
On Sep 18, 2019, 11:41 AM +0800, zhankun tang ,
wrote:

+1 (non-binding).
Installed and verified it by running several Spark job and DS jobs.

BR,
Zhankun

On Wed, 18 Sep 2019 at 08:05, Naganarasimha Garla <
naganarasimha...@apache.org> wrote:


Verified the source and the binary tar and the sha512 checksums
Installed and verified the basic hadoop operations (ran few MR tasks)

+1.

Thanks,
+ Naga

On Wed, Sep 18, 2019 at 1:32 AM Anil Sadineni 
wrote:


+1 (non-binding)

On Tue, Sep 17, 2019 at 9:55 AM Santosh Marella 

wrote:



+1 (non-binding)

On Wed, Sep 11, 2019 at 12:26 AM Rohith Sharma K S <
rohithsharm...@apache.org> wrote:


Hi folks,

I have put together a release candidate (RC0) for Apache Hadoop

3.2.1.


The RC is available at:
http://home.apache.org/~rohithsharmaks/hadoop-3.2.1-RC0/

The RC tag in git is release-3.2.1-RC0:
https://github.com/apache/hadoop/tree/release-3.2.1-RC0


The maven artifacts are staged at




https://repository.apache.org/content/repositories/orgapachehadoop-1226/


You can find my public key at:
https://dist.apache.org/repos/dist/release/hadoop/common/KEYS

This vote will run for 7 days(5 weekdays), ending on 18th Sept at

11:59

pm

PST.

I have done testing with a pseudo cluster and distributed shell

job.

My

+1

to start.

Thanks & Regards
Rohith Sharma K S






--
Thanks & Regards,
Anil Sadineni
Solutions Architect, Optlin Inc
Ph: 571-438-1974 | www.optlin.com









-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[VOTE] Force "squash and merge" option for PR merge on github UI

2019-07-17 Thread Elek, Marton
Hi,

Github UI (ui!) helps to merge Pull Requests to the proposed branch.
There are three different ways to do it [1]:

1. Keep all the different commits from the PR branch and create one
additional merge commit ("Create a merge commit")

2. Squash all the commits and commit the change as one patch ("Squash
and merge")

3. Keep all the different commits from the PR branch but rebase, merge
commit will be missing ("Rebase and merge")



As only the option 2 is compatible with the existing development
practices of Hadoop (1 issue = 1 patch = 1 commit), I call for a lazy
consensus vote: If no objections withing 3 days, I will ask INFRA to
disable the options 1 and 3 to make the process less error prone.

Please let me know, what do you think,

Thanks a lot
Marton

ps: Personally I prefer to merge from local as it enables to sign the
commits and do a final build before push. But this is a different story,
this proposal is only about removing the options which are obviously
risky...

ps2: You can always do any kind of merge / commits from CLI, for example
to merge a feature branch together with keeping the history.

[1]:
https://help.github.com/en/articles/merging-a-pull-request#merging-a-pull-request-on-github

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Re: [VOTE] Force "squash and merge" option for PR merge on github UI

2019-07-22 Thread Elek, Marton


Thanks for all the positive feedback,

I opened INFRA-18777 to request the proposed change.

Marton

On 7/18/19 10:02 AM, Masatake Iwasaki wrote:
> +1
> 
> Thanks,
> Masatake Iwasaki
> 
> On 7/17/19 15:07, Elek, Marton wrote:
>> Hi,
>>
>> Github UI (ui!) helps to merge Pull Requests to the proposed branch.
>> There are three different ways to do it [1]:
>>
>> 1. Keep all the different commits from the PR branch and create one
>> additional merge commit ("Create a merge commit")
>>
>> 2. Squash all the commits and commit the change as one patch ("Squash
>> and merge")
>>
>> 3. Keep all the different commits from the PR branch but rebase, merge
>> commit will be missing ("Rebase and merge")
>>
>>
>>
>> As only the option 2 is compatible with the existing development
>> practices of Hadoop (1 issue = 1 patch = 1 commit), I call for a lazy
>> consensus vote: If no objections withing 3 days, I will ask INFRA to
>> disable the options 1 and 3 to make the process less error prone.
>>
>> Please let me know, what do you think,
>>
>> Thanks a lot
>> Marton
>>
>> ps: Personally I prefer to merge from local as it enables to sign the
>> commits and do a final build before push. But this is a different story,
>> this proposal is only about removing the options which are obviously
>> risky...
>>
>> ps2: You can always do any kind of merge / commits from CLI, for example
>> to merge a feature branch together with keeping the history.
>>
>> [1]:
>> https://help.github.com/en/articles/merging-a-pull-request#merging-a-pull-request-on-github
>>
>>
>> -
>> To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
>> For additional commands, e-mail: common-dev-h...@hadoop.apache.org
>>
> 

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[VOTE] create ozone-dev and ozone-issues mailing lists

2019-10-27 Thread Elek, Marton



As discussed earlier in the thread of "Hadoop-Ozone repository mailing 
list configurations" [1] I suggested to solve the current 
misconfiguration problem with creating separated mailing lists 
(dev/issues) for Hadoop Ozone.


It would have some additional benefit: for example it would make easier 
to follow the Ozone development and future plans.


Here I am starting a new vote thread (open for at least 72 hours) to 
collect more feedback about this.


Please express your opinion / vote.

Thanks a lot,
Marton

[1] 
https://lists.apache.org/thread.html/dc66a30f48a744534e748c418bf7ab6275896166ca5ade11560ebaef@%3Chdfs-dev.hadoop.apache.org%3E


-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Re: [VOTE] create ozone-dev and ozone-issues mailing lists

2019-10-29 Thread Elek, Marton

> The immediate problem we need to fix is to prevent github updates from
> spamming the dev mailing list.
> Might make sense to just have a separate issues@ mailing list and point
> github to that?


For me, it's less confusing to create both of the issues and the dev 
list at the same time. Just to follow the existing pattern.


I would prefer to solve multiple problems at the same time:

 * remove the notification noise from the hdfs-dev
 * create a place where certain features / design decision can be discussed

I assume that we will follow the current practice:

 * All generic votes / releases will be sent to all the *-dev@hadoop 
mailing lists


 * Generic hadoop-storage specific information / summaries can be sent 
to both hdfs-dev/ozone-dev


 * Low-level design discussions can be part of ozone-dev only


But if you see any problem with ozone-dev please downvote that part and 
I will create only the -issues list...


Thanks,
Marton




On 10/28/19 6:07 PM, Jitendra Pandey wrote:

The immediate problem we need to fix is to prevent github updates from
spamming the dev mailing list.
Might make sense to just have a separate issues@ mailing list and point
github to that?

On Sun, Oct 27, 2019 at 10:12 PM Dinesh Chitlangia
 wrote:


+1

-Dinesh




On Sun, Oct 27, 2019, 4:25 AM Elek, Marton 

As discussed earlier in the thread of "Hadoop-Ozone repository mailing
list configurations" [1] I suggested to solve the current
misconfiguration problem with creating separated mailing lists
(dev/issues) for Hadoop Ozone.

It would have some additional benefit: for example it would make easier
to follow the Ozone development and future plans.

Here I am starting a new vote thread (open for at least 72 hours) to
collect more feedback about this.

Please express your opinion / vote.

Thanks a lot,
Marton

[1]



https://lists.apache.org/thread.html/dc66a30f48a744534e748c418bf7ab6275896166ca5ade11560ebaef@%3Chdfs-dev.hadoop.apache.org%3E


-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org








-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Re: [Discuss] Hadoop-Ozone repository mailing list configurations

2019-10-23 Thread Elek, Marton

Thanks to report this problem Rohith,

Yes, it seems to be configured with the wrong mailing list.

I think the right fix is to create ozone-dev@ and ozone-issues@ and use 
them instead of hdfs-(dev/issues).


Is there any objections against creating new ozone-* mailing lists?

Thanks,
Marton


On 10/21/19 6:03 AM, Rohith Sharma K S wrote:

+ common/yarn and mapreduce/submarine

Looks like same issue in submarine repository also !


On Mon, 21 Oct 2019 at 09:30, Rohith Sharma K S 
wrote:


Folks,

In Hadoop world, any mailing list has its own purposes as below
1. hdfs/common/yarn/mapreduce-*dev *mailing list is meant for developer
discussion purpose.
2. hdfs/common/yarn/mapreduce*-issues* mailing list used for comments
made in the issues.

  It appears Hadoop-Ozone repository configured *hdfs-dev *mailing list
for *hdfs-issues* list also. As a result hdfs-dev mailing list is
bombarded with every comment made in hadoop-ozone repository.


Could it be fixed?

-Rohith Sharma K S







-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Re: [DISCUSS] Separate Hadoop Core trunk and Hadoop Ozone trunk source tree [discussion -> lazy vote]

2019-09-23 Thread Elek, Marton
> Do you see a Submarine like split-also-into-a-TLP for Ozone? If not 
now, sometime further down the line?


Good question, and I don't know what is the best answer right now. It's 
definitely an option, But Submarine move hasn't been finished, so it's 
not yet possible to learn form the experiences (which can be a usefull 
input for the decision).


I think it's a bigger/more important question and I would prefer to 
start a new thread about it.


>  If so, why not do both at the same time?

That's an easier question: I think the repo separation is an easier 
step, with immediate benefits, therefore I would prefer to do it as soon 
as possible.


Moving to a separated TLP may take months (discussion, vote, proposal, 
board approval, etc.). While this code organization step can be done 
easily after the 0.4.1 Ozone release (which is very close, I hope).


As it should be done anyway (with or without separated TLP) I propose to 
do it after the next Ozone release (in the next 1-2 weeks).




As the overall feedback was positive (in fact many of the answers were 
simple +1 votes) I don't think the thread should be repeated under 
[VOTE] subject. Therefore I call it for a lazy consensus. If you have 
any objections (against doing the repo separation now or doing it at 
all) please express in the next 3 days...


Thanks a lot,
Marton

On 9/22/19 4:02 PM, Vinod Kumar Vavilapalli wrote:

Looks to me that the advantages of this additional step are only incremental 
given that you've already decoupled releases and dependencies.

Do you see a Submarine like split-also-into-a-TLP for Ozone? If not now, 
sometime further down the line? If so, why not do both at the same time? I felt 
the same way with Submarine, but couldn't follow up in time.

Thanks
+Vinod


On Sep 18, 2019, at 4:04 AM, Wangda Tan  wrote:

+1 (binding).

 From my experiences of Submarine project, I think moving to a separate repo
helps.

- Wangda

On Tue, Sep 17, 2019 at 11:41 AM Subru Krishnan  wrote:


+1 (binding).

IIUC, there will not be an Ozone module in trunk anymore as that was my
only concern from the original discussion thread? IMHO, this should be the
default approach for new modules.

On Tue, Sep 17, 2019 at 9:58 AM Salvatore LaMendola (BLOOMBERG/ 731 LEX) <
slamendo...@bloomberg.net> wrote:


+1

From: e...@apache.org At: 09/17/19 05:48:32To:

hdfs-...@hadoop.apache.org,

mapreduce-...@hadoop.apache.org,  common-dev@hadoop.apache.org,
yarn-...@hadoop.apache.org
Subject: [DISCUSS] Separate Hadoop Core trunk and Hadoop Ozone trunk
source tree


TLDR; I propose to move Ozone related code out from Hadoop trunk and
store it in a separated *Hadoop* git repository apache/hadoop-ozone.git


When Ozone was adopted as a new Hadoop subproject it was proposed[1] to
be part of the source tree but with separated release cadence, mainly
because it had the hadoop-trunk/SNAPSHOT as compile time dependency.

During the last Ozone releases this dependency is removed to provide
more stable releases. Instead of using the latest trunk/SNAPSHOT build
from Hadoop, Ozone uses the latest stable Hadoop (3.2.0 as of now).

As we have no more strict dependency between Hadoop trunk SNAPSHOT and
Ozone trunk I propose to separate the two code base from each other with
creating a new Hadoop git repository (apache/hadoop-ozone.git):

With moving Ozone to a separated git repository:

  * It would be easier to contribute and understand the build (as of now
we always need `-f pom.ozone.xml` as a Maven parameter)
  * It would be possible to adjust build process without breaking
Hadoop/Ozone builds.
  * It would be possible to use different Readme/.asf.yaml/github
template for the Hadoop Ozone and core Hadoop. (For example the current
github template [2] has a link to the contribution guideline [3]. Ozone
has an extended version [4] from this guideline with additional
information.)
  * Testing would be more safe as it won't be possible to change core
Hadoop and Hadoop Ozone in the same patch.
  * It would be easier to cut branches for Hadoop releases (based on the
original consensus, Ozone should be removed from all the release
branches after creating relase branches from trunk)


What do you think?

Thanks,
Marton

[1]:



https://lists.apache.org/thread.html/c85e5263dcc0ca1d13cbbe3bcfb53236784a39111b8

c353f60582eb4@%3Chdfs-dev.hadoop.apache.org%3E
[2]:



https://github.com/apache/hadoop/blob/trunk/.github/pull_request_template.md

[3]:

https://cwiki.apache.org/confluence/display/HADOOP/How+To+Contribute

[4]:



https://cwiki.apache.org/confluence/display/HADOOP/How+To+Contribute+to+Ozone


-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org








-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: 

Re: [DISCUSS] Release Docs pointers Hadoop site

2019-10-08 Thread Elek, Marton

To be honest, I have no idea. I don't know about the historical meaning.

But as there is no other feedback, here are my guesses based on pure logic:

 * current -> should point to the release with the highest number (3.2.1)
 * stable -> to the stable 3.x release with the highest number (3.2.1 
as of now)


current2 -> latest 2.x release
stable2 -> latest stable 2.x release

>> 1. But if the release manager of 3.1 line thinks 3.1.3 is stable, 
and 3.2

>> line is also in stable state, which release should get precedence to be
>> called as *stable* in any release line (2.x or 3.x) ?

It depends if stable2 = (second highest stable) or (stable from the 2.x 
line). I think the second meaning is more reasonable.


>> 3.1.3 is getting released now, could
>> http://hadoop.apache.org/docs/current/ shall be updated to 3.1.3 ? is it
>> the norms ?

No. As the stable should point to the highest stable, not to the stable 
which was released recently.


Marton

On 9/30/19 10:09 AM, Sunil Govindan wrote:

Bumping up this thread again for feedback.
@Zhankun Tang   is now waiting for a confirmation to
complete 3.1.3 release publish activities.

- Sunil

On Fri, Sep 27, 2019 at 11:03 AM Sunil Govindan  wrote:


Hi Folks,

At present,
http://hadoop.apache.org/docs/stable/  points to *Apache Hadoop 3.2.1*
http://hadoop.apache.org/docs/current/ points to *Apache Hadoop 3.2.1*
http://hadoop.apache.org/docs/stable2/  points to *Apache Hadoop 2.9.2*
http://hadoop.apache.org/docs/current2/ points to *Apache Hadoop 2.9.2*

3.2.1 is released last day. *Now 3.1.3 has completed voting* and it is in
the final stages of staging
As per me,
a) 3.2.1 will be still be pointing to
http://hadoop.apache.org/docs/stable/ ?
b) 3.1.3 should be pointing to http://hadoop.apache.org/docs/current/ ?

Now my questions,
1. But if the release manager of 3.1 line thinks 3.1.3 is stable, and 3.2
line is also in stable state, which release should get precedence to be
called as *stable* in any release line (2.x or 3.x) ?
or do we need a vote or discuss thread to decide which release shall be
called as stable per release line?
2. Given 3.2.1 is released and pointing to 3.2.1 as stable, then when
3.1.3 is getting released now, could
http://hadoop.apache.org/docs/current/ shall be updated to 3.1.3 ? is it
the norms ?

Thanks
Sunil





-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Re: [VOTE] Release Apache Hadoop Ozone 0.4.1-alpha

2019-10-10 Thread Elek, Marton



+1

Thank you Nanda the enormous work to make this release happen.



 * GPG Signatures are fine
 * SHA512 signatures are fine
 * Can be built from the source package (in isolated environment 
without cached hadoop/ozone artifacts)

 * Started the pseudo cluster with `compose/ozone`
 * Executed FULL smoke-test suite (`cd compose && ./test-all.sh`) ALL 
passed except some intermittent issues:
   * kinit step was failed due to timeout but after that all the secure 
testss are passed. I think my laptop was too slow... + I had other CPU 
sensitive tasks in the mean time

 * Tested to create apache/hadoop-ozone:0.4.1 image
 * Using hadoop-docker-ozone/Dockerfile [1]
 * Started single, one node cluster + tested with AWS cli 
(REDUCED_REDUNDANCY) (`docker run elek/ozone:test`)
 * Started pseudo cluster (`docker run elek/ozone:test cat 
docker-compose.yaml && docker run elek/ozone:test cat docker-config`)

 * Tested with kubernetes:
   * Used the image which is created earlier
   * Replaced the images under kubernetes/examples/minikube
   * Started with kubectl `kubectl apply -f` to k3s (3!) cluster
   * Tested with `ozone sh` commands (put/get keys)


Marton

[1]:
```
docker build --build-arg 
OZONE_URL=https://home.apache.org/~nanda/ozone/release/0.4.1/RC0/hadoop-ozone-0.4.1-alpha.tar.gz 
-t elek/ozone-test .

```

On 10/4/19 7:42 PM, Nanda kumar wrote:

Hi Folks,

I have put together RC0 for Apache Hadoop Ozone 0.4.1-alpha.

The artifacts are at:
https://home.apache.org/~nanda/ozone/release/0.4.1/RC0/

The maven artifacts are staged at:
https://repository.apache.org/content/repositories/orgapachehadoop-1238/

The RC tag in git is at:
https://github.com/apache/hadoop/tree/ozone-0.4.1-alpha-RC0

And the public key used for signing the artifacts can be found at:
https://dist.apache.org/repos/dist/release/hadoop/common/KEYS

This release contains 363 fixes/improvements [1].
Thanks to everyone who put in the effort to make this happen.

*The vote will run for 7 days, ending on October 11th at 11:59 pm IST.*
Note: This release is alpha quality, it’s not recommended to use in
production but we believe that it’s stable enough to try out the feature
set and collect feedback.


[1] https://s.apache.org/yfudc

Thanks,
Team Ozone



-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Re: [DISCUSS] Separate Hadoop Core trunk and Hadoop Ozone trunk source tree [discussion -> lazy vote]

2019-10-13 Thread Elek, Marton



Sure. Will do it that way.

Thanks the feedback.

Marton

On 10/12/19 7:27 PM, Vinod Kumar Vavilapalli wrote:

On Mon, Sep 23, 2019 at 4:02 PM Elek, Marton  wrote:


  As the overall feedback was positive (in fact many of the answers were
simple +1 votes) I don't think the thread should be repeated under
[VOTE] subject. Therefore I call it for a lazy consensus.



Let's please not do this in the future.

These are large enough changes that we'd like to get formal votes both for
immediate visibility (VOTE threads are more in your face) as well as for
record-keeping in posterity.

Thanks
+Vinod



-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[NOTE] Ozone is moved out from Hadoop trunk

2019-10-13 Thread Elek, Marton



As discussed earlier Ozone is moved out from Hadoop trunk.

Please commit / create pull requests to the

--> https://github.com/apache/hadoop-ozone

in the future.

Remaining Ozone code will be deleted from Hadoop trunk with HDDS-2288.

Marton

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Re: [VOTE] Release Apache Hadoop Thirdparty 1.0.0

2020-02-28 Thread Elek, Marton



Thank you very much to work on this release Vinay, 1.0.0 is always a 
hard work...



1. I downloaded it and I can build it from the source

2. Checked the signature and the sha512 of the src package and they are fine

3. Yetus seems to be included in the source package. I am not sure if 
it's intentional but I would remove the patchprocess directory from the 
tar file.


4. NOTICE.txt seems to be outdated (I am not sure, but I think the 
Export Notice is unnecessary, especially for the source release, also 
the note about the bouncycastle and Yarn server is unnecessary).


5. NOTICE-binary and LICENSE-binary seems to be unused (and they contain 
unrelated entries, especially the NOTICE). IMHO


6. As far as I understand the binary release in this case is the maven 
artifact. IANAL but the original protobuf license seems to be missing 
from "unzip -p hadoop-shaded-protobuf_3_7-1.0.0.jar META-INF/LICENSE.txt"


7. Minor nit: I would suggest to use only the filename in the sha512 
files (instead of having the /build/source/target prefix). It would help 
to use `sha512 -c` command to validate the checksum.


Thanks again to work on this,
Marton

ps: I am not experienced with licensing enough to judge which one of 
these are blocking and I might be wrong.




On 2/25/20 8:17 PM, Vinayakumar B wrote:

Hi folks,

Thanks to everyone's help on this release.

I have created a release candidate (RC0) for Apache Hadoop Thirdparty 1.0.0.

RC Release artifacts are available at :
   http://home.apache.org/~vinayakumarb/release/hadoop-thirdparty-1.0.0-RC0/

Maven artifacts are available in staging repo:
 https://repository.apache.org/content/repositories/orgapachehadoop-1258/

The RC tag in git is here:
https://github.com/apache/hadoop-thirdparty/tree/release-1.0.0-RC0

And my public key is at:
https://dist.apache.org/repos/dist/release/hadoop/common/KEYS

*This vote will run for 5 days, ending on March 1st 2020 at 11:59 pm IST.*

For the testing, I have verified Hadoop trunk compilation with
"-DdistMgmtSnapshotsUrl=
https://repository.apache.org/content/repositories/orgapachehadoop-1258/
-Dhadoop-thirdparty-protobuf.version=1.0.0"

My +1 to start.

-Vinay



-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[DISCUSS] making Ozone a separate Apache project

2020-05-13 Thread Elek, Marton




I would like to start a discussion to make a separate Apache project for 
Ozone




### HISTORY [1]

 * Apache Hadoop Ozone development started on a feature branch of 
Hadoop repository (HDFS-7240)


 * In the October of 2017 a discussion has been started to merge it to 
the Hadoop main branch


 * After a long discussion it's merged to Hadoop trunk at the March of 2018

 * During the discussion of the merge, it was suggested multiple times 
to create a separated project for the Ozone. But at that time:

1). Ozone was tightly integrated with Hadoop/HDFS
2). There was an active plan to use Block layer of Ozone (HDDS or 
HDSL at that time) as the block level of HDFS

3). The community of Ozone was a subset of the HDFS community

 * The first beta release of Ozone was just released. Seems to be a 
good time before the first GA to make a decision about the future.




### WHAT HAS BEEN CHANGED

 During the last years Ozone became more and more independent both at 
the community and code side. The separation has been suggested again and 
again (for example by Owen [2] and Vinod [3])




 From COMMUNITY point of view:


  * Fortunately more and more new contributors are helping Ozone. 
Originally the Ozone community was a subset of HDFS project. But now a 
bigger and bigger part of the community is related to Ozone only.


  * It seems to be easier to _build_ the community as a separated project.

  * A new, younger project might have different practices 
(communication, commiter criteria, development style) compared to old, 
mature project


  * It's easier to communicate (and improve) these standards in a 
separated projects with clean boundaries


  * Separated project/brand can help to increase the adoption rate and 
attract more individual contributor (AFAIK it has been seen in Submarine 
after a similar move)


 * Contribution process can be communicated more easily, we can make 
first time contribution more easy




 From CODE point of view Ozone became more and more independent:


 * Ozone has different release cycle

 * Code is already separated from Hadoop code base 
(apache/hadoop-ozone.git)


 * It has separated CI (github actions)

 * Ozone uses different (more strict) coding style (zero toleration of 
unit test / checkstyle errors)


 * The code itself became more and more independent from Hadoop on 
Maven level. Originally it was compiled together with the in-tree latest 
Hadoop snapshot. Now it depends on released Hadoop artifacts (RPC, 
Configuration...)


 * It starts to use multiple version of Hadoop (on client side)

 * Volume of resolved issues are already very high on Ozone side (Ozone 
had slightly more resolved issues than HDFS/YARN/MAPREDUCE/COMMON all 
together in the last 2-3 months)



Summary: Before the first Ozone GA release, It seems to be a good time 
to discuss the long-term future of Ozone. Managing it as a separated TLP 
project seems to have more benefits.



Please let me know what your opinion is...

Thanks a lot,
Marton





[1]: For more details, see: 
https://github.com/apache/hadoop-ozone/blob/master/HISTORY.md


[2]: 
https://lists.apache.org/thread.html/0d0253f6e5fa4f609bd9b917df8e1e4d8848e2b7fdb3099b730095e6%40%3Cprivate.hadoop.apache.org%3E


[3]: 
https://lists.apache.org/thread.html/8be74421ea495a62e159f2b15d74627c63ea1f67a2464fa02c85d4aa%40%3Chdfs-dev.hadoop.apache.org%3E


-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Re: [DISCUSS] making Ozone a separate Apache project

2020-05-18 Thread Elek, Marton





One question, for the committers who contributed to Ozone before and got
the committer-role in the past (like me), will they carry the
committer-role to the new repo?



In short: yes.


In more details:

This discussion (if there is an agreement) should be followed by a next 
discussion + vote about a very specific proposal which should contain 
all the technical information (including committer list)


I support the the same approach what we followed with Submarine:

ALL the existing (Hadoop) committers should have a free / opt-in 
opportunity to be a committer in Ozone.


(After proposal is created on the wiki, you can add your name, or 
request to be added. But as the initial list can be created based on 
statistics from the Jira, your name can be already there ;-) )




Marton

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Re: [VOTE] Apache Hadoop Ozone 0.5.0-beta RC2

2020-03-20 Thread Elek, Marton

+1

 * signatures checked
 * sha512 checked
 * can be built from the source (without using my local mvn cache: 
using apache/ozone-build docker image)

 * created docker image with the provided Dockerfile
 * Deployed to Kubernetes together with Yarn and Hdfs
 * tested with 100G teragen (one pipeline, storage/execution were 
separated)



Thank you Dinesh for the continuous effort for this release.

Marton

On 3/16/20 3:27 AM, Dinesh Chitlangia wrote:

Hi Folks,

We have put together RC2 for Apache Hadoop Ozone 0.5.0-beta.

The RC artifacts are at:
https://home.apache.org/~dineshc/ozone-0.5.0-rc2/

The public key used for signing the artifacts can be found at:
https://dist.apache.org/repos/dist/release/hadoop/common/KEYS

The maven artifacts are staged at:
https://repository.apache.org/content/repositories/orgapachehadoop-1262

The RC tag in git is at:
https://github.com/apache/hadoop-ozone/tree/ozone-0.5.0-beta-RC2

This release contains 800+ fixes/improvements [1].
Thanks to everyone who put in the effort to make this happen.

*The vote will run for 7 days, ending on March 22nd 2020 at 11:59 pm PST.*

Note: This release is beta quality, it’s not recommended to use in
production but we believe that it’s stable enough to try out the feature
set and collect feedback.


[1] https://s.apache.org/ozone-0.5.0-fixed-issues

Thanks,
Dinesh Chitlangia



-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Re: [VOTE] Apache Hadoop Ozone 1.0.0 RC1

2020-08-31 Thread Elek, Marton

+1 (binding)


1. verified signatures

2. verified checksums

3. verified the output of `ozone version` (includes the good git revision)

4. verified that the source package matches the git tag

5. verified source can be used to build Ozone without previous state 
(docker run -v ... -it maven ... --> built from the source with zero 
local maven cache during 16 minutes --> did on a sever at this time)


6. Verified Ozone can be used from binary package (cd compose/ozone &&
test.sh --> all tests were passed)

7. Verified documentation is included in SCM UI

8. Deployed to Kubernetes and executed Teragen on Yarn [1]

9. Deployed to Kubernetes and executed Spark (3.0) Word count (local 
executor) [2]


10. Deployed to Kubernetes and executed Flink Word count [3]

11. Deployed to Kubernetes and executed Nifi

Thanks very much Sammi, to drive this release...
Marton

ps:  NiFi setup requires some more testing. Counters were not updated on 
the UI and at some cases, I saw DirNotFound exceptions when I used 
master. But during the last test with -rc1 it worked well.


[1]: https://github.com/elek/ozone-perf-env/tree/master/teragen-ozone

[2]: https://github.com/elek/ozone-perf-env/tree/master/spark-ozone

[3]: https://github.com/elek/ozone-perf-env/tree/master/flink-ozone


On 8/25/20 4:01 PM, Sammi Chen wrote:

RC1 artifacts are at:
https://home.apache.org/~sammichen/ozone-1.0.0-rc1/


Maven artifacts are staged at:
https://repository.apache.org/content/repositories/orgapachehadoop-1278


The public key used for signing the artifacts can be found at:
https://dist.apache.org/repos/dist/release/hadoop/common/KEYS

The RC1 tag in github is at:
https://github.com/apache/hadoop-ozone/releases/tag/ozone-1.0.0-RC1


Change log of RC1, add
1. HDDS-4063. Fix InstallSnapshot in OM HA
2. HDDS-4139. Update version number in upgrade tests.
3. HDDS-4144, Update version info in hadoop client dependency readme

*The vote will run for 7 days, ending on Aug 31th 2020 at 11:59 pm PST.*

Thanks,
Sammi Chen



-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[DISCUSS] Ozone TLP proposal

2020-09-07 Thread Elek, Marton



Hi,

The Hadoop community earlier decided to move out Ozone sub-project to a 
separated Apache Top Level Project (TLP). [1]


For detailed history and motivation, please check the previous thread ([1])

Ozone community discussed and agreed on the initial version of the 
project proposal, and now it's time to discuss it with the full Hadoop 
community.


The current version is available at the Hadoop wiki:

https://cwiki.apache.org/confluence/display/HADOOP/Ozone+Hadoop+subproject+to+Apache+TLP+proposal


 1. Please read it. You can suggest any modifications or topics to 
cover (here or in the comments)


 2. Following the path of Submarine, any existing Hadoop committers -- 
who are willing to contribute -- can ask to be included in the initial 
committer list without any additional constraints. (Edit the wiki, or 
send an email to this thread or to me). Thanks for Vinod to suggesting 
this approach. (for Submarine at that time)



Next steps:

 * After this discussion thread (in case of consensus) a new VOTE 
thread will be started about the proposal (*-dev@hadoop.a.o)


 * In case VOTE is passed, the proposal will be sent to the Apache 
Board to be discussed.



Please help to make the proposal better,

Thanks a lot,
Marton


[1]. 
https://lists.apache.org/thread.html/r298eba8abecc210abd952f040b0c4f07eccc62dcdc49429c1b8f4ba9%40%3Chdfs-dev.hadoop.apache.org%3E


-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[RESULT][VOTE] Moving Ozone to a separated Apache project

2020-10-02 Thread Elek, Marton




The vote passes with 29 +1 (where 22 are binding) and without -1 or 0.


Thank you very much for all the votes/support.

As a next step, I will send the proposal and the result of this vote to 
the ASF board.


Thanks again,
Marton

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Re: [VOTE] Apache Hadoop Ozone 1.0.0 RC0

2020-08-25 Thread Elek, Marton
Is it a blocker? It seems to be an optional test. While it can be 
confusing to have 0.6.0 it's very rarely used and the functionality is 
tested anyway...


What do you think?

Thanks,
Marton

On 8/24/20 10:01 PM, Attila Doroszlai wrote:

Hi Sammi,

Thanks for creating the RC.  I have found that there are some leftover
references to version "0.6.0" in upgrade scripts and tests
(HDDS-4139).  Created a pull request to fix it, please consider
including it in the release.

thanks,
Attila

On Mon, Aug 24, 2020 at 3:55 PM Elek, Marton  wrote:



+1 (binding)




1. verified signatures

2. verified checksums

3. verified the output of `ozone version` (includes the good git revision)

4. verified that the source package matches the git tag

5. verified source can be used to build Ozone without previous state
(docker run -v ... -it maven ... --> built from the source with zero
local maven cache during 30 minutes)

6. Verified Ozone can be used from binary package (cd compose/ozone &&
test.sh --> all tests were passed)

7. Verified documentation is included in SCM UI

8. Deployed to Kubernetes and executed Teragen on Yarn [1]

9. Deployed to Kubernetes and executed Spark (3.0) Word count (local
executor) [2]


I know about a few performance problems (like HDDS-4119) but I don't
think we should further block the release (they are not regressions just
   improvements). If we will have significant performance improvements
soon, we can release 1.0.1 within 1 month.


Thanks the great work Sammi!!

Marton

[1]: https://github.com/elek/ozone-perf-env/tree/master/teragen-ozone
[2]: https://github.com/elek/ozone-perf-env/tree/master/spark-ozone




On 8/20/20 3:12 PM, Sammi Chen wrote:

my +1(binding)

 -  Verified ozone version of binary package

 -  Verified ozone source package content with ozone-1.0.0-RC0 tag

 -  Build ozone from source package

 -  Upgrade an existing 1+3 cluster using RC0 binary package

 -  Check Ozone UI, SCM UI, Datanode UI and Recon UI

 -  Run TestDFSIO write/read with Hadoop 2.7.5

 -  Verified basic o3fs operations, upload and download file

 -  Create bucket using aws CLI,  upload and download 10G file through s3g

Thanks,
Sammi

On Thu, Aug 20, 2020 at 8:55 PM Sammi Chen  wrote:



This Ozone 1.0.0 release includes 620 JIRAs,

https://issues.apache.org/jira/issues/?jql=project+%3D+HDDS+AND+%28cf%5B12310320%5D+%3D+0.6.0+or+fixVersion+%3D+0.6.0%29

Thanks everyone for putting in the effort and making this happen.

You can find the RC0 artifacts are at:
https://home.apache.org/~sammichen/ozone-1.0.0-rc0/

Maven artifacts are staged at:
https://repository.apache.org/content/repositories/orgapachehadoop-1277

The public key used for signing the artifacts can be found at:
https://dist.apache.org/repos/dist/release/hadoop/common/KEYS

The RC0 tag in github is at:
https://github.com/apache/hadoop-ozone/releases/tag/ozone-1.0.0-RC0

*The vote will run for 7 days, ending on Aug 27th 2020 at 11:59 pm CST.*

Thanks,
Sammi Chen







-
To unsubscribe, e-mail: ozone-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-dev-h...@hadoop.apache.org



-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Re: [VOTE] Apache Hadoop Ozone 1.0.0 RC0

2020-08-24 Thread Elek, Marton



+1 (binding)




1. verified signatures

2. verified checksums

3. verified the output of `ozone version` (includes the good git revision)

4. verified that the source package matches the git tag

5. verified source can be used to build Ozone without previous state 
(docker run -v ... -it maven ... --> built from the source with zero 
local maven cache during 30 minutes)


6. Verified Ozone can be used from binary package (cd compose/ozone &&
test.sh --> all tests were passed)

7. Verified documentation is included in SCM UI

8. Deployed to Kubernetes and executed Teragen on Yarn [1]

9. Deployed to Kubernetes and executed Spark (3.0) Word count (local 
executor) [2]



I know about a few performance problems (like HDDS-4119) but I don't 
think we should further block the release (they are not regressions just 
 improvements). If we will have significant performance improvements 
soon, we can release 1.0.1 within 1 month.



Thanks the great work Sammi!!

Marton

[1]: https://github.com/elek/ozone-perf-env/tree/master/teragen-ozone
[2]: https://github.com/elek/ozone-perf-env/tree/master/spark-ozone




On 8/20/20 3:12 PM, Sammi Chen wrote:

my +1(binding)

-  Verified ozone version of binary package

-  Verified ozone source package content with ozone-1.0.0-RC0 tag

-  Build ozone from source package

-  Upgrade an existing 1+3 cluster using RC0 binary package

-  Check Ozone UI, SCM UI, Datanode UI and Recon UI

-  Run TestDFSIO write/read with Hadoop 2.7.5

-  Verified basic o3fs operations, upload and download file

-  Create bucket using aws CLI,  upload and download 10G file through s3g

Thanks,
Sammi

On Thu, Aug 20, 2020 at 8:55 PM Sammi Chen  wrote:



This Ozone 1.0.0 release includes 620 JIRAs,

https://issues.apache.org/jira/issues/?jql=project+%3D+HDDS+AND+%28cf%5B12310320%5D+%3D+0.6.0+or+fixVersion+%3D+0.6.0%29

Thanks everyone for putting in the effort and making this happen.

You can find the RC0 artifacts are at:
https://home.apache.org/~sammichen/ozone-1.0.0-rc0/

Maven artifacts are staged at:
https://repository.apache.org/content/repositories/orgapachehadoop-1277

The public key used for signing the artifacts can be found at:
https://dist.apache.org/repos/dist/release/hadoop/common/KEYS

The RC0 tag in github is at:
https://github.com/apache/hadoop-ozone/releases/tag/ozone-1.0.0-RC0

*The vote will run for 7 days, ending on Aug 27th 2020 at 11:59 pm CST.*

Thanks,
Sammi Chen







-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[VOTE] Moving Ozone to a separated Apache project

2020-09-25 Thread Elek, Marton

Hi all,

Thank you for all the feedback and requests,

As we discussed in the previous thread(s) [1], Ozone is proposed to be a 
separated Apache Top Level Project (TLP)


The proposal with all the details, motivation and history is here:

https://cwiki.apache.org/confluence/display/HADOOP/Ozone+Hadoop+subproject+to+Apache+TLP+proposal

This voting runs for 7 days and will be concluded at 2nd of October, 6AM 
GMT.


Thanks,
Marton Elek

[1]: 
https://lists.apache.org/thread.html/rc6c79463330b3e993e24a564c6817aca1d290f186a1206c43ff0436a%40%3Chdfs-dev.hadoop.apache.org%3E


-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Re: [DISCUSS] Ozone TLP proposal

2020-09-21 Thread Elek, Marton

Thank you all the feedback and update.

If there is no more comments, I will start the vote on Thursday.

Thanks,
Marton

On 9/7/20 2:04 PM, Elek, Marton wrote:


Hi,

The Hadoop community earlier decided to move out Ozone sub-project to a 
separated Apache Top Level Project (TLP). [1]


For detailed history and motivation, please check the previous thread ([1])

Ozone community discussed and agreed on the initial version of the 
project proposal, and now it's time to discuss it with the full Hadoop 
community.


The current version is available at the Hadoop wiki:

https://cwiki.apache.org/confluence/display/HADOOP/Ozone+Hadoop+subproject+to+Apache+TLP+proposal 




  1. Please read it. You can suggest any modifications or topics to 
cover (here or in the comments)


  2. Following the path of Submarine, any existing Hadoop committers -- 
who are willing to contribute -- can ask to be included in the initial 
committer list without any additional constraints. (Edit the wiki, or 
send an email to this thread or to me). Thanks for Vinod to suggesting 
this approach. (for Submarine at that time)



Next steps:

  * After this discussion thread (in case of consensus) a new VOTE 
thread will be started about the proposal (*-dev@hadoop.a.o)


  * In case VOTE is passed, the proposal will be sent to the Apache 
Board to be discussed.



Please help to make the proposal better,

Thanks a lot,
Marton


[1]. 
https://lists.apache.org/thread.html/r298eba8abecc210abd952f040b0c4f07eccc62dcdc49429c1b8f4ba9%40%3Chdfs-dev.hadoop.apache.org%3E 



-
To unsubscribe, e-mail: ozone-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-dev-h...@hadoop.apache.org



-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Re: [DISCUSS] making Ozone a separate Apache project

2020-05-27 Thread Elek, Marton



Thank you for all of your responses.

It seems that we have a strong agreement. This thread got ~ 25 
+1/positive feedback without any arguments against the move.




The next step is on the Ozone community:

 1. A proposal should be created for the new project which covers all 
of the technical questions (HOW to do it...). This should be created and 
agreed by the Ozone community


 2. A second discussion thread will be started here (all Hadoop dev 
lists) to discuss about the proposal itself.


 3. When everybody is happy with the proposed way of the move we can 
start the VOTE thread (based on the proposal) here


 4. Voted proposal will be sent to the ASF board to be 
discussed/decided/approved.



Thanks, again, all your feedback,
Marton


On 5/13/20 9:52 AM, Elek, Marton wrote:



I would like to start a discussion to make a separate Apache project for 
Ozone




### HISTORY [1]

  * Apache Hadoop Ozone development started on a feature branch of 
Hadoop repository (HDFS-7240)


  * In the October of 2017 a discussion has been started to merge it to 
the Hadoop main branch


  * After a long discussion it's merged to Hadoop trunk at the March of 
2018


  * During the discussion of the merge, it was suggested multiple times 
to create a separated project for the Ozone. But at that time:

     1). Ozone was tightly integrated with Hadoop/HDFS
     2). There was an active plan to use Block layer of Ozone (HDDS or 
HDSL at that time) as the block level of HDFS

     3). The community of Ozone was a subset of the HDFS community

  * The first beta release of Ozone was just released. Seems to be a 
good time before the first GA to make a decision about the future.




### WHAT HAS BEEN CHANGED

  During the last years Ozone became more and more independent both at 
the community and code side. The separation has been suggested again and 
again (for example by Owen [2] and Vinod [3])




  From COMMUNITY point of view:


   * Fortunately more and more new contributors are helping Ozone. 
Originally the Ozone community was a subset of HDFS project. But now a 
bigger and bigger part of the community is related to Ozone only.


   * It seems to be easier to _build_ the community as a separated project.

   * A new, younger project might have different practices 
(communication, commiter criteria, development style) compared to old, 
mature project


   * It's easier to communicate (and improve) these standards in a 
separated projects with clean boundaries


   * Separated project/brand can help to increase the adoption rate and 
attract more individual contributor (AFAIK it has been seen in Submarine 
after a similar move)


  * Contribution process can be communicated more easily, we can make 
first time contribution more easy




  From CODE point of view Ozone became more and more independent:


  * Ozone has different release cycle

  * Code is already separated from Hadoop code base 
(apache/hadoop-ozone.git)


  * It has separated CI (github actions)

  * Ozone uses different (more strict) coding style (zero toleration of 
unit test / checkstyle errors)


  * The code itself became more and more independent from Hadoop on 
Maven level. Originally it was compiled together with the in-tree latest 
Hadoop snapshot. Now it depends on released Hadoop artifacts (RPC, 
Configuration...)


  * It starts to use multiple version of Hadoop (on client side)

  * Volume of resolved issues are already very high on Ozone side (Ozone 
had slightly more resolved issues than HDFS/YARN/MAPREDUCE/COMMON all 
together in the last 2-3 months)



Summary: Before the first Ozone GA release, It seems to be a good time 
to discuss the long-term future of Ozone. Managing it as a separated TLP 
project seems to have more benefits.



Please let me know what your opinion is...

Thanks a lot,
Marton





[1]: For more details, see: 
https://github.com/apache/hadoop-ozone/blob/master/HISTORY.md


[2]: 
https://lists.apache.org/thread.html/0d0253f6e5fa4f609bd9b917df8e1e4d8848e2b7fdb3099b730095e6%40%3Cprivate.hadoop.apache.org%3E 



[3]: 
https://lists.apache.org/thread.html/8be74421ea495a62e159f2b15d74627c63ea1f67a2464fa02c85d4aa%40%3Chdfs-dev.hadoop.apache.org%3E 



-
To unsubscribe, e-mail: ozone-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-dev-h...@hadoop.apache.org



-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Re: [VOTE] Release Apache Hadoop 3.1.4 (RC4)

2020-08-03 Thread Elek, Marton

+1 (binding)

 * checked signature
 * built from source
 * deployed binary package to kubernetes
 * executed teragen with automatic tests [1]
 * checked "hadoop version" and compared with git revision
 * checked if the staging repository contains src packages


Thanks the work (and the toughness) Gabor Bota.

Marton


[1]: https://github.com/elek/ozone-perf-env/tree/master/teragen-hdfs


On 7/21/20 2:50 PM, Gabor Bota wrote:

Hi folks,

I have put together a release candidate (RC4) for Hadoop 3.1.4.

*
The RC includes in addition to the previous ones:
* fix for HDFS-15313. Ensure inodes in active filesystem are not
deleted during snapshot delete
* fix for YARN-10347. Fix double locking in
CapacityScheduler#reinitialize in branch-3.1
(https://issues.apache.org/jira/browse/YARN-10347)
* the revert of HDFS-14941, as it caused
HDFS-15421. IBR leak causes standby NN to be stuck in safe mode.
(https://issues.apache.org/jira/browse/HDFS-15421)
* HDFS-15323, as requested.
(https://issues.apache.org/jira/browse/HDFS-15323)
*

The RC is available at: http://people.apache.org/~gabota/hadoop-3.1.4-RC4/
The RC tag in git is here:
https://github.com/apache/hadoop/releases/tag/release-3.1.4-RC4
The maven artifacts are staged at
https://repository.apache.org/content/repositories/orgapachehadoop-1275/

You can find my public key at:
https://dist.apache.org/repos/dist/release/hadoop/common/KEYS
and http://keys.gnupg.net/pks/lookup?op=get=0xB86249D83539B38C

Please try the release and vote. The vote will run for 8 weekdays,
until July 31. 2020. 23:00 CET.


Thanks,
Gabor

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Re: [DISCUSS] Removing the archiac master branch

2020-06-22 Thread Elek, Marton




On 6/19/20 7:19 PM, Owen O'Malley wrote:

We unfortunately have a lot of master/slave and whitelist/blacklist
terminology usage in Hadoop. It will take a while to fix them all, but one
is easy to fix. In particular, we have a "master" branch that hasn't been
used since the project reunification and we use "trunk" as the main branch.

I propose that we delete the "master" branch. Thoughts?




+1
I am totally fine with the proposed change.



But as I am a non-native English and a non-US speaker, can you please 
help me to understand the question better?


While master/slave pair seems to be obvious reference to the slavery I 
don't know what "master" branch means without "slave" branches.


Based on the dictionary there are multiple meanings:

https://www.merriam-webster.com/dictionary/master

In my native language the primary meanings are 1/c and 1/b (We have 
almost the same world with Latin origin). A lot of meanings are missing 
(2/a,b,c...)



But it's very hard to imagine what secondary meanings are common in a 
different country/culture. I would ask for some help to understand it 
better.



1. Can you please help us to understand how do you use "master" in the 
everyday life? What kind of meanings come in to your mind at first?


2. Is it common to associate to the 2 / d / 2 meaning? ("an owner 
especially of a slave or animal")


3. Is it common to associate to the 1 / a / 1 meaning? ("a male 
teacher"). Is it really a male thing?


(Actually even if only 3 is true: I would prefer something which is not 
a male)



Again, I am +1, but would use this opportunity to learn.

Thanks,
Marton

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-14160) Create dev-support scripts to do the bulk jira update required by the release process

2017-03-08 Thread Elek, Marton (JIRA)
Elek, Marton created HADOOP-14160:
-

 Summary: Create dev-support scripts to do the bulk jira update 
required by the release process
 Key: HADOOP-14160
 URL: https://issues.apache.org/jira/browse/HADOOP-14160
 Project: Hadoop Common
  Issue Type: Improvement
  Components: build
Reporter: Elek, Marton
Assignee: Elek, Marton


According to the conversation on the dev mailing list one pain point of the 
release making is the Jira administration.

This issue is about creating new scripts to 
 
 * query apache issue about a possible release (remaining blocking, issues, 
etc.)
 * and do bulk changes (eg.  bump fixVersions)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-14164) Update the skin of maven-site during doc generation

2017-03-08 Thread Elek, Marton (JIRA)
Elek, Marton created HADOOP-14164:
-

 Summary: Update the skin of maven-site during doc generation
 Key: HADOOP-14164
 URL: https://issues.apache.org/jira/browse/HADOOP-14164
 Project: Hadoop Common
  Issue Type: Improvement
  Components: documentation
Reporter: Elek, Marton
Assignee: Elek, Marton


Together with the improvements of the hadoop site (HADOOP-14163), I suggest to 
improve theme used by the mave-site plugin for all the hadoop documentation.

One possible option is using the reflow skin:

http://andriusvelykis.github.io/reflow-maven-skin/



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-14162) Improve release scripts to automate missing steps

2017-03-08 Thread Elek, Marton (JIRA)
Elek, Marton created HADOOP-14162:
-

 Summary: Improve release scripts to automate missing steps
 Key: HADOOP-14162
 URL: https://issues.apache.org/jira/browse/HADOOP-14162
 Project: Hadoop Common
  Issue Type: Improvement
  Components: build
Reporter: Elek, Marton
Assignee: Elek, Marton


According to the conversation on the dev mailing list one pain point of the 
release making is that even with the latest create-release script a lot of 
steps are not automated.

This Jira is about creating a script which guides the release manager throw the 
proces:

Goals:
  * It would work even without the apache infrastructure: with custom 
configuration (forked repositories/alternative nexus), it would be possible to 
test the scripts even by a non-commiter.  
  * every step which could be automated should be scripted (create git 
branches, build,...). if something could be not automated there an explanation 
could be printed out, and wait for confirmation
  * Before dangerous steps (eg. bulk jira update) we can ask for confirmation 
and explain the 
  * The run should be idempontent (and there should be an option to continue 
the release from any steps).  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-14163) Refactor existing hadoop site to use more usable static website generator

2017-03-08 Thread Elek, Marton (JIRA)
Elek, Marton created HADOOP-14163:
-

 Summary: Refactor existing hadoop site to use more usable static 
website generator
 Key: HADOOP-14163
 URL: https://issues.apache.org/jira/browse/HADOOP-14163
 Project: Hadoop Common
  Issue Type: Improvement
  Components: site
Reporter: Elek, Marton
Assignee: Elek, Marton


>From the dev mailing list:

"Publishing can be attacked via a mix of scripting and revamping the darned 
website. Forrest is pretty bad compared to the newer static site generators out 
there (e.g. need to write XML instead of markdown, it's hard to review a 
staging site because of all the absolute links, hard to customize, did I 
mention XML?), and the look and feel of the site is from the 00s. We don't 
actually have that much site content, so it should be possible to migrate to a 
new system."

This issue is find a solution to migrate the old site to a new modern static 
site generator using a more contemprary theme.

Goals: 
 * existing links should work (or at least redirected)
 * It should be easy to add more content required by a release automatically 
(most probably with creating separated markdown files)




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-14898) Create official Docker images for development and testing features

2017-09-22 Thread Elek, Marton (JIRA)
Elek, Marton created HADOOP-14898:
-

 Summary: Create official Docker images for development and testing 
features 
 Key: HADOOP-14898
 URL: https://issues.apache.org/jira/browse/HADOOP-14898
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Elek, Marton
Assignee: Elek, Marton


This is the original mail from the mailing list:

{code}
TL;DR: I propose to create official hadoop images and upload them to the 
dockerhub.

GOAL/SCOPE: I would like improve the existing documentation with easy-to-use 
docker based recipes to start hadoop clusters with various configuration.

The images also could be used to test experimental features. For example ozone 
could be tested easily with these compose file and configuration:

https://gist.github.com/elek/1676a97b98f4ba561c9f51fce2ab2ea6

Or even the configuration could be included in the compose file:

https://github.com/elek/hadoop/blob/docker-2.8.0/example/docker-compose.yaml

I would like to create separated example compose files for federation, ha, 
metrics usage, etc. to make it easier to try out and understand the features.

CONTEXT: There is an existing Jira 
https://issues.apache.org/jira/browse/HADOOP-13397
But it’s about a tool to generate production quality docker images (multiple 
types, in a flexible way). If no objections, I will create a separated issue to 
create simplified docker images for rapid prototyping and investigating new 
features. And register the branch to the dockerhub to create the images 
automatically.

MY BACKGROUND: I am working with docker based hadoop/spark clusters quite a 
while and run them succesfully in different environments (kubernetes, 
docker-swarm, nomad-based scheduling, etc.) My work is available from here: 
https://github.com/flokkr but they could handle more complex use cases (eg. 
instrumenting java processes with btrace, or read/reload configuration from 
consul).
 And IMHO in the official hadoop documentation it’s better to suggest to use 
official apache docker images and not external ones (which could be changed).
{code}

The next list will enumerate the key decision points regarding to docker image 
creating

A. automated dockerhub build  / jenkins build

Docker images could be built on the dockerhub (a branch pattern should be 
defined for a github repository and the location of the Docker files) or could 
be built on a CI server and pushed.

The second one is more flexible (it's more easy to create matrix build, for 
example)
The first one had the advantage that we can get an additional flag on the 
dockerhub that the build is automated (and built from the source by the 
dockerhub).

The decision is easy as ASF supports the first approach: (see 
https://issues.apache.org/jira/browse/INFRA-12781?focusedCommentId=15824096=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15824096)

B. source: binary distribution or source build

The second question is about creating the docker image. One option is to build 
the software on the fly during the creation of the docker image the other one 
is to use the binary releases.

I suggest to use the second approach as:

1. In that case the hadoop:2.7.3 could contain exactly the same hadoop 
distrubution as the downloadable one

2. We don't need to add development tools to the image, the image could be more 
smaller (which is important as the goal for this image to getting started as 
fast as possible)

3. The docker definition will be more simple (and more easy to maintain)

Usually this approach is used in other projects (I checked Apache Zeppelin and 
Apache Nutch)

C. branch usage

Other question is the location of the Docker file. It could be on the official 
source-code branches (branch-2, trunk, etc.) or we can create separated 
branches for the dockerhub (eg. docker/2.7 docker/2.8 docker/3.0)

For the first approach it's easier to find the docker images, but it's less 
flexible. For example if we had a Dockerfile for on the source code it should 
be used for every release (for example the Docker file from the tag 
release-3.0.0 should be used for the 3.0 hadoop docker image). In that case the 
release process is much more harder: in case of a Dockerfile error (which could 
be test on dockerhub only after the taging), a new release should be added 
after fixing the Dockerfile.

Another problem is that with using tags it's not possible to improve the 
Dockerfiles. I can imagine that we would like to improve for example the 
hadoop:2.7 images (for example adding more smart startup scripts) with using 
exactly the same hadoop 2.7 distribution. 

Finally with tag based approach we can't create images for the older releases 
(2.8.1 for example)

So I suggest to create separated branches for the Dockerfiles.

D. Versions

We can create a separated branch for every version (2.7.1/2.7.2/2.7.3) or just 
for the main version

[jira] [Created] (HADOOP-14850) Read HttpServer2 resources directly from the source tree (if exists)

2017-09-08 Thread Elek, Marton (JIRA)
Elek, Marton created HADOOP-14850:
-

 Summary: Read HttpServer2 resources directly from the source tree 
(if exists)
 Key: HADOOP-14850
 URL: https://issues.apache.org/jira/browse/HADOOP-14850
 Project: Hadoop Common
  Issue Type: Improvement
Affects Versions: 3.0.0-alpha4
Reporter: Elek, Marton
Assignee: Elek, Marton


Currently the Hadoop server components can't be started from IDE during the 
development. There are two reasons for that:

1. some artifacts are in provided scope which are definitelly needed to run the 
server (see HDFS-12197)

2. The src/main/webapp dir should be on the classpath (but not).

In this issue I suggest to fix the second issue by reading the web resources 
(html and css files) directly from the source tree and not from the classpath 
but ONLY if the src/main/webapp dir exists. Similar approach exists in 
different projects (eg. in Spark).

WIth this patch the web development of the web interfaces are significant 
easier as the result could be checked immediatelly with a running severt 
(without rebuild/restart). I used this patch during the development of the 
Ozone web interfaces.

As the original behaviour of the resource location has not been change if 
"src/main/webapp" doesn't exist, I think it's quite safe.  And the method is 
called only once during the creation of the HttpServer2 there is also no change 
in performance.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-15065) Make mapreduce specific GenericOptionsParser arguments optional

2017-11-22 Thread Elek, Marton (JIRA)
Elek, Marton created HADOOP-15065:
-

 Summary: Make mapreduce specific GenericOptionsParser arguments 
optional
 Key: HADOOP-15065
 URL: https://issues.apache.org/jira/browse/HADOOP-15065
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Elek, Marton
Priority: Minor


org.apache.hadoop.util.GenericOptionsParser is widely used to use common 
arguments in all the command line applications.

Some of the common arguments are really generic:

{code}
-D 

[jira] [Created] (HADOOP-15122) Lock down version of doxia-module-markdown plugin

2017-12-15 Thread Elek, Marton (JIRA)
Elek, Marton created HADOOP-15122:
-

 Summary: Lock down version of doxia-module-markdown plugin
 Key: HADOOP-15122
 URL: https://issues.apache.org/jira/browse/HADOOP-15122
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 3.0.0
Reporter: Elek, Marton
Assignee: Elek, Marton


Since HADOOP-14364 we have a SNAPSHOT dependency in the main pom.xml:

{code}
+
+  org.apache.maven.doxia
+  doxia-module-markdown
+  1.8-SNAPSHOT
+
{code}

Most probably because some feature was missing from doxia markdown module.

I propose to lock down the version and use a fixed instance from the snapshot 
version. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-14162) Improve release scripts to automate missing steps

2017-11-09 Thread Elek, Marton (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elek, Marton resolved HADOOP-14162.
---
Resolution: Won't Fix

> Improve release scripts to automate missing steps
> -
>
> Key: HADOOP-14162
> URL: https://issues.apache.org/jira/browse/HADOOP-14162
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: build
>    Reporter: Elek, Marton
>Assignee: Elek, Marton
>
> According to the conversation on the dev mailing list one pain point of the 
> release making is that even with the latest create-release script a lot of 
> steps are not automated.
> This Jira is about creating a script which guides the release manager throw 
> the proces:
> Goals:
>   * It would work even without the apache infrastructure: with custom 
> configuration (forked repositories/alternative nexus), it would be possible 
> to test the scripts even by a non-commiter.  
>   * every step which could be automated should be scripted (create git 
> branches, build,...). if something could be not automated there an 
> explanation could be printed out, and wait for confirmation
>   * Before dangerous steps (eg. bulk jira update) we can ask for confirmation 
> and explain the 
>   * The run should be idempontent (and there should be an option to continue 
> the release from any steps).  



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-14160) Create dev-support scripts to do the bulk jira update required by the release process

2017-11-09 Thread Elek, Marton (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elek, Marton resolved HADOOP-14160.
---
Resolution: Won't Fix

> Create dev-support scripts to do the bulk jira update required by the release 
> process
> -
>
> Key: HADOOP-14160
> URL: https://issues.apache.org/jira/browse/HADOOP-14160
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: build
>Reporter: Elek, Marton
>    Assignee: Elek, Marton
>
> According to the conversation on the dev mailing list one pain point of the 
> release making is the Jira administration.
> This issue is about creating new scripts to 
>  
>  * query apache issue about a possible release (remaining blocking, issues, 
> etc.)
>  * and do bulk changes (eg.  bump fixVersions)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-15084) Create docker images for latest stable hadoop2 build

2017-12-01 Thread Elek, Marton (JIRA)
Elek, Marton created HADOOP-15084:
-

 Summary: Create docker images for latest stable hadoop2 build
 Key: HADOOP-15084
 URL: https://issues.apache.org/jira/browse/HADOOP-15084
 Project: Hadoop Common
  Issue Type: Sub-task
Reporter: Elek, Marton
Assignee: Elek, Marton






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-15083) Create base image for running hadoop in docker containers

2017-12-01 Thread Elek, Marton (JIRA)
Elek, Marton created HADOOP-15083:
-

 Summary: Create base image for running hadoop in docker containers
 Key: HADOOP-15083
 URL: https://issues.apache.org/jira/browse/HADOOP-15083
 Project: Hadoop Common
  Issue Type: Sub-task
Reporter: Elek, Marton
Assignee: Elek, Marton






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-15539) Make start-build-env.sh usable in non-interactive mode

2018-06-14 Thread Elek, Marton (JIRA)
Elek, Marton created HADOOP-15539:
-

 Summary: Make start-build-env.sh usable in non-interactive mode
 Key: HADOOP-15539
 URL: https://issues.apache.org/jira/browse/HADOOP-15539
 Project: Hadoop Common
  Issue Type: Improvement
  Components: build
Reporter: Elek, Marton
Assignee: Elek, Marton


The current start-build-env.sh in the project root is useful to start a new 
build environment. But it's not possible to start the build environment and run 
the command in one step.

We use the dockerized build environment on jenkins 
(https://builds.apache.org/job/Hadoop-trunk-ozone-acceptance/) which requires a 
small modification to optionally run start-build-env.sh in non-interactive mode 
and execute any command in the container.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-15256) Create docker images for latest stable hadoop3 build

2018-02-23 Thread Elek, Marton (JIRA)
Elek, Marton created HADOOP-15256:
-

 Summary: Create docker images for latest stable hadoop3 build
 Key: HADOOP-15256
 URL: https://issues.apache.org/jira/browse/HADOOP-15256
 Project: Hadoop Common
  Issue Type: Sub-task
Reporter: Elek, Marton
Assignee: Elek, Marton


Similar to the hadoop2 image we can provide a developer hadoop image which 
contains the latest hadoop from the binary release.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-15257) Provide example docker compose file for developer builds

2018-02-23 Thread Elek, Marton (JIRA)
Elek, Marton created HADOOP-15257:
-

 Summary: Provide example docker compose file for developer builds
 Key: HADOOP-15257
 URL: https://issues.apache.org/jira/browse/HADOOP-15257
 Project: Hadoop Common
  Issue Type: Sub-task
Reporter: Elek, Marton
Assignee: Elek, Marton


This issue is about creating example docker-compose files which use the latest 
build from the hadoop-dist directory.

These docker-compose files would help to run a specific hadoop cluster based on 
the latest custom build without the need to build customized docker image (with 
mounting hadoop fro hadoop-dist to the container



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-15258) Create example docker-compse file for documentations

2018-02-23 Thread Elek, Marton (JIRA)
Elek, Marton created HADOOP-15258:
-

 Summary: Create example docker-compse file for documentations
 Key: HADOOP-15258
 URL: https://issues.apache.org/jira/browse/HADOOP-15258
 Project: Hadoop Common
  Issue Type: Sub-task
Reporter: Elek, Marton
Assignee: Elek, Marton


An other user case for docker is to use it in the documentation. For example in 
the HA documentation we can provide an example docker-compose file and 
configuration with all the required settings to getting started easily with an 
HA cluster.

1. I would add an example to a documetation page
2. It will use the hadoop3 image (which contains latest hadoop3) as the user of 
the documentation may not build a hadoop



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-15656) Support byteman in hadoop-runner baseimage

2018-08-09 Thread Elek, Marton (JIRA)
Elek, Marton created HADOOP-15656:
-

 Summary: Support byteman in hadoop-runner baseimage
 Key: HADOOP-15656
 URL: https://issues.apache.org/jira/browse/HADOOP-15656
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Elek, Marton
Assignee: Elek, Marton


[Byteman|http://byteman.jboss.org/] is an easy to use tool to instrument a java 
process with agent string.

For example [this 
script|https://gist.githubusercontent.com/elek/0589a91b4d55afb228279f6c4f04a525/raw/8bb4e03de7397c8a9d9bb74a5ec80028b42575c4/hadoop.btm]
 defines a rule to print out all the hadoop rpc traffic to the standard output 
(which is extremely useful for testing development).

This patch adds the byteman.jar to the baseimage and defines a simple logic to 
add agent instrumentation string to the HADOOP_OPTS (optional it also could 
download the byteman script from an url)





--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-15673) Hadoop:3 image is missing from dockerhub

2018-08-15 Thread Elek, Marton (JIRA)
Elek, Marton created HADOOP-15673:
-

 Summary: Hadoop:3 image is missing from dockerhub
 Key: HADOOP-15673
 URL: https://issues.apache.org/jira/browse/HADOOP-15673
 Project: Hadoop Common
  Issue Type: Sub-task
Reporter: Elek, Marton


Currently the apache/hadoop:3 image is missing from the dockerhub as the 
Dockerfile in docker-hadoop-3 branch contains the outdated 3.0.0 download url. 
It should be updated to the latest 3.1.1 url.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-15730) Add Ozone submodule to the hadoop.apache.org

2018-09-07 Thread Elek, Marton (JIRA)
Elek, Marton created HADOOP-15730:
-

 Summary: Add Ozone submodule to the hadoop.apache.org
 Key: HADOOP-15730
 URL: https://issues.apache.org/jira/browse/HADOOP-15730
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Elek, Marton
Assignee: Elek, Marton


The current hadoop.apache.org doesn't mention Ozone in the "Modules" section.

We can add something like this (or better):

{quote}
Hadoop Ozone is an object store for Hadoop on top of the Hadoop HDDS which 
provides low-level binary storage layer.
{quote}

We can also linke to the http://ozone.hadoop.apache.org




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-15302) Enable DataNode/NameNode service plugins with Service Provider interface

2018-03-09 Thread Elek, Marton (JIRA)
Elek, Marton created HADOOP-15302:
-

 Summary: Enable DataNode/NameNode service plugins with Service 
Provider interface
 Key: HADOOP-15302
 URL: https://issues.apache.org/jira/browse/HADOOP-15302
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Elek, Marton
Assignee: Elek, Marton


HADOOP-5257 introduced ServicePlugin capabilities for NameNode/DataNode. As of 
now they could be activated by configuration values. 

I propose to activate plugins with Service Provider Interface. In case of a 
special service file is added a jar it would be enough to add the plugin to the 
classpath. It would help to add optional components to NameNode/DataNode with 
settings the classpath.

This is the same api which could be used in java 9 to consume defined services.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-15367) Update the initialization code in the docker hadoop-runner baseimage

2018-04-05 Thread Elek, Marton (JIRA)
Elek, Marton created HADOOP-15367:
-

 Summary: Update the initialization code in the docker 
hadoop-runner baseimage 
 Key: HADOOP-15367
 URL: https://issues.apache.org/jira/browse/HADOOP-15367
 Project: Hadoop Common
  Issue Type: Sub-task
Reporter: Elek, Marton
Assignee: Elek, Marton


The hadoop-runner baseimage contains initialization code for both the HDFS 
namenode/datanode and Ozone/Hdds scm/ksm.

The script name is for the later one is changed (from oz to ozone) therefore we 
need to updated the base image.

This commit also would be a test for the dockerhub automated build.

Please apply the patch on the top of the _docker-hadoop-runner_ branch. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-15369) Avoid usage of ${project.version} in parent pom

2018-04-06 Thread Elek, Marton (JIRA)
Elek, Marton created HADOOP-15369:
-

 Summary: Avoid usage of ${project.version} in parent pom
 Key: HADOOP-15369
 URL: https://issues.apache.org/jira/browse/HADOOP-15369
 Project: Hadoop Common
  Issue Type: Bug
  Components: build
Affects Versions: 3.2.0
Reporter: Elek, Marton
Assignee: Elek, Marton


hadoop-project/pom.xml and hadoop-project-dist/pom.xml use _${project.version}_ 
variable in dependencyManagement and plugin dependencies.

Unfortunatelly it could not work if we use different version in a child project 
as ${project.version} variable is resolved *after* the inheritance.

>From [maven 
>doc|https://maven.apache.org/guides/introduction/introduction-to-the-pom.html#Project_Inheritance]:

{quote}
For example, to access the project.version variable, you would reference it 
like so:

  ${project.version}

One factor to note is that these variables are processed after inheritance as 
outlined above. This means that if a parent project uses a variable, then its 
definition in the child, not the parent, will be the one eventually used.
{quote}

The community voted to keep ozone in-tree but use a different release cycle. To 
achieve this we need different version for selected subproject therefor we 
can't use ${project.version} any more. 

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-15352) FIx default local maven repository path in create-release script

2018-03-29 Thread Elek, Marton (JIRA)
Elek, Marton created HADOOP-15352:
-

 Summary: FIx default local maven repository path in create-release 
script 
 Key: HADOOP-15352
 URL: https://issues.apache.org/jira/browse/HADOOP-15352
 Project: Hadoop Common
  Issue Type: Bug
  Components: scripts
Affects Versions: 3.1.0
Reporter: Elek, Marton
Assignee: Elek, Marton


I am testing the create-release script locally. In case the MVNCACHE is not set 
the local ~/.m2 is used. Which is not good as the packages are downloaded to 
~/.m2/org/.../... instead of ~/.m2/repository/org/.../.../...





--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-15353) Bump default yetus version in the yetus-wrapper

2018-03-29 Thread Elek, Marton (JIRA)
Elek, Marton created HADOOP-15353:
-

 Summary: Bump default yetus version in the yetus-wrapper
 Key: HADOOP-15353
 URL: https://issues.apache.org/jira/browse/HADOOP-15353
 Project: Hadoop Common
  Issue Type: Improvement
Affects Versions: 3.1.0
Reporter: Elek, Marton
Assignee: Elek, Marton
 Attachments: HADOOP-15353.001.patch

The current precommit hook uses yetus 0.8.0-SNAPSHOT. The default version in 
the yetus-wrapper script is 0.4.0. It could be adjusted HADOOP_YETUS_VERSION 
but I suggest to set the default version to 0.7.0 to get results similar to the 
jenkins results locally without adjustments.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-15339) Support additional key/value propereties in JMX bean registration

2018-03-23 Thread Elek, Marton (JIRA)
Elek, Marton created HADOOP-15339:
-

 Summary: Support additional key/value propereties in JMX bean 
registration
 Key: HADOOP-15339
 URL: https://issues.apache.org/jira/browse/HADOOP-15339
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Elek, Marton
Assignee: Elek, Marton


org.apache.hadoop.metrics2.util.MBeans.register is a utility function to 
register objects to the JMX registry with a given name prefix and name.

JMX supports any additional key value pairs which could be part the the address 
of the jmx bean. For example: 
_java.lang:type=MemoryManager,name=CodeCacheManager_

Using this method we can query a group of mbeans, for example we can add the 
same tag to similar mbeans from namenode and datanode.

This patch adds a small modification to support custom key value pairs and also 
introduce a new unit test for MBeans utility which was missing until now.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-15340) Fix the RPC server name usage to provide information about the metrics

2018-03-23 Thread Elek, Marton (JIRA)
Elek, Marton created HADOOP-15340:
-

 Summary: Fix the RPC server name usage to provide information 
about the metrics
 Key: HADOOP-15340
 URL: https://issues.apache.org/jira/browse/HADOOP-15340
 Project: Hadoop Common
  Issue Type: Bug
  Components: common
Affects Versions: 3.2.0
Reporter: Elek, Marton
Assignee: Elek, Marton


In case of multiple RPC servers in the same JVM it's hard to identify the 
metric data. The only available information as of now is the port number.

Server name is also added in the constructor of Server.java but it's not used 
at all.

This patch fix this behaviour:

 1. The server name is saved to a field in Server.java (constructor signature 
is not changed)
 2. ServerName is added as a tag to the metrics in RpcMetrics
 3. The naming convention for the severs are fix.

About 3: if the server name is not defined the current code tries to identify 
the name from the class name. Which is not always an easy task as in some cases 
the server has a protobuf generated dirty name which also could be an inner 
class.

The patch also improved the detection of the name (if it's not defined). It's a 
compatible change as the current name is not user ad all.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-15857) Remove ozonefs class name definition from core-default.xml

2018-10-16 Thread Elek, Marton (JIRA)
Elek, Marton created HADOOP-15857:
-

 Summary: Remove ozonefs class name definition from core-default.xml
 Key: HADOOP-15857
 URL: https://issues.apache.org/jira/browse/HADOOP-15857
 Project: Hadoop Common
  Issue Type: Bug
  Components: fs
Reporter: Elek, Marton
Assignee: Elek, Marton


Ozone file system is under renaming in HDDS-651 from o3:// to o3fs://. But 
branch-3.2 still contains a reference with o3://.

The easiest way to fix it just remove the fs.o3.imp definition from 
core-default.xml from branch-3.2 as since HDDS-654 the file system could be 
registered with Service Provider Interface (META-INF/services...)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Reopened] (HADOOP-15339) Support additional key/value propereties in JMX bean registration

2018-10-29 Thread Elek, Marton (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-15339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elek, Marton reopened HADOOP-15339:
---

Since the commit we use this change from ozone/hdds and it worked well.

This change is required to have a working ozone/hdds webui as the shared code 
path tags the common jmx beans with generic key/value tags.

I reopen this issue and propose to backport it to branch-3.1 to make it easier 
to use hdds/ozone with older hadoop versions.
 # It's a small change
 # Backward compatible
 # Safe to use (no issue during the last 6 months)
 # No conflicts for cherry-pick.

 

> Support additional key/value propereties in JMX bean registration
> -
>
> Key: HADOOP-15339
> URL: https://issues.apache.org/jira/browse/HADOOP-15339
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: common
>    Reporter: Elek, Marton
>Assignee: Elek, Marton
>Priority: Major
> Fix For: 3.2.0
>
> Attachments: HADOOP-15339.001.patch, HADOOP-15339.002.patch, 
> HADOOP-15339.003.patch
>
>
> org.apache.hadoop.metrics2.util.MBeans.register is a utility function to 
> register objects to the JMX registry with a given name prefix and name.
> JMX supports any additional key value pairs which could be part the the 
> address of the jmx bean. For example: 
> _java.lang:type=MemoryManager,name=CodeCacheManager_
> Using this method we can query a group of mbeans, for example we can add the 
> same tag to similar mbeans from namenode and datanode.
> This patch adds a small modification to support custom key value pairs and 
> also introduce a new unit test for MBeans utility which was missing until now.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-15791) Remove Ozone related sources from the 3.2 branch

2018-09-25 Thread Elek, Marton (JIRA)
Elek, Marton created HADOOP-15791:
-

 Summary: Remove Ozone related sources from the 3.2 branch
 Key: HADOOP-15791
 URL: https://issues.apache.org/jira/browse/HADOOP-15791
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Elek, Marton
Assignee: Elek, Marton


As it is discussed at HDDS-341 and written in the original proposal of Ozone 
merge, we can remove all the ozone/hdds projects from the 3.2 release branch.

{quote}
 * On trunk (as opposed to release branches) HDSL will be a separate module in 
Hadoop's source tree. This will enable the HDSL to work on their trunk and the 
Hadoop trunk without making releases for every change.
  * Hadoop's trunk will only build HDSL if a non-default profile is enabled.
  * When Hadoop creates a release branch, the RM will delete the HDSL module 
from the branch.
{quote}




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-16063) Docker based pseudo-cluster definitions and test scripts for Hdfs/Yarn

2019-01-22 Thread Elek, Marton (JIRA)
Elek, Marton created HADOOP-16063:
-

 Summary: Docker based pseudo-cluster definitions and test scripts 
for Hdfs/Yarn
 Key: HADOOP-16063
 URL: https://issues.apache.org/jira/browse/HADOOP-16063
 Project: Hadoop Common
  Issue Type: New Feature
Reporter: Elek, Marton


During the recent releases of Apache Hadoop Ozone we had multiple experiments 
using docker/docker-compose to support the development of ozone.

As of now the hadoop-ozone distribution contains two directories in additional 
the regular hadoop directories (bin, share/lib, etc
h3. compose

The ./compose directory of the distribution contains different type of 
pseudo-cluster definitions. To start an ozone cluster is as easy as "cd 
compose/ozone && docker-compose up-d"

The clusters also could be scaled up and down (docker-compose scale datanode=3)

There are multiple cluster definitions for different use cases (for example 
ozone+s3 or hdfs+ozone).

The docker-compose files are based on apache/hadoop-runner image which is an 
"empty" image. It doesnt' contain any hadoop distribution. Instead the current 
hadoop is used (the ../.. is mapped as a volume at /opt/hadoop)

With this approach it's very easy to 1) start a cluster from the distribution 
2) test any patch from the dev tree, as after any build a new cluster can be 
started easily (with multiple nodes and datanodes)
h3. smoketest

We also started to use a simple robotframework based test suite. (see 
./smoketest directory). It's a high level test definition very similar to the 
smoketests which are executed manually by the contributors during a release 
vote.

But it's a formal definition to start cluster from different docker-compose 
definitions and execute simple shell scripts (and compare the output).

 

I believe that both approaches helped a lot during the development of ozone and 
I propose to do the same improvements on the main hadoop distribution.

I propose to provide docker-compose based example cluster definitions for 
yarn/hdfs and for different use cases (simple hdfs, router based federation, 
etc.)

It can help to understand the different configuration and try out new features 
with predefined config set.

Long term we can also add robottests to help the release votes (basic 
wordcount/mr tests could be scripted)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-16064) Load configuration values from external sources

2019-01-22 Thread Elek, Marton (JIRA)
Elek, Marton created HADOOP-16064:
-

 Summary: Load configuration values from external sources
 Key: HADOOP-16064
 URL: https://issues.apache.org/jira/browse/HADOOP-16064
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Elek, Marton


This is a proposal to improve the Configuration.java to load configuration from 
external sources (kubernetes config map, external http reqeust, any cluster 
manager like ambari, etc.)

I will attach a patch to illustrate the proposed solution, but please comment 
the concept first, the patch is just poc and not fully implemented.

*Goals:*
 * **Load the configuration files (core-site.xml/hdfs-site.xml/...) from 
external locations instead of the classpath (classpath remains the default)
 * Make the configuration loading extensible
 * Make it in an backward-compatible way with minimal change in the existing 
Configuration.java

*Use-cases:*

 1.) load configuration from the namenode ([http://namenode:9878/conf]). With 
this approach only the namenode should be configured, other components require 
only the url of the namenode

 2.) Read configuration directly from kubernetes config-map (or mesos)

 3.) Read configuration from any external cluster management (such as Apache 
Ambari or any equivalent)

 4.) as of now in the hadoop docker images we transform environment variables 
(such as HDFS-SITE.XML_fs.defaultFs) to configuration xml files with the help 
of a python script. With the proposed implementation it would be possible to 
read the configuration directly from the system environment variables.

*Problem:*

The existing Configuration.java can read configuration from multiple sources. 
But most of the time it's used to load predefined config names ("core-site.xml" 
and "hdfs-site.xml") without configuration location. In this case the files 
will be loaded from the classpath.

I propose to add additional option to define the default location of 
core-site.xml and hdfs-site.xml (any configuration which is defined by string 
name) to use external sources in the classpath.

The configuration loading requires implementation + configuration (where are 
the external configs). We can't use regular configuration to configure the 
config loader (chicken/egg).

I propose to use a new environment variable HADOOP_CONF_SOURCE

The environment variable could contain a URL, where the schema of the url can 
define the config source and all the other parts can configure the access to 
the resource.

Examples:

HADOOP_CONF_SOURCE=hadoop-[http://namenode:9878/conf]

HADOOP_CONF_SOURCE=env://prefix

HADOOP_CONF_SOURCE=k8s://config-map-name

The ConfigurationSource interface can be as easy as:
{code:java}
/**
 * Interface to load hadoop configuration from custom location.
 */
public interface ConfigurationSource {

  /**
   * Method will be called one with the defined configuration url.
   *
   * @param uri
   */
  void initialize(URI uri) throws IOException;

  /**
   * Method will be called to load a specific configuration resource.
   *
   * @param name of the configuration resource (eg. hdfs-site.xml)
   * @return List of loaded configuraiton key and values.
   */
  List readConfiguration(String name);

}{code}
We can choose the right implementation based the schema of the uri and with 
Java Service Provider Interface mechanism 
(META-INF/services/org.apache.hadoop.conf.ConfigurationSource)

It could be with minimal modification in the Configuration.java (see the 
attached patch as an example)

 The patch contains two example implementation:

*hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/conf/location/Env.java*

This can load configuration from environment variables based on a naming 
convention (eg. HDFS-SITE.XML_hdfs.dfs.key=value)

*hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/conf/location/HadoopWeb.java*

 This implementation can load the configuration from a /conf servlet of any 
Hadoop components.

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-16003) Migrate the Hadoop jenkins jobs to use new gitbox urls

2018-12-13 Thread Elek, Marton (JIRA)
Elek, Marton created HADOOP-16003:
-

 Summary: Migrate the Hadoop jenkins jobs to use new gitbox urls
 Key: HADOOP-16003
 URL: https://issues.apache.org/jira/browse/HADOOP-16003
 Project: Hadoop Common
  Issue Type: Task
Reporter: Elek, Marton


As it's announced the INFRA team all the apache git repositories will be 
migrated to use gitbox. I created this jira to sync on the required steps to 
update the jenkins job, and record the changes.

By default it could be as simple as changing the git url for all the jenkins 
jobs under the Hadoop view:

https://builds.apache.org/view/H-L/view/Hadoop/




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-16146) Make start-build-env.sh safe in case of misusage of DOCKER_INTERACTIVE_RUN

2019-02-25 Thread Elek, Marton (JIRA)
Elek, Marton created HADOOP-16146:
-

 Summary: Make start-build-env.sh safe in case of misusage of 
DOCKER_INTERACTIVE_RUN
 Key: HADOOP-16146
 URL: https://issues.apache.org/jira/browse/HADOOP-16146
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Elek, Marton
Assignee: Elek, Marton


[~aw] reported the problem in HDDS-891:
{quote}DOCKER_INTERACTIVE_RUN opens the door for users to set command line 
options to docker. Most notably, -c and -v and a few others that share one 
particular characteristic: they reference the file system. As soon as shell 
code hits the file system, it is no longer safe to assume space delimited 
options. In other words, -c /My Cool Filesystem/Docker Files/config.json or -v 
/c_drive/Program Files/Data:/data may be something a user wants to do, but the 
script now breaks because of the IFS assumptions.
{quote}
DOCKER_INTERACTIVE_RUN was used in jenkins to run normal build process in 
docker. In case of DOCKER_INTERACTIVE_RUN was set to empty the docker container 
is started without the "-i -t" flags.

It can be improved by checking the value of the environment variable and enable 
only fixed set of values.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



  1   2   >