[jira] [Resolved] (HADOOP-16146) Make start-build-env.sh safe in case of misusage of DOCKER_INTERACTIVE_RUN
[ https://issues.apache.org/jira/browse/HADOOP-16146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marton Elek resolved HADOOP-16146. -- Resolution: Won't Fix no review > Make start-build-env.sh safe in case of misusage of DOCKER_INTERACTIVE_RUN > -- > > Key: HADOOP-16146 > URL: https://issues.apache.org/jira/browse/HADOOP-16146 > Project: Hadoop Common > Issue Type: Bug > Reporter: Marton Elek > Assignee: Marton Elek >Priority: Major > Labels: pull-request-available > > [~aw] reported the problem in HDDS-891: > {quote}DOCKER_INTERACTIVE_RUN opens the door for users to set command line > options to docker. Most notably, -c and -v and a few others that share one > particular characteristic: they reference the file system. As soon as shell > code hits the file system, it is no longer safe to assume space delimited > options. In other words, -c /My Cool Filesystem/Docker Files/config.json or > -v /c_drive/Program Files/Data:/data may be something a user wants to do, but > the script now breaks because of the IFS assumptions. > {quote} > DOCKER_INTERACTIVE_RUN was used in jenkins to run normal build process in > docker. In case of DOCKER_INTERACTIVE_RUN was set to empty the docker > container is started without the "-i -t" flags. > It can be improved by checking the value of the environment variable and > enable only fixed set of values. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
access to the old wiki
Could you please give me (->user: MartonElek) access to the OLD hadoop wiki? (Have access to the new wiki but would like to fix pages on the old one.) Thank you very much, Marton - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
Re: [VOTE] Release Apache Hadoop 3.0.0-beta1 RC0
+1 (non-binding) * Built from the source * Installed dockerized YARN/HADOOP cluster to a 20 node cluster (scheduled with nomad, configured from consul, docker host networking) * Started example yarn jobs (teragen/terasort) and hdfs dfs commands + checking UIs I noticed only two very minor issues (changelog of HADOOP-9902 didn't mention that I need a writable 'logs' dir, even with custom log4j.properties; and there was a space typo in yarn error message https://issues.apache.org/jira/browse/YARN-7279) Marton On 09/29/2017 02:04 AM, Andrew Wang wrote: Hi all, Let me start, as always, by thanking the many, many contributors who helped with this release! I've prepared an RC0 for 3.0.0-beta1: http://home.apache.org/~wang/3.0.0-beta1-RC0/ This vote will run five days, ending on Nov 3rd at 5PM Pacific. beta1 contains 576 fixed JIRA issues comprising a number of bug fixes, improvements, and feature enhancements. Notable additions include the addition of YARN Timeline Service v2 alpha2, S3Guard, completion of the shaded client, and HDFS erasure coding pluggable policy support. I've done the traditional testing of running a Pi job on a pseudo cluster. My +1 to start. We're working internally on getting this run through our integration test rig. I'm hoping Vijay or Ray can ring in with a +1 once that's complete. Best, Andrew - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
Re: [DISCUSS] official docker image(s) for hadoop
Thanks all the feedbacks. I created an issue: https://issues.apache.org/jira/browse/HADOOP-14898 Let's continue the discussion there. Thanks, Marton On 09/08/2017 02:45 PM, Marton, Elek wrote: TL;DR: I propose to create official hadoop images and upload them to the dockerhub. GOAL/SCOPE: I would like improve the existing documentation with easy-to-use docker based recipes to start hadoop clusters with various configuration. The images also could be used to test experimental features. For example ozone could be tested easily with these compose file and configuration: https://gist.github.com/elek/1676a97b98f4ba561c9f51fce2ab2ea6 Or even the configuration could be included in the compose file: https://github.com/elek/hadoop/blob/docker-2.8.0/example/docker-compose.yaml I would like to create separated example compose files for federation, ha, metrics usage, etc. to make it easier to try out and understand the features. CONTEXT: There is an existing Jira https://issues.apache.org/jira/browse/HADOOP-13397 But it’s about a tool to generate production quality docker images (multiple types, in a flexible way). If no objections, I will create a separated issue to create simplified docker images for rapid prototyping and investigating new features. And register the branch to the dockerhub to create the images automatically. MY BACKGROUND: I am working with docker based hadoop/spark clusters quite a while and run them succesfully in different environments (kubernetes, docker-swarm, nomad-based scheduling, etc.) My work is available from here: https://github.com/flokkr but they could handle more complex use cases (eg. instrumenting java processes with btrace, or read/reload configuration from consul). And IMHO in the official hadoop documentation it’s better to suggest to use official apache docker images and not external ones (which could be changed). Please let me know if you have any comments. Marton - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
[DISCUSS] official docker image(s) for hadoop
TL;DR: I propose to create official hadoop images and upload them to the dockerhub. GOAL/SCOPE: I would like improve the existing documentation with easy-to-use docker based recipes to start hadoop clusters with various configuration. The images also could be used to test experimental features. For example ozone could be tested easily with these compose file and configuration: https://gist.github.com/elek/1676a97b98f4ba561c9f51fce2ab2ea6 Or even the configuration could be included in the compose file: https://github.com/elek/hadoop/blob/docker-2.8.0/example/docker-compose.yaml I would like to create separated example compose files for federation, ha, metrics usage, etc. to make it easier to try out and understand the features. CONTEXT: There is an existing Jira https://issues.apache.org/jira/browse/HADOOP-13397 But it’s about a tool to generate production quality docker images (multiple types, in a flexible way). If no objections, I will create a separated issue to create simplified docker images for rapid prototyping and investigating new features. And register the branch to the dockerhub to create the images automatically. MY BACKGROUND: I am working with docker based hadoop/spark clusters quite a while and run them succesfully in different environments (kubernetes, docker-swarm, nomad-based scheduling, etc.) My work is available from here: https://github.com/flokkr but they could handle more complex use cases (eg. instrumenting java processes with btrace, or read/reload configuration from consul). And IMHO in the official hadoop documentation it’s better to suggest to use official apache docker images and not external ones (which could be changed). Please let me know if you have any comments. Marton - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
hadoop roadmaps
Hi, I tried to summarize all of the information from different mail threads about the upcomming releases: https://cwiki.apache.org/confluence/display/HADOOP/Roadmap Please fix it / let me know if you see any invalid data. I will try to follow the conversations and update accordingly. Two administrative questions: * Is there any information about which wiki should be used? Or about the migration process? As I see the new pages are created on the cwiki recently. * Could you please give me permission (user: elek) to the old wiki. I would like to update the old Roadmap page (https://wiki.apache.org/hadoop/Roadmap) Thanks Marton - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
Re: HADOOP-14163 proposal for new hadoop.apache.org
Thank you all of the feedbacks, I fixed all of them (except one, see the comment below) and updated the http://hadoop.anzix.net preview site. So the next steps: 0. Let me know if you have any comment about the latest version 1. I wait for the 2.8.0 announcement, and migrate the new announcement as well. (wouldn't like to complicate the 2.8.0 with the site change) 2. I like the suggestion of Owen to move the site to a specific git branch. I wouldn't like to pending on it if it's too much time, but if any of the commiters could pick it up, I would wait for it. I tested it, and seems to be easy: git svn clone https://svn.apache.org/repos/asf/hadoop/common/site/main cd main git remote add elek g...@github.com:elek/hadoop.git git push elek master:asf-site According to the blog entry, an INFRA issue should be opened (I guess by a commiter or maybe a pmc member): https://blogs.apache.org/infra/entry/git_based_websites_available 3. After that I can submit the new site as a regular patch against the asf-site branch. 4. If it's merged, I can update the release wiki pages Marton ps: The only suggested item which is not implemented is the short version names in the documentation menu (2.7 instead of 2.7.3). I think there are two forces: usability of the site and the simplicity of the site generation. Ideally a new release could be added to the site as easy as possible (that was one of the motivation of the migration). While a new tag could be added to the header of the markdown files (eg: versionLine: 3.0), it requires multiple files update during a new release. And if something would be missed, there could be displayed multiple "2.7" menu item (one for 2.7.3 and for 2.7.4). So the current method is not so nice, but much more bug-safe. I prefer to keep the current/content in this step (if possible) and if the site is migrated we can submit new patches (hopefully against a git branch) in the normal way and further improve the site. From: Owen O'Malley <omal...@apache.org> Sent: Monday, March 13, 2017 6:15 PM To: Marton Elek Cc: common-dev@hadoop.apache.org Subject: Re: HADOOP-14163 proposal for new hadoop.apache.org Thanks for addressing this. Getting rid of Hadoop's use of forrest is a good thing. In terms of content, the documentation links should be sorted by number with only the latest from each minor release line (eg. 3.0, 2.7, 2.6). The download page points to the mirrors for checksums and signatures. It should use the direct links, such as https://dist.apache.org/repos/dist/release/hadoop/common/hadoop-2.7.3/hadoop-2.7.3-src.tar.gz.asc https://dist.apache.org/repos/dist/release/hadoop/common/hadoop-2.7.3/hadoop-2.7.3-src.tar.gz.mds Speaking of which, Hadoop's dist directory is huge and should be heavily pruned. We should probably take it down to just hadoop-2.6.5, hadoop-2.7.3, and hadoop-3.0.0-alpha2. You might also want to move us to git-pubsub so that we can use a branch in our source code git repository to publish the html. Typically this uses the asf-site branch. .. Owen On Mon, Mar 13, 2017 at 7:28 AM, Marton Elek <me...@hortonworks.com> wrote: > > Hi, > > In the previous thread the current forrest based hadoop site is identified > as one of the pain points of the release process. > > I created a new version of the site with exactly the same content. > > As it uses newer site generator (hugo), now: > > 1. It’s enough to create one new markdown file per release, and all the > documentation/download links will be automatically added. > 2. It requires only one single binary to render. > > > A preview version is temporary hosted at > > http://hadoop.anzix.net/ > > to make it easier to review. > > > For more details, you can check my comments on the issue > https://issues.apache.org/jira/browse/HADOOP-14163 > > I would be thankful to get any feedback/review. > > Cheers, > Marton > > > - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
Re: [VOTE] Release Apache Hadoop 2.8.0 (RC3)
+1 (non-binding) Tested from the released binary package * 5 node cluster running from dockerized containers (every namenode/datanode/nodemanager, etc. are running in separated containers) * Bitcoin bockchain data (~100Gb) parsed and imported to HBase (1.2.4) * Spark (2.1.0 with included hadoop) job (executing on YARN) to query the data from HBase and write the results to HDFS Looks good. Marton > On Mar 19, 2017, at 6:01 PM, Sunil Govindwrote: > > +1 (non-binding). Thanks Junping for the effort. > > I have used release package and verified below cases > - Ran MR sleep job and wordcount successfully in where nodes are configured > with labels. > - Verified application priority feature and I could see high priority apps > are getting resource over lower priority apps when configured > - Verified RM web UI pages and looks fine (priority could be seen) > - Intra-queue preemption related to app priority also seems fine > > Thanks > Sunil > > > On Fri, Mar 17, 2017 at 2:48 PM Junping Du wrote: > >> Hi all, >> With fix of HDFS-11431 get in, I've created a new release candidate >> (RC3) for Apache Hadoop 2.8.0. >> >> This is the next minor release to follow up 2.7.0 which has been >> released for more than 1 year. It comprises 2,900+ fixes, improvements, and >> new features. Most of these commits are released for the first time in >> branch-2. >> >> More information about the 2.8.0 release plan can be found here: >> https://cwiki.apache.org/confluence/display/HADOOP/Hadoop+2.8+Release >> >> New RC is available at: >> http://home.apache.org/~junping_du/hadoop-2.8.0-RC3 >> >> The RC tag in git is: release-2.8.0-RC3, and the latest commit id >> is: 91f2b7a13d1e97be65db92ddabc627cc29ac0009 >> >> The maven artifacts are available via repository.apache.org at: >> https://repository.apache.org/content/repositories/orgapachehadoop-1057 >> >> Please try the release and vote; the vote will run for the usual 5 >> days, ending on 03/22/2017 PDT time. >> >> Thanks, >> >> Junping >> - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
HADOOP-14163 proposal for new hadoop.apache.org
Hi, In the previous thread the current forrest based hadoop site is identified as one of the pain points of the release process. I created a new version of the site with exactly the same content. As it uses newer site generator (hugo), now: 1. It’s enough to create one new markdown file per release, and all the documentation/download links will be automatically added. 2. It requires only one single binary to render. A preview version is temporary hosted at http://hadoop.anzix.net/ to make it easier to review. For more details, you can check my comments on the issue https://issues.apache.org/jira/browse/HADOOP-14163 I would be thankful to get any feedback/review. Cheers, Marton
Re: About 2.7.4 Release
I think the main point here is the testing of the release script, not the creation of the official release. I think there should be an option to configure the release tool to use a forked github repo and/or a private playground nexus instead of official apache repos. In this case it would be easy to test regularly the tool, even by a non-committer (or even from Jenkins). But it would be just a smoketest of the release script... Marton From: Allen WittenauerSent: Wednesday, March 08, 2017 2:24 AM To: Andrew Wang Cc: Hadoop Common; yarn-...@hadoop.apache.org; Hdfs-dev; mapreduce-...@hadoop.apache.org Subject: Re: About 2.7.4 Release > On Mar 7, 2017, at 2:51 PM, Andrew Wang wrote: > I think it'd be nice to > have a nightly Jenkins job that builds an RC, Just a reminder that any such build cannot be used for an actual release: http://www.apache.org/legal/release-policy.html#owned-controlled-hardware - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
Re: About 2.7.4 Release
Thank you very much for your feedback. I am opening the following JIRAs: 1. Create dev-support scripts to do the bulk jira updates required by the releases (check remaining jiras, update fix versions, etc.) 2. Create a 'wizzard' like script which guide through the release process (all the steps from the wiki pages, not just a build. But it may be an extension of the existing script): Goals: * It would work even without the apache infrastructure: with custom configuration (forked repositories/alternative nexus), it would be possible to test the scripts even by a non-commiter. * every step which could be automated should be scripted (create git branches, build,...). if something could be not automated there an explanation could be printed out, and wait for confirmation * Before dangerous steps (eg. bulk jira update) we can ask for confirmation and explain what will be happened (eg. the following jira items will be changed: ) * The run should be idempontent (and there should be an option to continue the release from any steps). 3. Migrate the forrest based home page to a use a modern static site generator. Goals: * existing links should work (or at least redirected) * It should be easy to add more content required by a release automatically 4. It's not about the release, but I think the current maven site theme also could be updated to use a (more modern) theme, which could be similar to the main site from step 3. Let me know if you have any other suggestion for actionable items. Or comment the Jiras if you have more specific requirements. Marton ps: Vinod, I will contact with you, soon. From: Vinod Kumar Vavilapalli <vino...@apache.org> Sent: Tuesday, March 07, 2017 11:58 PM To: Sangjin Lee Cc: Marton Elek; common-dev@hadoop.apache.org; yarn-...@hadoop.apache.org; Hdfs-dev; mapreduce-...@hadoop.apache.org Subject: Re: About 2.7.4 Release I was planning to take this up, celebrating my return from my paternity leave of absence for quite a while. Marton, let me know if you do want to take this up instead and we can work together. Thanks +Vinod > On Mar 7, 2017, at 9:13 AM, Sangjin Lee <sj...@apache.org> wrote: > > If we have a volunteer for releasing 2.7.4, we should go full speed > ahead. We still need a volunteer from a PMC member or a committer as some > tasks may require certain privileges, but I don't think it precludes > working with others to close down the release. - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
Re: About 2.7.4 Release
Is there any reason to wait for 2.8 with 2.7.4? Unfortunately the previous thread about release cadence has been ended without final decision. But if I understood well, there was more or less an agreement about that it would be great to achieve more frequent releases, if possible (with or without written rules and EOL policy). I personally prefer to be more closer to the scheduling part of the proposal: "A minor release on the latest major line should be every 6 months, and a maintenance release on a minor release (as there may be concurrently maintained minor releases) every 2 months". I don't know what is the hardest part of creating new minor/maintenance releases. But if the problems are technical (smoketesting, unit tests, old release script, anything else) I would be happy to do any task for new maintenance releases (or more frequent releases). Regards, Marton From: Akira AjisakaSent: Tuesday, March 07, 2017 7:34 AM To: Brahma Reddy Battula; Hadoop Common; yarn-...@hadoop.apache.org; Hdfs-dev; mapreduce-...@hadoop.apache.org Subject: Re: About 2.7.4 Release Probably 2.8.0 will be released soon. https://issues.apache.org/jira/browse/HADOOP-13866?focusedCommentId=15898379=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15898379 I'm thinking 2.7.4 release process starts after 2.8.0 release, so 2.7.4 will be released in April or May. (hopefully) Thoughts? Regards, Akira On 2017/03/01 21:01, Brahma Reddy Battula wrote: > Hi All > > It has been six months for branch-2.7 release.. is there any near plan for > 2.7.4..? > > > Thanks > Brahma Reddy Battula > > - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
Re: [VOTE] Release Apache Hadoop 3.0.0-alpha2 RC0
Ran MapReduce jobs Pi 5. Verified Hadoop version command output is correct. Best, Yufei On Tue, Jan 24, 2017 at 3:02 AM, Marton Elek <me...@hortonworks.com<mailto:me...@hortonworks.com>> wrote: ]> minicluster is kind of weird on filesystems that don't support mixed case, like OS X's default HFS+. $ jar tf hadoop-client-minicluster-3.0.0-alpha3-SNAPSHOT.jar | grep -i license LICENSE.txt license/ license/LICENSE license/LICENSE.dom-documentation.txt license/LICENSE.dom-software.txt license/LICENSE.sax.txt license/NOTICE license/README.dom.txt license/README.sax.txt LICENSE Grizzly_THIRDPARTYLICENSEREADME.txt I added a patch to https://issues.apache.org/jira/browse/HADOOP-14018 to add the missing META-INF/LICENSE.txt to the shaded files. Question: what should be done with the other LICENSE files in the minicluster. Can we just exclude them (from legal point of view)? Regards, Marton - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org<mailto:yarn-dev-unsubscr...@hadoop.apache.org> For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org<mailto:yarn-dev-h...@hadoop.apache.org>
Re: [VOTE] Release Apache Hadoop 3.0.0-alpha2 RC0
]> > minicluster is kind of weird on filesystems that don't support mixed case, > like OS X's default HFS+. > > $ jar tf hadoop-client-minicluster-3.0.0-alpha3-SNAPSHOT.jar | grep -i > license > LICENSE.txt > license/ > license/LICENSE > license/LICENSE.dom-documentation.txt > license/LICENSE.dom-software.txt > license/LICENSE.sax.txt > license/NOTICE > license/README.dom.txt > license/README.sax.txt > LICENSE > Grizzly_THIRDPARTYLICENSEREADME.txt I added a patch to https://issues.apache.org/jira/browse/HADOOP-14018 to add the missing META-INF/LICENSE.txt to the shaded files. Question: what should be done with the other LICENSE files in the minicluster. Can we just exclude them (from legal point of view)? Regards, Marton - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
Re: [VOTE] Release Apache Hadoop 3.0.0-alpha2 RC0
hadoop-3.0.0-alpha2.tar.gz is much more smaller than hadoop-3.0.0-alpha1.tar.gz. (246M vs 316M) The big difference is the generated source documentation: find -name src-html ./hadoop-2.7.3/share/doc/hadoop/api/src-html ./hadoop-2.7.3/share/doc/hadoop/hadoop-hdfs-httpfs/apidocs/src-html ./hadoop-3.0.0-alpha1/share/doc/hadoop/api/src-html ./hadoop-3.0.0-alpha1/share/doc/hadoop/hadoop-hdfs-httpfs/apidocs/src-html ./hadoop-3.0.0-alpha1/share/doc/hadoop/hadoop-project-dist/hadoop-hdfs/api/src-html ./hadoop-3.0.0-alpha1/share/doc/hadoop/hadoop-project-dist/hadoop-hdfs-client/api/src-html ./hadoop-3.0.0-alpha1/share/doc/hadoop/hadoop-project-dist/hadoop-common/api/src-html (./hadoop-3.0.0-alpha-2 --> no match) I am just wondering if it's intentional or not as I can't find any related jira or mail thread (maybe I missed it) Regards, Marton On 01/20/2017 11:36 PM, Andrew Wang wrote: > Hi all, > > With heartfelt thanks to many contributors, the RC0 for 3.0.0-alpha2 is > ready. > > 3.0.0-alpha2 is the second alpha in the planned 3.0.0 release line leading > up to a 3.0.0 GA. It comprises 857 fixes, improvements, and new features > since alpha1 was released on September 3rd, 2016. > > More information about the 3.0.0 release plan can be found here: > > https://cwiki.apache.org/confluence/display/HADOOP/Hadoop+3.0.0+release > > The artifacts can be found here: > > http://home.apache.org/~wang/3.0.0-alpha2-RC0/ > > This vote will run 5 days, ending on 01/25/2017 at 2PM pacific. > > I ran basic validation with a local pseudo cluster and a Pi job. RAT output > was clean. > > My +1 to start. > > Thanks, > Andrew > - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org