Re: [DISCUSS] hadoop branch-3.3+ going to java11 only
In theory, I like the idea of setting aside Java 8. Unfortunately, I don't know that upgrading within the 3.3 line adheres to our binary compatibility policy [1]. I don't see specific discussion of the Java version there, but it states that you should be able to drop in minor upgrades and have existing apps keep working. Users might find it surprising if they try to upgrade a cluster that has JDK 8. There is also the question of impact on downstream projects [2]. We'd have to check plans with our consumers. What about the idea of shooting for a 3.4 release on JDK 11 (or even 17)? The downside is that we'd probably need to set boundaries on end of life/limited support for 3.2 and 3.3 to keep the workload manageable. [1] https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/Compatibility.html#Java_Binary_compatibility_for_end-user_applications_i.e._Apache_Hadoop_ABI [2] https://github.com/apache/spark/blob/v3.3.2/pom.xml#L109 Chris Nauroth On Tue, Mar 28, 2023 at 11:10 AM Ayush Saxena wrote: > > > > it's already hard to migrate from JDK8 why not retarget JDK17. > > > > +1, makes sense to me, sounds like a win-win situation to me, though there > would be some additional issues to chase now :) > > -Ayush > > > On Tue, 28 Mar 2023 at 23:29, Wei-Chiu Chuang wrote: > > > My random thoughts. Probably bad takes: > > > > There are projects experimenting with JDK17 now. > > JDK11 active support will end in 6 months. If it's already hard to > migrate > > from JDK8 why not retarget JDK17. > > > > On Tue, Mar 28, 2023 at 10:30 AM Ayush Saxena > wrote: > > > >> I know Jersey upgrade as a blocker. Some folks were chasing that last > >> year during 3.3.4 time, I don’t know where it is now, didn’t see then > >> what’s the problem there but I remember there was some intitial PR which > >> did it for HDFS atleast, so I never looked beyond that… > >> > >> I too had jdk-11 in my mind, but only for trunk. 3.4.x can stay as > >> java-11 only branch may be, but that is something later to decide, once > we > >> get the code sorted… > >> > >> -Ayush > >> > >> > On 28-Mar-2023, at 9:16 PM, Steve Loughran > > >> wrote: > >> > > >> > well, how about we flip the switch and get on with it. > >> > > >> > slf4j seems happy on java11, > >> > > >> > side issue, anyone seen test failures on zulu1.8; somehow my test run > is > >> > failing and i'm trying to work out whether its a mismatch in command > >> > line/ide jvm versions, or the 3.3.5 JARs have been built with an > openjdk > >> > version which requires IntBuffer implements an overridden method > >> IntBuffer > >> > rewind(). > >> > > >> > java.lang.NoSuchMethodError: > >> java.nio.IntBuffer.rewind()Ljava/nio/IntBuffer; > >> > > >> > at > >> org.apache.hadoop.fs.FSInputChecker.verifySums(FSInputChecker.java:341) > >> > at > >> > > >> > org.apache.hadoop.fs.FSInputChecker.readChecksumChunk(FSInputChecker.java:308) > >> > at org.apache.hadoop.fs.FSInputChecker.read1(FSInputChecker.java:257) > >> > at org.apache.hadoop.fs.FSInputChecker.read(FSInputChecker.java:202) > >> > at java.io.DataInputStream.read(DataInputStream.java:149) > >> > > >> >> On Tue, 28 Mar 2023 at 15:52, Viraj Jasani > wrote: > >> >> IIRC some of the ongoing major dependency upgrades (log4j 1 to 2, > >> jersey 1 > >> >> to 2 and junit 4 to 5) are blockers for java 11 compile + test > >> stability. > >> >> On Tue, Mar 28, 2023 at 4:55 AM Steve Loughran > >> >> >> wrote: > >> >>> Now that hadoop 3.3.5 is out, i want to propose something new > >> >>> we switch branch-3.3 and trunk to being java11 only > >> >>> 1. java 11 has been out for years > >> >>> 2. oracle java 8 is no longer available under "premier support"; you > >> >>> can't really get upgrades > >> >>> > https://www.oracle.com/java/technologies/java-se-support-roadmap.html > >> >>> 3. openJDK 8 releases != oracle ones, and things you compile with > them > >> >>> don't always link to oracle java 8 (some classes in java.nio have > >> >> added > >> >>> more overrides) > >> >>> 4. more and more libraries we want to upgrade to/bundle are java 11 > >> >> only > >> >>> 5. moving to java 11 would cut our yetus build workload in half, and > >> >>> line up for adding java 17 builds instead. > >> >>> I know there are some outstanding issues still in > >> >>> https://issues.apache.org/jira/browse/HADOOP-16795 -but are they > >> >> blockers? > >> >>> Could we just move to java11 and enhance at our leisure, once java8 > >> is no > >> >>> longer a concern. > >> > >> - > >> To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org > >> For additional commands, e-mail: common-dev-h...@hadoop.apache.org > >> > >> >
Re: [VOTE] Release Apache Hadoop 3.3.5 (RC3)
+1 Thank you for the release candidate, Steve! * Verified all checksums. * Verified all signatures. * Built from source, including native code on Linux. * mvn clean package -Pnative -Psrc -Drequire.openssl -Drequire.snappy -Drequire.zstd -DskipTests * Tests passed. * mvn --fail-never clean test -Pnative -Dparallel-tests -Drequire.snappy -Drequire.zstd -Drequire.openssl -Dsurefire.rerunFailingTestsCount=3 -DtestsThreadCount=8 * Checked dependency tree to make sure we have all of the expected library updates that are mentioned in the release notes. * mvn -o dependency:tree * Confirmed that hadoop-openstack is now just a stub placeholder artifact with no code. * For ARM verification: * Ran "file " on all native binaries in the ARM tarball to confirm they actually came out with ARM as the architecture. * Output of hadoop checknative -a on ARM looks good. * Ran a MapReduce job with the native bzip2 codec for compression, and it worked fine. * Ran a MapReduce job with YARN configured to use LinuxContainerExecutor and verified launching the containers through container-executor worked. Chris Nauroth On Mon, Mar 20, 2023 at 3:45 AM Ayush Saxena wrote: > +1(Binding) > > * Built from source (x86 & ARM) > * Successful Native Build (x86 & ARM) > * Verified Checksums (x86 & ARM) > * Verified Signature (x86 & ARM) > * Checked the output of hadoop version (x86 & ARM) > * Verified the output of hadoop checknative (x86 & ARM) > * Ran some basic HDFS shell commands. > * Ran some basic Yarn shell commands. > * Played a bit with HDFS Erasure Coding. > * Ran TeraGen & TeraSort > * Browed through NN, DN, RM & NM UI > * Skimmed over the contents of website. > * Skimmed over the contents of maven repo. > * Selectively ran some HDFS & CloudStore tests > > Thanx Steve for driving the release. Good Luck!!! > > -Ayush > > > On 20-Mar-2023, at 12:54 PM, Xiaoqiao He wrote: > > > > +1 > > > > * Verified signature and checksum of the source tarball. > > * Built the source code on Ubuntu and OpenJDK 11 by `mvn clean package > > -DskipTests -Pnative -Pdist -Dtar`. > > * Setup pseudo cluster with HDFS and YARN. > > * Run simple FsShell - mkdir/put/get/mv/rm (include EC) and check the > > result. > > * Run example mr applications and check the result - Pi & wordcount. > > * Check the Web UI of NameNode/DataNode/Resourcemanager/NodeManager etc. > > > > Thanks Steve for your work. > > > > Best Regards, > > - He Xiaoqiao > > > >> On Mon, Mar 20, 2023 at 12:04 PM Masatake Iwasaki < > iwasak...@oss.nttdata.com> > >> wrote: > >> > >> +1 > >> > >> + verified the signature and checksum of the source tarball. > >> > >> + built from the source tarball on Rocky Linux 8 (x86_64) and OpenJDK 8 > >> with native profile enabled. > >> + launched pseudo distributed cluster including kms and httpfs with > >> Kerberos and SSL enabled. > >> + created encryption zone, put and read files via httpfs. > >> + ran example MR wordcount over encryption zone. > >> + checked the binary of container-executor. > >> > >> + built rpm packages by Bigtop (with trivial modifications) on Rocky > Linux > >> 8 (aarch64). > >> + ran smoke-tests of hdfs, yarn and mapreduce. > >> + built site documentation and skimmed the contents. > >> + Javadocs are contained. > >> > >> Thanks, > >> Masatake Iwasaki > >> > >>> On 2023/03/16 4:47, Steve Loughran wrote: > >>> Apache Hadoop 3.3.5 > >>> > >>> Mukund and I have put together a release candidate (RC3) for Hadoop > >> 3.3.5. > >>> > >>> What we would like is for anyone who can to verify the tarballs, > >> especially > >>> anyone who can try the arm64 binaries as we want to include them too. > >>> > >>> The RC is available at: > >>> https://dist.apache.org/repos/dist/dev/hadoop/hadoop-3.3.5-RC3/ > >>> > >>> The git tag is release-3.3.5-RC3, commit 706d88266ab > >>> > >>> The maven artifacts are staged at > >>> > https://repository.apache.org/content/repositories/orgapachehadoop-1369/ > >>> > >>> You can find my public key at: > >>> https://dist.apache.org/repos/dist/release/hadoop/common/KEYS > >>> > >>> Change log > >>> > >> > https://dist.apache.org/repos/dist/dev/hadoop/hadoop-3.3.5-RC3/CHANGELOG.md > >>> > >>> Release notes > >>>
Re: [VOTE] Release Apache Hadoop 3.3.5 (RC3)
Yes, I'm in progress on verification, so you can expect to get a vote from me. Thank you, Steve! Chris Nauroth On Sat, Mar 18, 2023 at 9:19 AM Ashutosh Gupta wrote: > Hi Steve > > I will also do it by today/tomorrow. > > Thanks, > Ashutosh > > On Sat, 18 Mar, 2023, 4:07 pm Steve Loughran, > > wrote: > > > Thank you for this! > > > > Can anyone else with time do a review too? i really want to get this one > > done, now the HDFS issues are all resolved. > > > > I do not want this release to fall by the wayside through lack of votes > > alone. In fact, I would be very unhappy > > > > > > > > On Sat, 18 Mar 2023 at 06:47, Viraj Jasani wrote: > > > > > +1 (non-binding) > > > > > > * Signature/Checksum: ok > > > * Rat check (1.8.0_341): ok > > > - mvn clean apache-rat:check > > > * Built from source (1.8.0_341): ok > > > - mvn clean install -DskipTests > > > * Built tar from source (1.8.0_341): ok > > > - mvn clean package -Pdist -DskipTests -Dtar > -Dmaven.javadoc.skip=true > > > > > > Containerized deployments: > > > * Deployed and started Hdfs - NN, DN, JN with Hbase 2.5 and Zookeeper > 3.7 > > > * Deployed and started JHS, RM, NM > > > * Hbase, hdfs CRUD looks good > > > * Sample RowCount MapReduce job looks good > > > > > > * S3A tests with scale profile looks good > > > > > > > > > On Wed, Mar 15, 2023 at 12:48 PM Steve Loughran > > > > > > wrote: > > > > > > > Apache Hadoop 3.3.5 > > > > > > > > Mukund and I have put together a release candidate (RC3) for Hadoop > > > 3.3.5. > > > > > > > > What we would like is for anyone who can to verify the tarballs, > > > especially > > > > anyone who can try the arm64 binaries as we want to include them too. > > > > > > > > The RC is available at: > > > > https://dist.apache.org/repos/dist/dev/hadoop/hadoop-3.3.5-RC3/ > > > > > > > > The git tag is release-3.3.5-RC3, commit 706d88266ab > > > > > > > > The maven artifacts are staged at > > > > > > https://repository.apache.org/content/repositories/orgapachehadoop-1369/ > > > > > > > > You can find my public key at: > > > > https://dist.apache.org/repos/dist/release/hadoop/common/KEYS > > > > > > > > Change log > > > > > > > > > > https://dist.apache.org/repos/dist/dev/hadoop/hadoop-3.3.5-RC3/CHANGELOG.md > > > > > > > > Release notes > > > > > > > > > > > > > > https://dist.apache.org/repos/dist/dev/hadoop/hadoop-3.3.5-RC3/RELEASENOTES.md > > > > > > > > This is off branch-3.3 and is the first big release since 3.3.2. > > > > > > > > Key changes include > > > > > > > > * Big update of dependencies to try and keep those reports of > > > > transitive CVEs under control -both genuine and false positives. > > > > * HDFS RBF enhancements > > > > * Critical fix to ABFS input stream prefetching for correct reading. > > > > * Vectored IO API for all FSDataInputStream implementations, with > > > > high-performance versions for file:// and s3a:// filesystems. > > > > file:// through java native io > > > > s3a:// parallel GET requests. > > > > * This release includes Arm64 binaries. Please can anyone with > > > > compatible systems validate these. > > > > * and compared to the previous RC, all the major changes are > > > > HDFS issues. > > > > > > > > Note, because the arm64 binaries are built separately on a different > > > > platform and JVM, their jar files may not match those of the x86 > > > > release -and therefore the maven artifacts. I don't think this is > > > > an issue (the ASF actually releases source tarballs, the binaries are > > > > there for help only, though with the maven repo that's a bit > blurred). > > > > > > > > The only way to be consistent would actually untar the x86.tar.gz, > > > > overwrite its binaries with the arm stuff, retar, sign and push out > > > > for the vote. Even automating that would be risky. > > > > > > > > Please try the release and vote. The vote will run for 5 days. > > > > > > > > -Steve > > > > > > > > > >
[jira] [Resolved] (YARN-11231) FSDownload set wrong permission in destinationTmp
[ https://issues.apache.org/jira/browse/YARN-11231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth resolved YARN-11231. -- Assignee: Zhang Dongsheng Resolution: Won't Fix Hello [~skysider]. I noticed you closed pull request [#4629|https://github.com/apache/hadoop/pull/4629]. I assume you are abandoning this change, because 777 would be too dangerous, so I'm also closing this corresponding JIRA issue. (If I misunderstood, and you're still working on something for this, then the issue can be reopened.) > FSDownload set wrong permission in destinationTmp > - > > Key: YARN-11231 > URL: https://issues.apache.org/jira/browse/YARN-11231 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Reporter: Zhang Dongsheng >Assignee: Zhang Dongsheng >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > FSDownload calls createDir in the call method to create the destinationTmp > directory, which is later used as the parent directory to create the > directory dFinal, which is used in doAs to perform operations such as path > creation and path traversal. doAs cannot determine the user's identity, so > there is a problem with setting 755 permissions for destinationTmp here, I > think it should be set to 777 permissions here. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
Re: [VOTE] Release Apache Hadoop 3.3.5
Is it a problem limited to MiniDFSCluster, or is it a broader problem of RPC client resource cleanup? The patch is changing connection close cleanup, so I assumed the latter. If so, then it could potentially impact applications integrating with the RPC clients. If the problem is limited to MiniDFSCluster and restarts within a single JVM, then I agree the impact is smaller. Then, we'd want to consider what downstream projects have tests that do restarts on a MiniDFSCluster. Chris Nauroth On Wed, Jan 4, 2023 at 4:22 PM Ayush Saxena wrote: > Hmm I'm looking at HADOOP-11867 related stuff but couldn't find it >> mentioned anywhere in change log or release notes. Are they actually >> up-to-date? > > > I don't think there is any issue with the ReleaseNotes generation as such > but with the Resolution type of this ticket, It ain't marked as Fixed but > Done. The other ticket which is marked Done is also not part of the release > notes. [1] > > if I'm understanding the potential impact of HDFS-16853 >> correctly, then it's serious enough to fix before a release. (I could >> change my vote if someone wants to make a case that it's not that >> serious.) >> > > Chris, I just had a very quick look at HDFS-16853, I am not sure if this > can happen outside a MiniDfsCluster setup? Just guessing from the > description in the ticket. It looked like when we did a restart of the > Namenode in the MiniDfsCluster, I guess that would be in the same single > JVM, and that is why a previous blocked thread caused issues with the > restart. That is what I understood, I haven't checked the code though. > > Second, In the same context, Being curious If this lands up being a > MiniDfsCluster only issue, do we still consider this a release blocker? Not > saying in a way it won't be serious, MiniDfsCluster is very widely used by > downstream projects and all, so just wanted to know > > Regarding the Hive & Bouncy castle. The PR seems to have a valid binding > veto, I am not sure if it will get done any time soon, so if the use case > is something required, I would suggest handling it at Hadoop itself. It > seems to be centric to Hive-3.x, I tried compiling the Hive master branch > with 3.3.5 and it passed. Other than that Hive officially support only > Hadoop-3.3.1 and that too only in the last 4.x release[2] > > > [1] > https://issues.apache.org/jira/browse/HADOOP-11867?jql=project%20%3D%20HADOOP%20AND%20resolution%20%3D%20Done%20AND%20fixVersion%20%3D%203.3.5%20ORDER%20BY%20resolution%20DESC > [2] https://issues.apache.org/jira/browse/HIVE-24484 > > -Ayush > > On Tue, 3 Jan 2023 at 23:51, Chris Nauroth wrote: > >> -1, because if I'm understanding the potential impact of HDFS-16853 >> correctly, then it's serious enough to fix before a release. (I could >> change my vote if someone wants to make a case that it's not that >> serious.) >> >> Otherwise, this RC was looking good: >> >> * Verified all checksums. >> * Verified all signatures. >> * Built from source, including native code on Linux. >> * mvn clean package -Pnative -Psrc -Drequire.openssl -Drequire.snappy >> -Drequire.zstd -DskipTests >> * Tests passed. >> * mvn --fail-never clean test -Pnative -Dparallel-tests >> -Drequire.snappy -Drequire.zstd -Drequire.openssl >> -Dsurefire.rerunFailingTestsCount=3 -DtestsThreadCount=8 >> * Checked dependency tree to make sure we have all of the expected library >> updates that are mentioned in the release notes. >> * mvn -o dependency:tree >> * Farewell, S3Guard. >> * Confirmed that hadoop-openstack is now just a stub placeholder artifact >> with no code. >> * For ARM verification: >> * Ran "file " on all native binaries in the ARM tarball to confirm >> they actually came out with ARM as the architecture. >> * Output of hadoop checknative -a on ARM looks good. >> * Ran a MapReduce job with the native bzip2 codec for compression, and >> it worked fine. >> * Ran a MapReduce job with YARN configured to use >> LinuxContainerExecutor and verified launching the containers through >> container-executor worked. >> >> My local setup didn't have the test failures mentioned by Viraj, though >> there was some flakiness with a few HDFS snapshot tests timing out. >> >> Regarding Hive and Bouncy Castle, there is an existing issue and pull >> request tracking an upgrade attempt. It's looking like some amount of code >> changes are required: >> >> https://issues.apache.org/jira/browse/HIVE-26648 >> https://github.com/apache/hive/pull/3744 >> >> Chris Nauroth >> >> >> On Tue, Jan 3, 2
Re: [VOTE] Release Apache Hadoop 3.3.5
-1, because if I'm understanding the potential impact of HDFS-16853 correctly, then it's serious enough to fix before a release. (I could change my vote if someone wants to make a case that it's not that serious.) Otherwise, this RC was looking good: * Verified all checksums. * Verified all signatures. * Built from source, including native code on Linux. * mvn clean package -Pnative -Psrc -Drequire.openssl -Drequire.snappy -Drequire.zstd -DskipTests * Tests passed. * mvn --fail-never clean test -Pnative -Dparallel-tests -Drequire.snappy -Drequire.zstd -Drequire.openssl -Dsurefire.rerunFailingTestsCount=3 -DtestsThreadCount=8 * Checked dependency tree to make sure we have all of the expected library updates that are mentioned in the release notes. * mvn -o dependency:tree * Farewell, S3Guard. * Confirmed that hadoop-openstack is now just a stub placeholder artifact with no code. * For ARM verification: * Ran "file " on all native binaries in the ARM tarball to confirm they actually came out with ARM as the architecture. * Output of hadoop checknative -a on ARM looks good. * Ran a MapReduce job with the native bzip2 codec for compression, and it worked fine. * Ran a MapReduce job with YARN configured to use LinuxContainerExecutor and verified launching the containers through container-executor worked. My local setup didn't have the test failures mentioned by Viraj, though there was some flakiness with a few HDFS snapshot tests timing out. Regarding Hive and Bouncy Castle, there is an existing issue and pull request tracking an upgrade attempt. It's looking like some amount of code changes are required: https://issues.apache.org/jira/browse/HIVE-26648 https://github.com/apache/hive/pull/3744 Chris Nauroth On Tue, Jan 3, 2023 at 8:57 AM Chao Sun wrote: > Hmm I'm looking at HADOOP-11867 related stuff but couldn't find it > mentioned anywhere in change log or release notes. Are they actually > up-to-date? > > On Mon, Jan 2, 2023 at 7:48 AM Masatake Iwasaki > wrote: > > > > >- building HBase 2.4.13 and Hive 3.1.3 against 3.3.5 failed due to > dependency change. > > > > For HBase, classes under com/sun/jersey/json/* and com/sun/xml/* are not > expected in hbase-shaded-with-hadoop-check-invariants. > > Updating hbase-shaded/pom.xml is expected to be the fix as done in > HBASE-27292. > > > https://github.com/apache/hbase/commit/00612106b5fa78a0dd198cbcaab610bd8b1be277 > > > >[INFO] --- exec-maven-plugin:1.6.0:exec > (check-jar-contents-for-stuff-with-hadoop) @ > hbase-shaded-with-hadoop-check-invariants --- > >[ERROR] Found artifact with unexpected contents: > '/home/rocky/srcs/bigtop/build/hbase/rpm/BUILD/hbase-2.4.13/hbase-shaded/hbase-shaded-client/target/hbase-shaded-client-2.4.13.jar' > >Please check the following and either correct the build or update > >the allowed list with reasoning. > > > >com/ > >com/sun/ > >com/sun/jersey/ > >com/sun/jersey/json/ > >... > > > > > > For Hive, classes belonging to org.bouncycastle:bcprov-jdk15on:1.68 seem > to be problematic. > > Excluding them on hive-jdbc might be the fix. > > > >[ERROR] Failed to execute goal > org.apache.maven.plugins:maven-shade-plugin:3.2.1:shade (default) on > project hive-jdbc: Error creating shaded jar: Problem shading JAR > /home/rocky/.m2/repository/org/bouncycastle/bcprov-jdk15on/1.68/bcprov-jdk15on-1.68.jar > entry > META-INF/versions/15/org/bouncycastle/jcajce/provider/asymmetric/edec/SignatureSpi$EdDSA.class: > java.lang.IllegalArgumentException: Unsupported class file major version 59 > -> [Help 1] > >... > > > > > > On 2023/01/02 22:02, Masatake Iwasaki wrote: > > > Thanks for your great effort for the new release, Steve and Mukund. > > > > > > +1 while it would be nice if we can address missed Javadocs. > > > > > > + verified the signature and checksum. > > > + built from source tarball on Rocky Linux 8 and OpenJDK 8 with native > profile enabled. > > >+ launched pseudo distributed cluster including kms and httpfs with > Kerberos and SSL enabled. > > >+ created encryption zone, put and read files via httpfs. > > >+ ran example MR wordcount over encryption zone. > > > + built rpm packages by Bigtop and ran smoke-tests on Rocky Linux 8 > (both x86_64 and aarch64). > > >- building HBase 2.4.13 and Hive 3.1.3 against 3.3.5 failed due to > dependency change. > > > # while building HBase 2.4.13 and Hive 3.1.3 against Hadoop 3.3.4 > worked. > > > + skimmed the site contents. > > >- Javadocs are not contained (under r3.3.5/
[jira] [Resolved] (YARN-11388) Prevent resource leaks in TestClientRMService.
[ https://issues.apache.org/jira/browse/YARN-11388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth resolved YARN-11388. -- Fix Version/s: 3.4.0 3.2.5 3.3.9 Resolution: Fixed I have merged this to trunk, branch-3.3 and branch-3.2 (after resolving some minor merge conflicts). [~slfan1989] , thank you for your review! > Prevent resource leaks in TestClientRMService. > -- > > Key: YARN-11388 > URL: https://issues.apache.org/jira/browse/YARN-11388 > Project: Hadoop YARN > Issue Type: Test > Components: test > Reporter: Chris Nauroth > Assignee: Chris Nauroth >Priority: Minor > Labels: pull-request-available > Fix For: 3.4.0, 3.2.5, 3.3.9 > > > While working on YARN-11360, I noticed a few problems in > {{TestClientRMService}} that made it difficult to work with. Tests do not > guarantee that servers they start up get shutdown. If an individual test > fails, then it can leave TCP sockets bound, causing subsequent tests in the > suite to fail on their socket bind attempts for the same port. There is also > a file generated by a test that is leaking outside of the build directory > into the source tree. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Resolved] (YARN-11392) ClientRMService implemented getCallerUgi and verifyUserAccessForRMApp methods but forget to use sometimes, caused audit log missing.
[ https://issues.apache.org/jira/browse/YARN-11392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth resolved YARN-11392. -- Fix Version/s: 3.4.0 3.2.5 3.3.9 Resolution: Fixed I have committed this to trunk, branch-3.3 and branch-3.2. [~chino71], thank you for the contribution. > ClientRMService implemented getCallerUgi and verifyUserAccessForRMApp methods > but forget to use sometimes, caused audit log missing. > > > Key: YARN-11392 > URL: https://issues.apache.org/jira/browse/YARN-11392 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Affects Versions: 3.3.4 >Reporter: Beibei Zhao >Assignee: Beibei Zhao >Priority: Major > Labels: audit, log, pull-request-available, yarn > Fix For: 3.4.0, 3.2.5, 3.3.9 > > > ClientRMService implemented getCallerUgi and verifyUserAccessForRMApp methods. > {code:java} > private UserGroupInformation getCallerUgi(ApplicationId applicationId, > String operation) throws YarnException { > UserGroupInformation callerUGI; > try { > callerUGI = UserGroupInformation.getCurrentUser(); > } catch (IOException ie) { > LOG.info("Error getting UGI ", ie); > RMAuditLogger.logFailure("UNKNOWN", operation, "UNKNOWN", > "ClientRMService", "Error getting UGI", applicationId); > throw RPCUtil.getRemoteException(ie); > } > return callerUGI; > } > {code} > *Privileged operations* like "getContainerReport" (which called checkAccess > before op) will call them and *record audit logs* when an *exception* > happens, but forget to use sometimes, caused audit log {*}missing{*}: > {code:java} > // getApplicationReport > UserGroupInformation callerUGI; > try { > callerUGI = UserGroupInformation.getCurrentUser(); > } catch (IOException ie) { > LOG.info("Error getting UGI ", ie); > // a logFailure should be called here. > throw RPCUtil.getRemoteException(ie); > } > {code} > So, I will replace some code blocks like this with getCallerUgi or > verifyUserAccessForRMApp. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
Re: [VOTE] Release Apache Hadoop 3.3.5
I'm not quite ready to vote yet, pending some additional testing. However, I wanted to give a quick update that ARM support is looking good from my perspective. I focused on verifying the native bits that would need to be different for ARM vs. x64. Here is what I did: * Ran "file " on all native binaries in the ARM tarball to confirm they actually came out with ARM as the architecture. * Output of hadoop checknative -a on ARM looks good. * Ran a MapReduce job with the native bzip2 codec for compression, and it worked fine. * Ran a MapReduce job with YARN configured to use LinuxContainerExecutor and verified launching the containers through container-executor worked. Chris Nauroth On Wed, Dec 21, 2022 at 11:29 AM Steve Loughran wrote: > Mukund and I have put together a release candidate (RC0) for Hadoop 3.3.5. > > Given the time of year it's a bit unrealistic to run a 5 day vote and > expect people to be able to test it thoroughly enough to make this the one > we can ship. > > What we would like is for anyone who can to verify the tarballs, and test > the binaries, especially anyone who can try the arm64 binaries. We've got > the building of those done and now the build file will incorporate them > into the release -but neither of us have actually tested it yet. Maybe I > should try it on my pi400 over xmas. > > The maven artifacts are up on the apache staging repo -they are the ones > from x86 build. Building and testing downstream apps will be incredibly > helpful. > > The RC is available at: > https://dist.apache.org/repos/dist/dev/hadoop/hadoop-3.3.5-RC0/ > > The git tag is release-3.3.5-RC0, commit 3262495904d > > The maven artifacts are staged at > https://repository.apache.org/content/repositories/orgapachehadoop-1365/ > > You can find my public key at: > https://dist.apache.org/repos/dist/release/hadoop/common/KEYS > > Change log > https://dist.apache.org/repos/dist/dev/hadoop/hadoop-3.3.5-RC0/CHANGELOG.md > > Release notes > > https://dist.apache.org/repos/dist/dev/hadoop/hadoop-3.3.5-RC0/RELEASENOTES.md > > This is off branch-3.3 and is the first big release since 3.3.2. > > Key changes include > > * Big update of dependencies to try and keep those reports of > transitive CVEs under control -both genuine and false positive. > * HDFS RBF enhancements > * Critical fix to ABFS input stream prefetching for correct reading. > * Vectored IO API for all FSDataInputStream implementations, with > high-performance versions for file:// and s3a:// filesystems. > file:// through java native io > s3a:// parallel GET requests. > * This release includes Arm64 binaries. Please can anyone with > compatible systems validate these. > > > Please try the release and vote on it, even though i don't know what is a > good timeline here...i'm actually going on holiday in early jan. Mukund is > around and so can drive the process while I'm offline. > > Assuming we do have another iteration, the RC1 will not be before mid jan > for that reason > > Steve (and mukund) >
[jira] [Resolved] (YARN-11390) TestResourceTrackerService.testNodeRemovalNormally: Shutdown nodes should be 0 now expected: <1> but was: <0>
[ https://issues.apache.org/jira/browse/YARN-11390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth resolved YARN-11390. -- Fix Version/s: 3.4.0 3.2.5 3.3.9 Resolution: Fixed [~bkosztolnik] , thank you for the contribution. [~pszucs], thank you for reviewing. I have committed this to trunk, branch-3.3 and branch-3.2. For the cherry-picks to branch-3.3 and branch-3.2, I resolved some minor merge conflicts and confirmed a successful test run. > TestResourceTrackerService.testNodeRemovalNormally: Shutdown nodes should be > 0 now expected: <1> but was: <0> > - > > Key: YARN-11390 > URL: https://issues.apache.org/jira/browse/YARN-11390 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Reporter: Bence Kosztolnik >Assignee: Bence Kosztolnik >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0, 3.2.5, 3.3.9 > > > Some times the TestResourceTrackerService.{*}testNodeRemovalNormally{*} fails > with the following message > {noformat} > java.lang.AssertionError: Shutdown nodes should be 0 now expected:<1> but > was:<0> > at > org.apache.hadoop.yarn.server.resourcemanager.TestResourceTrackerService.testNodeRemovalUtilDecomToUntracked(TestResourceTrackerService.java:1723) > at > org.apache.hadoop.yarn.server.resourcemanager.TestResourceTrackerService.testNodeRemovalUtil(TestResourceTrackerService.java:1685) > at > org.apache.hadoop.yarn.server.resourcemanager.TestResourceTrackerService.testNodeRemovalNormally(TestResourceTrackerService.java:1530){noformat} > This can happen in case if the hardcoded 1s sleep in the test not enough for > proper shut down. > To fix this issue we should poll the cluster status with a time out, and see > the cluster can reach the expected state -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-11388) Prevent resource leaks in TestClientRMService.
Chris Nauroth created YARN-11388: Summary: Prevent resource leaks in TestClientRMService. Key: YARN-11388 URL: https://issues.apache.org/jira/browse/YARN-11388 Project: Hadoop YARN Issue Type: Test Components: test Reporter: Chris Nauroth Assignee: Chris Nauroth While working on YARN-11360, I noticed a few problems in {{TestClientRMService}} that made it difficult to work with. Tests do not guarantee that servers they start up get shutdown. If an individual test fails, then it can leave TCP sockets bound, causing subsequent tests in the suite to fail on their socket bind attempts for the same port. There is also a file generated by a test that is leaking outside of the build directory into the source tree. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Resolved] (YARN-11363) Remove unused TimelineVersionWatcher and TimelineVersion from hadoop-yarn-server-tests
[ https://issues.apache.org/jira/browse/YARN-11363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth resolved YARN-11363. -- Fix Version/s: 3.3.5 3.4.0 Resolution: Fixed > Remove unused TimelineVersionWatcher and TimelineVersion from > hadoop-yarn-server-tests > --- > > Key: YARN-11363 > URL: https://issues.apache.org/jira/browse/YARN-11363 > Project: Hadoop YARN > Issue Type: Improvement > Components: test, yarn >Affects Versions: 3.3.3, 3.3.4 >Reporter: Ashutosh Gupta >Assignee: Ashutosh Gupta >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0, 3.3.5 > > > Verify and remove unused TimelineVersionWatcher and TimelineVersion from > hadoop-yarn-server-tests -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Resolved] (YARN-11360) Add number of decommissioning/shutdown nodes to YARN cluster metrics.
[ https://issues.apache.org/jira/browse/YARN-11360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth resolved YARN-11360. -- Fix Version/s: 3.4.0 3.2.5 3.3.9 Hadoop Flags: Reviewed Resolution: Fixed I have committed this to trunk, branch-3.3 and branch-3.2 (after resolving a minor merge conflict). [~mkonst], [~groot] and [~abmodi], thank you for the code reviews. > Add number of decommissioning/shutdown nodes to YARN cluster metrics. > - > > Key: YARN-11360 > URL: https://issues.apache.org/jira/browse/YARN-11360 > Project: Hadoop YARN > Issue Type: Improvement > Components: client, resourcemanager > Reporter: Chris Nauroth > Assignee: Chris Nauroth >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0, 3.2.5, 3.3.9 > > > YARN cluster metrics expose counts of NodeManagers in various states > including active and decommissioned. However, these metrics don't expose > NodeManagers that are currently in the process of decommissioning. This can > look a little spooky to a consumer of these metrics. First, the node drops > out of the active count, so it seems like a node just vanished. Then, later > (possibly hours later with consideration of graceful decommission), it comes > back into existence in the decommissioned count. > This issue tracks adding the decommissioning count to the metrics > ResourceManager RPC. This also enables exposing it in the {{yarn top}} > output. This metric is already visible through the REST API, so there isn't > any change required there. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
YARN-11360: Proposed "yarn top" output change
I'd like to raise awareness that in YARN-11360, I've shared a patch that includes proposed changes to the header output of the "yarn top" CLI. Technically, this doesn't follow the letter of the law on CLI output compatibility [1]. However, I'd like to proceed on the basis that "yarn top" is intended as an interactive tool for users actively watching, and it seems unlikely that anyone could usefully script around its output anyway. The patch already has a +1. I'll hold off committing for a few days. Please reply if you have an opinion. If there isn't consensus, I can revert the "yarn top" portion of the patch, leaving just the protocol changes. Then, the "yarn top" changes would go in a separate patch targeting a major release upgrade that allows backward-incompatible changes. (I don't believe there is a branch to receive 4.x changes though, so this would effectively stall that part of the patch indefinitely.) [1] https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/Compatibility.html#Command_Line_Interface_.28CLI.29 Chris Nauroth
[jira] [Created] (YARN-11360) Add number of decommissioning nodes to YARN cluster metrics.
Chris Nauroth created YARN-11360: Summary: Add number of decommissioning nodes to YARN cluster metrics. Key: YARN-11360 URL: https://issues.apache.org/jira/browse/YARN-11360 Project: Hadoop YARN Issue Type: Improvement Components: client, resourcemanager Environment: YARN cluster metrics expose counts of NodeManagers in various states including active and decommissioned. However, these metrics don't expose NodeManagers that are currently in the process of decommissioning. This can look a little spooky to a consumer of these metrics. First, the node drops out of the active count, so it seems like a node just vanished. Then, later (possibly hours later with consideration of graceful decommission), it comes back into existence in the decommissioned count. This issue tracks adding the decommissioning count to the metrics ResourceManager RPC. This also enables exposing it in the {{yarn top}} output. This metric is already visible through the REST API, so there isn't any change required there. Reporter: Chris Nauroth Assignee: Chris Nauroth -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
Re: [VOTE] Release Apache Hadoop 3.3.4
+1 (binding) * Verified all checksums. * Verified all signatures. * Built from source, including native code on Linux. * mvn clean package -Pnative -Psrc -Drequire.openssl -Drequire.snappy -Drequire.zstd -DskipTests * Tests passed. * mvn --fail-never clean test -Pnative -Dparallel-tests -Drequire.snappy -Drequire.zstd -Drequire.openssl -Dsurefire.rerunFailingTestsCount=3 -DtestsThreadCount=8 * Checked dependency tree to make sure we have all of the expected library updates that are mentioned in the release notes. * mvn -o dependency:tree I saw a LibHDFS test failure, but I know it's something flaky that's already tracked in a JIRA issue. The release looks good. Steve, thank you for driving this. Chris Nauroth On Wed, Aug 3, 2022 at 11:27 AM Steve Loughran wrote: > my vote for this is +1, binding. > > obviously I`m biased, but i do not want to have to issue any more interim > releases before the feature release off branch-3.3, so I am trying to be > ruthless. > > my client vaidator ant project has a more targets to help with releasing, > and now builds a lot mor of my local projects > https://github.com/steveloughran/validate-hadoop-client-artifacts > all good as far as my test coverage goes, with these projects validating > the staged dependencies. > > now, who else can review > > On Fri, 29 Jul 2022 at 19:47, Steve Loughran wrote: > > > > > > > I have put together a release candidate (RC1) for Hadoop 3.3.4 > > > > The RC is available at: > > https://dist.apache.org/repos/dist/dev/hadoop/hadoop-3.3.4-RC1/ > > > > The git tag is release-3.3.4-RC1, commit a585a73c3e0 > > > > The maven artifacts are staged at > > https://repository.apache.org/content/repositories/orgapachehadoop-1358/ > > > > You can find my public key at: > > https://dist.apache.org/repos/dist/release/hadoop/common/KEYS > > > > Change log > > > https://dist.apache.org/repos/dist/dev/hadoop/hadoop-3.3.4-RC1/CHANGELOG.md > > > > Release notes > > > > > https://dist.apache.org/repos/dist/dev/hadoop/hadoop-3.3.4-RC1/RELEASENOTES.md > > > > There's a very small number of changes, primarily critical code/packaging > > issues and security fixes. > > > > See the release notes for details. > > > > Please try the release and vote. The vote will run for 5 days. > > > > steve > > >
Re: [VOTE] Release Apache Hadoop 3.2.4 - RC0
I'm changing my vote to +1 (binding). Masatake and Ashutosh, thank you for investigating. I reran tests without the parallel options, and that mostly addressed the failures. Maybe the tests in question are just not sufficiently isolated to support parallel execution. That looks to be the case for TestFsck, where the failure was caused by missing audit log entries. This test works by toggling global logging state, so I can see why multi-threaded execution might confuse the test. Chris Nauroth On Thu, Jul 21, 2022 at 12:01 AM Ashutosh Gupta wrote: > +1(non-binding) > > * Builds from source look good. > * Checksums and signatures are correct. > * Running basic HDFS and MapReduce commands looks good. > > > * TestAMRMProxy - Not able to reproduce in local > > * TestFsck - I can see failure only I can see is > TestFsck.testFsckListCorruptSnapshotFiles which passed after applying > HDFS-15038 > > * TestSLSStreamAMSynth - Not able to reproduce in local > > * TestServiceAM - Not able to reproduce in local > > Thanks Masatake for driving this release. > > On Thu, Jul 21, 2022 at 5:51 AM Masatake Iwasaki < > iwasak...@oss.nttdata.com> > wrote: > > > Hi developers, > > > > I'm still waiting for your vote. > > I'm considering the intermittent test failures mentioned by Chris are not > > blocker. > > Please file a JIRA and let me know if you find a blocker issue. > > > > I will appreciate your help for the release process. > > > > Regards, > > Masatake Iwasaki > > > > On 2022/07/20 14:50, Masatake Iwasaki wrote: > > >> TestServiceAM > > > > > > I can see the reported failure of TestServiceAM in some "Apache Hadoop > > qbt Report: branch-3.2+JDK8 on Linux/x86_64". > > > 3.3.0 and above might be fixed by YARN-8867 which added guard using > > GenericTestUtils#waitFor for stabilizing the > > testContainersReleasedWhenPreLaunchFails. > > > YARN 8867 did not modified other code under hadoop-yarn-services. > > > If it is the case, TestServiceAM can be tagged as flaky in branch-3.2. > > > > > > > > > On 2022/07/20 14:21, Masatake Iwasaki wrote: > > >> Thanks for testing the RC0, Chris. > > >> > > >>> The following are new test failures for me on 3.2.4: > > >>> * TestAMRMProxy > > >>> * TestFsck > > >>> * TestSLSStreamAMSynth > > >>> * TestServiceAM > > >> > > >> I could not reproduce the test failures on my local. > > >> > > >> For TestFsck, if the failed test case is > > testFsckListCorruptSnapshotFiles, > > >> cherry-picking HDFS-15038 (fixing only test code) could be the fix. > > >> > > >> The failure of TestSLSStreamAMSynth looks frequently reported by > > >> "Apache Hadoop qbt Report: trunk+JDK8 on Linux/x86_64". > > >> It could be tagged as known flaky test. > > >> > > >> On 2022/07/20 9:15, Chris Nauroth wrote: > > >>> -0 (binding) > > >>> > > >>> * Verified all checksums. > > >>> * Verified all signatures. > > >>> * Built from source, including native code on Linux. > > >>> * mvn clean package -Pnative -Psrc -Drequire.openssl > > -Drequire.snappy > > >>> -Drequire.zstd -DskipTests > > >>> * Tests mostly passed, but see below. > > >>> * mvn --fail-never clean test -Pnative -Dparallel-tests > > >>> -Drequire.snappy -Drequire.zstd -Drequire.openssl > > >>> -Dsurefire.rerunFailingTestsCount=3 -DtestsThreadCount=8 > > >>> > > >>> The following are new test failures for me on 3.2.4: > > >>> * TestAMRMProxy > > >>> * TestFsck > > >>> * TestSLSStreamAMSynth > > >>> * TestServiceAM > > >>> > > >>> The following tests also failed, but they also fail for me on 3.2.3, > so > > >>> they aren't likely to be related to this release candidate: > > >>> * TestCapacitySchedulerNodeLabelUpdate > > >>> * TestFrameworkUploader > > >>> * TestSLSGenericSynth > > >>> * TestSLSRunner > > >>> * test_libhdfs_threaded_hdfspp_test_shim_static > > >>> > > >>> I'm not voting a full -1, because I haven't done any root cause > > analysis on > > >>> these new test failures. I don't know if it's a quirk to my > > environment, > > >>> though I'm using the start-build-env.sh Docker co
Re: [VOTE] Release Apache Hadoop 3.2.4 - RC0
-0 (binding) * Verified all checksums. * Verified all signatures. * Built from source, including native code on Linux. * mvn clean package -Pnative -Psrc -Drequire.openssl -Drequire.snappy -Drequire.zstd -DskipTests * Tests mostly passed, but see below. * mvn --fail-never clean test -Pnative -Dparallel-tests -Drequire.snappy -Drequire.zstd -Drequire.openssl -Dsurefire.rerunFailingTestsCount=3 -DtestsThreadCount=8 The following are new test failures for me on 3.2.4: * TestAMRMProxy * TestFsck * TestSLSStreamAMSynth * TestServiceAM The following tests also failed, but they also fail for me on 3.2.3, so they aren't likely to be related to this release candidate: * TestCapacitySchedulerNodeLabelUpdate * TestFrameworkUploader * TestSLSGenericSynth * TestSLSRunner * test_libhdfs_threaded_hdfspp_test_shim_static I'm not voting a full -1, because I haven't done any root cause analysis on these new test failures. I don't know if it's a quirk to my environment, though I'm using the start-build-env.sh Docker container, so any build dependencies should be consistent. I'd be comfortable moving ahead if others are seeing these tests pass. Chris Nauroth On Thu, Jul 14, 2022 at 7:57 AM Masatake Iwasaki wrote: > +1 from myself. > > * skimmed the contents of site documentation. > > * built the source tarball on Rocky Linux 8 (x86_64) by OpenJDK 8 with > `-Pnative`. > > * launched pseudo distributed cluster including kms and httpfs with > Kerberos and SSL enabled. > >* created encryption zone, put and read files via httpfs. >* ran example MR wordcount over encryption zone. > > * launched 3-node docker cluster with NN-HA and RM-HA enabled and ran some > example MR jobs. > > * built HBase 2.4.11, Hive 3.1.2 and Spark 3.1.2 against Hadoop 3.2.4 RC0 >on CentOS 7 (x86_64) by using Bigtop branch-3.1 and ran smoke-tests. >https://github.com/apache/bigtop/pull/942 > >* Hive needs updating exclusion rule to address HADOOP-18088 (migration > to reload4j). > > * built Spark 3.3.0 against Hadoop 3.2.4 RC0 using the staging repository:: > > > staged > staged-releases > > https://repository.apache.org/content/repositories/orgapachehadoop-1354 > > > true > > > true > > > > Thanks, > Masatake Iwasaki > > On 2022/07/13 1:14, Masatake Iwasaki wrote: > > Hi all, > > > > Here's Hadoop 3.2.4 release candidate #0: > > > > The RC is available at: > >https://home.apache.org/~iwasakims/hadoop-3.2.4-RC0/ > > > > The RC tag is at: > >https://github.com/apache/hadoop/releases/tag/release-3.2.4-RC0 > > > > The Maven artifacts are staged at: > > > https://repository.apache.org/content/repositories/orgapachehadoop-1354 > > > > You can find my public key at: > >https://downloads.apache.org/hadoop/common/KEYS > > > > Please evaluate the RC and vote. > > The vote will be open for (at least) 5 days. > > > > Thanks, > > Masatake Iwasaki > > > > - > > To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org > > For additional commands, e-mail: common-dev-h...@hadoop.apache.org > > > > - > To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org > For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org > >
Re: [VOTE] Release Apache Hadoop 2.10.2 - RC0
+1 (binding) * Verified all checksums. * Verified all signatures. * Built from source, including native code on Linux. * mvn clean package -Pnative -Psrc -Drequire.openssl -Drequire.snappy -Drequire.zstd -DskipTests * Almost all unit tests passed. * mvn clean test -Pnative -Dparallel-tests -Drequire.snappy -Drequire.zstd -Drequire.openssl -Dsurefire.rerunFailingTestsCount=3 -DtestsThreadCount=8 * TestBookKeeperHACheckpoints consistently has a few failures. * TestCapacitySchedulerNodeLabelUpdate is flaky, intermittently timing out. These test failures don't look significant enough to hold up a release, so I'm still voting +1. Chris Nauroth On Sun, May 29, 2022 at 2:35 AM Masatake Iwasaki < iwasak...@oss.nttdata.co.jp> wrote: > Thanks for the help, Ayush. > > I committed HADOOP-16663/HADOOP-16664 and cherry-picked HADOOP-16985 to > branch-2.10 (and branch-3.2). > If I need to cut RC1, I will try cherry-picking them to branch-2.10.2 > > Masatake Iwasaki > > > On 2022/05/28 5:23, Ayush Saxena wrote: > > The checksum stuff was addressed in HADOOP-16985, so that filename stuff > is > > sorted only post 3.3.x > > BTW it is a known issue: > > > https://issues.apache.org/jira/browse/HADOOP-16494?focusedCommentId=16927236=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16927236 > > > > Must not be a blocker for us > > > > The RAT check failing with dependency issue. That also should work post > > 3.3.x because there is no Hadoop-maven-plugin dependency in > Hadoop-yarn-api > > module post 3.3.x, HADOOP-16560 removed it. > > Ref: > > > https://github.com/apache/hadoop/pull/1496/files#diff-f5d219eaf211871f9527ae48da59586e7e9958ea7649de74a1393e599caa6dd6L121-R122 > > > > So, that is why the RAT check passes for 3.3.x+ without the need of this > > module. Committing HADOOP-16663, should solve this though.(I haven't > tried > > though, just by looking at the problem) > > > > Good to have patches, but doesn't look like blockers to me. kind of build > > related stuffs only, nothing bad with our core Hadoop code. > > > > -Ayush > > > > On Sat, 28 May 2022 at 01:04, Viraj Jasani wrote: > > > >> +0 (non-binding), > >> > >> * Signature/Checksum looks good, though I am not sure where > >> "target/artifacts" is coming from for the tars, here is the diff (this > was > >> the case for 2.10.1 as well but checksum was correct): > >> > >> 1c1 > >> < SHA512 (hadoop-2.10.2-site.tar.gz) = > >> > >> > 3055a830003f5012660d92da68a317e15da5b73301c2c73cf618e724c67b7d830551b16928e0c28c10b66f04567e4b6f0b564647015bacc4677e232c0011537f > >> --- > >>> SHA512 (target/artifacts/hadoop-2.10.2-site.tar.gz) = > >> > >> > 3055a830003f5012660d92da68a317e15da5b73301c2c73cf618e724c67b7d830551b16928e0c28c10b66f04567e4b6f0b564647015bacc4677e232c0011537f > >> 1c1 > >> < SHA512 (hadoop-2.10.2-src.tar.gz) = > >> > >> > 483b6a4efd44234153e21ffb63a9f551530a1627f983a8837c655ce1b8ef13486d7178a7917ed3f35525c338e7df9b23404f4a1b0db186c49880448988b88600 > >> --- > >>> SHA512 (target/artifacts/hadoop-2.10.2-src.tar.gz) = > >> > >> > 483b6a4efd44234153e21ffb63a9f551530a1627f983a8837c655ce1b8ef13486d7178a7917ed3f35525c338e7df9b23404f4a1b0db186c49880448988b88600 > >> 1c1 > >> < SHA512 (hadoop-2.10.2.tar.gz) = > >> > >> > 13e95907073d815e3f86cdcc24193bb5eec0374239c79151923561e863326988c7f32a05fb7a1e5bc962728deb417f546364c2149541d6234221b00459154576 > >> --- > >>> SHA512 (target/artifacts/hadoop-2.10.2.tar.gz) = > >> > >> > 13e95907073d815e3f86cdcc24193bb5eec0374239c79151923561e863326988c7f32a05fb7a1e5bc962728deb417f546364c2149541d6234221b00459154576 > >> > >> However, checksums are correct. > >> > >> * Builds from source look good > >> - mvn clean install -DskipTests > >> - mvn clean package -Pdist -DskipTests -Dtar > -Dmaven.javadoc.skip=true > >> > >> * Rat check, if run before building from source locally, fails with > error: > >> > >> [ERROR] Plugin org.apache.hadoop:hadoop-maven-plugins:2.10.2 or one of > its > >> dependencies could not be resolved: Could not find artifact > >> org.apache.hadoop:hadoop-maven-plugins:jar:2.10.2 in central ( > >> https://repo.maven.apache.org/maven2) -> [Help 1] > >> [ERROR] > >> > >> However, once we build locally, rat check passes (because > >> hadoop-maven-plugins 2.10.2 would be present in
Re: [VOTE] Release Apache Hadoop 3.3.3 (RC1)
+1 (binding) - Verified all checksums. - Verified all signatures. - Built from source, including native code on Linux. - Ran several examples successfully. Chris Nauroth On Mon, May 16, 2022 at 10:06 AM Chao Sun wrote: > +1 > > - Compiled from source > - Verified checksums & signatures > - Launched a pseudo HDFS cluster and ran some simple commands > - Ran full Spark tests with the RC > > Thanks Steve! > > Chao > > On Mon, May 16, 2022 at 2:19 AM Ayush Saxena wrote: > > > > +1, > > * Built from source. > > * Successful native build on Ubuntu 18.04 > > * Verified Checksums. > > > (CHANGELOG.md,RELEASENOTES.md,hadoop-3.3.3-rat.txt,hadoop-3.3.3-site.tar.gz,hadoop-3.3.3-src.tar.gz,hadoop-3.3.3.tar.gz) > > * Verified Signature. > > * Successful RAT check > > * Ran basic HDFS shell commands. > > * Ran basic YARN shell commands. > > * Verified version in hadoop version command and UI > > * Ran some MR example Jobs. > > * Browsed UI(Namenode/Datanode/ResourceManager/NodeManager/HistoryServer) > > * Browsed the contents of Maven Artifacts. > > * Browsed the contents of the website. > > > > Thanx Steve for driving the release, Good Luck!!! > > > > -Ayush > > > > On Mon, 16 May 2022 at 08:20, Xiaoqiao He wrote: > > > > > +1(binding) > > > > > > * Verified signature and checksum of the source tarball. > > > * Built the source code on Ubuntu and OpenJDK 11 by `mvn clean package > > > -DskipTests -Pnative -Pdist -Dtar`. > > > * Setup pseudo cluster with HDFS and YARN. > > > * Run simple FsShell - mkdir/put/get/mv/rm and check the result. > > > * Run example mr applications and check the result - Pi & wordcount. > > > * Check the Web UI of NameNode/DataNode/Resourcemanager/NodeManager > etc. > > > > > > Thanks Steve for your work. > > > > > > - He Xiaoqiao > > > > > > On Mon, May 16, 2022 at 4:25 AM Viraj Jasani > wrote: > > > > > > > > +1 (non-binding) > > > > > > > > * Signature: ok > > > > * Checksum : ok > > > > * Rat check (1.8.0_301): ok > > > > - mvn clean apache-rat:check > > > > * Built from source (1.8.0_301): ok > > > > - mvn clean install -DskipTests > > > > * Built tar from source (1.8.0_301): ok > > > > - mvn clean package -Pdist -DskipTests -Dtar > -Dmaven.javadoc.skip=true > > > > > > > > HDFS, MapReduce and HBase (2.5) CRUD functional testing on > > > > pseudo-distributed mode looks good. > > > > > > > > > > > > On Wed, May 11, 2022 at 10:26 AM Steve Loughran > > > > > > > wrote: > > > > > > > > > I have put together a release candidate (RC1) for Hadoop 3.3.3 > > > > > > > > > > The RC is available at: > > > > > https://dist.apache.org/repos/dist/dev/hadoop/3.3.3-RC1/ > > > > > > > > > > The git tag is release-3.3.3-RC1, commit d37586cbda3 > > > > > > > > > > The maven artifacts are staged at > > > > > > > > > https://repository.apache.org/content/repositories/orgapachehadoop-1349/ > > > > > > > > > > You can find my public key at: > > > > > https://dist.apache.org/repos/dist/release/hadoop/common/KEYS > > > > > > > > > > Change log > > > > > > https://dist.apache.org/repos/dist/dev/hadoop/3.3.3-RC1/CHANGELOG.md > > > > > > > > > > Release notes > > > > > > > > > https://dist.apache.org/repos/dist/dev/hadoop/3.3.3-RC1/RELEASENOTES.md > > > > > > > > > > There's a very small number of changes, primarily critical > > > code/packaging > > > > > issues and security fixes. > > > > > > > > > > * The critical fixes which shipped in the 3.2.3 release. > > > > > * CVEs in our code and dependencies > > > > > * Shaded client packaging issues. > > > > > * A switch from log4j to reload4j > > > > > > > > > > reload4j is an active fork of the log4j 1.17 library with the > classes > > > > > which contain CVEs removed. Even though hadoop never used those > > > classes, > > > > > they regularly raised alerts on security scans and concen from > users. > > > > > Switching to the forked project allows us to ship a secure logging > > > > > framework. It will complicate the builds of downstream > > > > > maven/ivy/gradle projects which exclude our log4j artifacts, as > they > > > > > need to cut the new dependency instead/as well. > > > > > > > > > > See the release notes for details. > > > > > > > > > > This is the second release attempt. It is the same git commit as > > > before, > > > > > but > > > > > fully recompiled with another republish to maven staging, which > has bee > > > > > verified by building spark, as well as a minimal test project. > > > > > > > > > > Please try the release and vote. The vote will run for 5 days. > > > > > > > > > > -Steve > > > > > > > > > > > - > > > To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org > > > For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org > > > > > > > > - > To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org > For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org > >
Re: [DISCUSS] Hadoop 3.3.2 release?
+1 Chao, thank you very much for volunteering on the release. Chris Nauroth On Tue, Sep 7, 2021 at 10:00 PM Igor Dvorzhak wrote: > +1 > > On Tue, Sep 7, 2021 at 10:06 AM Chao Sun wrote: > >> Hi all, >> >> It has been almost 3 months since the 3.3.1 release and branch-3.3 has >> accumulated quite a few commits (118 atm). In particular, Spark community >> recently found an issue which prevents one from using the shaded Hadoop >> client together with certain compression codecs such as lz4 and snappy >> codec. The details are recorded in HADOOP-17891 and SPARK-36669. >> >> Therefore, I'm wondering if anyone is also interested in a 3.3.2 release. >> If there is no objection, I'd like to volunteer myself for the work as >> well. >> >> Best Regards, >> Chao >> >
Re: [VOTE] Release Apache Hadoop 2.7.3 RC1
Andrew, thanks for adding your perspective on this. What is a realistic strategy for us to evolve the HDFS audit log in a backward-compatible way? If the API is essentially any form of ad-hoc scripting, then for any proposed audit log format change, I can find a reason to veto it on grounds of backward incompatibility. - I can’t add a new field on the end, because that would break an awk script that uses $NF expecting to find a specific field. - I can’t prepend a new field, because that would break a "cut -f1" expecting to find the timestamp. - HDFS can’t add any new features, because someone might have written a script that does "exit 1" if it finds an unexpected RPC in the "cmd=" field. - Hadoop is not allowed to add full IPv6 support, because someone might have written a script that looks at the "ip=" field and parses it by IPv4 syntax. On the CLI, a potential solution for evolving the output is to preserve the old format by default and only enable the new format if the user explicitly passes a new argument. What should we do for the audit log? Configuration flags in hdfs-site.xml? (That of course adds its own brand of complexity.) I’m particularly interested to hear potential solutions from people like Andrew and Allen who have been most vocal about the need for a stable format. Without a solution, this unfortunately devolves into the format being frozen within a major release line. We could benefit from getting a patch on the compatibility doc that addresses the HDFS audit log specifically. --Chris Nauroth On 8/18/16, 8:47 AM, "Andrew Purtell" <andrew.purt...@gmail.com> wrote: An incompatible APIs change is developer unfriendly. An incompatible behavioral change is operator unfriendly. Historically, one dimension of incompatibility has had a lot more mindshare than the other. It's great that this might be changing for the better. Where I work when we move from one Hadoop 2.x minor to another we always spend time updating our deployment plans, alerting, log scraping, and related things due to changes. Some are debatable as if qualifying for the 'incompatible' designation. I think the audit logging change that triggered this discussion is a good example of one that does. If you want to audit HDFS actions those log emissions are your API. (Inotify doesn't offer access control events.) One has to code regular expressions for parsing them and reverse engineer under what circumstances an audit line is emitted so you can make assumptions about what transpired. Change either and you might break someone's automation for meeting industry or legal compliance obligations. Not a trivial matter. If you don't operate Hadoop in production you might not realize the implications of such a change. Glad to see Hadoop has community diversity to recognize it in some cases. > On Aug 18, 2016, at 6:57 AM, Junping Du <j...@hortonworks.com> wrote: > > I think Allen's previous comments are very misleading. > In my understanding, only incompatible API (RPC, CLIs, WebService, etc.) shouldn't land on branch-2, but other incompatible behaviors (logs, audit-log, daemon's restart, etc.) should get flexible for landing. Otherwise, how could 52 issues ( https://s.apache.org/xJk5) marked with incompatible-changes could get landed on branch-2 after 2.2.0 release? Most of them are already released. > > Thanks, > > Junping > > From: Vinod Kumar Vavilapalli <vino...@apache.org> > Sent: Wednesday, August 17, 2016 9:29 PM > To: Allen Wittenauer > Cc: common-...@hadoop.apache.org; hdfs-...@hadoop.apache.org; yarn-dev@hadoop.apache.org; mapreduce-...@hadoop.apache.org > Subject: Re: [VOTE] Release Apache Hadoop 2.7.3 RC1 > > I always look at CHANGES.txt entries for incompatible-changes and this JIRA obviously wasn’t there. > > Anyways, this shouldn’t be in any of branch-2.* as committers there clearly mentioned that this is an incompatible change. > > I am reverting the patch from branch-2* . > > Thanks > +Vinod > >> On Aug 16, 2016, at 9:29 PM, Allen Wittenauer <a...@effectivemachines.com> wrote: >> >> >> >> -1 >> >> HDFS-9395 is an incompatible change: >> >> a) Why is not marked as such in the changes file? >> b) Why is an incompatible change in a micro release, much less a minor? >> c) Where is the release note for this change? >> >> >>> On Aug 12, 2016, at 9:45 AM, Vinod Kumar Vavilapalli <vino...@apache.org> wrote: >>> >>> Hi all, >>> >>> I've created a release candidate RC1
Re: Apache MSDN Offer is Back
That definitely was possible under the old deal. You could go through the MSDN site and download an iso for various versions of Windows and run it under VirtualBox. The MSDN site also would furnish a license key that you could use to activate the machine. I haven't yet gone through this new process to see if anything has changed in the benefits. --Chris Nauroth From: Ravi Prakash <ravihad...@gmail.com<mailto:ravihad...@gmail.com>> Date: Wednesday, July 20, 2016 at 12:04 PM To: Chris Nauroth <cnaur...@hortonworks.com<mailto:cnaur...@hortonworks.com>> Cc: "common-...@hadoop.apache.org<mailto:common-...@hadoop.apache.org>" <common-...@hadoop.apache.org<mailto:common-...@hadoop.apache.org>>, "hdfs-...@hadoop.apache.org<mailto:hdfs-...@hadoop.apache.org>" <hdfs-...@hadoop.apache.org<mailto:hdfs-...@hadoop.apache.org>>, "yarn-dev@hadoop.apache.org<mailto:yarn-dev@hadoop.apache.org>" <yarn-dev@hadoop.apache.org<mailto:yarn-dev@hadoop.apache.org>>, "mapreduce-...@hadoop.apache.org<mailto:mapreduce-...@hadoop.apache.org>" <mapreduce-...@hadoop.apache.org<mailto:mapreduce-...@hadoop.apache.org>> Subject: Re: Apache MSDN Offer is Back Thanks Chris! I did avail of the offer a few months ago, and wasn't able to figure out if a windows license was also available. I want to run windows inside a virtual machine on my Linux laptop, for the rare cases that there are patches that may affect that. Any clue if that is possible? Thanks Ravi On Tue, Jul 19, 2016 at 4:09 PM, Chris Nauroth <cnaur...@hortonworks.com<mailto:cnaur...@hortonworks.com>> wrote: A few months ago, we learned that the offer for ASF committers to get an MSDN license had gone away. I'm happy to report that as of a few weeks ago, that offer is back in place. For more details, committers can check out https://svn.apache.org/repos/private/committers and read donated-licenses/msdn.txt. --Chris Nauroth
Apache MSDN Offer is Back
A few months ago, we learned that the offer for ASF committers to get an MSDN license had gone away. I'm happy to report that as of a few weeks ago, that offer is back in place. For more details, committers can check out https://svn.apache.org/repos/private/committers and read donated-licenses/msdn.txt. --Chris Nauroth
Re: Apache Hadoop qbt Report: trunk+JDK8 on Linux/x86
Interestingly, that FindBugs warning in hadoop-azure-datalake was not flagged during pre-commit before I committed HADOOP-12666. I'm going to propose that we address it in scope of HADOOP-12875. --Chris Nauroth On 6/10/16, 10:30 AM, "Apache Jenkins Server" <jenk...@builds.apache.org> wrote: >For more details, see >https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/58/ > >No changes > > > > >-1 overall > > >The following subsystems voted -1: >findbugs unit > > >The following subsystems voted -1 but >were configured to be filtered/ignored: >cc checkstyle javac javadoc pylint shellcheck shelldocs whitespace > > >The following subsystems are considered long running: >(runtime bigger than 1h 0m 0s) >unit > > >Specific tests: > >FindBugs : > > module:hadoop-tools/hadoop-azure-datalake > int value cast to float and then passed to Math.round in >org.apache.hadoop.hdfs.web.PrivateAzureDataLakeFileSystem$BatchByteArrayIn >putStream.getSplitSize(int) At PrivateAzureDataLakeFileSystem.java:and >then passed to Math.round in >org.apache.hadoop.hdfs.web.PrivateAzureDataLakeFileSystem$BatchByteArrayIn >putStream.getSplitSize(int) At PrivateAzureDataLakeFileSystem.java:[line >925] > >Failed junit tests : > > hadoop.hdfs.server.namenode.TestEditLog > hadoop.yarn.server.resourcemanager.TestClientRMTokens > hadoop.yarn.server.resourcemanager.TestAMAuthorization > hadoop.yarn.server.TestContainerManagerSecurity > hadoop.yarn.server.TestMiniYarnClusterNodeUtilization > hadoop.yarn.client.cli.TestLogsCLI > hadoop.yarn.client.api.impl.TestAMRMProxy > hadoop.yarn.client.api.impl.TestDistributedScheduling > hadoop.yarn.client.TestGetGroups > hadoop.mapreduce.tools.TestCLI > hadoop.mapred.TestMRCJCFileOutputCommitter > >Timed out junit tests : > > org.apache.hadoop.yarn.client.cli.TestYarnCLI > org.apache.hadoop.yarn.client.api.impl.TestAMRMClient > org.apache.hadoop.yarn.client.api.impl.TestYarnClient > org.apache.hadoop.yarn.client.api.impl.TestNMClient > > > cc: > > >https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/58/artifact >/out/diff-compile-cc-root.txt [4.0K] > > javac: > > >https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/58/artifact >/out/diff-compile-javac-root.txt [164K] > > checkstyle: > > >https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/58/artifact >/out/diff-checkstyle-root.txt [16M] > > pylint: > > >https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/58/artifact >/out/diff-patch-pylint.txt [16K] > > shellcheck: > > >https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/58/artifact >/out/diff-patch-shellcheck.txt [20K] > > shelldocs: > > >https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/58/artifact >/out/diff-patch-shelldocs.txt [16K] > > whitespace: > > >https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/58/artifact >/out/whitespace-eol.txt [12M] > >https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/58/artifact >/out/whitespace-tabs.txt [1.3M] > > findbugs: > > >https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/58/artifact >/out/branch-findbugs-hadoop-tools_hadoop-azure-datalake-warnings.html >[8.0K] > > javadoc: > > >https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/58/artifact >/out/diff-javadoc-javadoc-root.txt [2.3M] > > unit: > > >https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/58/artifact >/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt [144K] > >https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/58/artifact >/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop- >yarn-server-resourcemanager.txt [60K] > >https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/58/artifact >/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop- >yarn-server-tests.txt [268K] > >https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/58/artifact >/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-client.txt >[908K] > >https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/58/artifact >/out/patch-unit-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-ma >preduce-client-core.txt [56K] > >https://builds.a
Re: [DISCUSS] what exactly are the stability guarantees of the YARN APIs
I recommend that we update the compatibility guide with some text that explicitly addresses subclassing/interface inheritance stability for classes/interfaces annotated Stable. This is for our own benefit too. (I often refer back to that doc when I'm coding a patch that might have a chance of being backwards-incompatible.) --Chris Nauroth On 5/31/16, 9:46 AM, "Karthik Kambatla" <ka...@cloudera.com> wrote: >Argh! Totally my bad on YARN-2882. Kept missing the changes to >ContainerStatus even after you pointed out. > >Filed YARN-5184 to fix this before we release it. Thanks for pointing it >out, Steve. > >On Tue, May 31, 2016 at 6:00 AM, Steve Loughran <ste...@hortonworks.com> >wrote: > >> >> On 31 May 2016, at 05:44, Karthik Kambatla <ka...@cloudera.com> ka...@cloudera.com>> wrote: >> >> Inline. >> >> On Sat, May 28, 2016 at 11:34 AM, Sangjin Lee <sjl...@gmail.com> sjl...@gmail.com>> wrote: >> I think there is more to it. The InterfaceStability javadoc states: >> Incompatible changes must not be made to classes marked as stable. >> >> And in practice, I don't think the class annotation can be considered a >> simple sum of method annotations. There is a notion of class >>compatibility >> distinct from method stability. One key example is interfaces and >>abstract >> classes as in this case. The moment a new abstract method is added, the >> class becomes incompatible as it would break all downstream subclasses >>or >> implementing classes. That's the case even if *all methods are declared >> stable*. Thus, adding any abstract method (no matter what their >> scope/stability is) should be considered in violation of the stable >> contract of the class. >> >> Fair point. I was referring to them in the context of adding @Evolving >> methods to @Stable classes. Our policy states that "Classes not >>annotated >> are implicitly ³Private². Class members not annotated inherit the >> annotations of the enclosing class." So, the annotation on a method >> overrides that of the enclosing class. This seems pretty reasonable to >>me. >> >> >> >> My code wouldn't even compile because new abstract methods were added >>to a >> class tagged as stable. >> >> As far as I'm concerned, it doesn't meat the strict semantics of >>"stable", >> unless there is some nuance I'm missing. >> >> Therefore, I'm with Sangin: adding new abstract methods to an existing >> @Stable class breaks compatibility. Adding new non-abstract methods >>‹fine. >> It would have been straightforward to add some new methods to, say >> ContainerReport, which were no-ops/exception raising, but which at least >> didn't break compilation. (though they may have broken codepaths which >> required the methods to act as getters/settes) >> >> Do you think there is reason to revisit this? If yes, we should update >> this for Hadoop 3. >> >> I'm not sure about revisiting. I'm raising the fact that changes to >> classes marked as stable have broken code, and querying the validity of >> such an operation within the constraints of the 2.x codebase. >> >> And I'm raising it on yarn-dev, as that's where things broke. If we do >> want to revisit things, that'll mean a move to common-dev. >> >> >> >> Regarding interfaces and abstract classes, one future enhancement to the >> InterfaceStability annotation we could consider is formally separating >>the >> contract for users of the API and the implementers of the API. They >>follow >> different rules. It could be feasible to have an interface as >>Public/Stable >> for users (anyone can use the API in a stable manner) but Private for >> implementers. The idea is that it is still a public interface but no >> third-party code should not subclass or implement it. I suspect a fair >> amount of hadoop's public interface might fall into that category. That >> itself is probably an incompatible change, so we might have to wait >>until >> after 3.0, however. >> >> Interesting thought. Agree that we do not anticipate users sub-classing >> most of our Public-Stable classes. >> >> There are also classes which we do not anticipate end-users to directly >> use, but devs might want to sub-class. This applies to pluggable >>entities; >> e.g. SchedulingPolicy in fairscheduler. We are currently using >> Public-Evolving to capture this intent. >> >> Should we add a third annotation in addition to Audience and Stability >>to >> ca
Re: ASF OS X Build Infrastructure
Hi Ravi, Something certainly seems off about that bootstrapping problem you encountered. :-) When I've done this, the artifact I downloaded was an .iso file, which I could then use to install a VirtualBox VM. I'm now tuned into the discussion Sean referenced about the ASF MSDN program. I'll send another update when I have something more specific to share. --Chris Nauroth From: Ravi Prakash <ravihad...@gmail.com<mailto:ravihad...@gmail.com>> Date: Friday, May 20, 2016 at 4:56 PM To: Sean Busbey <bus...@cloudera.com<mailto:bus...@cloudera.com>> Cc: Chris Nauroth <cnaur...@hortonworks.com<mailto:cnaur...@hortonworks.com>>, Steve Loughran <ste...@hortonworks.com<mailto:ste...@hortonworks.com>>, Hadoop Common <common-...@hadoop.apache.org<mailto:common-...@hadoop.apache.org>>, "mapreduce-...@hadoop.apache.org<mailto:mapreduce-...@hadoop.apache.org>" <mapreduce-...@hadoop.apache.org<mailto:mapreduce-...@hadoop.apache.org>>, "hdfs-...@hadoop.apache.org<mailto:hdfs-...@hadoop.apache.org>" <hdfs-...@hadoop.apache.org<mailto:hdfs-...@hadoop.apache.org>>, "yarn-dev@hadoop.apache.org<mailto:yarn-dev@hadoop.apache.org>" <yarn-dev@hadoop.apache.org<mailto:yarn-dev@hadoop.apache.org>> Subject: Re: ASF OS X Build Infrastructure FWIW, I was able to get a response from the form last month. I was issued a new MSDN subscriber ID using which I could have downloaded Microsoft Visual Studio (and some other products, I think). I was interested in downloading an image of Windows to run in a VM, but the downloader is. wait for it. an exe file :-) Haven't gotten around to begging someone with a Windows OS to run that image downloader. On Fri, May 20, 2016 at 10:39 AM, Sean Busbey <bus...@cloudera.com<mailto:bus...@cloudera.com>> wrote: Some talk about the MSDN-for-committers program recently passed by on a private list. It's still active, it just changed homes within Microsoft. The info should still be in the committer repo. If something is amiss please let me know and I'll pipe up to the folks already plugged in to confirming it's active. On Fri, May 20, 2016 at 12:13 PM, Chris Nauroth <cnaur...@hortonworks.com<mailto:cnaur...@hortonworks.com>> wrote: > It's very disappointing to see that vanish. I'm following up to see if I > can learn more about what happened or if I can do anything to help > reinstate it. > > --Chris Nauroth > > > > > On 5/20/16, 6:11 AM, "Steve Loughran" > <ste...@hortonworks.com<mailto:ste...@hortonworks.com>> wrote: > >> >>> On 20 May 2016, at 10:40, Lars Francke >>> <lars.fran...@gmail.com<mailto:lars.fran...@gmail.com>> wrote: >>> >>>> >>>> Regarding lack of personal access to anything but Linux, I'll take >>>>this as >>>> an opportunity to remind everyone that ASF committers (not just >>>>limited to >>>> Hadoop committers) are entitled to a free MSDN license, which can get >>>>you >>>> a Windows VM for validating Windows issues and any patches that touch >>>> cross-platform concerns, like the native code. Contributors who are >>>>not >>>> committers still might struggle to get access to Windows, but all of us >>>> reviewing and committing patches do have access. >>>> >>> >>> Actually, from all I can tell this MSDN offer has been discontinued for >>> now. All the information has been removed from the committers repo. Do >>>you >>> have any more up to date information on this? >>> >> >> >>That's interesting. >> >>I did an SVN update and it went away..looks like something happened on >>April 26 >> >>No idea, though the svn log has a bit of detail >> >>- >>To unsubscribe, e-mail: >>mapreduce-dev-unsubscr...@hadoop.apache.org<mailto:mapreduce-dev-unsubscr...@hadoop.apache.org> >>For additional commands, e-mail: >>mapreduce-dev-h...@hadoop.apache.org<mailto:mapreduce-dev-h...@hadoop.apache.org> >> >> > > > - > To unsubscribe, e-mail: > common-dev-unsubscr...@hadoop.apache.org<mailto:common-dev-unsubscr...@hadoop.apache.org> > For additional commands, e-mail: > common-dev-h...@hadoop.apache.org<mailto:common-dev-h...@hadoop.apache.org> > -- busbey - To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org<mailto:mapreduce-dev-unsubscr...@hadoop.apache.org> For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org<mailto:mapreduce-dev-h...@hadoop.apache.org>
Re: ASF OS X Build Infrastructure
It's very disappointing to see that vanish. I'm following up to see if I can learn more about what happened or if I can do anything to help reinstate it. --Chris Nauroth On 5/20/16, 6:11 AM, "Steve Loughran" <ste...@hortonworks.com> wrote: > >> On 20 May 2016, at 10:40, Lars Francke <lars.fran...@gmail.com> wrote: >> >>> >>> Regarding lack of personal access to anything but Linux, I'll take >>>this as >>> an opportunity to remind everyone that ASF committers (not just >>>limited to >>> Hadoop committers) are entitled to a free MSDN license, which can get >>>you >>> a Windows VM for validating Windows issues and any patches that touch >>> cross-platform concerns, like the native code. Contributors who are >>>not >>> committers still might struggle to get access to Windows, but all of us >>> reviewing and committing patches do have access. >>> >> >> Actually, from all I can tell this MSDN offer has been discontinued for >> now. All the information has been removed from the committers repo. Do >>you >> have any more up to date information on this? >> > > >That's interesting. > >I did an SVN update and it went away..looks like something happened on >April 26 > >No idea, though the svn log has a bit of detail > >- >To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org >For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org > > - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
Re: ASF OS X Build Infrastructure
Allen, thank you for doing this. Regarding lack of personal access to anything but Linux, I'll take this as an opportunity to remind everyone that ASF committers (not just limited to Hadoop committers) are entitled to a free MSDN license, which can get you a Windows VM for validating Windows issues and any patches that touch cross-platform concerns, like the native code. Contributors who are not committers still might struggle to get access to Windows, but all of us reviewing and committing patches do have access. It has long been on my TODO list to set up similar Jenkins jobs for Windows, but it keeps slipping. I'll try once again to bump up priority. --Chris Nauroth On 5/19/16, 9:41 AM, "Allen Wittenauer" <a...@apache.org> wrote: > > Some of you may not know that the ASF actually does have an OS X machine >(a Mac mini, so it¹s not a speed demon) in the build infrastructure. >While messing around with getting all? of the trunk jobs reconfigured to >do Java 8 and separate maven repos, I noticed that this box tends to sit >idle most of the day. Why take advantage of it? Therefore, I also setup >two jobs for us to use to help alleviate the ³I don¹t have access to >anything but Linux² excuse when writing code that may not work in a >portable manner. > >Jobs #1: > > https://builds.apache.org/view/H-L/view/Hadoop/job/Precommit-HADOOP-OSX > > This basically runs Apache Yetus precommit with quite a few of the >unnecessary tests disabled. For example, there¹s no point in running >checkstyle. Note that this job takes the *full* JIRA issue id as input. >So ŒHADOOP-9902¹ not Œ9902¹. This allows for one Jenkins job to be used >for all the Hadoop sub-projects (HADOOP, HDFS, MR, YARN). ³But my code >is on github and I don¹t want to upload a patch!² I haven¹t tested it, >but it should also take a URL, so just add a .diff to the end of your >github compare URL and put that in the issue box. It hypothetically >should work. > >Job #2: > > I¹m still hammering on this one because the email notifications aren¹t >working to my satisfaction plus we have some extremely Linux-specific >code in YARNŠ but > > > https://builds.apache.org/view/H-L/view/Hadoop/job/hadoop-trunk-osx-java8 >/ > > Š is a ³build the world² job similar to what is currently running under >the individual sub projects. (This actually makes it one of the few >³build everything² jobs we have running. Most of the other jobs only >build that particular sub project.). It does not run the full unit test >suite and it also does not build all of the native code. This gives us a >place to start on our journey of making Hadoop actually, truly run >everywhere. (Interesting side note: It¹s been *extremely* consistent in >what fails vs. the Linux build hosts.) > > At some point, likely after YETUS-390 is complete, I¹ll switch this job >over to be run by Apache Yetus in qbt mode so that it¹s actually easier >to track failures across all dirs. A huge advantage over raw maven >commands. > > Happy testing everyone. > > NOTE: if you don¹t have access to launch jobs on builds.apache.org, >you¹ll need to send a request to private@. The Apache Hadoop PMC has the >keys to give access to folks. > > > >- >To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org >For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org > > - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
Re: [VOTE] Merge feature branch HADOOP-12930
Understood about the tests. --Chris Nauroth On 5/15/16, 7:30 AM, "Allen Wittenauer" <a...@apache.org> wrote: > >> On May 14, 2016, at 3:11 PM, Chris Nauroth <cnaur...@hortonworks.com> >>wrote: >> >> +1 (binding) >> >> -Tried a dry-run merge of HADOOP-12930 to trunk. >> -Successfully built distro on Windows. >> -Ran "hdfs namenode", "hdfs datanode", and various interactive hdfs >> commands through Cygwin. >> -Reviewed documentation. >> >> Allen, thank you for the contribution. Would you please attach a full >> patch to HADOOP-12930 to check pre-commit results? > > > Nope. The whole reason this was done as a branch with multiple patches >was to prevent Jenkins from getting overwhelmed since it would trigger >full unit tests on pretty much the entire code base…. > >> While testing this, I discovered a bug in the distro build for Windows. >> Could someone please code review my patch on HADOOP-13149? > > Done! > >> >> --Chris Nauroth >> >> >> >> >> On 5/9/16, 1:26 PM, "Allen Wittenauer" <a...@apache.org> wrote: >> >>> >>> Hey gang! >>> >>> I¹d like to call a vote to run for 7 days (ending May 16 at 13:30 PT) >>>to >>> merge the HADOOP-12930 feature branch into trunk. This branch was >>> developed exclusively by me as per the discussion two months ago as a >>>way >>> to make what would be a rather large patch hopefully easier to review. >>> The vast majority of the branch is code movement in the same file, >>> additional license headers, maven assembly hooks for distribution, and >>> variable renames. Not a whole lot of new code, but a big diff file >>> none-the-less. >>> >>> This branch modifies the Œhadoop¹, Œhdfs¹, Œmapred¹, and Œyarn¹ >>>commands >>> to allow for subcommands to be added or modified at runtime. This >>>allows >>> for individual users or entire sites to tweak the execution environment >>> to suit their local needs. For example, it has been a practice for >>>some >>> locations to change the distcp jar out for a custom one. Using this >>> functionality, it is possible that the Œhadoop distcp¹ command could >>>run >>> the local version without overwriting the bundled jar and for existing >>> documentation (read: results from Internet searches) to work as written >>> without modification. This has the potential to be a huge win, >>>especially >>> for: >>> >>> * advanced end users looking to supplement the Apache Hadoop >>>experience >>> * operations teams that may be able to leverage existing >>>documentation >>> without having to remain local ³exception² docs >>> * development groups wanting an easy way to trial experimental >>>features >>> >>> Additionally, this branch includes the following, related changes: >>> >>> * Adds the first unit tests for the Œhadoop¹ command >>> * Adds the infrastructure for hdfs script testing and the first >>> unit >>> test for the Œhdfs¹ command >>> * Modifies the hadoop-tools components to be dynamic rather >>> than hard >>> coded >>> * Renames the shell profiles for hdfs, mapred, and yarn to be >>> consistent with other bundled profiles, including the ones introduced >>>in >>> this branch >>> >>> Documentation, including a Œhello world¹-style example, is in the >>> UnixShellGuide markdown file. (Of course!) >>> >>> I am at ApacheCon this week if anyone wants to discuss in-depth. >>> >>> Thanks! >>> >>> P.S., >>> >>> There are still two open sub-tasks. These are blocked by other issues >>> so that we may add unit testing to the shell code in those respective >>> areas. I¹ll covert to full issues after HADOOP-12930 is closed. >>> >>> >>> - >>> To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org >>> For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org >>> >>> >> > > - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
Re: [VOTE] Merge feature branch HADOOP-12930
+1 (binding) -Tried a dry-run merge of HADOOP-12930 to trunk. -Successfully built distro on Windows. -Ran "hdfs namenode", "hdfs datanode", and various interactive hdfs commands through Cygwin. -Reviewed documentation. Allen, thank you for the contribution. Would you please attach a full patch to HADOOP-12930 to check pre-commit results? While testing this, I discovered a bug in the distro build for Windows. Could someone please code review my patch on HADOOP-13149? --Chris Nauroth On 5/9/16, 1:26 PM, "Allen Wittenauer" <a...@apache.org> wrote: > > Hey gang! > > I¹d like to call a vote to run for 7 days (ending May 16 at 13:30 PT) to >merge the HADOOP-12930 feature branch into trunk. This branch was >developed exclusively by me as per the discussion two months ago as a way >to make what would be a rather large patch hopefully easier to review. >The vast majority of the branch is code movement in the same file, >additional license headers, maven assembly hooks for distribution, and >variable renames. Not a whole lot of new code, but a big diff file >none-the-less. > > This branch modifies the Œhadoop¹, Œhdfs¹, Œmapred¹, and Œyarn¹ commands >to allow for subcommands to be added or modified at runtime. This allows >for individual users or entire sites to tweak the execution environment >to suit their local needs. For example, it has been a practice for some >locations to change the distcp jar out for a custom one. Using this >functionality, it is possible that the Œhadoop distcp¹ command could run >the local version without overwriting the bundled jar and for existing >documentation (read: results from Internet searches) to work as written >without modification. This has the potential to be a huge win, especially >for: > > * advanced end users looking to supplement the Apache Hadoop > experience > * operations teams that may be able to leverage existing > documentation >without having to remain local ³exception² docs > * development groups wanting an easy way to trial experimental > features > > Additionally, this branch includes the following, related changes: > > * Adds the first unit tests for the Œhadoop¹ command > * Adds the infrastructure for hdfs script testing and the first > unit >test for the Œhdfs¹ command > * Modifies the hadoop-tools components to be dynamic rather > than hard >coded > * Renames the shell profiles for hdfs, mapred, and yarn to be >consistent with other bundled profiles, including the ones introduced in >this branch > > Documentation, including a Œhello world¹-style example, is in the >UnixShellGuide markdown file. (Of course!) > >I am at ApacheCon this week if anyone wants to discuss in-depth. > > Thanks! > >P.S., > > There are still two open sub-tasks. These are blocked by other issues >so that we may add unit testing to the shell code in those respective >areas. I¹ll covert to full issues after HADOOP-12930 is closed. > > >- >To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org >For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org > > - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
Re: [DISCUSS] Treating LimitedPrivate({"MapReduce"}) as Public APIs for YARN applications
Yes, I agree with you Andrew. Sorry, I should clarify my prior response. I didn't mean to imply a blind s/LimitedPrivate/Public/g across the whole codebase. Instead, I'm +1 for the intent of HADOOP-10776: a transition to Public for UserGroupInformation, and by extension the related parts of its API like Credentials. I'm in the camp that generally questions the usefulness of LimitedPrivate, but I agree that transitions to Public need case-by-case consideration. --Chris Nauroth From: Andrew Wang <andrew.w...@cloudera.com<mailto:andrew.w...@cloudera.com>> Date: Tuesday, May 10, 2016 at 2:40 PM To: Chris Nauroth <cnaur...@hortonworks.com<mailto:cnaur...@hortonworks.com>> Cc: Hitesh Shah <hit...@apache.org<mailto:hit...@apache.org>>, "yarn-dev@hadoop.apache.org<mailto:yarn-dev@hadoop.apache.org>" <yarn-dev@hadoop.apache.org<mailto:yarn-dev@hadoop.apache.org>>, "mapreduce-...@hadoop.apache.org<mailto:mapreduce-...@hadoop.apache.org>" <mapreduce-...@hadoop.apache.org<mailto:mapreduce-...@hadoop.apache.org>>, "common-...@hadoop.apache.org<mailto:common-...@hadoop.apache.org>" <common-...@hadoop.apache.org<mailto:common-...@hadoop.apache.org>> Subject: Re: [DISCUSS] Treating LimitedPrivate({"MapReduce"}) as Public APIs for YARN applications Why don't we address these on a case-by-case basis, changing the annotations on these key classes to Public? LimitedPrivate{"YARN applications"} is the same thing as Public. This way we don't need to add special exceptions to our compatibility policy. Keeps it simple. Best, Andrew On Tue, May 10, 2016 at 2:26 PM, Chris Nauroth <cnaur...@hortonworks.com<mailto:cnaur...@hortonworks.com>> wrote: +1 for transitioning from LimitedPrivate to Public. I view this as an extension of the need for UserGroupInformation and related APIs to be Public. Regardless of the original intent behind LimitedPrivate, these are de facto public now, because there is no viable alternative for applications that want to integrate with a secured Hadoop cluster. There is prior discussion of this topic on HADOOP-10776 and HADOOP-12913. HADOOP-10776 is a blocker for 2.8.0 to make the transition to Public. --Chris Nauroth On 5/10/16, 11:34 AM, "Hitesh Shah" <hit...@apache.org<mailto:hit...@apache.org>> wrote: >There seems to be some incorrect assumptions on why the application had >an issue. For rolling upgrade deployments, the application bundles the >client-side jars that it was compiled against and uses them in its >classpath and expects to be able to communicate with upgraded servers. >Given that hadoop-common is a monolithic jar, it ends up being used on >both client-side and server-side. The problem in this case was caused by >the fact that the ResourceManager was generating the credentials file >with a format understood only by hadoop-common from 3.x. For an >application compiled against 2.x and has *only* hadoop-common from 2.x on >its classpath, trying to read this file fails. > >This is not about whether internal implementations can change for >non-public APIs. The file format for the Credential file in this scenario >is *not* internal implementation especially when you can have different >versions of the library trying to read the file. If an older client is >talking to a newer versioned server, the general backward compat >assumption is that the client should receive a response that it can parse >and understand. In this scenario, the credentials file provided to the >YARN app by the RM should have been written out with the older version or >at the very least been readable by the older hadoop-common.jar. > >In any case, does anyone have any specific concerns with changing >LimitedPrivate({"MapReduce"}) to Public? > >And sure, if we are saying that Hadoop-3.x requires all apps built >against it to go through a full re-compile as well as downtime as >existing apps may no longer work out of the box, lets call it out very >explicitly in the Release notes. > >‹ Hitesh > >> On May 10, 2016, at 9:24 AM, Allen Wittenauer >><allenwittena...@yahoo.com<mailto:allenwittena...@yahoo.com>> wrote: >> >> >>> On May 10, 2016, at 8:37 AM, Hitesh Shah >>> <hit...@apache.org<mailto:hit...@apache.org>> wrote: >>> >>> There have been various discussions on various JIRAs where upstream >>>projects such as YARN apps ( Tez, Slider, etc ) are called out for >>>using the above so-called Private APIs. A lot of YARN applications that >>>have been built out have picked up various bits and pieces of >>>implementation from MapReduce and DistributedShell to get things to >>>work. >>> >&g
Re: [DISCUSS] Treating LimitedPrivate({"MapReduce"}) as Public APIs for YARN applications
+1 for transitioning from LimitedPrivate to Public. I view this as an extension of the need for UserGroupInformation and related APIs to be Public. Regardless of the original intent behind LimitedPrivate, these are de facto public now, because there is no viable alternative for applications that want to integrate with a secured Hadoop cluster. There is prior discussion of this topic on HADOOP-10776 and HADOOP-12913. HADOOP-10776 is a blocker for 2.8.0 to make the transition to Public. --Chris Nauroth On 5/10/16, 11:34 AM, "Hitesh Shah" <hit...@apache.org> wrote: >There seems to be some incorrect assumptions on why the application had >an issue. For rolling upgrade deployments, the application bundles the >client-side jars that it was compiled against and uses them in its >classpath and expects to be able to communicate with upgraded servers. >Given that hadoop-common is a monolithic jar, it ends up being used on >both client-side and server-side. The problem in this case was caused by >the fact that the ResourceManager was generating the credentials file >with a format understood only by hadoop-common from 3.x. For an >application compiled against 2.x and has *only* hadoop-common from 2.x on >its classpath, trying to read this file fails. > >This is not about whether internal implementations can change for >non-public APIs. The file format for the Credential file in this scenario >is *not* internal implementation especially when you can have different >versions of the library trying to read the file. If an older client is >talking to a newer versioned server, the general backward compat >assumption is that the client should receive a response that it can parse >and understand. In this scenario, the credentials file provided to the >YARN app by the RM should have been written out with the older version or >at the very least been readable by the older hadoop-common.jar. > >In any case, does anyone have any specific concerns with changing >LimitedPrivate({"MapReduce"}) to Public? > >And sure, if we are saying that Hadoop-3.x requires all apps built >against it to go through a full re-compile as well as downtime as >existing apps may no longer work out of the box, lets call it out very >explicitly in the Release notes. > >‹ Hitesh > >> On May 10, 2016, at 9:24 AM, Allen Wittenauer >><allenwittena...@yahoo.com> wrote: >> >> >>> On May 10, 2016, at 8:37 AM, Hitesh Shah <hit...@apache.org> wrote: >>> >>> There have been various discussions on various JIRAs where upstream >>>projects such as YARN apps ( Tez, Slider, etc ) are called out for >>>using the above so-called Private APIs. A lot of YARN applications that >>>have been built out have picked up various bits and pieces of >>>implementation from MapReduce and DistributedShell to get things to >>>work. >>> >>> A recent example is a backward incompatible change introduced ( where >>>the API is not even directly invoked ) in the Credentials class related >>>to the ability to read tokens/credentials from a file. >> >> Let¹s be careful here. It should be noted that the problem happened >>primarily because the application jar appears to have included some >>hadoop jars in them. So the API invocation isn¹t the problem: it¹s >>the fact that the implementation under the hood changed. If the >>application jar didn¹t bundle hadoop jars ‹especially given that were >>already on the classpath--this problem should never have happened. >> >>> This functionality is required by pretty much everyone as YARN >>>provides the credentials to the app by writing the credentials/tokens >>>to a local file which is read in when >>>UserGroupInformation.getCurrentUser() is invoked. >> >> What you¹re effectively arguing is that implementations should never >>change for public (and in this case LimitedPrivate) APIs. I don¹t think >>that¹s reasonable. Hadoop is filled with changes in major branches >>where the implementations have changed but the internals have been >>reworked to perform the work in a slightly different manner. >> >>> This change breaks rolling upgrades for yarn applications from 2.x to >>>3.x (whether we end up supporting rolling upgrades across 2.x to 3.x is >>>a separate discussion ) >> >> >> At least today, according to the document attached to YARN-666 (lol), >>rolling upgrades are only supported within the same major version. >> >>> >>> I would like to change our compatibility docs to state that any API >>>that is marked as LimitedPrivate{Mapreduce} impl
Re: ProcFsBasedProcessTree and clean pages in smaps
Interesting, I didn't know about "Locked" in smaps. Thanks for pointing that out. At this point, if Varun's suggestion to check out YARN-1856 doesn't solve the problem, then I suggest opening a JIRA to track further design discussion. --Chris Nauroth On 2/5/16, 6:10 AM, "Varun Vasudev" <vvasu...@apache.org> wrote: >Hi Jan, > >YARN-1856 was recently committed which allows admins to use cgroups >instead the ProcFsBasedProcessTree monitory. Would that solve your >problem? However, that requires usage of the LinuxContainerExecutor. > >-Varun > > > >On 2/5/16, 6:45 PM, "Jan Lukavský" <jan.lukav...@firma.seznam.cz> wrote: > >>Hi Chris, >> >>thanks for your reply. As far as I can see right, new linux kernels show >>the locked memory in "Locked" field. >> >>If mmap file a mlock it, I see the following in 'smaps' file: >> >>7efd20aeb000-7efd2172b000 r--p 103:04 1870 >>/tmp/file.bin >>Size: 12544 kB >>Rss: 12544 kB >>Pss: 12544 kB >>Shared_Clean: 0 kB >>Shared_Dirty: 0 kB >>Private_Clean: 12544 kB >>Private_Dirty: 0 kB >>Referenced:12544 kB >>Anonymous: 0 kB >>AnonHugePages: 0 kB >>Swap: 0 kB >>KernelPageSize:4 kB >>MMUPageSize: 4 kB >>Locked:12544 kB >> >>... >># uname -a >>Linux XX 3.2.0-4-amd64 #1 SMP Debian 3.2.68-1+deb7u3 x86_64 GNU/Linux >> >>If I do this on an older kernel (2.6.x), the Locked field is missing. >> >>I can make a patch for the ProcfsBasedProcessTree that will calculate >>the "Locked" pages instead of the "Private_Clean" (based on >>configuration option). The question is - should there be made even more >>changes in the way the memory footprint is calculated? For instance, I >>believe the kernel can write to disk even all dirty pages (if they are >>backed by a file), making them clean and therefore can later free them. >>Should I open a JIRA for this to have some discussion on this topic? >> >>Regards, >> Jan >> >> >>On 02/04/2016 07:20 PM, Chris Nauroth wrote: >>> Hello Jan, >>> >>> I am moving this thread from u...@hadoop.apache.org to >>> yarn-dev@hadoop.apache.org, since it's less a question of general usage >>> and more a question of internal code implementation details and >>>possible >>> enhancements. >>> >>> I think the issue is that it's not guaranteed in the general case that >>> Private_Clean pages are easily evictable from page cache by the kernel. >>> For example, if the pages have been pinned into RAM by calling mlock >>>[1], >>> then the kernel cannot evict them. Since YARN can execute any code >>> submitted by an application, including possibly code that calls mlock, >>>it >>> takes a cautious approach and assumes that these pages must be counted >>> towards the process footprint. Although your Spark use case won't >>>mlock >>> the pages (I assume), YARN doesn't have a way to identify this. >>> >>> Perhaps there is room for improvement here. If there is a reliable >>>way to >>> count only mlock'ed pages, then perhaps that behavior could be added as >>> another option in ProcfsBasedProcessTree. Off the top of my head, I >>>can't >>> think of a reliable way to do this, and I can't research it further >>> immediately. Do others on the thread have ideas? >>> >>> --Chris Nauroth >>> >>> [1] http://linux.die.net/man/2/mlock >>> >>> >>> >>> >>> On 2/4/16, 5:11 AM, "Jan Lukavský" <jan.lukav...@firma.seznam.cz> >>>wrote: >>> >>>> Hello, >>>> >>>> I have a question about the way LinuxResourceCalculatorPlugin >>>>calculates >>>> memory consumed by process tree (it is calculated via >>>> ProcfsBasedProcessTree class). When we enable caching (disk) in apache >>>> spark jobs run on YARN cluster, the node manager starts to kill the >>>> containers while reading the cached data, because of "Container is >>>> running beyond memory limits ...". The reason is that even if we >>>>enable >>>> parsing of the smaps file >>>> >>>>(yarn.nodemanager.container-monitor.procfs-tree.smaps-based-rss.enabled >>>>) >>>> the ProcfsBased
Re: [Release thread] 2.8.0 release activities
FYI, I've just needed to raise HDFS-9761 to blocker status for the 2.8.0 release. --Chris Nauroth On 2/3/16, 6:19 PM, "Karthik Kambatla" <ka...@cloudera.com> wrote: >Thanks Vinod. Not labeling 2.8.0 stable sounds perfectly reasonable to me. >Let us not call it alpha or beta though, it is quite confusing. :) > >On Wed, Feb 3, 2016 at 8:17 PM, Gangumalla, Uma <uma.ganguma...@intel.com> >wrote: > >> Thanks Vinod. +1 for 2.8 release start. >> >> Regards, >> Uma >> >> On 2/3/16, 3:53 PM, "Vinod Kumar Vavilapalli" <vino...@apache.org> >>wrote: >> >> >Seems like all the features listed in the Roadmap wiki are in. I¹m >>going >> >to try cutting an RC this weekend for a first/non-stable release off of >> >branch-2.8. >> > >> >Let me know if anyone has any objections/concerns. >> > >> >Thanks >> >+Vinod >> > >> >> On Nov 25, 2015, at 5:59 PM, Vinod Kumar Vavilapalli >> >><vino...@apache.org> wrote: >> >> >> >> Branch-2.8 is created. >> >> >> >> As mentioned before, the goal on branch-2.8 is to put improvements / >> >>fixes to existing features with a goal of converging on an alpha >>release >> >>soon. >> >> >> >> Thanks >> >> +Vinod >> >> >> >> >> >>> On Nov 25, 2015, at 5:30 PM, Vinod Kumar Vavilapalli >> >>><vino...@apache.org> wrote: >> >>> >> >>> Forking threads now in order to track all things related to the >> >>>release. >> >>> >> >>> Creating the branch now. >> >>> >> >>> Thanks >> >>> +Vinod >> >>> >> >>> >> >>>> On Nov 25, 2015, at 11:37 AM, Vinod Kumar Vavilapalli >> >>>><vino...@apache.org> wrote: >> >>>> >> >>>> I think we¹ve converged at a high level w.r.t 2.8. And as I just >>sent >> >>>>out an email, I updated the Roadmap wiki reflecting the same: >> >>>>https://wiki.apache.org/hadoop/Roadmap >> >>>><https://wiki.apache.org/hadoop/Roadmap> >> >>>> >> >>>> I plan to create a 2.8 branch EOD today. >> >>>> >> >>>> The goal for all of us should be to restrict improvements & fixes >>to >> >>>>only (a) the feature-set documented under 2.8 in the RoadMap wiki >>and >> >>>>(b) other minor features that are already in 2.8. >> >>>> >> >>>> Thanks >> >>>> +Vinod >> >>>> >> >>>> >> >>>>> On Nov 11, 2015, at 12:13 PM, Vinod Kumar Vavilapalli >> >>>>><vino...@hortonworks.com <mailto:vino...@hortonworks.com>> wrote: >> >>>>> >> >>>>> - Cut a branch about two weeks from now >> >>>>> - Do an RC mid next month (leaving ~4weeks since branch-cut) >> >>>>> - As with 2.7.x series, the first release will still be called as >> >>>>>early / alpha release in the interest of >> >>>>> ‹ gaining downstream adoption >> >>>>> ‹ wider testing, >> >>>>> ‹ yet reserving our right to fix any inadvertent >>incompatibilities >> >>>>>introduced. >> >>>> >> >>> >> >> >> > >> >>
Re: ProcFsBasedProcessTree and clean pages in smaps
Hello Jan, I am moving this thread from u...@hadoop.apache.org to yarn-dev@hadoop.apache.org, since it's less a question of general usage and more a question of internal code implementation details and possible enhancements. I think the issue is that it's not guaranteed in the general case that Private_Clean pages are easily evictable from page cache by the kernel. For example, if the pages have been pinned into RAM by calling mlock [1], then the kernel cannot evict them. Since YARN can execute any code submitted by an application, including possibly code that calls mlock, it takes a cautious approach and assumes that these pages must be counted towards the process footprint. Although your Spark use case won't mlock the pages (I assume), YARN doesn't have a way to identify this. Perhaps there is room for improvement here. If there is a reliable way to count only mlock'ed pages, then perhaps that behavior could be added as another option in ProcfsBasedProcessTree. Off the top of my head, I can't think of a reliable way to do this, and I can't research it further immediately. Do others on the thread have ideas? --Chris Nauroth [1] http://linux.die.net/man/2/mlock On 2/4/16, 5:11 AM, "Jan Lukavský" <jan.lukav...@firma.seznam.cz> wrote: >Hello, > >I have a question about the way LinuxResourceCalculatorPlugin calculates >memory consumed by process tree (it is calculated via >ProcfsBasedProcessTree class). When we enable caching (disk) in apache >spark jobs run on YARN cluster, the node manager starts to kill the >containers while reading the cached data, because of "Container is >running beyond memory limits ...". The reason is that even if we enable >parsing of the smaps file >(yarn.nodemanager.container-monitor.procfs-tree.smaps-based-rss.enabled) >the ProcfsBasedProcessTree calculates mmaped read-only pages as consumed >by the process tree, while spark uses FileChannel.map(MapMode.READ_ONLY) >to read the cached data. The JVM then consumes *a lot* more memory than >the configured heap size (and it cannot be really controlled), but this >memory is IMO not really consumed by the process, the kernel can reclaim >these pages, if needed. My question is - is there any explicit reason >why "Private_Clean" pages are calculated as consumed by process tree? I >patched the ProcfsBasedProcessTree not to calculate them, but I don't >know if this is the "correct" solution. > >Thanks for opinions, > cheers, > Jan > > >- >To unsubscribe, e-mail: user-unsubscr...@hadoop.apache.org >For additional commands, e-mail: user-h...@hadoop.apache.org > >
Re: YARN application container state doubt
If you have a copy of the Hadoop source and a working build environment, then here is a cool way to answer questions about which state transitions are valid. There is a Maven build command that inspects the current implementation of the state machines and summarizes the state transitions in a GraphViz file. There are visualization tools available for browsing the GraphViz file, but they are plain text, so I tend to just browse them in my text editor. This special build command is available for both ResourceManager and NodeManager. Your question relates to the ResourceManager, so let's look at that one. > cd >hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-reso >urcemanager/ > mvn -o clean compile -Pvisualize > ls -lrt *.gv -rw-r--r--+ 1 chris staff14K Jan 6 10:42 ResourceManager.gv Looking in that file, I can see that this is one of the state transitions for RMContainer. subgraph cluster_RMContainer { label="RMContainer" ... "RMContainer.ACQUIRED" -> "RMContainer.COMPLETED" [ label = "FINISHED" ]; ... Based on this, we know that ACQUIRED to COMPLETED is a valid state transition. Unfortunately, I'm not deep enough on this code to quickly answer the second part of your question: when would this happen? Can someone deeper on YARN please add to this? --Chris Nauroth On 1/6/16, 10:33 AM, "Prabhu Joseph" <prabhujose.ga...@gmail.com> wrote: >Hi Experts, > > I have a Custom ApplicationMaster written using TWILL which asks for 20 >Containers and RM ApplicationMasterService has assigned 20 containers out >of which 19 are moved from ACQUIRED to RUNNING. And there is one moved >from >*ACQUIRED to COMPLETED. * >So my doubt is is ACQUIRED to COMPLETED a correct state flow and if so >when >that will happen. > > >Thanks, >Prabhu Joseph
Re: How to debug hadoop(or YARN) locally?
If you want the capability to run live pseudo-distributed and deploy code changes without doing a full distro tarball build, then you can control the classpath by setting a few more environment variables in hadoop-env.sh. Here is an example of what I'm doing in one of my dev environments. export HADOOP_USER_CLASSPATH_FIRST=1 HADOOP_REPO=~/git/hadoop export HADOOP_CLASSPATH=$HADOOP_REPO/hadoop-common-project/hadoop-common/target/cl asses:$HADOOP_REPO/hadoop-hdfs-project/hadoop-hdfs-client/target/classes:$H ADOOP_REPO/hadoop-hdfs-project/hadoop-hdfs/target/classes Setting HADOOP_CLASSPATH adds additional paths to the classpath before the shell launches the JVM. In my case, I have the source checked out to ~/git/hadoop, and I point to the target/classes sub-directories for the individual sub-modules that I want to override and test. Then, I can make code changes, run "mvn compile" in the sub-module directory, and restart the daemons. By default, the HADOOP_CLASSPATH entries are added at the end of the standard classpath. Setting HADOOP_USER_CLASSPATH_FIRST=1 changes that behavior so that the custom entries are first. This way, my built code changes override the classes that were bundled in the tarball distro. --Chris Nauroth On 12/21/15, 7:29 PM, "Allen Zhang" <allenzhang...@126.com> wrote: > >oh, so cool. awesome. Thanks > > > > > > > >At 2015-12-22 11:01:55, "Jeff Zhang" <zjf...@gmail.com> wrote: >>If you want to change the yarn internal code, you can use MiniYarnCluster >>for testing. >> >>https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-ya >>rn/hadoop-yarn-server/hadoop-yarn-server-tests/src/test/java/org/apache/h >>adoop/yarn/server/MiniYARNCluster.java >> >>On Tue, Dec 22, 2015 at 10:00 AM, Allen Zhang <allenzhang...@126.com> >>wrote: >> >>> >>> >>> so, does it to mean that, if I change or add some code, I have to >>> re-tarball the whole project using "mvn clean package -Pdist >>>-DskipTests >>> -Dtar", and then, deploy it to somewhere to remote debug? if yes, I >>>think >>> it is so inconvincence. if no, can you guys explain more in this way? >>> >>> >>> Thanks, >>> Allen >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> At 2015-12-22 08:55:01, "Jeff Zhang" <zjf...@gmail.com> wrote: >>> >+1 for Chris, remote debug will help you. >>> > >>> >On Tue, Dec 22, 2015 at 1:54 AM, Chris Nauroth >>><cnaur...@hortonworks.com> >>> >wrote: >>> > >>> >> If you're running the Hadoop daemons in pseudo-distributed mode >>>(all the >>> >> daemons running as separate processes, but on a single dev host), >>>then >>> >> another option is to launch the daemon's JVM with the JDWP >>>arguments and >>> >> attach a "remote" debugger. This can be either the jdb CLI debugger >>> that >>> >> ships with the JDK or a fancier IDE like Eclipse or IntelliJ. >>> >> >>> >> Each daemon's JVM arguments are controlled with an environment >>>variable >>> >> suffixed with "_OPTS" defined in files named *-env.sh. For >>>example, in >>> >> hadoop-env.sh, you could set something like this to enable remote >>> >> debugging for the NameNode process: >>> >> >>> >> export >>> >> >>> >>>HADOOP_NAMENODE_OPTS="-agentlib:jdwp=transport=dt_socket,server=y,addres >>>s=8 >>> >> 000,suspend=n $HADOOP_NAMENODE_OPTS" >>> >> >>> >> >>> >> Then, you can run "jdb -attach localhost:8000" to attach the >>>debugger, >>> or >>> >> do the equivalent in your IDE of choice. >>> >> >>> >> --Chris Nauroth >>> >> >>> >> >>> >> >>> >> >>> >> On 12/21/15, 7:25 AM, "Daniel Templeton" <dan...@cloudera.com> >>>wrote: >>> >> >>> >> >Your best bet is to find a test that includes all the bits you >>>want and >>> >> >execute that test in debug mode. (You can also change an existing >>>test >>> >> >to include what you want, but in most cases it is easier to start >>>with >>> >> >an existing test than to start from scratch.) >>> >> > >>> >> >Daniel >>> >> > >>> >> >On 12/20/15 6:01 PM, Allen Zhang wrote: >>> >> >> Hi all, >>> >> >> >>> >> >> I am reading hadoop-2.6.0 source code, mainly focusing on hadoop >>> yarn. >>> >> >> However i have some problems in reading or debugging the source >>> >> >>code,can I debug it locally(I mean in my laptop locally with this >>> source >>> >> >>code I've downloaded, not remotely debug), >>> >> >> because I need to track it execution flow stey by stey, and then >>>I >>> want >>> >> >>to add a new feature or enhancement. >>> >> >> >>> >> >> >>> >> >> So can anyone give some good suggestions or share your method or >>>any >>> >> >>wiki page? Really appreciate!! >>> >> >> >>> >> >> >>> >> >> Thanks, >>> >> >> Allen >>> >> > >>> >> > >>> >> >>> >> >>> > >>> > >>> >-- >>> >Best Regards >>> > >>> >Jeff Zhang >>> >> >> >> >>-- >>Best Regards >> >>Jeff Zhang
Re: How to debug hadoop(or YARN) locally?
If you're running the Hadoop daemons in pseudo-distributed mode (all the daemons running as separate processes, but on a single dev host), then another option is to launch the daemon's JVM with the JDWP arguments and attach a "remote" debugger. This can be either the jdb CLI debugger that ships with the JDK or a fancier IDE like Eclipse or IntelliJ. Each daemon's JVM arguments are controlled with an environment variable suffixed with "_OPTS" defined in files named *-env.sh. For example, in hadoop-env.sh, you could set something like this to enable remote debugging for the NameNode process: export HADOOP_NAMENODE_OPTS="-agentlib:jdwp=transport=dt_socket,server=y,address=8 000,suspend=n $HADOOP_NAMENODE_OPTS" Then, you can run "jdb -attach localhost:8000" to attach the debugger, or do the equivalent in your IDE of choice. --Chris Nauroth On 12/21/15, 7:25 AM, "Daniel Templeton" <dan...@cloudera.com> wrote: >Your best bet is to find a test that includes all the bits you want and >execute that test in debug mode. (You can also change an existing test >to include what you want, but in most cases it is easier to start with >an existing test than to start from scratch.) > >Daniel > >On 12/20/15 6:01 PM, Allen Zhang wrote: >> Hi all, >> >> I am reading hadoop-2.6.0 source code, mainly focusing on hadoop yarn. >> However i have some problems in reading or debugging the source >>code,can I debug it locally(I mean in my laptop locally with this source >>code I've downloaded, not remotely debug), >> because I need to track it execution flow stey by stey, and then I want >>to add a new feature or enhancement. >> >> >> So can anyone give some good suggestions or share your method or any >>wiki page? Really appreciate!! >> >> >> Thanks, >> Allen > >
Re: [DISCUSS] Looking to a 2.8.0 release
+1. Thanks, Vinod. --Chris Nauroth On 11/25/15, 1:45 PM, "Vinod Kumar Vavilapalli" <vino...@apache.org> wrote: >Okay, tx for this clarification Chris! I dug more into this and now >realized the actual scope of this. Given the the limited nature of this >feature (non-Namenode etc) and the WIP nature of the larger umbrella >HADOOP-11744, we will ship the feature but I’ll stop calling this out as >a notable feature. > >Thanks >+Vinod > > >> On Nov 25, 2015, at 12:04 PM, Chris Nauroth <cnaur...@hortonworks.com> >>wrote: >> >> Hi Vinod, >> >> The HDFS-8155 work is complete in branch-2 already, so feel free to >> include it in the roadmap. >> >> For those watching the thread that aren't familiar with HDFS-8155, I >>want >> to call out that it was a client-side change only. The WebHDFS client >>is >> capable of obtaining OAuth2 tokens and passing them along in its HTTP >> requests. The NameNode and DataNode server side currently do not have >>any >> support for OAuth2, so overall, this feature is only useful in some very >> unique deployment architectures right now. This is all discussed >> explicitly in documentation committed with HDFS-8155, but I wanted to >> prevent any mistaken assumptions for people only reading this thread. >> >> --Chris Nauroth >> >> >> >> >> On 11/25/15, 11:08 AM, "Vinod Kumar Vavilapalli" <vino...@apache.org> >> wrote: >> >>> This is the current state from the feedback I gathered. >>> - Support priorities across applications within the same queue >>>YARN-1963 >>> ‹ Can push as an alpha / beta feature per Sunil >>> - YARN-1197 Support changing resources of an allocated container: >>> ‹ Can push as an alpha/beta feature per Wangda >>> - YARN-3611 Support Docker Containers In LinuxContainerExecutor: Well >>> most of it anyways. >>> ‹ Can push as an alpha feature. >>> - YARN Timeline Service v1.5 - YARN-4233 >>> ‹ Should include per Li Lu >>> - YARN Timeline Service Next generation: YARN-2928 >>> ‹ Per analysis from Sangjin, drop this from 2.8. >>> >>> One open feature status >>> - HDFS-8155Support OAuth2 in WebHDFS: Alpha / Early feature? >>> >>> Updated the Roadmap wiki with the same. >>> >>> Thanks >>> +Vinod >>> >>>> On Nov 13, 2015, at 12:12 PM, Sangjin Lee <sj...@apache.org> wrote: >>>> >>>> I reviewed the current state of the YARN-2928 changes regarding its >>>> impact >>>> if the timeline service v.2 is disabled. It does appear that there >>>>are a >>>> lot of things that still do get created and enabled unconditionally >>>> regardless of configuration. While this is understandable when we were >>>> working to implement the feature, this clearly needs to be cleaned up >>>>so >>>> that when disabled the timeline service v.2 doesn't impact other >>>>things. >>>> >>>> I filed a JIRA for that work: >>>> https://issues.apache.org/jira/browse/YARN-4356 >>>> >>>> We need to complete it before we can merge. >>>> >>>> Somewhat related is the status of the configuration and what it means >>>>in >>>> various contexts (client/app-side vs. server-side, v.1 vs. v.2, >>>>etc.). I >>>> know there is an ongoing discussion regarding YARN-4183. We'll need to >>>> reflect the outcome of that discussion. >>>> >>>> My overall impression of whether this can be done for 2.8 is that it >>>> looks >>>> rather challenging given the suggested timeframe. We also need to >>>> complete >>>> several major tasks before it is ready. >>>> >>>> Sangjin >>>> >>>> >>>> On Wed, Nov 11, 2015 at 5:49 PM, Sangjin Lee <sjl...@gmail.com> wrote: >>>> >>>>> >>>>> On Wed, Nov 11, 2015 at 12:13 PM, Vinod Vavilapalli < >>>>> vino...@hortonworks.com> wrote: >>>>> >>>>>> ‹ YARN Timeline Service Next generation: YARN-2928: Lots of >>>>>> momentum, >>>>>> but clearly a work in progress. Two options here >>>>>> ‹ If it is safe to ship it into 2.8 in a disable manner, we >>>>>>can >>>>>> get the early code into trunk and all the way int o2.8. >>>>>> ‹ If it is not safe, it organically rolls over into 2.9 >>>>>> >>>>> >>>>> I'll review the changes on YARN-2928 to see what impact it has (if >>>>> any) if >>>>> the timeline service v.2 is disabled. >>>>> >>>>> Another condition for it to make 2.8 is whether the branch will be >>>>>in a >>>>> shape in a couple of weeks such that it adds value for folks that >>>>>want >>>>> to >>>>> test it. Hopefully it will become clearer soon. >>>>> >>>>> Sangjin >>>>> >>> >> >> > >
Re: [DISCUSS] Looking to a 2.8.0 release
Hi Vinod, The HDFS-8155 work is complete in branch-2 already, so feel free to include it in the roadmap. For those watching the thread that aren't familiar with HDFS-8155, I want to call out that it was a client-side change only. The WebHDFS client is capable of obtaining OAuth2 tokens and passing them along in its HTTP requests. The NameNode and DataNode server side currently do not have any support for OAuth2, so overall, this feature is only useful in some very unique deployment architectures right now. This is all discussed explicitly in documentation committed with HDFS-8155, but I wanted to prevent any mistaken assumptions for people only reading this thread. --Chris Nauroth On 11/25/15, 11:08 AM, "Vinod Kumar Vavilapalli" <vino...@apache.org> wrote: >This is the current state from the feedback I gathered. > - Support priorities across applications within the same queue YARN-1963 >‹ Can push as an alpha / beta feature per Sunil > - YARN-1197 Support changing resources of an allocated container: >‹ Can push as an alpha/beta feature per Wangda > - YARN-3611 Support Docker Containers In LinuxContainerExecutor: Well >most of it anyways. >‹ Can push as an alpha feature. > - YARN Timeline Service v1.5 - YARN-4233 >‹ Should include per Li Lu > - YARN Timeline Service Next generation: YARN-2928 >‹ Per analysis from Sangjin, drop this from 2.8. > >One open feature status > - HDFS-8155Support OAuth2 in WebHDFS: Alpha / Early feature? > >Updated the Roadmap wiki with the same. > >Thanks >+Vinod > >> On Nov 13, 2015, at 12:12 PM, Sangjin Lee <sj...@apache.org> wrote: >> >> I reviewed the current state of the YARN-2928 changes regarding its >>impact >> if the timeline service v.2 is disabled. It does appear that there are a >> lot of things that still do get created and enabled unconditionally >> regardless of configuration. While this is understandable when we were >> working to implement the feature, this clearly needs to be cleaned up so >> that when disabled the timeline service v.2 doesn't impact other things. >> >> I filed a JIRA for that work: >> https://issues.apache.org/jira/browse/YARN-4356 >> >> We need to complete it before we can merge. >> >> Somewhat related is the status of the configuration and what it means in >> various contexts (client/app-side vs. server-side, v.1 vs. v.2, etc.). I >> know there is an ongoing discussion regarding YARN-4183. We'll need to >> reflect the outcome of that discussion. >> >> My overall impression of whether this can be done for 2.8 is that it >>looks >> rather challenging given the suggested timeframe. We also need to >>complete >> several major tasks before it is ready. >> >> Sangjin >> >> >> On Wed, Nov 11, 2015 at 5:49 PM, Sangjin Lee <sjl...@gmail.com> wrote: >> >>> >>> On Wed, Nov 11, 2015 at 12:13 PM, Vinod Vavilapalli < >>> vino...@hortonworks.com> wrote: >>> >>>>‹ YARN Timeline Service Next generation: YARN-2928: Lots of >>>>momentum, >>>> but clearly a work in progress. Two options here >>>>‹ If it is safe to ship it into 2.8 in a disable manner, we can >>>> get the early code into trunk and all the way int o2.8. >>>>‹ If it is not safe, it organically rolls over into 2.9 >>>> >>> >>> I'll review the changes on YARN-2928 to see what impact it has (if >>>any) if >>> the timeline service v.2 is disabled. >>> >>> Another condition for it to make 2.8 is whether the branch will be in a >>> shape in a couple of weeks such that it adds value for folks that want >>>to >>> test it. Hopefully it will become clearer soon. >>> >>> Sangjin >>> >
Re: Hadoop 2.6.1 Release process thread
The HADOOP-10786 patch is compatible with JDK 6. This was a point of discussion during the original development of the patch. If you'd like full details, please see the comments there. Like Akira, I also confirmed that the new test works correctly when running with JDK 6. Thanks! --Chris Nauroth On 8/14/15, 9:22 AM, Akira AJISAKA ajisa...@oss.nttdata.co.jp wrote: Good point. I ran the regression test in HADOOP-10786 successfully on ajisakaa/common-merge branch with JDK6. I'll run all the unit tests against JDK6 locally after merging all the jiras. Thanks, Akira On 8/14/15 23:21, Allen Wittenauer wrote: I hope someone tests this against JDK6, otherwise this is an incompatible change. On Aug 12, 2015, at 2:21 PM, Chris Nauroth cnaur...@hortonworks.com wrote: I've just applied the 2.6.1-candidate label to HADOOP-10786. Since this is somewhat late in the process, I thought I'd better follow up over email too. This bug was originally reported with JDK 8. A code change in JDK 8 broke our automatic relogin from a Kerberos keytab, and we needed to change UserGroupInformation to fix it. Just today I discovered that the JDK code change has made it into the JDK 7 code line too. Specifically, I can repro the bug against OpenJDK 1.7.0_85. Since many users wouldn't expect a minor version upgrade of the JDK to cause such a severe problem, I think HADOOP-10786 is justified for inclusion in a patch release. --Chris Nauroth On 8/11/15, 7:48 PM, Sangjin Lee sjl...@gmail.com wrote: It might have been because we thought that HDFS-7704 was going to make it. It's both make it or neither does. Now that we know HDFS-7704 is out, HDFS-7916 should definitely be out. I hope that clarifies. On Tue, Aug 11, 2015 at 6:26 PM, Vinod Kumar Vavilapalli vino...@hortonworks.com wrote: I earlier removed HDFS-7916 from the list given HDFS-7704 was only in 2.7.0. Chris Trezzo added it back and so it appeared in my lists. I removed it again, Chris, please comment on why you added it back. If you want it included, please comment here and we can add it after we figure out the why and the dependent tickets. Thanks +Vinod On Aug 11, 2015, at 4:37 PM, Sangjin Lee sjl...@gmail.commailto: sjl...@gmail.com wrote: Could you double check HDFS-7916? HDFS-7916 is needed only if HDFS-7704 makes it. However, I see commits for HDFS-7916 in this list, but not for HDFS-7704. If HDFS-7704 is not in the list, we should not backport HDFS-7916 as it fixes an issue introduced by HDFS-7704. On Tue, Aug 11, 2015 at 4:10 PM, Vinod Kumar Vavilapalli vino...@hortonworks.commailto:vino...@hortonworks.com wrote: Put the list here: https://wiki.apache.org/hadoop/Release-2.6.1-Working-Notes. And started figuring out ways to fast-path the cherry-picks. Thanks +Vinod On Aug 11, 2015, at 1:15 PM, Vinod Kumar Vavilapalli vino...@apache.org mailto:vino...@apache.org wrote: (2) With Wangda's help offline, I prepared an ordered list of cherry-pick commits that we can do from our candidate list [1], will do some ground work today.
Re: Hadoop 2.6.1 Release process thread
I've just applied the 2.6.1-candidate label to HADOOP-10786. Since this is somewhat late in the process, I thought I'd better follow up over email too. This bug was originally reported with JDK 8. A code change in JDK 8 broke our automatic relogin from a Kerberos keytab, and we needed to change UserGroupInformation to fix it. Just today I discovered that the JDK code change has made it into the JDK 7 code line too. Specifically, I can repro the bug against OpenJDK 1.7.0_85. Since many users wouldn't expect a minor version upgrade of the JDK to cause such a severe problem, I think HADOOP-10786 is justified for inclusion in a patch release. --Chris Nauroth On 8/11/15, 7:48 PM, Sangjin Lee sjl...@gmail.com wrote: It might have been because we thought that HDFS-7704 was going to make it. It's both make it or neither does. Now that we know HDFS-7704 is out, HDFS-7916 should definitely be out. I hope that clarifies. On Tue, Aug 11, 2015 at 6:26 PM, Vinod Kumar Vavilapalli vino...@hortonworks.com wrote: I earlier removed HDFS-7916 from the list given HDFS-7704 was only in 2.7.0. Chris Trezzo added it back and so it appeared in my lists. I removed it again, Chris, please comment on why you added it back. If you want it included, please comment here and we can add it after we figure out the why and the dependent tickets. Thanks +Vinod On Aug 11, 2015, at 4:37 PM, Sangjin Lee sjl...@gmail.commailto: sjl...@gmail.com wrote: Could you double check HDFS-7916? HDFS-7916 is needed only if HDFS-7704 makes it. However, I see commits for HDFS-7916 in this list, but not for HDFS-7704. If HDFS-7704 is not in the list, we should not backport HDFS-7916 as it fixes an issue introduced by HDFS-7704. On Tue, Aug 11, 2015 at 4:10 PM, Vinod Kumar Vavilapalli vino...@hortonworks.commailto:vino...@hortonworks.com wrote: Put the list here: https://wiki.apache.org/hadoop/Release-2.6.1-Working-Notes. And started figuring out ways to fast-path the cherry-picks. Thanks +Vinod On Aug 11, 2015, at 1:15 PM, Vinod Kumar Vavilapalli vino...@apache.org mailto:vino...@apache.org wrote: (2) With Wangda's help offline, I prepared an ordered list of cherry-pick commits that we can do from our candidate list [1], will do some ground work today.
Re: 2.7.2 release plan
I'd be comfortable with inclusion of any doc-only patch in minor releases. There is a lot of value to end users in pushing documentation fixes as quickly as possible, and they don't bear the same risk of regressions or incompatibilities as code changes. --Chris Nauroth On 7/16/15, 12:38 AM, Tsuyoshi Ozawa oz...@apache.org wrote: Hi, thank you for starting the discussion about 2.7.2 release. The focus obviously is to have blocker issues [2], bug-fixes and *no* features / improvements. I've committed YARN-3170, which is an improvement of documentation. I thought documentation pages which can be fit into branch-2.7 can be included easily. Should I revert it? I need help from all committers in automatically merging in any patch that fits the above criterion into 2.7.2 instead of only on trunk or 2.8. Sure, I'll try my best. That way we can include not only blocker but also critical bug fixes to 2.7.2 release. As Vinod mentioned, we should also apply major bug fixes into branch-2.7. Thanks, - Tsuyoshi On Thu, Jul 16, 2015 at 3:52 PM, Akira AJISAKA ajisa...@oss.nttdata.co.jp wrote: Thanks Vinod for starting 2.7.2 release plan. The focus obviously is to have blocker issues [2], bug-fixes and *no* features / improvements. Can we adopt the plan as Karthik mentioned in Additional maintenance releases for Hadoop 2.y versions thread? That way we can include not only blocker but also critical bug fixes to 2.7.2 release. In addition, branch-2.7 is a special case. (2.7.1 is the first stable release) Therefore I'm thinking we can include major bug fixes as well. Regards, Akira On 7/16/15 04:13, Vinod Kumar Vavilapalli wrote: Hi all, Thanks everyone for the push on 2.7.1! Branch-2.7 is now open for commits to a 2.7.2 release. JIRA also now has a 2.7.2 version for all the sub-projects. Continuing the previous 2.7.1 thread on steady maintenance releases [1], we should follow up 2.7.1 with a 2.7.2 within 4 weeks. Earlier I tried a 2-3 week cycle for 2.7.1, but it seems to be impractical given the community size. So, I propose we target a release by the end for 4 weeks from now, starting the release close-down within 2-3 weeks. The focus obviously is to have blocker issues [2], bug-fixes and *no* features / improvements. I need help from all committers in automatically merging in any patch that fits the above criterion into 2.7.2 instead of only on trunk or 2.8. Thoughts? Thanks, +Vinod [1] A 2.7.1 release to follow up 2.7.0 http://markmail.org/message/zwzze6cqqgwq4rmw [2] 2.7.2 release blockers: https://issues.apache.org/jira/issues/?filter=12332867
Re: IMPORTANT: automatic changelog creation
+1 Thank you to Allen for the script, and thank you to Andrew for volunteering to drive the conversion. --Chris Nauroth On 7/2/15, 2:01 PM, Andrew Wang andrew.w...@cloudera.com wrote: Hi all, I want to revive the discussion on this thread, since the overhead of CHANGES.txt came up again in the context of backporting fixes for maintenance releases. Allen's automatic generation script (HADOOP-11731) went into trunk but not branch-2, so we're still maintaining CHANGES.txt everywhere. What do people think about backporting this to branch-2 and then removing CHANGES.txt from trunk/branch-2 (HADOOP-11792)? Based on discussion on this thread and in HADOOP-11731, we seem to agree that CHANGES.txt is an unreliable source of information, and JIRA is at least as reliable and probably much more so. Thus I don't see any downsides to backporting it. Would like to hear everyone's thoughts on this, I'm willing to drive the effort. Thanks, Andrew On Thu, Apr 2, 2015 at 2:00 PM, Tsz Wo Sze szets...@yahoo.com.invalid wrote: Generating change log from JIRA is a good idea. It bases on an assumption that each JIRA has an accurate summary (a.k.a. JIRA title) to reflect the committed change. Unfortunately, the assumption is invalid for many cases since we never enforce that the JIRA summary must be the same as the change log. We may compare the current CHANGES.txt with the generated change log. I beg the diff is long. Besides, after a release R1 is out, someone may (accidentally or intentionally) modify the JIRA summary. Then, the entry for the same item in a later release R2 could be different from the one in R1. I agree that manually editing CHANGES.txt is not a perfect solution. However, it works well in the past for many releases. I suggest we keep the current dev workflow. Try using the new script provided by HADOOP-11731 to generate the next release. If everything works well, we shell remove CHANGES.txt and revise the dev workflow. What do you think? Regards,Tsz-Wo On Thursday, April 2, 2015 12:57 PM, Allen Wittenauer a...@altiscale.com wrote: On Apr 2, 2015, at 12:40 PM, Vinod Kumar Vavilapalli vino...@hortonworks.com wrote: We'd then doing two commits for every patch. Let's simply not remove CHANGES.txt from trunk, keep the existing dev workflow, but doc the release process to remove CHANGES.txt in trunk at the time of a release going out of trunk. Might as well copy branch-2¹s changes.txt into trunk then. (or 2.7¹s. Last I looked, people updated branch-2 and not 2.7¹s or vice versa for some patches that went into both branches.) So that folks who are committing to both branches and want to cherry pick all changes can. I mean, trunk¹s is very very very wrong. Right now. Today. Borderline useless. See HADOOP-11718 (which I will now close out as won¹t fix)Š and that jira is only what is miscategorized, not what is missing.
[jira] [Created] (YARN-3834) Scrub debug logging of tokens during resource localization.
Chris Nauroth created YARN-3834: --- Summary: Scrub debug logging of tokens during resource localization. Key: YARN-3834 URL: https://issues.apache.org/jira/browse/YARN-3834 Project: Hadoop YARN Issue Type: Improvement Components: nodemanager Affects Versions: 2.7.1 Reporter: Chris Nauroth Assignee: Chris Nauroth During resource localization, the NodeManager logs tokens at debug level to aid troubleshooting. This includes the full token representation. Best practice is to avoid logging anything secret, even at debug level. We can improve on this by changing the logging to use a scrubbed representation of the token. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
F 6/19: Jenkins clogged up
Hi everyone, I was just in contact with Apache infrastructure. Jenkins wasn't running jobs for a while, so there is a large backlog in the queue now (over 200 jobs). Infra has fixed the problems, so jobs are running now, but our test-patch runs might have to sit in the queue a long time today. --Chris Nauroth
Reminder: Apache committers have access to a free MSDN license
If you are a committer on any Apache project (not just Hadoop), then you have access to a free MSDN license. The details are described here. https://svn.apache.org/repos/private/committers/donated-licenses/msdn-licen se-grants.txt You'll need to authenticate with your Apache credentials. This means that all Hadoop committers, and a large number of contributors who are also committers on other Apache projects, are empowered to review and test patches on Windows. After getting the free MSDN license, you can download the installation iso for Windows Server 2008 or 2010 and run it in a VirtualBox VM (or your hypervisor of choice). Instructions for setting up a Windows development environment have been in BUILDING.txt for a few years. This would prevent situations where patches are blocked from getting committed while waiting for me or any other individual to test. --Chris Nauroth
Re: 2.7.1 status
Thanks, Larry. I have marked HADOOP-11934 as a blocker for 2.7.1. I have reviewed and +1'd it. I can commit it after we get feedback from Jenkins. --Chris Nauroth On 5/26/15, 12:41 PM, larry mccay lmc...@apache.org wrote: Hi Vinod - I think that https://issues.apache.org/jira/browse/HADOOP-11934 should also be added to the blocker list. This is a critical bug in our ability to protect the LDAP connection password in LdapGroupsMapper. thanks! --larry On Tue, May 26, 2015 at 3:32 PM, Vinod Kumar Vavilapalli vino...@hortonworks.com wrote: Tx for reporting this, Elliot. Made it a blocker, not with a deeper understanding of the problem. Can you please chime in with your opinion and perhaps code reviews? Thanks +Vinod On May 26, 2015, at 10:48 AM, Elliott Clark ecl...@apache.org wrote: HADOOP-12001 should probably be added to the blocker list since it's a regression that can keep ldap from working.
Re: [DISCUSS] branch-1
I think it would be fine to auto-close most remaining branch-1 issues even if the branch is still formally considered alive. I don't expect us to create a new 1.x release unless a security vulnerability or critical bug forces it. Closing all non-critical issues would match with the reality that no one is actively developing for the branch, but there would still be the option of filing new critical bugs if someone decides that they want a new 1.x release. --Chris Nauroth On 5/8/15, 10:50 AM, Karthik Kambatla ka...@cloudera.com wrote: I would be -1 to declaring the branch dead just yet. There have been 7 commits to that branch this year. I know this isn't comparable to trunk or branch-2, but it is not negligible either. I propose we come up with a policy for deprecating past major release branches. May be, something along the lines of - deprecate branch-x when release x+3.0.0 goes GA? On Fri, May 8, 2015 at 10:41 AM, Allen Wittenauer a...@altiscale.com wrote: May we declare this branch dead and just close bugs (but not necessarily concepts, ideas, etc) with won¹t fix? I don¹t think anyone has any intention of working on the 1.3 release, especially given that 1.2.1 was Aug 2013 Š. I guess we need a PMC member to declare a vote or whateverŠ. -- Karthik Kambatla Software Engineer, Cloudera Inc. http://five.sentenc.es
Re: [DISCUSS] Developing features in branches
I also recommend frequent merges with trunk, ramping up to at least daily as the work on the branch winds down. The sooner you know about the merge conflict, the easier it is to trace back to the commit that caused it and untangle things. --Chris Nauroth On 5/1/15, 2:58 AM, Steve Loughran ste...@hortonworks.com wrote: people are already doing work in branches, on private github repos, with their own personal review/commit policy. Branches in apache codebase would permit more structured collaboration between committers. But to get the same review as a final patch, they probably need supervision as they go along. The strength of doing things in branches is -no half-complete patches, either during the work or otherwise -one person's work-in-progress doesn't break other people's work. The weaknesses? -the longer lived the branch, the harder the merge. You can reduce the impact through rebasing the branch, which you can't do on shared branches, or simply through regular merges (which complicates the graph). -there's potentially more of a tendency to accept a long-lived branch in without enough final review, on the basis that the work has been ongoing for longer. whatever: it works for HDFS On 1 May 2015, at 05:14, Bikas Saha bi...@hortonworks.com wrote: In other words, the best solution is careful up-front design and break up of changes so that they can be made incrementally. At that point working on master is not much different than working in a branch. However, if that allows for a set of changes to be made inside a branch in a non-disruptive manner and people want the extra work of maintaining a branch then that choice could be made. E.g. there could be parts which are less thought out and more risky. They could be grafted out in a preparatory master patches and then do the isolated riskier changes in a branch. This would make sense when the riskier change is worth 5-10 jiras or more. Ie. The work is substantial enough that it need multiple jiras over multiple weeks to get to completion. So avoiding master is beneficial. That's the way I would think about it. -Original Message- From: Chris Nauroth [mailto:cnaur...@hortonworks.com] Sent: Thursday, April 30, 2015 3:46 PM To: yarn-dev@hadoop.apache.org Subject: Re: [DISCUSS] Developing features in branches In HDFS, our recent feature branches tried to keep large portions of their new code in new classes (i.e. org.apache.hadoop.hdfs.server.namenode.CacheManager) or even new Java packages (i.e. org.apache.hadoop.hdfs.server.namenode.snapshot). We tried to make minimal changes in existing code: just enough to hook into the new code. If hooking into the new code isn't easy for some reason, then sometimes you can submit a non-impactful refactoring patch to trunk to help make it easier. By submitting straightforward refactorings to trunk first, you can reduce some of the difficulty of reviewing a large consolidated patch at merge time. Reviewers can focus on the new logic. This tends to minimize the impact of merge conflicts coming from either trunk or a sibling feature branch. This is only possible if it's a logically distinct new feature and this kind of code organization makes sense for that feature, but it's something to keep in mind. --Chris Nauroth On 4/30/15, 3:23 PM, Zhijie Shen zs...@hortonworks.com wrote: Exactly. Branch development is good, but I concerned about too many concurrent branches. In terms of code management, the good branch development candidate could be those like registry, shared cache and timeline service. Their most changes are the incremental code in some new sub-module, are less likely to conflict with trunk/branch-2, and are rarely depended by other parallel development. Thanks, Zhijie From: Bikas Saha bi...@hortonworks.com Sent: Thursday, April 30, 2015 12:52 PM To: yarn-dev@hadoop.apache.org Subject: RE: [DISCUSS] Developing features in branches I think what Zhijie is talking about is a little different. Work happening in parallel across 2 branches have no clue about each other since they don¹t get updates via master. If a bunch of these branches is tried to be merged close to a release then there are likely to be a lot of surprises. As an example, lets say support for speculation and node labels were happening in separate branches. It is very likely that 50% of the code would conflict - not just in code but also in semantics. Bikas -Original Message- From: Ray Chiang [mailto:rchi...@cloudera.com] Sent: Thursday, April 30, 2015 10:35 AM To: yarn-dev@hadoop.apache.org Subject: Re: [DISCUSS] Developing features in branches Following up on Zhijie's comments, there's nothing to prevent periodically pulling updates from the main branch (e.g. branch-2 or trunk) into the feature branch, is there? Or cherry-picking some changes to alleviate conflict management during branch merging? I've
Re: [DISCUSS] Developing features in branches
In HDFS, our recent feature branches tried to keep large portions of their new code in new classes (i.e. org.apache.hadoop.hdfs.server.namenode.CacheManager) or even new Java packages (i.e. org.apache.hadoop.hdfs.server.namenode.snapshot). We tried to make minimal changes in existing code: just enough to hook into the new code. If hooking into the new code isn't easy for some reason, then sometimes you can submit a non-impactful refactoring patch to trunk to help make it easier. By submitting straightforward refactorings to trunk first, you can reduce some of the difficulty of reviewing a large consolidated patch at merge time. Reviewers can focus on the new logic. This tends to minimize the impact of merge conflicts coming from either trunk or a sibling feature branch. This is only possible if it's a logically distinct new feature and this kind of code organization makes sense for that feature, but it's something to keep in mind. --Chris Nauroth On 4/30/15, 3:23 PM, Zhijie Shen zs...@hortonworks.com wrote: Exactly. Branch development is good, but I concerned about too many concurrent branches. In terms of code management, the good branch development candidate could be those like registry, shared cache and timeline service. Their most changes are the incremental code in some new sub-module, are less likely to conflict with trunk/branch-2, and are rarely depended by other parallel development. Thanks, Zhijie From: Bikas Saha bi...@hortonworks.com Sent: Thursday, April 30, 2015 12:52 PM To: yarn-dev@hadoop.apache.org Subject: RE: [DISCUSS] Developing features in branches I think what Zhijie is talking about is a little different. Work happening in parallel across 2 branches have no clue about each other since they don¹t get updates via master. If a bunch of these branches is tried to be merged close to a release then there are likely to be a lot of surprises. As an example, lets say support for speculation and node labels were happening in separate branches. It is very likely that 50% of the code would conflict - not just in code but also in semantics. Bikas -Original Message- From: Ray Chiang [mailto:rchi...@cloudera.com] Sent: Thursday, April 30, 2015 10:35 AM To: yarn-dev@hadoop.apache.org Subject: Re: [DISCUSS] Developing features in branches Following up on Zhijie's comments, there's nothing to prevent periodically pulling updates from the main branch (e.g. branch-2 or trunk) into the feature branch, is there? Or cherry-picking some changes to alleviate conflict management during branch merging? I've seen other projects use one of the two techniques above. -Ray On Wed, Apr 29, 2015 at 9:43 PM, Zhijie Shen zs...@hortonworks.com wrote: My 2 cents: Branch maintenance cost should be fine if we have few features to be developed in branches. However, if there're too many, each other branch may be blind to most of latest code change from others, and trunk/branch-2 becomes stale. That said, with the increasing adopting of branch development, it's likely to increase the cost of merging each branch back. Some features may last more than one releases, such as RM restarting before and timeline service now. Even if it's developed in a branch, we may want to merge its milestones such as phase 1, phase 2 back to trunk/branch-2 to align with some release before it's completely done. Moreover, my experience is that the longer a feature stays in the branch, the more conflicts we have to merge. Hence, it may not be a good idea to hold a feature in the branch too long before merging it back. Thanks, Zhijie From: Subramaniam V K subru...@gmail.com Sent: Wednesday, April 29, 2015 7:16 PM To: yarn-dev@hadoop.apache.org Subject: Re: [DISCUSS] Developing features in branches Karthik, thanks for starting the thread. Here's my $0.02 based on the experience of working on a feature branch while adding reservations (YARN-1051). Overall a +1 for the approach. The couple of pain points we faced were: 1) Merge cost with trunk 2) Lack of CI in the feature branch The migration to git keeping the feature branch in continuous sync with trunk mitigated (1) and with Allen's new test-patch.sh addressing (2), branches for features especially if used for all major features seems like an excellent choice. -Subru On Tue, Apr 28, 2015 at 5:47 PM, Sangjin Lee sjl...@gmail.com wrote: Ah, I missed that part (obviously). Fantastic! On Tue, Apr 28, 2015 at 5:31 PM, Sean Busbey bus...@cloudera.com wrote: On Apr 28, 2015 5:59 PM, Sangjin Lee sjl...@gmail.com wrote: That said, in a way we're deferring the cost of cleaning things up towards the end of the branch. For example, we don't get the same treatment of the hadoop jenkins in a branch development. It's left up to the group or the individuals to make sure to run test-patch.sh to ensure tech
Re: Planning Hadoop 2.6.1 release
Thank you, Arpit. In addition, I suggest we include the following: HADOOP-11333. Fix deadlock in DomainSocketWatcher when the notification pipe is full HADOOP-11604. Prevent ConcurrentModificationException while closing domain sockets during shutdown of DomainSocketWatcher thread. HADOOP-11648. Set DomainSocketWatcher thread name explicitly HADOOP-11802. DomainSocketWatcher thread terminates sometimes after there is an I/O error during requestShortCircuitShm HADOOP-11604 and 11648 are not critical by themselves, but they are pre-requisites to getting a clean cherry-pick of 11802, which we believe finally fixes the root cause of this issue. --Chris Nauroth On 4/30/15, 3:55 PM, Arpit Agarwal aagar...@hortonworks.com wrote: HDFS candidates for back-porting to Hadoop 2.6.1. The first two were requested in [1]. HADOOP-11674. oneByteBuf in CryptoInputStream and CryptoOutputStream should be non static HADOOP-11710. Make CryptoOutputStream behave like DFSOutputStream wrt synchronization HDFS-7009. Active NN and standby NN have different live nodes. HDFS-7035. Make adding a new data directory to the DataNode an atomic and improve error handling HDFS-7425. NameNode block deletion logging uses incorrect appender. HDFS-7443. Datanode upgrade to BLOCKID_BASED_LAYOUT fails if duplicate block files are present in the same volume. HDFS-7489. Incorrect locking in FsVolumeList#checkDirs can hang datanodes HDFS-7503. Namenode restart after large deletions can cause slow processReport. HDFS-7575. Upgrade should generate a unique storage ID for each volume. HDFS-7579. Improve log reporting during block report rpc failure. HDFS-7587. Edit log corruption can happen if append fails with a quota violation. HDFS-7596. NameNode should prune dead storages from storageMap. HDFS-7611. deleteSnapshot and delete of a file can leave orphaned blocks in the blocksMap on NameNode restart. HDFS-7714. Simultaneous restart of HA NameNodes and DataNode can cause DataNode to register successfully with only one NameNode. HDFS-7733. NFS: readdir/readdirplus return null directory attribute on failure. HDFS-7831. Fix the starting index and end condition of the loop in FileDiffList.findEarlierSnapshotBlocks(). HDFS-7885. Datanode should not trust the generation stamp provided by client. HDFS-7960. The full block report should prune zombie storages even if they're not empty. HDFS-8072. Reserved RBW space is not released if client terminates while writing block. HDFS-8127. NameNode Failover during HA upgrade can cause DataNode to finalize upgrade. Arpit [1] Will Hadoop 2.6.1 be released soon? http://markmail.org/thread/zlsr6prejyogdyvh On 4/27/15, 11:47 AM, Vinod Kumar Vavilapalli vino...@apache.org wrote: There were several requests on the user lists [1] for a 2.6.1 release. I got many offline comments too. Planning to do a 2.6.1 release in a few weeks time. We already have a bunch of tickets committed to 2.7.1. I created a filter [2] to tracking pending tickets. We need to collectively come up with a list of critical issues. We can use the JIRA Target Version field for the same. I see some but not a whole lot of new work for this release, most of it is likely going to be pulling in critical patches from 2.7.1/2.8 etc. Thoughts? Thanks +Vinod [1] Will Hadoop 2.6.1 be released soon? http://markmail.org/thread/zlsr6prejyogdyvh [2] 2.6.1 pending tickets https://issues.apache.org/jira/issues/?filter=12331711
[jira] [Resolved] (YARN-3524) Mapreduce failed due to AM Container-Launch failure at NM on windows
[ https://issues.apache.org/jira/browse/YARN-3524?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth resolved YARN-3524. - Resolution: Not A Problem Hello [~KaveenBigdata]. Nice debugging! The native components for Hadoop on Windows are built using either Windows SDK 7.1 or Visual Studio 2010. Because of this, there is a runtime dependency on the C++ 2010 runtime dll, which is MSVCR100.dll. You are correct that the fix in this case is to install the missing dll. I believe this is the official download location: https://www.microsoft.com/en-us/download/details.aspx?id=13523 Since this does not represent a bug in the Hadoop codebase, I'm resolving this issue as Not a Problem. Mapreduce failed due to AM Container-Launch failure at NM on windows Key: YARN-3524 URL: https://issues.apache.org/jira/browse/YARN-3524 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.5.2 Environment: Windows server 2012 and Windows-8 Hadoop-2.5.2 Java-1.7 Reporter: Kaveen Raajan I tried to run TEZ job on windows machine I successfully Build Tez-0.6.0 against Hadoop-2.5.2 Then I configured Tez-0.6.0 as like in http://tez.apache.org/install.html But I face following error while running this command Note: I'm using HADOOP High Availability setup. {code} Running OrderedWordCount SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/C:/Hadoop/ share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBind er.class] SLF4J: Found binding in [jar:file:/C:/Tez/lib /slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] 15/04/15 10:47:57 INFO client.TezClient: Tez Client Version: [ component=tez-api , version=0.6.0, revision=${buildNumber}, SCM-URL=scm:git:https://git-wip-us.apa che.org/repos/asf/tez.git, buildTime=2015-04-15T01:13:02Z ] 15/04/15 10:48:00 INFO client.TezClient: Submitting DAG application with id: app lication_1429073725727_0005 15/04/15 10:48:00 INFO Configuration.deprecation: fs.default.name is deprecated. Instead, use fs.defaultFS 15/04/15 10:48:00 INFO client.TezClientUtils: Using tez.lib.uris value from conf iguration: hdfs://HACluster/apps/Tez/,hdfs://HACluster/apps/Tez/lib/ 15/04/15 10:48:01 INFO client.TezClient: Stage directory /tmp/app/tez/sta ging doesn't exist and is created 15/04/15 10:48:01 INFO client.TezClient: Tez system stage directory hdfs://HACluster /tmp/app/tez/staging/.tez/application_1429073725727_0005 doesn't ex ist and is created 15/04/15 10:48:02 INFO client.TezClient: Submitting DAG to YARN, applicationId=a pplication_1429073725727_0005, dagName=OrderedWordCount 15/04/15 10:48:03 INFO impl.YarnClientImpl: Submitted application application_14 29073725727_0005 15/04/15 10:48:03 INFO client.TezClient: The url to track the Tez AM: http://MASTER_NN1:8088/proxy/application_1429073725727_0005/ 15/04/15 10:48:03 INFO client.DAGClientImpl: Waiting for DAG to start running 15/04/15 10:48:09 INFO client.DAGClientImpl: DAG completed. FinalState=FAILED OrderedWordCount failed with diagnostics: [Application application_1429073725727 _0005 failed 2 times due to AM Container for appattempt_1429073725727_0005_0 2 exited with exitCode: -1073741515 due to: Exception from container-launch: Ex itCodeException exitCode=-1073741515: ExitCodeException exitCode=-1073741515: at org.apache.hadoop.util.Shell.runCommand(Shell.java:538) at org.apache.hadoop.util.Shell.run(Shell.java:455) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java: 702) at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.la unchContainer(DefaultContainerExecutor.java:195) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.C ontainerLaunch.call(ContainerLaunch.java:300) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.C ontainerLaunch.call(ContainerLaunch.java:81) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor. java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor .java:615) at java.lang.Thread.run(Thread.java:744) 1 file(s) moved. Container exited with a non-zero exit code -1073741515 .Failing this attempt.. Failing the application.] {code} While Seeing at Resourcemanager log: {code} 2015-04-19 21:49:57,533 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue: completedContainer
Re: Set minimum version of Hadoop 3 to JDK 8
Suppose we configure maven-compiler-plugin with source set to 1.7 but target set to 1.8 in trunk. I believe this would have the effect of generating JDK 8 bytecode, but enforcing that our code sticks to JDK 7 compatibility at compile time. Does that still satisfy requirements for HADOOP-11858? I'd prefer to avoid executing duplicate builds for different JDK versions. Pre-commit already takes a long time, and I suspect doubling the amount of builds will make us starved for executors in Jenkins. Chris Nauroth Hortonworks http://hortonworks.com/ On 4/21/15, 8:38 PM, Sean Busbey bus...@cloudera.com wrote: A few options: * Only change the builds for master to use jdk8 * build with both jdk7 and jdk8 by copying jobs * build with both jdk7 and jdk8 using a jenkins matrix build Robert, if you'd like help with any of these please send me a ping off-list. On Tue, Apr 21, 2015 at 8:19 PM, Vinod Kumar Vavilapalli vino...@hortonworks.com wrote: We don't want JDK 8 only code going into branch-2 line. Moving Jenkins to 1.8 right-away will shield such code, how do we address that? Thanks, +Vinod On Apr 21, 2015, at 5:54 PM, Robert Kanter rkan...@cloudera.com wrote: Sure, I'll try to change the Jenkins builds to 1.8 first. On Tue, Apr 21, 2015 at 3:31 PM, Andrew Wang andrew.w...@cloudera.com wrote: Hey Robert, As a first step, could we try switching all our precommit and nightly builds over to use 1.8? This is a prerequisite for HADOOP-11858, and safe to do in any case since it'll still target 1.7. I'll note that HADOOP-10530 details the pain Steve went through switching us to JDK7. Might be some lessons learned about how to do this transition more smoothly. Thanks, Andrew On Tue, Apr 21, 2015 at 3:15 PM, Robert Kanter rkan...@cloudera.com wrote: + yarn-dev, hdfs-dev, mapred-dev On Tue, Apr 21, 2015 at 3:14 PM, Robert Kanter rkan...@cloudera.com wrote: Hi all, Moving forward on some of the discussions on Hadoop 3, I've created HADOOP-11858 to set the minimum version of Hadoop 3 to JDK 8. I just wanted to let everyone know in case there's some reason we shouldn't go ahead with this. thanks - Robert -- Sean
committing HADOOP-11746 test-patch improvements
I'd like to thank Allen Wittenauer for his work on HADOOP-11746 to rewrite test-patch.sh. There is a lot of nice new functionality in there. My favorite part is that some patches will execute much faster, so I expect this will make the project more efficient overall at moving patches through the pre-commit process. I have +1'd the patch, but since this is a tool that we all use frequently, I'd like to delay a day before committing. Please comment on the jira if you have any additional feedback. We're aiming to commit on Friday, 4/17. Chris Nauroth Hortonworks http://hortonworks.com/
Re: [VOTE] Release Apache Hadoop 2.7.0 RC0
+1 (binding) - Downloaded source tarball and verified signature and checksum. - Built from source, including native code on Linux and Windows. - Ran a 3-node unsecured cluster. - Tested various HDFS and YARN commands, including sample MapReduce jobs. - Confirmed that SecondaryNameNode can take a checkpoint. - HADOOP-9629: Tested integration with Azure storage as an alternative FileSystem. - HADOOP-11394/11395/11396: Built site docs and confirmed presence of these fixes in the documentation for Hadoop Compatible File Systems. - HDFS-7604: Confirmed presence of DataNode volume failure reporting in web UI and metrics. - HDFS-7879: Ran nm -g libhdfs.so and dumpbin /exports hdfs.dll to confirm export of public API symbols in libhdfs. Allen mentioned HDFS-8132, which reported a problem using JCarder on the build. Brahma, Todd and I have determined that root cause is incompatibility of the JCarder build with Java 7 classes. (2.7.0 is our first release compiling as Java 7.) This is not a problem that needs to hold up the release. Vinod, thank you for putting together the release. Chris Nauroth Hortonworks http://hortonworks.com/ On 4/15/15, 7:07 AM, Mit Desai mitde...@yahoo-inc.com.INVALID wrote: +1 (non-binding) - Downloaded and built the source. - Deployed to a local cluster and ran sample jobls like sleepJob and Wordcount. - Verified Signatures - Verified the RM UI for correctness. Thanks Vinod for taking the time and effort to drive this release. -Mit Desai On Wednesday, April 15, 2015 8:03 AM, Vinod Kumar Vavilapalli vino...@hortonworks.com wrote: Tx Pat. This is really interesting news! +Vinod On Apr 14, 2015, at 11:18 PM, Pat White patwh...@yahoo-inc.com.INVALID wrote: +1 non-binding Ran a performance comparison of 2.6.0 Latest with 2.7.0 RC, looks good, no regressions observed, most metrics are well within 5% tolerance (dfsio, sort, amrestart, gmv3) and some tests (scan, amscale, compression, shuffle) appear to have improvements. Disclaimer, please note a JDK difference, 2.6.0 ran with jdk1.7 while 2.7.0 had jdk1.8, so some of the improvement may be from Java 1.8.0, that said, the 2.7.0 benchmarks compare well against current 2.6.0. Thanks. patw - Forwarded Message - From: Vinod Kumar Vavilapalli vino...@apache.org To: common-...@hadoop.apache.org; hdfs-...@hadoop.apache.org; yarn-dev@hadoop.apache.org; mapreduce-...@hadoop.apache.org Cc: vino...@apache.org Sent: Friday, April 10, 2015 6:44 PM Subject: [VOTE] Release Apache Hadoop 2.7.0 RC0 Hi all, I've created a release candidate RC0 for Apache Hadoop 2.7.0. The RC is available at: http://people.apache.org/~vinodkv/hadoop-2.7.0-RC0/ The RC tag in git is: release-2.7.0-RC0 The maven artifacts are available via repository.apache.org athttps://repository.apache.org/content/repositories/orgapachehadoop-1017 / As discussed before - This release will only work with JDK 1.7 and above - I¹d like to use this as a starting release for 2.7.x [1], depending onhow it goes, get it stabilized and potentially use a 2.7.1 in a fewweeks as the stable release. Please try the release and vote; the vote will run for the usual 5 days. Thanks, Vinod [1]: A 2.7.1 release to follow up 2.7.0http://markmail.org/thread/zwzze6cqqgwq4rmw
Re: A 2.7.1 release to follow up 2.7.0
+1, full agreement with both Vinod and Karthik. Thanks! Chris Nauroth Hortonworks http://hortonworks.com/ On 4/9/15, 12:07 PM, Karthik Kambatla ka...@cloudera.com wrote: Inline. On Thu, Apr 9, 2015 at 11:48 AM, Vinod Kumar Vavilapalli vino...@apache.org wrote: Hi all, I feel like we haven't done a great job of maintaining the previous 2.x releases. Seeing as how long 2.7.0 release has taken, I am sure we will spend more time stabilizing it, fixing issues etc. I propose that we immediately follow up 2.7.0 with a 2.7.1 within 2-3 weeks. The focus obviously is to have blocker issues, bug-fixes and *no* features. +1. Having a 2.7.2/2.7.3 to continue stabilizing is also appealing. Would greatly help folks who upgrade to later releases for major bug fixes instead of the new and shiny features. Improvements are going to be slightly hard to reason about, but I propose limiting ourselves to very small improvements, if at all. I would avoid any improvements unless they are to fix severe regressions - performance or otherwise. I guess they become blockers in that case. So, yeah, I suggest no improvements at all. The other area of concern with the previous releases had been compatibility. With help from Li Lu, I got jdiff reinstated in branch-2 (though patches are not yet in), and did a pass. In the unavoidable event that we find incompatibilities with 2.7.0, we can fix those in 2.7.1 and promote that to be the stable release. Sounds reasonable. Thoughts? Thanks,+Vinod -- Karthik Kambatla Software Engineer, Cloudera Inc. http://five.sentenc.es
Re: Erratic Jenkins behavior
I¹m pretty sure there is no guarantee of isolation on a shared .m2/repository directory for multiple concurrent Maven processes. I¹ve had a theory for a while that one build running ³mvm install² can overwrite the snapshot artifact that was just installed by another concurrent build. This can create bizarre problems, for example if a patch introduces a new class in hadoop-common and then references that class from hadoop-hdfs. I expect using completely separate work directories for .m2/repository, the patch directory, and the Jenkins workspace could resolve this. The typical cost for this kind of change is increased disk consumption and increased build time, since Maven would need to download dependencies fresh every time. Chris Nauroth Hortonworks http://hortonworks.com/ On 2/12/15, 2:00 PM, Colin P. McCabe cmcc...@apache.org wrote: We could potentially use different .m2 directories for each executor. I think this has been brought up in the past as well. I'm not sure how maven handles concurrent access to the .m2 directory... if it's not using flock or fnctl then it's not really safe. This might explain some of our missing class error issues. Colin On Tue, Feb 10, 2015 at 2:13 AM, Steve Loughran ste...@hortonworks.com wrote: Mvn is a dark mystery to us all. I wouldn't trust it not pick up things from other builds if they ended up published to ~/.m2/repository during the process On 9 February 2015 at 19:29:06, Colin P. McCabe (cmcc...@apache.orgmailto:cmcc...@apache.org) wrote: I'm sorry, I don't have any insight into this. With regard to HADOOP-11084, I thought that $BUILD_URL would be unique for each concurrent build, which would prevent build artifacts from getting mixed up between jobs. Based on the value of PATCHPROCESS that Kihwal posted, perhaps this is not the case? Perhaps someone can explain how this is supposed to work (I am a Jenkins newbie). regards, Colin On Thu, Feb 5, 2015 at 10:42 AM, Yongjun Zhang yzh...@cloudera.com wrote: Thanks Kihwal for bringing this up. Seems related to: https://issues.apache.org/jira/browse/HADOOP-11084 Hi Andrew/Arpit/Colin/Steve, you guys worked on this jira before, any insight about the issue Kihwal described? Thanks. --Yongjun On Thu, Feb 5, 2015 at 9:49 AM, Kihwal Lee kih...@yahoo-inc.com.invalid wrote: I am sure many of us have seen strange jenkins behavior out of the precommit builds. - build artifacts missing - serving build artifact belonging to another build. This also causes wrong precommit results to be posted on the bug. - etc. The latest one I saw is disappearance of the unit test stdout/stderr file during a build. After a successful run of unit tests, the file vanished, so the script could not cat it. It looked like another build process had deleted it, while this build was in progress. It might have something to do with the fact that the patch-dir is set like following: PATCHPROCESS=/home/jenkins/jenkins-slave/workspace/PreCommit-HDFS-Build /../patchprocessI don't have access to the jenkins build configs or the build machines, so I can't debug it further, but I think we need to take care of it sooner than later. Can any one help? Kihwal
[jira] [Created] (YARN-3015) yarn classpath command should support same options as hadoop classpath.
Chris Nauroth created YARN-3015: --- Summary: yarn classpath command should support same options as hadoop classpath. Key: YARN-3015 URL: https://issues.apache.org/jira/browse/YARN-3015 Project: Hadoop YARN Issue Type: Bug Components: scripts Reporter: Chris Nauroth Priority: Minor HADOOP-10903 enhanced the {{hadoop classpath}} command to support optional expansion of the wildcards and bundling the classpath into a jar file containing a manifest with the Class-Path attribute. The other classpath commands should do the same for consistency. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: [VOTE] Release Apache Hadoop 2.6.0
+1 (binding) - Verified checksums and signatures for source and binary tarballs. - Started a pseudo-distributed HDFS cluster in secure mode with SSL. - Tested various file system operations. - Verified HDFS-2856, the new feature to run a secure DataNode without requiring root. - Verified HDFS-7385, the recent blocker related to incorrect ACLs serialized to the edit log. Thank you to Arun as release manager, and thank you to all of the contributors for their hard work on this release. Chris Nauroth Hortonworks http://hortonworks.com/ On Fri, Nov 14, 2014 at 10:57 AM, Yongjun Zhang yzh...@cloudera.com wrote: Thanks Arun for leading the 2.6 release effort. +1 (non-binding) - Downloaded rc1 source and did build - Created two single-node clusters running 2.6 - Ran sample mapreduce job - Ran distcp between two clusters --Yongjun On Thu, Nov 13, 2014 at 3:08 PM, Arun C Murthy a...@hortonworks.com wrote: Folks, I've created another release candidate (rc1) for hadoop-2.6.0 based on the feedback. The RC is available at: http://people.apache.org/~acmurthy/hadoop-2.6.0-rc1 The RC tag in git is: release-2.6.0-rc1 The maven artifacts are available via repository.apache.org at https://repository.apache.org/content/repositories/orgapachehadoop-1013. Please try the release and vote; the vote will run for the usual 5 days. thanks, Arun -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You. -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
Re: [VOTE] Release Apache Hadoop 2.6.0
Hi Eric, There was a second release candidate created (named rc1), and voting started fresh in a new thread. You might want to join in on that second thread to make sure that your vote gets counted. Thanks! Chris Nauroth Hortonworks http://hortonworks.com/ On Fri, Nov 14, 2014 at 3:08 PM, Eric Payne erichadoo...@yahoo.com.invalid wrote: +1 I downloaded and built source. Started local cluster and ran wordcount, sleep, and simple streaming job. Also, I ran a distributed shell job which tested preserving containers across AM restart by setting the -keep_containers_across_application_attempts flag and killing the first AM once the containers start. Enabled the preemption feature and verified containers were preempted and queues were levelized. Ran unit tests for hadoop-yarn-server-resourcemanagerRan unit tests for hadoop-hdfs Thank you,-Eric Payne From: Arun C Murthy a...@hortonworks.com To: common-...@hadoop.apache.org common-...@hadoop.apache.org; hdfs-...@hadoop.apache.org hdfs-...@hadoop.apache.org; yarn-dev@hadoop.apache.org yarn-dev@hadoop.apache.org; mapreduce-...@hadoop.apache.org mapreduce-...@hadoop.apache.org Sent: Monday, November 10, 2014 8:52 PM Subject: [VOTE] Release Apache Hadoop 2.6.0 Folks, I've created a release candidate (rc0) for hadoop-2.6.0 that I would like to see released. The RC is available at: http://people.apache.org/~acmurthy/hadoop-2.6.0-rc0 The RC tag in git is: release-2.6.0-rc0 The maven artifacts are available via repository.apache.org at https://repository.apache.org/content/repositories/orgapachehadoop-1012. Please try the release and vote; the vote will run for the usual 5 days. thanks, Arun -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You. -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
Re: [VOTE] Release Apache Hadoop 2.6.0
I'm helping to expedite a complete, approved patch for HDFS-7385 now. Then, we can make a final decision on its inclusion in 2.6.0. Thank you for bringing it up, Yi. Chris Nauroth Hortonworks http://hortonworks.com/ On Wed, Nov 12, 2014 at 7:31 PM, Liu, Yi A yi.a@intel.com wrote: Arun, could you wait for HDFS-7385? It will cause issue of HDFS ACL and XAttrs in some case, the fix is very easy but I think the issue is critical. I'm helping review it, and expect to commit today. Thanks. Regards, Yi Liu -Original Message- From: Arun C Murthy [mailto:a...@hortonworks.com] Sent: Thursday, November 13, 2014 12:58 AM To: yarn-dev@hadoop.apache.org Cc: mapreduce-...@hadoop.apache.org; Ravi Prakash; hdfs-...@hadoop.apache.org; common-...@hadoop.apache.org Subject: Re: [VOTE] Release Apache Hadoop 2.6.0 Sounds good. I'll create an rc1. Thanks. Arun On Nov 11, 2014, at 2:06 PM, Robert Kanter rkan...@cloudera.com wrote: Hi Arun, We were testing the RC and ran into a problem with the recent fixes that were done for POODLE for Tomcat (HADOOP-11217 for KMS and HDFS-7274 for HttpFS). Basically, in disabling SSLv3, we also disabled SSLv2Hello, which is required for older clients (e.g. Java 6 with openssl 0.9.8x) so they can't connect without it. Just to be clear, it does not mean SSLv2, which is insecure. This also affects the MR shuffle in HADOOP-11243. The fix is super simple, so I think we should reopen these 3 JIRAs and put in addendum patches and get them into 2.6.0. thanks - Robert On Tue, Nov 11, 2014 at 1:04 PM, Ravi Prakash ravi...@ymail.com wrote: Hi Arun! We are very close to completion on YARN-1964 (DockerContainerExecutor). I'd also like HDFS-4882 to be checked in. Do you think these issues merit another RC? ThanksRavi On Tuesday, November 11, 2014 11:57 AM, Steve Loughran ste...@hortonworks.com wrote: +1 binding -patched slider pom to build against 2.6.0 -verified build did download, which it did at up to ~8Mbps. Faster than a local build. -full clean test runs on OS/X Linux Windows 2012: Same thing. I did have to first build my own set of the windows native binaries, by checking out branch-2.6.0; doing a native build, copying the binaries and then purging the local m2 repository of hadoop artifacts to be confident I was building against. For anyone who wants those native libs they will be up on https://github.com/apache/incubator-slider/tree/develop/bin/windows/ once it syncs with the ASF repos. afterwords: the tests worked! On 11 November 2014 02:52, Arun C Murthy a...@hortonworks.com wrote: Folks, I've created a release candidate (rc0) for hadoop-2.6.0 that I would like to see released. The RC is available at: http://people.apache.org/~acmurthy/hadoop-2.6.0-rc0 The RC tag in git is: release-2.6.0-rc0 The maven artifacts are available via repository.apache.org at https://repository.apache.org/content/repositories/orgapachehadoop-1012. Please try the release and vote; the vote will run for the usual 5 days. thanks, Arun -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You. -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You. -- Arun C. Murthy Hortonworks Inc. http://hortonworks.com/hdp/ -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately
Re: [VOTE] Release Apache Hadoop 2.6.0
I have committed HDFS-7385 down through branch-2.6.0. Thank you! Chris Nauroth Hortonworks http://hortonworks.com/ On Thu, Nov 13, 2014 at 9:14 AM, Chris Nauroth cnaur...@hortonworks.com wrote: I'm helping to expedite a complete, approved patch for HDFS-7385 now. Then, we can make a final decision on its inclusion in 2.6.0. Thank you for bringing it up, Yi. Chris Nauroth Hortonworks http://hortonworks.com/ On Wed, Nov 12, 2014 at 7:31 PM, Liu, Yi A yi.a@intel.com wrote: Arun, could you wait for HDFS-7385? It will cause issue of HDFS ACL and XAttrs in some case, the fix is very easy but I think the issue is critical. I'm helping review it, and expect to commit today. Thanks. Regards, Yi Liu -Original Message- From: Arun C Murthy [mailto:a...@hortonworks.com] Sent: Thursday, November 13, 2014 12:58 AM To: yarn-dev@hadoop.apache.org Cc: mapreduce-...@hadoop.apache.org; Ravi Prakash; hdfs-...@hadoop.apache.org; common-...@hadoop.apache.org Subject: Re: [VOTE] Release Apache Hadoop 2.6.0 Sounds good. I'll create an rc1. Thanks. Arun On Nov 11, 2014, at 2:06 PM, Robert Kanter rkan...@cloudera.com wrote: Hi Arun, We were testing the RC and ran into a problem with the recent fixes that were done for POODLE for Tomcat (HADOOP-11217 for KMS and HDFS-7274 for HttpFS). Basically, in disabling SSLv3, we also disabled SSLv2Hello, which is required for older clients (e.g. Java 6 with openssl 0.9.8x) so they can't connect without it. Just to be clear, it does not mean SSLv2, which is insecure. This also affects the MR shuffle in HADOOP-11243. The fix is super simple, so I think we should reopen these 3 JIRAs and put in addendum patches and get them into 2.6.0. thanks - Robert On Tue, Nov 11, 2014 at 1:04 PM, Ravi Prakash ravi...@ymail.com wrote: Hi Arun! We are very close to completion on YARN-1964 (DockerContainerExecutor). I'd also like HDFS-4882 to be checked in. Do you think these issues merit another RC? ThanksRavi On Tuesday, November 11, 2014 11:57 AM, Steve Loughran ste...@hortonworks.com wrote: +1 binding -patched slider pom to build against 2.6.0 -verified build did download, which it did at up to ~8Mbps. Faster than a local build. -full clean test runs on OS/X Linux Windows 2012: Same thing. I did have to first build my own set of the windows native binaries, by checking out branch-2.6.0; doing a native build, copying the binaries and then purging the local m2 repository of hadoop artifacts to be confident I was building against. For anyone who wants those native libs they will be up on https://github.com/apache/incubator-slider/tree/develop/bin/windows/ once it syncs with the ASF repos. afterwords: the tests worked! On 11 November 2014 02:52, Arun C Murthy a...@hortonworks.com wrote: Folks, I've created a release candidate (rc0) for hadoop-2.6.0 that I would like to see released. The RC is available at: http://people.apache.org/~acmurthy/hadoop-2.6.0-rc0 The RC tag in git is: release-2.6.0-rc0 The maven artifacts are available via repository.apache.org at https://repository.apache.org/content/repositories/orgapachehadoop-1012. Please try the release and vote; the vote will run for the usual 5 days. thanks, Arun -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You. -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You. -- Arun C. Murthy Hortonworks Inc. http://hortonworks.com/hdp/ -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any
[jira] [Created] (YARN-2803) MR distributed cache not working correctly on Windows after NodeManager privileged account changes.
Chris Nauroth created YARN-2803: --- Summary: MR distributed cache not working correctly on Windows after NodeManager privileged account changes. Key: YARN-2803 URL: https://issues.apache.org/jira/browse/YARN-2803 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Reporter: Chris Nauroth Priority: Critical This problem is visible by running {{TestMRJobs#testDistributedCache}} or {{TestUberAM#testDistributedCache}} on Windows. Both tests fail. Running git bisect, I traced it to the YARN-2198 patch to remove the need to run NodeManager as a privileged account. The tests started failing when that patch was committed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-2662) TestCgroupsLCEResourcesHandler leaks file descriptors.
Chris Nauroth created YARN-2662: --- Summary: TestCgroupsLCEResourcesHandler leaks file descriptors. Key: YARN-2662 URL: https://issues.apache.org/jira/browse/YARN-2662 Project: Hadoop YARN Issue Type: Bug Components: test Reporter: Chris Nauroth Assignee: Chris Nauroth Priority: Minor {{TestCgroupsLCEResourcesHandler}} includes tests that write and read values from the various cgroups files. After the tests read from a file, they do not close it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-2549) TestContainerLaunch fails due to classpath problem with hamcrest classes.
Chris Nauroth created YARN-2549: --- Summary: TestContainerLaunch fails due to classpath problem with hamcrest classes. Key: YARN-2549 URL: https://issues.apache.org/jira/browse/YARN-2549 Project: Hadoop YARN Issue Type: Test Components: test, nodemanager Reporter: Chris Nauroth Assignee: Chris Nauroth Priority: Minor The mockito jar bundles its own copy of the hamcrest classes, and it's ahead of our hamcrest dependency jar on the test classpath for hadoop-yarn-server-nodemanager. Unfortunately, the version bundled in mockito doesn't match the version we need, so it's missing the {{CoreMatchers#containsString}} method. This causes the tests to fail with {{NoSuchMethodError}} on Windows. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: should clients be able to override *-site.xml?
I participated on the HADOOP-9450 code review. There was some debate about the significance of keeping HADOOP_CONF_DIR first. Ultimately, there was agreement that FIRST really ought to mean first. Apparently, other Hadoop ecosystem projects have made the same interpretation and implemented their scripts accordingly. Just echoing what Allen said, there were always plenty of ways for a user to override *-site.xml too. Of course, I'm sorry to hear the change caused an issue at your company, Sangjin. Chris Nauroth Hortonworks http://hortonworks.com/ On Wed, Sep 10, 2014 at 10:55 AM, Allen Wittenauer a...@altiscale.com wrote: On Sep 10, 2014, at 10:14 AM, Sangjin Lee sj...@apache.org wrote: if HADOOP_USER_CLASSPATH_FIRST is set and the user provides his/her version of *-site.xml through HADOOP_CLASSPATH, the user would end up trumping the hadoop configuration. And I believe this behavior is preserved after Allen's changes (HADOOP-9902). I'd be surprised if 9902 changed this behavior, especially given that HADOOP_CONF_DIR is added fairly quickly to the CLASSPATH and HADOOP_USER_CLASSPATH is processed last or near last. Is this an intended behavior? What I'm not sure of is whether we expect the client to be able to override the site.xml files the hadoop configuration provides. If that is true, then it is working as desired. If not, we'd need to fix this behavior. Thoughts? Users have always been able to override the *-site.xml files via java properties, jobconf/etc, pointing HADOOP_CONF_DIR to somewhere else, or providing --conf flags to shell commands. In other words, this isn't new behavior, but provides yet another option to override the provided settings. -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
HDFS-573: Windows builds will require CMake
FYI, I plan to commit HDFS-573 next week, which ports libhdfs to Windows. With this patch, we have a new build requirement: Windows build machines now require CMake. I've updated BUILDING.txt accordingly. For those of you working on a Windows dev machine, please install CMake at your earliest convenience. I'll hold off committing for a few more days to give everyone time to react. Chris Nauroth Hortonworks http://hortonworks.com/ -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
Re: [DISCUSS] Assume Private-Unstable for classes that are not annotated
+1 for the proposal. I believe stating that classes without annotations are implicitly private is consistent with what we publish for our JavaDocs. IncludePublicAnnotationsStandardDoclet, used in the root pom.xml, filters out classes that don't explicitly have the Public annotation. Chris Nauroth Hortonworks http://hortonworks.com/ On Wed, Jul 23, 2014 at 10:55 AM, Karthik Kambatla ka...@cloudera.com wrote: Fair points, Jason. The fact that we include this in the compatibility guideline should not affect how developers go about this. We should still strive to annotate every new class we add, and reviewers should continue to check for them. However, in case we miss annotations, we won't be burdened to support those APIs for essentially eternity. I am aware of downstream projects that use @Private APIs, but I have also seen that improve in the recent past with compatible 2.x releases. So, I am hoping they will let us know of APIs they would like to see and eventually use only Public-Stable APIs. On Wed, Jul 23, 2014 at 7:22 AM, Jason Lowe jl...@yahoo-inc.com.invalid wrote: I think that's a reasonable proposal as long as we understand it changes the burden from finding all the things that should be marked @Private to finding all the things that should be marked @Public. As Tom Graves pointed out in an earlier discussion about @LimitedPrivate, it may be impossible to do a straightforward task and use only interfaces marked @Public. If users can't do basic things without straying from @Public interfaces then tons of code can break if we assume it's always fair game to change anything not marked @Public. The well you shouldn't have used a non-@Public interface argument is not very useful in that context. So as long as we're good about making sure officially supported features have corresponding @Public interfaces to wield them then I agree it will be easier to track those rather than track all the classes that should be @Private. Hopefully if users understand that's how things work they'll help file JIRAs for interfaces that need to be @Public to get their work done. Jason On 07/22/2014 04:54 PM, Karthik Kambatla wrote: Hi devs As you might have noticed, we have several classes and methods in them that are not annotated at all. This is seldom intentional. Avoiding incompatible changes to all these classes can be considerable baggage. I was wondering if we should add an explicit disclaimer in our compatibility guide that says, Classes without annotations are to considered @Private For methods, is it reasonable to say - Class members without specific annotations inherit the annotations of the class? Thanks Karthik -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
Re: Jenkins Build Slaves
Thanks, Giri, for taking care of pkgconfig. It looks like most (all?) pre-commit builds have some new failing tests: https://builds.apache.org/job/PreCommit-HADOOP-Build/4247/testReport/ On the symlink tests, is there any chance that the new hosts have a different version/different behavior for the ln command? The TestIPC failure is in a stress test that checks behavior after spamming a lot of connections at an RPC server. Maybe the new hosts have something different in the TCP stack, such as TCP backlog? I likely won't get a chance to investigate any more today, but I wanted to raise the issue in case someone else gets an opportunity to look. Chris Nauroth Hortonworks http://hortonworks.com/ On Wed, Jul 9, 2014 at 10:33 AM, Giridharan Kesavan gkesa...@hortonworks.com wrote: I dont think so, let me fix that. Thanks Chris for pointing that out. -giri On Wed, Jul 9, 2014 at 9:50 AM, Chris Nauroth cnaur...@hortonworks.com wrote: Hi Giri, Is pkgconfig deployed on the new Jenkins slaves? I noticed this build failed: https://builds.apache.org/job/PreCommit-HADOOP-Build/4237/ Looking in the console output, it appears the HDFS native code failed to build due to missing pkgconfig. [exec] CMake Error at /usr/share/cmake-2.8/Modules/FindPackageHandleStandardArgs.cmake:108 (message): [exec] Could NOT find PkgConfig (missing: PKG_CONFIG_EXECUTABLE) Chris Nauroth Hortonworks http://hortonworks.com/ On Wed, Jul 9, 2014 at 7:08 AM, Giridharan Kesavan gkesa...@hortonworks.com wrote: Build jobs are now configured to run on the newer set of slaves. -giri On Mon, Jul 7, 2014 at 4:12 PM, Giridharan Kesavan gkesa...@hortonworks.com wrote: All Yahoo is in the process of retiring all the hadoop jenkins build slaves, *hadoop[1-9]* and replace them with a newer set of beefier hosts. These new machines are configured with *ubuntu-14.04*. Over the next couple of days I will be configuring the build jobs to run on these newly configured build slaves. To automate the installation of tools and build libraries I have put together ansible scripts and here is the link to the toolchain repo. *https://github.com/apache/toolchain https://github.com/apache/toolchain * During the transition, the old build slave will be accessible, and expected to be shutdown by 07/15. I will send out an update later this week when this transition is complete. *Mean while, I would like to request the project owners to remove/cleanup any stale * *jenkins job for their respective project and help with any builds issue to make this * *transition seamless. * Thanks - Giri -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You. -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
Re: Jenkins Build Slaves
Hi Giri, Is pkgconfig deployed on the new Jenkins slaves? I noticed this build failed: https://builds.apache.org/job/PreCommit-HADOOP-Build/4237/ Looking in the console output, it appears the HDFS native code failed to build due to missing pkgconfig. [exec] CMake Error at /usr/share/cmake-2.8/Modules/FindPackageHandleStandardArgs.cmake:108 (message): [exec] Could NOT find PkgConfig (missing: PKG_CONFIG_EXECUTABLE) Chris Nauroth Hortonworks http://hortonworks.com/ On Wed, Jul 9, 2014 at 7:08 AM, Giridharan Kesavan gkesa...@hortonworks.com wrote: Build jobs are now configured to run on the newer set of slaves. -giri On Mon, Jul 7, 2014 at 4:12 PM, Giridharan Kesavan gkesa...@hortonworks.com wrote: All Yahoo is in the process of retiring all the hadoop jenkins build slaves, *hadoop[1-9]* and replace them with a newer set of beefier hosts. These new machines are configured with *ubuntu-14.04*. Over the next couple of days I will be configuring the build jobs to run on these newly configured build slaves. To automate the installation of tools and build libraries I have put together ansible scripts and here is the link to the toolchain repo. *https://github.com/apache/toolchain https://github.com/apache/toolchain * During the transition, the old build slave will be accessible, and expected to be shutdown by 07/15. I will send out an update later this week when this transition is complete. *Mean while, I would like to request the project owners to remove/cleanup any stale * *jenkins job for their respective project and help with any builds issue to make this * *transition seamless. * Thanks - Giri -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You. -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
Re: Moving to JDK7, JDK8 and new major releases
Following up on ecosystem, I just took a look at the Apache trunk pom.xml files for HBase, Flume and Oozie. All are specifying 1.6 for source and target in the maven-compiler-plugin configuration, so there may be additional follow-up required here. (For example, if HBase has made a statement that its client will continue to support JDK6, then it wouldn't be practical for them to link to a JDK7 version of hadoop-common.) +1 for the whole plan though. We can work through these details. Chris Nauroth Hortonworks http://hortonworks.com/ On Fri, Jun 27, 2014 at 3:10 PM, Karthik Kambatla ka...@cloudera.com wrote: +1 to making 2.6 the last JDK6 release. If we want, 2.7 could be a parallel release or one soon after 2.6. We could upgrade other dependencies that require JDK7 as well. On Fri, Jun 27, 2014 at 3:01 PM, Arun C. Murthy a...@hortonworks.com wrote: Thanks everyone for the discussion. Looks like we have come to a pragmatic and progressive conclusion. In terms of execution of the consensus plan, I think a little bit of caution is in order. Let's give downstream projects more of a runway. I propose we inform HBase, Pig, Hive etc. that we are considering making 2.6 (not 2.5) the last JDK6 release and solicit their feedback. Once they are comfortable we can pull the trigger in 2.7. thanks, Arun On Jun 27, 2014, at 11:34 AM, Karthik Kambatla ka...@cloudera.com wrote: As someone else already mentioned, we should announce one future release (may be, 2.5) as the last JDK6-based release before making the move to JDK7. I am comfortable calling 2.5 the last JDK6 release. On Fri, Jun 27, 2014 at 11:26 AM, Andrew Wang andrew.w...@cloudera.com wrote: Hi all, responding to multiple messages here, Arun, thanks for the clarification regarding MR classpaths. It sounds like the story there is improved and still improving. However, I think we still suffer from this at least on the HDFS side. We have a single JAR for all of HDFS, and our clients need to have all the fun deps like Guava on the classpath. I'm told Spark sticks a newer Guava at the front of the classpath and the HDFS client still works okay, but this is more happy coincidence than anything else. While we're leaking deps, we're in a scary situation. API compat to me means that an app should be able to run on a new minor version of Hadoop and not have anything break. MAPREDUCE-4421 sounds like it allows you to run e.g. 2.3 MR jobs on a 2.4 YARN cluster, but what should also be possible is running an HDFS 2.3 app with HDFS 2.4 JARs and have nothing break. If we muck with the classpath, my understanding is that this could break. Owen, bumping the minimum JDK version in a minor release like this should be a one-time exception as Tucu stated. A number of people have pointed out how painful a forced JDK upgrade is for end users, and it's not something we should be springing on them in a minor release unless we're *very* confident like in this case. Chris, thanks for bringing up the ecosystem. For CDH5, we standardized on JDK7 across the CDH stack, so I think that's an indication that most ecosystem projects are ready to make the jump. Is that sufficient in your mind? For the record, I'm also +1 on the Tucu plan. Is it too late to do this for 2.5? I'll offer to help out with some of the mechanics. Thanks, Andrew On Wed, Jun 25, 2014 at 4:18 PM, Chris Nauroth cnaur...@hortonworks.com wrote: I understood the plan for avoiding JDK7-specific features in our code, and your suggestion to add an extra Jenkins job is a great way to guard against that. The thing I haven't seen discussed yet is how downstream projects will continue to consume our built artifacts. If a downstream project upgrades to pick up a bug fix, and the jar switches to 1.7 class files, but their project is still building with 1.6, then it would be a nasty surprise. These are the options I see: 1. Make sure all other projects upgrade first. This doesn't sound feasible, unless all other ecosystem projects have moved to JDK7 already. If not, then waiting on a single long pole project would hold up our migration indefinitely. 2. We switch to JDK7, but run javac with -target 1.6 until the whole ecosystem upgrades. I find this undesirable, because in a certain sense, it still leaves a bit of 1.6 lingering in the project. (I'll assume that end-of-life for JDK6 also means end-of-life for the 1.6 bytecode format.) 3. Just declare a clean break on some version (your earlier email said 2.5) and start publishing artifacts built with JDK7 and no -target option. Overall, this is my preferred option. However, as a side effect, this sets us up for longer-term maintenance and patch
Re: Anyone know how to mock a secured hdfs for unit test?
Hi David and Kai, There are a couple of challenges with this, but I just figured out a pretty decent setup while working on HDFS-2856. That code isn't committed yet, but if you open patch version 5 attached to that issue and look for the TestSaslDataTransfer class, then you'll see how it works. Most of the logic for bootstrapping a MiniKDC and setting up the right HDFS configuration properties is in an abstract base class named SaslDataTransferTestCase. I hope this helps. There are a few other open issues out there related to tests in secure mode. I know of HDFS-4312 and HDFS-5410. It would be great to get more regular test coverage with something that more closely approximates a secured deployment. Chris Nauroth Hortonworks http://hortonworks.com/ On Thu, Jun 26, 2014 at 7:27 AM, Zheng, Kai kai.zh...@intel.com wrote: Hi David, Quite some time ago I opened HADOOP-9952 and planned to create secured MiniClusters by making use of MiniKDC. Unfortunately since then I didn't get the chance to work on it yet. If you need something like that and would contribute, please let me know and see if anything I can help with. Thanks. Regards, Kai -Original Message- From: Liu, David [mailto:liujion...@gmail.com] Sent: Thursday, June 26, 2014 10:12 PM To: hdfs-...@hadoop.apache.org; hdfs-iss...@hadoop.apache.org; yarn-dev@hadoop.apache.org; yarn-iss...@hadoop.apache.org; mapreduce-...@hadoop.apache.org; secur...@hadoop.apache.org Subject: Anyone know how to mock a secured hdfs for unit test? Hi all, I need to test my code which read data from secured hdfs, is there any library to mock secured hdfs, can minihdfscluster do the work? Any suggestion is appreciated. Thanks -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
Re: Moving to JDK7, JDK8 and new major releases
I'm also +1 for getting us to JDK7 within the 2.x line after reading the proposals and catching up on the discussion in this thread. Has anyone yet considered how to coordinate this change with downstream projects? Would we request downstream projects to upgrade to JDK7 first before we make the move? Would we switch to JDK7, but run javac -target 1.6 to maintain compatibility for downstream projects during an interim period? Chris Nauroth Hortonworks http://hortonworks.com/ On Wed, Jun 25, 2014 at 9:48 AM, Owen O'Malley omal...@apache.org wrote: On Tue, Jun 24, 2014 at 4:44 PM, Alejandro Abdelnur t...@cloudera.com wrote: After reading this thread and thinking a bit about it, I think it should be OK such move up to JDK7 in Hadoop I agree with Alejandro. Changing minimum JDKs is not an incompatible change and is fine in the 2 branch. (Although I think it is would *not* be appropriate for a patch release.) Of course we need to do it with forethought and testing, but moving off of JDK 6, which is EOL'ed is a good thing. Moving to Java 8 as a minimum seems much too aggressive and I would push back on that. I'm also think that we need to let the dust settle on the Hadoop 2 line for a while before we talk about Hadoop 3. It seems that it has only been in the last 6 months that Hadoop 2 adoption has reached the main stream users. Our user community needs time to digest the changes in Hadoop 2.x before we fracture the community by starting to discuss Hadoop 3 releases. .. Owen -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
Re: Moving to JDK7, JDK8 and new major releases
I understood the plan for avoiding JDK7-specific features in our code, and your suggestion to add an extra Jenkins job is a great way to guard against that. The thing I haven't seen discussed yet is how downstream projects will continue to consume our built artifacts. If a downstream project upgrades to pick up a bug fix, and the jar switches to 1.7 class files, but their project is still building with 1.6, then it would be a nasty surprise. These are the options I see: 1. Make sure all other projects upgrade first. This doesn't sound feasible, unless all other ecosystem projects have moved to JDK7 already. If not, then waiting on a single long pole project would hold up our migration indefinitely. 2. We switch to JDK7, but run javac with -target 1.6 until the whole ecosystem upgrades. I find this undesirable, because in a certain sense, it still leaves a bit of 1.6 lingering in the project. (I'll assume that end-of-life for JDK6 also means end-of-life for the 1.6 bytecode format.) 3. Just declare a clean break on some version (your earlier email said 2.5) and start publishing artifacts built with JDK7 and no -target option. Overall, this is my preferred option. However, as a side effect, this sets us up for longer-term maintenance and patch releases off of the 2.4 branch if a downstream project that's still on 1.6 needs to pick up a critical bug fix. Of course, this is all a moot point if all the downstream ecosystem projects have already made the switch to JDK7. I don't know the status of that off the top of my head. Maybe someone else out there knows? If not, then I expect I can free up enough in a few weeks to volunteer for tracking down that information. Chris Nauroth Hortonworks http://hortonworks.com/ On Wed, Jun 25, 2014 at 3:12 PM, Alejandro Abdelnur t...@cloudera.com wrote: Chris, Compiling with jdk7 and doing javac -target 1.6 is not sufficient, you are still using jdk7 libraries and you could use new APIs, thus breaking jdk6 both at compile and runtime. you need to compile with jdk6 to ensure you are not running into that scenario. that is why i was suggesting the nightly jdk6 build/test jenkins job. On Wed, Jun 25, 2014 at 2:04 PM, Chris Nauroth cnaur...@hortonworks.com wrote: I'm also +1 for getting us to JDK7 within the 2.x line after reading the proposals and catching up on the discussion in this thread. Has anyone yet considered how to coordinate this change with downstream projects? Would we request downstream projects to upgrade to JDK7 first before we make the move? Would we switch to JDK7, but run javac -target 1.6 to maintain compatibility for downstream projects during an interim period? Chris Nauroth Hortonworks http://hortonworks.com/ On Wed, Jun 25, 2014 at 9:48 AM, Owen O'Malley omal...@apache.org wrote: On Tue, Jun 24, 2014 at 4:44 PM, Alejandro Abdelnur t...@cloudera.com wrote: After reading this thread and thinking a bit about it, I think it should be OK such move up to JDK7 in Hadoop I agree with Alejandro. Changing minimum JDKs is not an incompatible change and is fine in the 2 branch. (Although I think it is would *not* be appropriate for a patch release.) Of course we need to do it with forethought and testing, but moving off of JDK 6, which is EOL'ed is a good thing. Moving to Java 8 as a minimum seems much too aggressive and I would push back on that. I'm also think that we need to let the dust settle on the Hadoop 2 line for a while before we talk about Hadoop 3. It seems that it has only been in the last 6 months that Hadoop 2 adoption has reached the main stream users. Our user community needs time to digest the changes in Hadoop 2.x before we fracture the community by starting to discuss Hadoop 3 releases. .. Owen -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You. -- Alejandro -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received
Re: [DISCUSS] Change by-laws on release votes: 5 days instead of 7
+1 binding. Thanks, Arun. Chris Nauroth Hortonworks http://hortonworks.com/ On Sat, Jun 21, 2014 at 10:36 AM, Arun C. Murthy a...@hortonworks.com wrote: Uma, Voting periods are defined in *minimum* terms, so it already covers what you'd like to see i.e. the vote can continue longer. thanks, Arun On Jun 21, 2014, at 2:19 AM, Gangumalla, Uma uma.ganguma...@intel.com wrote: How about proposing vote for 5days and give chance to RM for extending vote for 2more days( total to 7days) if the rc did not receive enough vote within 5days? If a rc received enough votes in 5days, RM can close vote. I can see an advantage of 7days voting is, that will cover all the week and weekend days. So, if someone wants to test on weekend time(due to the weekday schedules), that will give chance to them. Regards, Uma -Original Message- From: Arun C Murthy [mailto:a...@hortonworks.com] Sent: Saturday, June 21, 2014 11:25 AM To: hdfs-...@hadoop.apache.org; common-...@hadoop.apache.org; yarn-dev@hadoop.apache.org; mapreduce-...@hadoop.apache.org Subject: [DISCUSS] Change by-laws on release votes: 5 days instead of 7 Folks, I'd like to propose we change our by-laws to reduce our voting periods on new releases from 7 days to 5. Currently, it just takes too long to turn around releases; particularly if we have critical security fixes etc. Thoughts? thanks, Arun -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You. -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You. -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
[jira] [Created] (YARN-1970) Prepare YARN codebase for JUnit 4.11.
Chris Nauroth created YARN-1970: --- Summary: Prepare YARN codebase for JUnit 4.11. Key: YARN-1970 URL: https://issues.apache.org/jira/browse/YARN-1970 Project: Hadoop YARN Issue Type: Test Affects Versions: 2.4.0, 3.0.0 Reporter: Chris Nauroth Assignee: Chris Nauroth Priority: Minor Attachments: YARN-1970.1.patch HADOOP-10503 upgrades the entire Hadoop repo to use JUnit 4.11. Some of the YARN code needs some minor updates to fix deprecation warnings and test isolation problems before the upgrade. -- This message was sent by Atlassian JIRA (v6.2#6252)
Re: Policy on adding timeouts to tests
+common-dev, hdfs-dev My understanding of the current situation is that we had a period where we tried to enforce adding timeouts on all new tests in patches, but it caused trouble, and now we're back to not requiring it. Jenkins test-patch isn't checking for it anymore. I don't think patches are getting rejected for using timeouts though. The difficulty is that execution time is quite sensitive to the build environment. (Consider top-of-the-line server hardware used in build infrastructure vs. a dev running a VirtualBox VM with 1 dedicated CPU, 2 GB RAM and slow virtualized disk.) When we were enforcing timeouts, it was quite common to see follow-up patches tuning up the timeout settings to make tests work reliably in a greater variety of environments. At that point, the benefit of using the timeout becomes questionable, because now the fast machine is running with the longer timeout too. Chris Nauroth Hortonworks http://hortonworks.com/ On Mon, Apr 14, 2014 at 9:41 AM, Karthik Kambatla ka...@cloudera.comwrote: Hi folks Just wanted to check what our policy for adding timeouts to tests is. Do we encourage/discourage using timeouts for tests? If we discourage using timeouts for tests in general, are we okay with adding timeouts for a few tests where we explicitly want the test to fail if it takes longer than a particular amount of time? Thanks Karthik -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
Re: Thinking ahead
+1 The proposed content for 2.5 in the roadmap wiki looks good to me. On Apr 12, 2014 7:26 AM, Arun C Murthy a...@hortonworks.com wrote: Gang, With hadoop-2.4 out, it's time to think ahead. In the short-term hadoop-2.4.1 is in order; particularly with https://issues.apache.org/jira/browse/MAPREDUCE-5830 (it's a break to @Private API, unfortunately something Hive is using - sigh!). There are some other fixes which testing has uncovered; so it will be nice to pull them them in. I'm thinking of an RC by end of the coming week - committers, please be *very* conservative when getting stuff into 2.4.1 (i.e. merging to branch-2.4). Next up, hadoop-2.5. I've updated https://wiki.apache.org/hadoop/Roadmap with some candidates for consideration - please chime in and say 'aye'/'nay' or add new content. IAC, I suspect that list is too large. Rather than wait for everything it would be better to plan on releasing it on a time-bound manner; particularly around the Hadoop Summit. If that makes sense; I think we should target branching for 2.5 by mid-May to get it stable and released by early June. Thoughts? thanks, Arun -- Arun C. Murthy Hortonworks Inc. http://hortonworks.com/ -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You. -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
[jira] [Created] (YARN-1905) TestProcfsBasedProcessTree must only run on Linux.
Chris Nauroth created YARN-1905: --- Summary: TestProcfsBasedProcessTree must only run on Linux. Key: YARN-1905 URL: https://issues.apache.org/jira/browse/YARN-1905 Project: Hadoop YARN Issue Type: Test Components: nodemanager Affects Versions: 3.0.0, 2.4.0 Reporter: Chris Nauroth Assignee: Chris Nauroth The tests in {{TestProcfsBasedProcessTree}} only make sense on Linux, where the process tree calculations are based on reading the /proc file system. Right now, not all of the individual tests are skipped when the OS is not Linux. This patch will make it consistent. -- This message was sent by Atlassian JIRA (v6.2#6252)
Re: very old dependencies
Regarding JDK7, we have at least one jira I'm aware of proposing several improvements that we can make in our code by using new JDK7 APIs. https://issues.apache.org/jira/browse/HADOOP-9590 If anyone wants to repurpose this as a master JDK7 enhancement jira and start cataloging additional improvement ideas for using JDK7 APIs, please feel free. The reluctance I've heard around JDK7 mostly relates to testing and compatibility across ecosystem components. If there are success stories around large-scale Hadoop system test suites executed with JDK7, then that could help build confidence. For compatibility, I think this effectively means that Hadoop, HDFS, YARN and MapReduce have to wait until after all downstream ecosystem projects finish their upgrades to JDK7. Chris Nauroth Hortonworks http://hortonworks.com/ On Fri, Mar 28, 2014 at 11:04 AM, Steve Loughran ste...@hortonworks.comwrote: On 28 March 2014 17:00, Sandy Ryza sandy.r...@cloudera.com wrote: My understanding is that unfortunately we're stuck with these for the rest of 2.x, because changing them could break jobs that rely on them. For jobs that want to use newer versions, the recommendation is to use mapreduce.user.classpath.first or turn on classpath isolation with mapreduce.job.classloader. If you look at the compatibility statement of hadoop it makes clear there are no guarantees about dependencies and especially transitive ones. http://hadoop.apache.org/docs/r2.3.0/hadoop-project-dist/hadoop-common/Compatibility.html#Java_Classpath That is not an accident -any statement of stability would risk painting us into a corner and never able to update anything. It also allowed us to ship hadoop 2.2 without doing a rush update of every dependency -that would only have caused chaos and delays. Protubf - 2.5 was enough -and that was done because it was even worse than guava in terms of compatibility policy. The issue goes beyond MR as YARN apps pick up the core binaries, so are constrained by what comes in hadoop/lib. hdfs/lib and yarn/lib. most of which is pretty dated. and if you have an app like hbase or accumulo, with more current dependencies, you have to play games excluding all of hadoop's dependencies. But even that doesn't help with Guava, because it is so aggressive about retiring classes and methods. for the sake of our own code and more modern apps, I'm in favour of gradually rolling out updates that don't break things, force a move to java7 or require changes to the source. -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You. -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
Re: [DISCUSS] Clarification on Compatibility Policy: Upgraded Client + Old Server
Adding back all *-dev lists to make sure everyone is covered. Chris Nauroth Hortonworks http://hortonworks.com/ On Mon, Mar 24, 2014 at 2:02 PM, Chris Nauroth cnaur...@hortonworks.comwrote: Thank you, everyone, for the discussion. There is general agreement, so I have filed HADOOP-10423 with a patch to update the compatibility documentation. Chris Nauroth Hortonworks http://hortonworks.com/ On Thu, Mar 20, 2014 at 11:24 AM, Colin McCabe cmcc...@alumni.cmu.eduwrote: +1 for making this guarantee explicit. It also definitely seems like a good idea to test mixed versions in bigtop. HDFS is not immune to new client, old server scenarios because the HDFS client gets bundled into a lot of places. Colin On Mar 20, 2014 10:55 AM, Chris Nauroth cnaur...@hortonworks.com wrote: Our use of protobuf helps mitigate a lot of compatibility concerns, but there still can be situations that require careful coding on our part. When adding a new field to a protobuf message, the client might need to do a null check, even if the server-side implementation in the new version always populates the field. When adding a whole new RPC endpoint, the client might need to consider the possibility that the RPC endpoint isn't there on an old server, and degrade gracefully after the RPC fails. The original issue in MAPREDUCE-4052 concerned the script commands passed in a YARN container submission, where protobuf doesn't provide any validation beyond the fact that they're strings. Forward compatibility is harder than backward compatibility, and testing is a big challenge. Our test suites in the Hadoop repo don't cover this. Does anyone know if anything in Bigtop tries to run with mixed versions? I agree that we need to make it clear in the language that upgrading client alone is insufficient to get access to new server-side features, including new YARN APIs. Thanks for the suggestions, Steve. Chris Nauroth Hortonworks http://hortonworks.com/ On Thu, Mar 20, 2014 at 5:53 AM, Steve Loughran ste...@hortonworks.com wrote: I'm clearly supportive of this, though of course the testing costs needed to back up the assertion make it more expensive than just a statement. Two issues -we'd need to make clear that new cluster features that a client can invoke won't be available. You can't expect snapshot or symlink support running against a -2.2.0 cluster, even if the client supports it. -in YARN, there are no guarantees that an app compiled against later YARN APIs will work in old clusters. Because YARN apps upload themselves to the server, and run with their hadoop, hdfs yarn libraries. We have to do a bit of introspection in our code already to support this situation. The compatibility doc would need to be clear on that too: YARN apps that use new APIs (including new fields in datastructures) can expect link exceptions On 20 March 2014 04:25, Vinayakumar B vinayakuma...@huawei.com wrote: +1, I agree with your point Chris. It depends on the client application how they using the hdfs jars in their classpath. As implementation already supports the compatibility (through protobuf), No extra code changes required to support new Client + old server. I feel it will be good to explicitly mention about the compatibility of existing APIs in both versions. Anyway this is not applicable for the new APIs in latest client and this is understood. We can make it explicit in the document though. Regards, Vinayakumar B -Original Message- From: Chris Nauroth [mailto:cnaur...@hortonworks.com] Sent: 20 March 2014 05:36 To: common-...@hadoop.apache.org Cc: mapreduce-...@hadoop.apache.org; hdfs-...@hadoop.apache.org; yarn-dev@hadoop.apache.org Subject: Re: [DISCUSS] Clarification on Compatibility Policy: Upgraded Client + Old Server I think this kind of compatibility issue still could surface for HDFS, particularly for custom applications (i.e. something not executed via hadoop jar on a cluster node, where the client classes ought to be injected into the classpath automatically). Running DistCP between 2 clusters of different versions could result in a 2.4.0 client calling a 2.3.0 NameNode. Someone could potentially pick up the 2.4.0 WebHDFS client as a dependency and try to use it to make HTTP calls to a 2.3.0 HDFS cluster. Chris Nauroth Hortonworks http://hortonworks.com/ On Wed, Mar 19, 2014 at 4:28 PM, Vinod Kumar Vavilapalli vino...@apache.org wrote: It makes sense only for YARN today where we separated out the clients. HDFS is still a monolithic jar so this compatibility issue is kind of invalid there. +vinod On Mar 19, 2014, at 1:59 PM, Chris Nauroth
Re: [DISCUSS] Clarification on Compatibility Policy: Upgraded Client + Old Server
Our use of protobuf helps mitigate a lot of compatibility concerns, but there still can be situations that require careful coding on our part. When adding a new field to a protobuf message, the client might need to do a null check, even if the server-side implementation in the new version always populates the field. When adding a whole new RPC endpoint, the client might need to consider the possibility that the RPC endpoint isn't there on an old server, and degrade gracefully after the RPC fails. The original issue in MAPREDUCE-4052 concerned the script commands passed in a YARN container submission, where protobuf doesn't provide any validation beyond the fact that they're strings. Forward compatibility is harder than backward compatibility, and testing is a big challenge. Our test suites in the Hadoop repo don't cover this. Does anyone know if anything in Bigtop tries to run with mixed versions? I agree that we need to make it clear in the language that upgrading client alone is insufficient to get access to new server-side features, including new YARN APIs. Thanks for the suggestions, Steve. Chris Nauroth Hortonworks http://hortonworks.com/ On Thu, Mar 20, 2014 at 5:53 AM, Steve Loughran ste...@hortonworks.comwrote: I'm clearly supportive of this, though of course the testing costs needed to back up the assertion make it more expensive than just a statement. Two issues -we'd need to make clear that new cluster features that a client can invoke won't be available. You can't expect snapshot or symlink support running against a -2.2.0 cluster, even if the client supports it. -in YARN, there are no guarantees that an app compiled against later YARN APIs will work in old clusters. Because YARN apps upload themselves to the server, and run with their hadoop, hdfs yarn libraries. We have to do a bit of introspection in our code already to support this situation. The compatibility doc would need to be clear on that too: YARN apps that use new APIs (including new fields in datastructures) can expect link exceptions On 20 March 2014 04:25, Vinayakumar B vinayakuma...@huawei.com wrote: +1, I agree with your point Chris. It depends on the client application how they using the hdfs jars in their classpath. As implementation already supports the compatibility (through protobuf), No extra code changes required to support new Client + old server. I feel it will be good to explicitly mention about the compatibility of existing APIs in both versions. Anyway this is not applicable for the new APIs in latest client and this is understood. We can make it explicit in the document though. Regards, Vinayakumar B -Original Message- From: Chris Nauroth [mailto:cnaur...@hortonworks.com] Sent: 20 March 2014 05:36 To: common-...@hadoop.apache.org Cc: mapreduce-...@hadoop.apache.org; hdfs-...@hadoop.apache.org; yarn-dev@hadoop.apache.org Subject: Re: [DISCUSS] Clarification on Compatibility Policy: Upgraded Client + Old Server I think this kind of compatibility issue still could surface for HDFS, particularly for custom applications (i.e. something not executed via hadoop jar on a cluster node, where the client classes ought to be injected into the classpath automatically). Running DistCP between 2 clusters of different versions could result in a 2.4.0 client calling a 2.3.0 NameNode. Someone could potentially pick up the 2.4.0 WebHDFS client as a dependency and try to use it to make HTTP calls to a 2.3.0 HDFS cluster. Chris Nauroth Hortonworks http://hortonworks.com/ On Wed, Mar 19, 2014 at 4:28 PM, Vinod Kumar Vavilapalli vino...@apache.org wrote: It makes sense only for YARN today where we separated out the clients. HDFS is still a monolithic jar so this compatibility issue is kind of invalid there. +vinod On Mar 19, 2014, at 1:59 PM, Chris Nauroth cnaur...@hortonworks.com wrote: I'd like to discuss clarification of part of our compatibility policy. Here is a link to the compatibility documentation for release 2.3.0: http://hadoop.apache.org/docs/r2.3.0/hadoop-project-dist/hadoop-common /Compatibility.html#Wire_compatibility For convenience, here are the specific lines in question: Client-Server compatibility is required to allow users to continue using the old clients even after upgrading the server (cluster) to a later version (or vice versa). For example, a Hadoop 2.1.0 client talking to a Hadoop 2.3.0 cluster. Client-Server compatibility is also required to allow upgrading individual components without upgrading others. For example, upgrade HDFS from version 2.1.0 to 2.2.0 without upgrading MapReduce. Server-Server compatibility is required to allow mixed versions within an active cluster so the cluster may be upgraded without downtime in a rolling fashion
[DISCUSS] Clarification on Compatibility Policy: Upgraded Client + Old Server
I'd like to discuss clarification of part of our compatibility policy. Here is a link to the compatibility documentation for release 2.3.0: http://hadoop.apache.org/docs/r2.3.0/hadoop-project-dist/hadoop-common/Compatibility.html#Wire_compatibility For convenience, here are the specific lines in question: Client-Server compatibility is required to allow users to continue using the old clients even after upgrading the server (cluster) to a later version (or vice versa). For example, a Hadoop 2.1.0 client talking to a Hadoop 2.3.0 cluster. Client-Server compatibility is also required to allow upgrading individual components without upgrading others. For example, upgrade HDFS from version 2.1.0 to 2.2.0 without upgrading MapReduce. Server-Server compatibility is required to allow mixed versions within an active cluster so the cluster may be upgraded without downtime in a rolling fashion. Notice that there is no specific mention of upgrading the client ahead of the server. (There is no clause for upgraded client + old server.) Based on my experience, this is a valid use case when a user wants to pick up a client-side bug fix ahead of the cluster administrator's upgrade schedule. Is it our policy to maintain client compatibility with old clusters within the same major release? I think many of us have assumed that the answer is yes and coded our new features accordingly, but it isn't made explicit in the documentation. Do we all agree that the answer is yes, or is it possibly up for debate depending on the change in question? In RFC 2119 lingo, is it a MUST or a SHOULD? Either way, I'd like to update the policy text to make our decision clear. After we have consensus, I can volunteer to file an issue and patch the text of the policy. This discussion started initially in MAPREDUCE-4052, which involved changing our scripting syntax for MapReduce YARN container submissions. We settled the question there by gating the syntax change behind a configuration option. By default, it will continue using the existing syntax currently understood by the pre-2.4.0 NodeManager, thus preserving compatibility. We wanted to open the policy question for wider discussion though. Thanks, everyone. Chris Nauroth Hortonworks http://hortonworks.com/ -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
Re: [DISCUSS] Clarification on Compatibility Policy: Upgraded Client + Old Server
I think this kind of compatibility issue still could surface for HDFS, particularly for custom applications (i.e. something not executed via hadoop jar on a cluster node, where the client classes ought to be injected into the classpath automatically). Running DistCP between 2 clusters of different versions could result in a 2.4.0 client calling a 2.3.0 NameNode. Someone could potentially pick up the 2.4.0 WebHDFS client as a dependency and try to use it to make HTTP calls to a 2.3.0 HDFS cluster. Chris Nauroth Hortonworks http://hortonworks.com/ On Wed, Mar 19, 2014 at 4:28 PM, Vinod Kumar Vavilapalli vino...@apache.org wrote: It makes sense only for YARN today where we separated out the clients. HDFS is still a monolithic jar so this compatibility issue is kind of invalid there. +vinod On Mar 19, 2014, at 1:59 PM, Chris Nauroth cnaur...@hortonworks.com wrote: I'd like to discuss clarification of part of our compatibility policy. Here is a link to the compatibility documentation for release 2.3.0: http://hadoop.apache.org/docs/r2.3.0/hadoop-project-dist/hadoop-common/Compatibility.html#Wire_compatibility For convenience, here are the specific lines in question: Client-Server compatibility is required to allow users to continue using the old clients even after upgrading the server (cluster) to a later version (or vice versa). For example, a Hadoop 2.1.0 client talking to a Hadoop 2.3.0 cluster. Client-Server compatibility is also required to allow upgrading individual components without upgrading others. For example, upgrade HDFS from version 2.1.0 to 2.2.0 without upgrading MapReduce. Server-Server compatibility is required to allow mixed versions within an active cluster so the cluster may be upgraded without downtime in a rolling fashion. Notice that there is no specific mention of upgrading the client ahead of the server. (There is no clause for upgraded client + old server.) Based on my experience, this is a valid use case when a user wants to pick up a client-side bug fix ahead of the cluster administrator's upgrade schedule. Is it our policy to maintain client compatibility with old clusters within the same major release? I think many of us have assumed that the answer is yes and coded our new features accordingly, but it isn't made explicit in the documentation. Do we all agree that the answer is yes, or is it possibly up for debate depending on the change in question? In RFC 2119 lingo, is it a MUST or a SHOULD? Either way, I'd like to update the policy text to make our decision clear. After we have consensus, I can volunteer to file an issue and patch the text of the policy. This discussion started initially in MAPREDUCE-4052, which involved changing our scripting syntax for MapReduce YARN container submissions. We settled the question there by gating the syntax change behind a configuration option. By default, it will continue using the existing syntax currently understood by the pre-2.4.0 NodeManager, thus preserving compatibility. We wanted to open the policy question for wider discussion though. Thanks, everyone. Chris Nauroth Hortonworks http://hortonworks.com/ -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You. -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You. -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender
Re: Help : Source Code for Map and Reduce task spawning
Hi Sandeep, If you haven't already seen it, then you might find it helpful to look at the documentation on writing a YARN application. Since MapReduce is implemented in terms of a YARN application, it might be helpful to see the simpler example presented in this documentation before attempting to understand MapReduce. http://hadoop.apache.org/docs/r2.2.0/hadoop-yarn/hadoop-yarn-site/WritingYarnApplications.html As far as task spawning, the responsibilities are split across several classes in the hadoop-mapreduce-client-app sub-project. Here are a few classes that I think are relevant: RMContainerRequestor is responsible for obtaining the YARN containers for running MapReduce tasks. This is where you'll find the code that uses AllocateRequest to ask the ResourceManager for containers. ContainerLauncherImpl is responsible for actually launching the containers. This is where you'll find the code that uses StartContainerRequest to ask a NodeManager to run a task. TaskAttemptImpl is responsible for configuring exactly what gets run in one of the containers. This is where you'll find the code that uses ContainerLaunchContext to set up the exact commands to run for the task (either map or reduce). MapReduceChildJVM is also significant as a helper class. TaskAttemptImpl calls this to do things like setting up the task's environment variables and the exact launch command. Hope this helps! Chris Nauroth Hortonworks http://hortonworks.com/ On Thu, Feb 20, 2014 at 9:47 PM, Sandeep Kandula srkan...@ncsu.edu wrote: Hi, I am new to Hadoop and *I am interested in finding the code where the reduce and map tasks are spawn*. Towards this goal I have been going through the MapReduce, YARN source code for the past few days. I have started from the NodeManager class and found it launches containers on the corresponding node. MRAppMaster class is run by the launch_container.sh script downloaded on each of the nodes. I have observed that statemachines are used for the transition of a job, task and each of these transitions affect the state of the object. But I haven't really found a specific location in the code base where the map and reduce tasks are spawn. Any help in this regard is much appreciated. Thanks, Sandeep -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
Re: Is there any alternative solution thinking on the event model of YARN
The event model is so much simpler and mvn -Pvisualize draws out a beautiful state diagram. Oh my goodness. How have I gone so long without knowing about this? This is so awesome! Thanks for the tip, Ravi! Chris Nauroth Hortonworks http://hortonworks.com/ On Thu, Feb 20, 2014 at 7:47 PM, Vinod Kumar Vavilapalli vino...@apache.org wrote: I actually think that the component boundaries are much more cleaner now in YARN. Components (mostly) only interact via events and not via synchronous method calls which Ravi hinted to. Each event is decorated with its source and destination. This is arguably only using code comments, but if you think it helps, you can pursue https://issues.apache.org/jira/browse/YARN-1743. The implementation in YARN is in fact loosely modeled around actors. It's a custom implementation, we didn't go the full route as we didn't need to. Like Ravi said, it takes a little getting used to. I have seen developers beyond the initial set taking a little while getting used to but then doing lots of things much easily after they get a grip on it; specifically compared to my experience with devs working aroun Hadoop 1.x code, where we didn't have cleaner component boundaries. Let us know if things like YARN-1743 will help. We can do more. Definitely look for the state machines as Ravi mentioned, that can simplify your understanding of things a lot. +Vinod On Feb 20, 2014, at 5:54 PM, Jeff Zhang jezh...@gopivotal.com wrote: Hi Ravi, Thanks for your reply. The reason I think another alternative solution of event model is that I found that the actor model which is used by spark is much easier to read and understand. Here I will compare 2 differences on usage of these 2 framework ( I will ignore the performance comparison currently) 1. actor explicitly specify the event destination (event handler) when sending message, while it is not clear to know the event handler for yarn event model e.g actor: actorRef ! message // it is easy to understand that actorRef is the event destination (event handler) yarn: dispatcher.dispatch(message) // it's not clear who is the event handler, we must to look for the event registration code which is in other places. 2. actor has the event source builtin, so it is easy to send the message back. There's lots of state machines in yarn, and these state machines often send message between each other. e.g, ContainerImpl interact with ApplicationImpl by sending message. e.g. actor: sender ! message // sender is message sender actor reference which is builtin in actor, so it is easy to send message back yarn: dispatcher.dispatch(event) // yarn event model do not know the event source, even he know the source, he still need to rely on the dispatcher to send message. It is not easy for user to know the event flow from this piece of code. You still need to look for the event registration code to get know the event handler. Let me know if you have any thinking. Thanks Jeff Zhang On Fri, Feb 21, 2014 at 4:02 AM, Ravi Prakash ravi...@ymail.com wrote: Hi Jeff! The event model does have some issues, but I believe it has made things a lot simpler. The source could easily be added to the event object if you needed it to. There might be issues with flow control, but I thought they were fixed where they were cropping up. MRv1 had all these method calls which could affect the state in several ways, and synchronization and locking was extremely difficult to get right (perhaps only by the select few who completely understood the codebase). The event model is so much simpler and mvn -Pvisualize draws out a beautiful state diagram. It takes a little getting used to, but you can connect the debugger and trace through the code too with conditional breakpoints. This is of course just my opinion. Ravi On Wednesday, February 19, 2014 6:33 PM, Jeff Zhang jezh...@gopivotal.com wrote: Hi all, I have studied YARN for several months, and have some thinking on the event model of YARN. 1. The event model do help the performance of YARN by allowing async call 2. But the event model make the boundary of each component unclear. The event receiver do not know the sender of this event which make the reader difficult to understand the event flow. E.g. in node manager, there's several event sender and handler which include container , application, localization server, log aggregation service and so on. One component will send event to another component. Because of the lack of the event sender in receiver, it is not easy to read the code and understand the event flow. The event flow in resource manager is even more complex which involve the RMApp, RMAppAttempt, RMContainer, RMNode
[jira] [Resolved] (YARN-191) Enhancements to YARN for Windows Server and Windows Azure development and runtime environments
[ https://issues.apache.org/jira/browse/YARN-191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth resolved YARN-191. Resolution: Fixed We've completed the intended scope of this issue, so I'm resolving it. Thank you to all contributors! Enhancements to YARN for Windows Server and Windows Azure development and runtime environments -- Key: YARN-191 URL: https://issues.apache.org/jira/browse/YARN-191 Project: Hadoop YARN Issue Type: Improvement Reporter: Bikas Saha Assignee: Bikas Saha This JIRA tracks the work that needs to be done on trunk to enable Hadoop to run on Windows Server and Azure environments. This incorporates porting relevant work from the similar effort on branch 1 tracked via HADOOP-8079. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
Re: branch development for HADOOP-9639
+1 for the idea. The branch committership clause was added for exactly this kind of scenario. From the phrasing in the bylaws, it looks like we'll need assistance from PMC to get the ball rolling. Is there a PMC member out there who could volunteer to help start the process with Sangjin? Chris Nauroth Hortonworks http://hortonworks.com/ On Mon, Dec 2, 2013 at 11:47 AM, Sangjin Lee sj...@apache.org wrote: We have been having discussions on HADOOP-9639 (shared cache for jars) and the proposed design there for some time now. We are going to start work on this and have it vetted and reviewed by the community. I have just filed some more implementation JIRAs for this feature: YARN-1465, MAPREDUCE-5662, YARN-1466, YARN-1467 Rather than working privately in our corner and sharing a big patch at the end, I'd like to explore the idea of developing on a branch in the public to foster more public feedback. Recently the Hadoop PMC has passed the change to the bylaws to allow for branch committers ( http://mail-archives.apache.org/mod_mbox/hadoop-general/201307.mbox/%3CCACO5Y4y7HZnn3BS-ZyCVfv-UBcMudeQhndr2vqg%3DXqE1oBiQvQ%40mail.gmail.com%3E ), and I think it would be a good model for this development. I'd like to propose a branch development and a branch committer status for a couple of us who are going to work on this per bylaw. Could you please let me know what you think? Thanks, Sangjin -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
Re: Next releases
Arun, what are your thoughts on test-only patches? I know I've been merging a lot of Windows test stabilization patches down to branch-2.2. These can't rightly be called blockers, but they do improve dev experience, and there is no risk to product code. Chris Nauroth Hortonworks http://hortonworks.com/ On Fri, Nov 8, 2013 at 1:30 AM, Steve Loughran ste...@hortonworks.comwrote: On 8 November 2013 02:42, Arun C Murthy a...@hortonworks.com wrote: Gang, Thinking through the next couple of releases here, appreciate f/b. # hadoop-2.2.1 I was looking through commit logs and there is a *lot* of content here (81 commits as on 11/7). Some are features/improvements and some are fixes - it's really hard to distinguish what is important and what isn't. I propose we start with a blank slate (i.e. blow away branch-2.2 and start fresh from a copy of branch-2.2.0) and then be very careful and meticulous about including only *blocker* fixes in branch-2.2. So, most of the content here comes via the next minor release (i.e. hadoop-2.3) In future, we continue to be *very* parsimonious about what gets into a patch release (major.minor.patch) - in general, these should be only *blocker* fixes or key operational issues. +1 # hadoop-2.3 I'd like to propose the following features for YARN/MR to make it into hadoop-2.3 and punt the rest to hadoop-2.4 and beyond: * Application History Server - This is happening in a branch and is close; with it we can provide a reasonable experience for new frameworks being built on top of YARN. * Bug-fixes in RM Restart * Minimal support for long-running applications (e.g. security) via YARN-896 +1 -the complete set isn't going to make it, but I'm sure we can identify the key ones * RM Fail-over via ZKFC * Anything else? HDFS??? - If I had the time, I'd like to do some work on the HADOOP-9361 filesystem spec tests -this is mostly some specification, the basis of a better test framework for newer FS tests, and some more tests, with a couple of minor changes to some of the FS code, mainly in terms of tightening some of the exceptions thrown (IOE - EOF) otherwise: - I'd like the hadoop-openstack JAR in; it's already in branch-2 so it's a matter of ensuring testing during the release against as many providers as possible. - There are a fair few JIRAs about updating versions of dependencies -the S3 JetS3t update went in this week, but there are more, as well as cruft in the POMs which shows up downstream. I think we could update the low-risk dependencies (test-time, log4j, c), while avoiding those we know will be trouble (jetty). This may seem minor but it does make a big diff to the downstream projects. -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You. -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
[jira] [Resolved] (YARN-1219) FSDownload changes file suffix making FileUtil.unTar() throw exception
[ https://issues.apache.org/jira/browse/YARN-1219?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth resolved YARN-1219. - Resolution: Fixed I've committed this to trunk, branch-2, and branch-2.1-beta. Shanyu, thank you for the patch. Omkar, thank you for help with code review. FSDownload changes file suffix making FileUtil.unTar() throw exception -- Key: YARN-1219 URL: https://issues.apache.org/jira/browse/YARN-1219 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Affects Versions: 3.0.0, 2.1.1-beta, 2.1.2-beta Reporter: shanyu zhao Assignee: shanyu zhao Fix For: 2.1.2-beta Attachments: YARN-1219.patch While running a Hive join operation on Yarn, I saw exception as described below. This is caused by FSDownload copy the files into a temp file and change the suffix into .tmp before unpacking it. In unpack(), it uses FileUtil.unTar() which will determine if the file is gzipped by looking at the file suffix: {code} boolean gzipped = inFile.toString().endsWith(gz); {code} To fix this problem, we can remove the .tmp in the temp file name. Here is the detailed exception: org.apache.commons.compress.archivers.tar.TarArchiveInputStream.getNextTarEntry(TarArchiveInputStream.java:240) at org.apache.hadoop.fs.FileUtil.unTarUsingJava(FileUtil.java:676) at org.apache.hadoop.fs.FileUtil.unTar(FileUtil.java:625) at org.apache.hadoop.yarn.util.FSDownload.unpack(FSDownload.java:203) at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:287) at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:50) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) at java.util.concurrent.FutureTask.run(FutureTask.java:166) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) at java.util.concurrent.FutureTask.run(FutureTask.java:166) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:722) -- This message was sent by Atlassian JIRA (v6.1#6144)
Re: [VOTE] Release Apache Hadoop 2.1.1-beta
I suspect that HDFS-5228 needs to be a blocker for the RC, considering the impact on client code. Existing working client code can now get a NullPointerException as a result of this bug. Chris Nauroth Hortonworks http://hortonworks.com/ On Wed, Sep 18, 2013 at 3:41 PM, Roman Shaposhnik r...@apache.org wrote: On Tue, Sep 17, 2013 at 9:58 PM, Roman Shaposhnik r...@apache.org wrote: I'm also running the tests on fully distributed clusters in Bigtop -- will report the findings tomorrow. The first result of my test run is this: https://issues.apache.org/jira/browse/HDFS-5225 Not sure it it qualifies as a blocker, but it looks serious enough. Not just because it essentially make DN logs grow to infinity (this issue actually can be mitigated by managing local FS) but because it suggests a deeper problem somehow. I'd love to proceed with more testing, but I'm keeping this current cluster running if anybody wants to take a look. Thanks, Roman. -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
Re: [VOTE] Plan to create release candidate for 0.23.8
+1 (non-binding) BTW, I left a comment on HDFS-4835 suggesting that you include HDFS-3180 for WebHDFS socket connect/read timeouts. It's up to you. (I'm voting +1 for the release plan either way.) Chris Nauroth Hortonworks http://hortonworks.com/ On Fri, May 17, 2013 at 7:25 PM, Eli Collins e...@cloudera.com wrote: +1 On Friday, May 17, 2013, Thomas Graves wrote: Hello all, We've had a few critical issues come up in 0.23.7 that I think warrants a 0.23.8 release. The main one is MAPREDUCE-5211. There are a couple of other issues that I want finished up and get in before we spin it. Those include HDFS-3875, HDFS-4805, and HDFS-4835. I think those are on track to finish up early next week. So I hope to spin 0.23.8 soon after this vote completes. Please vote '+1' to approve this plan. Voting will close on Friday May 24th at 2:00pm PDT. Thanks, Tom Graves
[jira] [Created] (YARN-593) container launch on Windows does not correctly populate classpath with new process's environment variables and localized resources
Chris Nauroth created YARN-593: -- Summary: container launch on Windows does not correctly populate classpath with new process's environment variables and localized resources Key: YARN-593 URL: https://issues.apache.org/jira/browse/YARN-593 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Affects Versions: 3.0.0 Reporter: Chris Nauroth Assignee: Chris Nauroth On Windows, we must bundle the classpath of a launched container in an intermediate jar with a manifest. Currently, this logic incorrectly uses the nodemanager process's environment variables for substitution. Instead, it needs to use the new environment for the launched process. Also, the bundled classpath is missing some localized resources for directories, due to a quirk in the way {{File#toURI}} decides whether or not to append a trailing '/'. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: [VOTE] Release Apache Hadoop 0.23.7
+1 (non-binding) - downloaded binary tarball - verified signatures and checksums - deployed to 3 Ubuntu VMs: 1xNN, 1xRM, 2xDN, 2xNM, 1x2NN - tested multiple HDFS operations - ran wordcount MR job - verified that 2NN can take a checkpoint On Wed, Apr 17, 2013 at 1:17 AM, Chris Douglas cdoug...@apache.org wrote: +1 Verified checksums and signatures, ran some tests, built the tarball. -C On Thu, Apr 11, 2013 at 12:55 PM, Thomas Graves tgra...@yahoo-inc.com wrote: I've created a release candidate (RC0) for hadoop-0.23.7 that I would like to release. This release is a sustaining release with several important bug fixes in it. The RC is available at: http://people.apache.org/~tgraves/hadoop-0.23.7-candidate-0/ The RC tag in svn is here: http://svn.apache.org/viewvc/hadoop/common/tags/release-0.23.7-rc0/ The maven artifacts are available via repository.apache.org. Please try the release and vote; the vote will run for the usual 7 days. thanks, Tom Graves
Re: [VOTE] Release Apache Hadoop 2.0.4-alpha
+1 (non-binding) - downloaded binary tarball - verified signatures and checksums - deployed to 3 Ubuntu VMs: 1xNN, 1xRM, 2xDN, 2xNM, 1x2NN - tested multiple HDFS operations - ran wordcount MR job - verified that 2NN can take a checkpoint On Wed, Apr 17, 2013 at 1:24 AM, Chris Douglas cdoug...@apache.org wrote: +1 Verified checksum, signatures. Ran some tests, built the package. -C On Fri, Apr 12, 2013 at 2:56 PM, Arun C Murthy a...@hortonworks.com wrote: Folks, I've created a release candidate (RC2) for hadoop-2.0.4-alpha that I would like to release. The RC is available at: http://people.apache.org/~acmurthy/hadoop-2.0.4-alpha-rc2/ The RC tag in svn is here: http://svn.apache.org/repos/asf/hadoop/common/tags/release-2.0.4-alpha-rc2 The maven artifacts are available via repository.apache.org. Please try the release and vote; the vote will run for the usual 7 days. thanks, Arun -- Arun C. Murthy Hortonworks Inc. http://hortonworks.com/
Re: Maven build YARN ResourceManager only
No problem! I think yarn-dev is appropriate, so I'm removing user (bcc'd one last time). The user list is focused on how to use Hadoop, and the *-dev lists are focused on how to develop Hadoop. What specific problem are you seeing when you try to compile hadoop-yarn-server-resourcemanager independently? I'm going to take a guess that it can't find classes from its other dependencies in the Hadoop source tree. To handle this, you can run the following from the top of the source tree: mvn clean install -DskipTests This will build the whole source tree and install the resulting jars into your local Maven repository. Then, subsequent builds of individual submodules like hadoop-yarn-server-resourcemanager will link to the jars in your local Maven repository during their builds. When you pull in new changes from upstream, you may need to repeat the install. For example, this would be required if someone added a new method in hadoop-yarn-common and changed hadoop-yarn-server-resourcemanager to call it. (General rule of thumb: if your build breaks after pulling in new changes, try a fresh mvn clean install to see if that fixes it.) You may want to read the file BUILDING.txt in the root of the source tree, especially the section titled Building components separately. That file contains the same information and a lot of other helpful build tips. --Chris On Sat, Apr 13, 2013 at 7:10 AM, Chin-Jung Hsu oxhead.l...@gmail.comwrote: Hi Chris, Appreciate your help, and sorry for the crossposting on both 'user' and 'yarn-dev'. I first posted on yarn-dev and didn't see anything. I then thought I might not be able to post on that list. That's why I posted it again on 'user'. Should I post this kind of questions on 'yarn-dev' or 'user'? Right now, I cannot compile hadoop-yarn-server-resourcemanager independently. I have to use mvn package -Pdist -DskipTests -Dtar to compile the whole project. What command is appropriate to this scenario? Thanks, oxhead On Sat, Apr 13, 2013 at 12:41 AM, Chris Nauroth cnaur...@hortonworks.comwrote: I don't have an answer to your exact question, but I do have a different suggestion that prevents the need to do frequent rebuilds of the whole Hadoop source tree. First, do a full build of the distribution tar.gz. Extract it and set up a custom hadoop-env.sh for yourself. Inside the hadoop-env.sh file, export the environment variables HADOOP_USER_CLASSPATH_FIRST=true and HADOOP_CLASSPATH= any classpath that you want to prepend before the classes loaded from the distribution. For example, this is what I have in mine right now, because I'm mostly working on HDFS and NodeManager: export HADOOP_USER_CLASSPATH_FIRST=true HADOOP_REPO=~/git/hadoop-common export HADOOP_CLASSPATH=$HADOOP_REPO/hadoop-common-project/hadoop-common/target/classes:$HADOOP_REPO/hadoop-hdfs-project/hadoop-hdfs/target/classes:$HADOOP_REPO/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/target/classes:$HADOOP_REPO/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/target/classes For your ResourceManager work, you could set up your HADOOP_CLASSPATH to point at your hadoop-yarn-server-resourcemanager/target/classes directory. Then, source (.) this hadoop-env.sh in any shell that you're using to run hadoop commands. The daemons will print their full classpath before launching, so you can check that to see if it worked. With all of this in place, you can keep recompiling just hadoop-yarn-server-resourcemanager whenever you make changes instead of the whole hadoop-common tree. Does this help? Thanks, --Chris On Fri, Apr 12, 2013 at 8:46 PM, Chin-Jung Hsu oxhead.l...@gmail.comwrote: I am implementing my own YARN scheduler under 2.0.3-alpha. Is that possible to build only the ResourceManager project, and then create a distribution tar.gz for the entire Hadoop project? Right now, the compiling time takes me about 9 minutes. Thanks, oxhead
Re: Maven build YARN ResourceManager only
I don't have an answer to your exact question, but I do have a different suggestion that prevents the need to do frequent rebuilds of the whole Hadoop source tree. First, do a full build of the distribution tar.gz. Extract it and set up a custom hadoop-env.sh for yourself. Inside the hadoop-env.sh file, export the environment variables HADOOP_USER_CLASSPATH_FIRST=true and HADOOP_CLASSPATH= any classpath that you want to prepend before the classes loaded from the distribution. For example, this is what I have in mine right now, because I'm mostly working on HDFS and NodeManager: export HADOOP_USER_CLASSPATH_FIRST=true HADOOP_REPO=~/git/hadoop-common export HADOOP_CLASSPATH=$HADOOP_REPO/hadoop-common-project/hadoop-common/target/classes:$HADOOP_REPO/hadoop-hdfs-project/hadoop-hdfs/target/classes:$HADOOP_REPO/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/target/classes:$HADOOP_REPO/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/target/classes For your ResourceManager work, you could set up your HADOOP_CLASSPATH to point at your hadoop-yarn-server-resourcemanager/target/classes directory. Then, source (.) this hadoop-env.sh in any shell that you're using to run hadoop commands. The daemons will print their full classpath before launching, so you can check that to see if it worked. With all of this in place, you can keep recompiling just hadoop-yarn-server-resourcemanager whenever you make changes instead of the whole hadoop-common tree. Does this help? Thanks, --Chris On Fri, Apr 12, 2013 at 8:46 PM, Chin-Jung Hsu oxhead.l...@gmail.comwrote: I am implementing my own YARN scheduler under 2.0.3-alpha. Is that possible to build only the ResourceManager project, and then create a distribution tar.gz for the entire Hadoop project? Right now, the compiling time takes me about 9 minutes. Thanks, oxhead
[jira] [Created] (YARN-557) TestUnmanagedAMLauncher fails on Windows
Chris Nauroth created YARN-557: -- Summary: TestUnmanagedAMLauncher fails on Windows Key: YARN-557 URL: https://issues.apache.org/jira/browse/YARN-557 Project: Hadoop YARN Issue Type: Bug Components: applications Affects Versions: 3.0.0 Reporter: Chris Nauroth Assignee: Chris Nauroth {{TestUnmanagedAMLauncher}} fails on Windows due to attempting to run a Unix-specific command in distributed shell and use of a Unix-specific environment variable to determine username for the {{ContainerLaunchContext}}. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira