Re: [Vote] Merge branch-trunk-win to trunk
Andrew, this used to be on all -dev lists. Let's keep it that way. To the point. Does this mean that people are silently porting windows changes to branch-2? New features on a branch should be voted first, no? Thanks, --Konstantin On Mon, Mar 25, 2013 at 1:36 PM, Andrew Purtell apurt...@apache.org wrote: Noticed this too. Simply a 'public' modifier is missing, but it's unclear how this could not have been caught prior to check-in. On Mon, Mar 25, 2013 at 9:17 PM, Konstantin Boudnik c...@apache.org wrote: It doesn't look like any progress has been done on the ticket below in the last 3 weeks. And now branch-2 can't be compiled because of hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSShell.java:[895,15] WINDOWS is not public in org.apache.hadoop.fs.Path; cannot be accessed from outside package That's exactly why I was -1'ing this... Cos On Mon, Mar 04, 2013 at 05:41PM, Matt Foley wrote: Thanks, gentlemen. I've opened and taken responsibility for https://issues.apache.org/jira/browse/HADOOP-9359. Giri Kesavan has agreed to help with the parts that require Jenkins admin access. Thanks, --Matt On Mon, Mar 4, 2013 at 5:00 PM, Konstantin Shvachko shv.had...@gmail.comwrote: +1 on the merge. I am glad we agreed. Having Jira to track the CI effort is a good idea. Thanks, --Konstantin On Mon, Mar 4, 2013 at 3:29 PM, Matt Foley mfo...@hortonworks.com wrote: Thanks. I agree Windows -1's in test-patch should not block commits. --Matt On Mon, Mar 4, 2013 at 2:30 PM, Konstantin Shvachko shv.had...@gmail.com wrote: On Mon, Mar 4, 2013 at 12:22 PM, Matt Foley mfo...@hortonworks.com wrote: Konstantine, you have voted -1, and stated some requirements before you'll withdraw that -1. As I plan to do work to fulfill those requirements, I want to make sure that what I'm proposing will, in fact, satisfy you. That's why I'm asking, if we implement full test-patch integration for Windows, does it seem to you that that would provide adequate support? Yes. I have learned not to presume that my interpretation is correct. My interpretation of item #1 is that test-patch provides pre-commit build, so it would satisfy item #1. But rather than assuming that I am interpreting it correctly, I simply want your agreement that it would, or if not, clarification why it won't. I agree it will satisfy my item #1. I did not agree in my previous email, but I changed my mind based on the latest discussion. I have to explain why now. I was proposing nightly build because I did not want pre-commit build for Windows block commits to Linux. But if people are fine just ignoring -1s for the Windows part of the build it should be good. Regarding item #2, it is also my interpretation that test-patch provides an on-demand (perhaps 20-minutes deferred) Jenkins build and unit test, with logs available to the developer, so it would satisfy item #2. But rather than assuming that I am interpreting it correctly, I simply want your agreement that it would, or if not, clarification why it won't. It will satisfy my item #2 in the following way: I can duplicate your pre-commit build for Windows and add an input parameter, which would let people run the build on their patches chosen from local machine rather than attaching them to Jiras. Thanks, --Konstantin In agile terms, you are the Owner of these requirements. Please give me owner feedback as to whether my proposed work sounds like it will satisfy the requirements. Thank you, --Matt On Sun, Mar 3, 2013 at 12:16 PM, Konstantin Shvachko shv.had...@gmail.com wrote: Didn't I explain in details what I am asking for? Thanks, --Konst On Sun, Mar 3, 2013 at 11:08 AM, Matt Foley mfo...@hortonworks.com wrote: Hi Konstantin, I'd like to point out two things: First, I already committed in this thread (email of Thu, Feb 28, 2013 at 6:01 PM) to providing CI for Windows builds. So please stop acting like I'm resisting this idea or something. Second, you didn't answer my question, you just kvetched about the phrasing. So I ask again: Will providing full test-patch integration (pre-commit build and unit test triggered by Jira Patch Available state) satisfy your request for functionality #1 and #2? Yes or no, please. Thanks, --Matt On Sat, Mar 2, 2013 at 7:32 PM, Konstantin Shvachko shv.had...@gmail.com wrote: Hi Matt, On Sat, Mar 2, 2013 at 12:32 PM, Matt Foley
Re: [Vote] Merge branch-trunk-win to trunk
Adding other mailing lists I missed earlier. Cos, There is progress being made on that ticket. Also it has nothing to do with that. Please follow the discussion here and why this happened due to an invalid commit that was reverted - https://issues.apache.org/jira/browse/HDFS-4615?focusedCommentId=13612650page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13612650 Regards, Suresh On Mon, Mar 25, 2013 at 1:17 PM, Konstantin Boudnik c...@apache.org wrote: It doesn't look like any progress has been done on the ticket below in the last 3 weeks. And now branch-2 can't be compiled because of hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSShell.java:[895,15] WINDOWS is not public in org.apache.hadoop.fs.Path; cannot be accessed from outside package That's exactly why I was -1'ing this... Cos On Mon, Mar 04, 2013 at 05:41PM, Matt Foley wrote: Thanks, gentlemen. I've opened and taken responsibility for https://issues.apache.org/jira/browse/HADOOP-9359. Giri Kesavan has agreed to help with the parts that require Jenkins admin access. Thanks, --Matt On Mon, Mar 4, 2013 at 5:00 PM, Konstantin Shvachko shv.had...@gmail.comwrote: +1 on the merge. I am glad we agreed. Having Jira to track the CI effort is a good idea. Thanks, --Konstantin On Mon, Mar 4, 2013 at 3:29 PM, Matt Foley mfo...@hortonworks.com wrote: Thanks. I agree Windows -1's in test-patch should not block commits. --Matt On Mon, Mar 4, 2013 at 2:30 PM, Konstantin Shvachko shv.had...@gmail.com wrote: On Mon, Mar 4, 2013 at 12:22 PM, Matt Foley mfo...@hortonworks.com wrote: Konstantine, you have voted -1, and stated some requirements before you'll withdraw that -1. As I plan to do work to fulfill those requirements, I want to make sure that what I'm proposing will, in fact, satisfy you. That's why I'm asking, if we implement full test-patch integration for Windows, does it seem to you that that would provide adequate support? Yes. I have learned not to presume that my interpretation is correct. My interpretation of item #1 is that test-patch provides pre-commit build, so it would satisfy item #1. But rather than assuming that I am interpreting it correctly, I simply want your agreement that it would, or if not, clarification why it won't. I agree it will satisfy my item #1. I did not agree in my previous email, but I changed my mind based on the latest discussion. I have to explain why now. I was proposing nightly build because I did not want pre-commit build for Windows block commits to Linux. But if people are fine just ignoring -1s for the Windows part of the build it should be good. Regarding item #2, it is also my interpretation that test-patch provides an on-demand (perhaps 20-minutes deferred) Jenkins build and unit test, with logs available to the developer, so it would satisfy item #2. But rather than assuming that I am interpreting it correctly, I simply want your agreement that it would, or if not, clarification why it won't. It will satisfy my item #2 in the following way: I can duplicate your pre-commit build for Windows and add an input parameter, which would let people run the build on their patches chosen from local machine rather than attaching them to Jiras. Thanks, --Konstantin In agile terms, you are the Owner of these requirements. Please give me owner feedback as to whether my proposed work sounds like it will satisfy the requirements. Thank you, --Matt On Sun, Mar 3, 2013 at 12:16 PM, Konstantin Shvachko shv.had...@gmail.com wrote: Didn't I explain in details what I am asking for? Thanks, --Konst On Sun, Mar 3, 2013 at 11:08 AM, Matt Foley mfo...@hortonworks.com wrote: Hi Konstantin, I'd like to point out two things: First, I already committed in this thread (email of Thu, Feb 28, 2013 at 6:01 PM) to providing CI for Windows builds. So please stop acting like I'm resisting this idea or something. Second, you didn't answer my question, you just kvetched about the phrasing. So I ask again: Will providing full test-patch integration (pre-commit build and unit test triggered by Jira Patch Available state) satisfy your request for functionality #1 and #2? Yes or no, please. Thanks, --Matt On Sat, Mar 2, 2013 at 7:32 PM, Konstantin Shvachko shv.had...@gmail.com wrote: Hi Matt, On Sat, Mar 2, 2013 at 12:32 PM, Matt Foley mfo...@hortonworks.com wrote:
Re: [Vote] Merge branch-trunk-win to trunk
On Mon, Mar 4, 2013 at 12:22 PM, Matt Foley mfo...@hortonworks.com wrote: Konstantine, you have voted -1, and stated some requirements before you'll withdraw that -1. As I plan to do work to fulfill those requirements, I want to make sure that what I'm proposing will, in fact, satisfy you. That's why I'm asking, if we implement full test-patch integration for Windows, does it seem to you that that would provide adequate support? Yes. I have learned not to presume that my interpretation is correct. My interpretation of item #1 is that test-patch provides pre-commit build, so it would satisfy item #1. But rather than assuming that I am interpreting it correctly, I simply want your agreement that it would, or if not, clarification why it won't. I agree it will satisfy my item #1. I did not agree in my previous email, but I changed my mind based on the latest discussion. I have to explain why now. I was proposing nightly build because I did not want pre-commit build for Windows block commits to Linux. But if people are fine just ignoring -1s for the Windows part of the build it should be good. Regarding item #2, it is also my interpretation that test-patch provides an on-demand (perhaps 20-minutes deferred) Jenkins build and unit test, with logs available to the developer, so it would satisfy item #2. But rather than assuming that I am interpreting it correctly, I simply want your agreement that it would, or if not, clarification why it won't. It will satisfy my item #2 in the following way: I can duplicate your pre-commit build for Windows and add an input parameter, which would let people run the build on their patches chosen from local machine rather than attaching them to Jiras. Thanks, --Konstantin In agile terms, you are the Owner of these requirements. Please give me owner feedback as to whether my proposed work sounds like it will satisfy the requirements. Thank you, --Matt On Sun, Mar 3, 2013 at 12:16 PM, Konstantin Shvachko shv.had...@gmail.com wrote: Didn't I explain in details what I am asking for? Thanks, --Konst On Sun, Mar 3, 2013 at 11:08 AM, Matt Foley mfo...@hortonworks.com wrote: Hi Konstantin, I'd like to point out two things: First, I already committed in this thread (email of Thu, Feb 28, 2013 at 6:01 PM) to providing CI for Windows builds. So please stop acting like I'm resisting this idea or something. Second, you didn't answer my question, you just kvetched about the phrasing. So I ask again: Will providing full test-patch integration (pre-commit build and unit test triggered by Jira Patch Available state) satisfy your request for functionality #1 and #2? Yes or no, please. Thanks, --Matt On Sat, Mar 2, 2013 at 7:32 PM, Konstantin Shvachko shv.had...@gmail.com wrote: Hi Matt, On Sat, Mar 2, 2013 at 12:32 PM, Matt Foley mfo...@hortonworks.com wrote: Konstantin, I would like to explore what it would take to remove this perceived impediment -- Glad you decided to explore. Thank you. although I reserve the right to argue that this is not pre-requisite to merging the cross-platform support patch. It's your right indeed. So as mine to question what the platform support means for you, which I believe remained unclear. I do not impede the change as you should have noticed. My requirement comes from my perception of the support, which means to me exactly two things: 1. The ability to recognise the code is broken for the platform 2. The ability to test new patches on the platform The latter is problematic, as many noticed in this thread, for those whose customary environment does not include Windows. If we implemented full test-patch support for Windows on trunk, would that fulfill both your items #1 and #2? Please note that: a) Pushing the Patch Available button in Jira shall cause a pre-commit build to start within, I believe, 20 minutes. b) That build keeps logs for both java build and unit tests for several days, that are accessible to all viewers. In item #1 I mostly asking for the nightly build, which is simpler than test-patch. The latter would be ideal from the platform support viewpoint, but it is for the community to decide if we want to add extra +3 hours to the build. Nightly build in my understanding is triggered by the timer rather than by Jira's submit patch button. On Jenkins build configuration you can specify it under Build periodically. So, does this provide sufficient on-demand support that we don't have to implement a whole new on-demand VM support structure of some sort for #2 (which would be an extraordinary and impractical demand)? I did not mention VMs. Item #2 means a build, which runs test-patch target with the file specified by a user (instead of a jira attachment). When user clicks Build Now link a box is displayed where the user can enter the file
Re: [Vote] Merge branch-trunk-win to trunk
+1 on the merge. I am glad we agreed. Having Jira to track the CI effort is a good idea. Thanks, --Konstantin On Mon, Mar 4, 2013 at 3:29 PM, Matt Foley mfo...@hortonworks.com wrote: Thanks. I agree Windows -1's in test-patch should not block commits. --Matt On Mon, Mar 4, 2013 at 2:30 PM, Konstantin Shvachko shv.had...@gmail.com wrote: On Mon, Mar 4, 2013 at 12:22 PM, Matt Foley mfo...@hortonworks.com wrote: Konstantine, you have voted -1, and stated some requirements before you'll withdraw that -1. As I plan to do work to fulfill those requirements, I want to make sure that what I'm proposing will, in fact, satisfy you. That's why I'm asking, if we implement full test-patch integration for Windows, does it seem to you that that would provide adequate support? Yes. I have learned not to presume that my interpretation is correct. My interpretation of item #1 is that test-patch provides pre-commit build, so it would satisfy item #1. But rather than assuming that I am interpreting it correctly, I simply want your agreement that it would, or if not, clarification why it won't. I agree it will satisfy my item #1. I did not agree in my previous email, but I changed my mind based on the latest discussion. I have to explain why now. I was proposing nightly build because I did not want pre-commit build for Windows block commits to Linux. But if people are fine just ignoring -1s for the Windows part of the build it should be good. Regarding item #2, it is also my interpretation that test-patch provides an on-demand (perhaps 20-minutes deferred) Jenkins build and unit test, with logs available to the developer, so it would satisfy item #2. But rather than assuming that I am interpreting it correctly, I simply want your agreement that it would, or if not, clarification why it won't. It will satisfy my item #2 in the following way: I can duplicate your pre-commit build for Windows and add an input parameter, which would let people run the build on their patches chosen from local machine rather than attaching them to Jiras. Thanks, --Konstantin In agile terms, you are the Owner of these requirements. Please give me owner feedback as to whether my proposed work sounds like it will satisfy the requirements. Thank you, --Matt On Sun, Mar 3, 2013 at 12:16 PM, Konstantin Shvachko shv.had...@gmail.com wrote: Didn't I explain in details what I am asking for? Thanks, --Konst On Sun, Mar 3, 2013 at 11:08 AM, Matt Foley mfo...@hortonworks.com wrote: Hi Konstantin, I'd like to point out two things: First, I already committed in this thread (email of Thu, Feb 28, 2013 at 6:01 PM) to providing CI for Windows builds. So please stop acting like I'm resisting this idea or something. Second, you didn't answer my question, you just kvetched about the phrasing. So I ask again: Will providing full test-patch integration (pre-commit build and unit test triggered by Jira Patch Available state) satisfy your request for functionality #1 and #2? Yes or no, please. Thanks, --Matt On Sat, Mar 2, 2013 at 7:32 PM, Konstantin Shvachko shv.had...@gmail.com wrote: Hi Matt, On Sat, Mar 2, 2013 at 12:32 PM, Matt Foley mfo...@hortonworks.com wrote: Konstantin, I would like to explore what it would take to remove this perceived impediment -- Glad you decided to explore. Thank you. although I reserve the right to argue that this is not pre-requisite to merging the cross-platform support patch. It's your right indeed. So as mine to question what the platform support means for you, which I believe remained unclear. I do not impede the change as you should have noticed. My requirement comes from my perception of the support, which means to me exactly two things: 1. The ability to recognise the code is broken for the platform 2. The ability to test new patches on the platform The latter is problematic, as many noticed in this thread, for those whose customary environment does not include Windows. If we implemented full test-patch support for Windows on trunk, would that fulfill both your items #1 and #2? Please note that: a) Pushing the Patch Available button in Jira shall cause a pre-commit build to start within, I believe, 20 minutes. b) That build keeps logs for both java build and unit tests for several days, that are accessible to all viewers. In item #1 I mostly asking for the nightly build, which is simpler than test-patch. The latter would be ideal from the platform support viewpoint, but it is for the community to decide if we want to add extra +3 hours to the build. Nightly build in my understanding is triggered by the timer rather than by Jira's submit patch button. On Jenkins build configuration
RE: [Vote] Merge branch-trunk-win to trunk
That's sounds like a reasonable approach. Like you say below, we need to ensure is that the Java side of OS specific optimizations is generic and wont need drastic surgery when that optimization is ported to another platform. Bikas -Original Message- From: Uma Maheswara Rao G [mailto:mahesw...@huawei.com] Sent: Thursday, February 28, 2013 9:20 PM To: hdfs-...@hadoop.apache.org; common-...@hadoop.apache.org Cc: yarn-dev@hadoop.apache.org; mapreduce-...@hadoop.apache.org Subject: RE: [Vote] Merge branch-trunk-win to trunk +1 (non-binding) Thanks a lot for the work done by Suresh and team of community! I don't think there will be much problems because of platform dependancy as our development is with Java. If we have some native code portings, then dev members has to take care of them. One question regarding to it: Ex: if one contributor is giving the patch for some native code porting for some performance improvements and he is interested only in Linux. Then I hope some other contributors will help in getting the windows patch if possible. If others busy to get that done with in time lines, then I think we can commit Linux support patch and leave one JIRA open for Windows support? [ make sure that porting options configurable and platform check and give release note about platform support..etc?] Regards, Uma From: Suresh Srinivas [sur...@hortonworks.com] Sent: Wednesday, February 27, 2013 4:25 AM To: common-...@hadoop.apache.org Cc: yarn-dev@hadoop.apache.org; hdfs-...@hadoop.apache.org; mapreduce-...@hadoop.apache.org Subject: [Vote] Merge branch-trunk-win to trunk I had posted heads up about merging branch-trunk-win to trunk on Feb 8th. I am happy to announce that we are ready for the merge. Here is a brief recap on the highlights of the work done: - Command-line scripts for the Hadoop surface area - Mapping the HDFS permissions model to Windows - Abstracted and reconciled mismatches around differences in Path semantics in Java and Windows - Native Task Controller for Windows - Implementation of a Block Placement Policy to support cloud environments, more specifically Azure. - Implementation of Hadoop native libraries for Windows (compression codecs, native I/O) - Several reliability issues, including race-conditions, intermittent test failures, resource leaks. - Several new unit test cases written for the above changes Please find the details of the work in CHANGES.branch-trunk-win.txt - Common changeshttp://bit.ly/Xe7Ynv, HDFS changeshttp://bit.ly/13QOSo9, and YARN and MapReduce changes http://bit.ly/128zzMt. This is the work ported from branch-1-win to a branch based on trunk. For details of the testing done, please see the thread - http://bit.ly/WpavJ4. Merge patch for this is available on HADOOP-8562 https://issues.apache.org/jira/browse/HADOOP-8562. This was a large undertaking that involved developing code, testing the entire Hadoop stack, including scale tests. This is made possible only with the contribution from many many folks in the community. Following people contributed to this work: Ivan Mitic, Chuan Liu, Ramya Sunil, Bikas Saha, Kanna Karanam, John Gordon, Brandon Li, Chris Nauroth, David Lao, Sumadhur Reddy Bolli, Arpit Agarwal, Ahmed El Baz, Mike Liddell, Jing Zhao, Thejas Nair, Steve Maine, Ganeshan Iyer, Raja Aluri, Giridharan Kesavan, Ramya Bharathi Nimmagadda, Daryn Sharp, Arun Murthy, Tsz-Wo Nicholas Sze, Suresh Srinivas and Sanjay Radia. There are many others who contributed as well providing feedback and comments on numerous jiras. The vote will run for seven days and will end on March 5, 6:00PM PST. Regards, Suresh On Thu, Feb 7, 2013 at 6:41 PM, Mahadevan Venkatraman mah...@microsoft.comwrote: It is super exciting to look at the prospect of these changes being merged to trunk. Having Windows as one of the supported Hadoop platforms is a fantastic opportunity both for the Hadoop project and Microsoft customers. This work began around a year back when a few of us started with a basic port of Hadoop on Windows. Ever since, the Hadoop team in Microsoft have made significant progress in the following areas: (PS: Some of these items are already included in Suresh's email, but including again for completeness) - Command-line scripts for the Hadoop surface area - Mapping the HDFS permissions model to Windows - Abstracted and reconciled mismatches around differences in Path semantics in Java and Windows - Native Task Controller for Windows - Implementation of a Block Placement Policy to support cloud environments, more specifically Azure. - Implementation of Hadoop native libraries for Windows (compression codecs, native I/O) - Several reliability issues, including race-conditions, intermittent test failures, resource leaks. - Several new unit test cases written for the above changes In the process, we have closely engaged with the Apache open source community and have got great support
Re: [Vote] Merge branch-trunk-win to trunk
Commitment is a good thing. I think the two builds that I proposed are a prerequisite for Win support. If we commit windows patch people will start breaking it the next day. Which we wont know without the nightly build and wont be able to fix without the on-demand one. Making two builds is less than 2 days work, imho, given that there is a Windows node available and that mvn targets are in place. Correct me if I missed any complications in the process. Thanks, --Konst On Fri, Mar 1, 2013 at 1:28 PM, Chris Douglas cdoug...@apache.org wrote: Konstantin- There's no debate on the necessity of CI and related infrastructure to support the platform well. Suresh outlined the support to effect this here: http://s.apache.org/s1 Is the commitment to establish this infrastructure after the merge sufficient? -C On Fri, Mar 1, 2013 at 12:18 PM, Konstantin Shvachko shv.had...@gmail.com wrote: -1 We should have a CI infrastructure in place before we can commit to supporting Windows platform. Eric is right Win/Cygwin was supported since day one. I had a Windows box under my desk running nightly builds back in 2006-07. People were irritated but I was filing windows bugs until 0.22 release. Times changing and I am glad to see wider support for Win platform. But in order to make it work you guys need to put the CI process in place 1. windows jenkins build: could be nightly or PreCommit. - Nightly would mean that changes can be committed to trunk based on linux PreCommit build. And people will file bugs if the change broke Windows nightly build. - PreCommit-win build will mean automatic reporting failed tests to respective jira blocking commits the same way as it is now with linux PreCommit builds. We should discuss which way is more efficient for developers. 2. On-demand-windows Jenkins build. I see it as a build to which I can attach my patch and the build will run my changes on a dedicated windows box. That way people can test their changes without having personal windows nodes. I think this is the minimal set of requirement for us to be able to commit to the new platform. Right now I see only one windows related build https://builds.apache.org/view/Hadoop/job/Hadoop-1-win/ Which was failing since Sept 8, 2012 and did not run in the last month. Thanks, --Konst On Thu, Feb 28, 2013 at 8:47 PM, Eric Baldeschwieler eri...@hortonworks.com wrote: +1 (non-binding) A few of observations: - Windows has actually been a supported platform for Hadoop since 0.1 . Doug championed supporting windows then and we've continued to do it with varying vigor over time. To my knowledge we've never made a decision to drop windows support. The change here is improving our support and dropping the requirement of cigwin. We had Nutch windows users on the list in 2006 and we've been supporting windows FS requirements since inception. - A little pragmatism will go a long way. As a community we've got to stay committed to keeping hadoop simple (so it does work on many platforms) and extending it to take advantage of key emerging OS/hardware features, such as containers, new FSs, virtualization, flash ... We should all plan to let new features optimizations emerge that don't work everywhere, if they are compelling and central to hadoop's mission of being THE best fabric for storing and processing big data. - A UI project like KDE has to deal with the MANY differences between windows and linux UI APIs. Hadoop faces no such complex challenge and hence can be maintained from a single codeline IMO. It is mostly abstracted from the OS APIs via Java and our design choices. Where it is not we can continue to add plugable abstractions.
Re: [Vote] Merge branch-trunk-win to trunk
On Fri, Mar 1, 2013 at 1:57 PM, Konstantin Shvachko shv.had...@gmail.com wrote: Commitment is a good thing. I think the two builds that I proposed are a prerequisite for Win support. If we commit windows patch people will start breaking it the next day. Which we wont know without the nightly build and wont be able to fix without the on-demand one. As several people have pointed out already, the surface of possible conflicts is relatively limited, and- as you did in 2007- the devs on Windows will report and fix bugs in that platform as they find them. CI is important for detecting and preventing bugs, but this isn't software we're launching into orbit. Making two builds is less than 2 days work, imho, given that there is a Windows node available and that mvn targets are in place. Correct me if I missed any complications in the process. On Fri, Mar 1, 2013 at 3:47 PM, Konstantin Boudnik c...@apache.org wrote: It seems that with the HW in place, the matter of setting at least nightly build is trivial for anyone with up to date Windows knowledge. I wish I could help. Going without a validation is a recipe for a disaster IMO. Fair enough, though that also implies that the window for regressions is small, and it leaves little room to doubt that this will receive priority. Until it's merged, spurious notifications that the current trunk breaks Windows are an awkward introduction to devs' workflow. The order of merge/CI is a choice between mild annoyances, really. But it might be moot. Giri: you're doing the work on this. When do you think it can be complete? -C
Re: [Vote] Merge branch-trunk-win to trunk
On Mar 1, 2013, at 1:57 PM, Konstantin Shvachko wrote: Commitment is a good thing. I think the two builds that I proposed are a prerequisite for Win support. If we commit windows patch people will start breaking it the next day. Which we wont know without the nightly build and wont be able to fix without the on-demand one. They clearly are a prerequisite for declaring official support for windows. But they should not be a prerequisite for the merge,. Currently we enable windows through cygwin. There is no jenkins. Folks have been fixing windows issues as they are discovered. Merging the branch makes the situation no worse than it is today - all tests pass on Linux, there is no regression. Merging now removes the cygwin dependency. Jenkins is critical to make windows officially supported platform without cygwin. When Jenkins is enabled, the team that has worked on this branch will have to fix any bugs that have arisen in the mean time. sanjay
RE: [Vote] Merge branch-trunk-win to trunk
+1 (non-binding) I want to share my vote of confidence in this community. If motivated to do so, this community can keep this project cross-platform and continue to rapidly innovate without breaking a sweat. The day we started working on this, I saw the foundations of greatness in the quality and volume of dev tests, the code itself, and the Apache values themselves. 1.) Hadoop's unit tests and their frameworks are very well thought out and the consideration and energy that went into their design is worthy of praise. The MiniCluster abstractions utilize very few resources and put all the processes into one JVM for easy debugging. It is very easy to select specific tests from the full suite to reproduce an issue reported in another environment - like the Jenkins build server or another contributor's environment. 2.) This community has done an excellent job of incorporating well-placed log messages to make it easy to post mortem troubleshoot most failures. The logs are very useful, and it is extremely rare that troubleshooting a failure requires debugging a live repro. 3.) Hadoop is written primarily in Java, a cross-platform language that provides its own platform in the form of the JVM to insulate most of the code from the specifics of the OS layer. 4.) CoPDoC - The right priorities, and well stated. Thank you, John -Original Message- From: Ivan Mitic [mailto:iva...@microsoft.com] Sent: Wednesday, February 27, 2013 6:32 PM To: mapreduce-...@hadoop.apache.org; common-...@hadoop.apache.org Cc: yarn-dev@hadoop.apache.org; hdfs-...@hadoop.apache.org Subject: RE: [Vote] Merge branch-trunk-win to trunk +1 (non-binding) I am really glad to see this happening! As people already mentioned, this has been a great engineering effort involving many people! Folks raised some valid concerns below and I thought it would be good to share my 2 cents. In my opinion, we don't have to solve all these problems right now. As we move forward with two platforms, we can start addressing one problem at a time and incrementally improve. In the first iteration, maintaining Hadoop on Windows could be just everyone trying to do their best effort (make sure Jenkins build succeeds at least). We already have people who are building/running trunk on Windows daily, so they would jump in and fix problems as needed (we've been doing this in branch-trunk-win for a while now). Although I see that the problems could arise with platform specific features/optimizations, I don't think these are frequent, so in most cases everything will just work. Merging the two branches sooner rather than later does seems like the right thing to do if the ultimate goal is to have Hadoop on both platforms. Now that the port has completed, we will have people in Microsoft (and elsewhere) wanting to contribute features/improvements to the trunk branch. A separate branch would just make things more difficult and confusing for everyone :) Hope this makes sense. -Original Message- From: Todd Lipcon [mailto:t...@cloudera.com] Sent: Wednesday, February 27, 2013 3:43 PM To: common-...@hadoop.apache.org Cc: yarn-dev@hadoop.apache.org; hdfs-...@hadoop.apache.org; mapreduce-...@hadoop.apache.org Subject: Re: [Vote] Merge branch-trunk-win to trunk On Wed, Feb 27, 2013 at 2:54 PM, Suresh Srinivas sur...@hortonworks.comwrote: With that we need to decide how our precommit process looks. My inclination is to wait for +1 from precommit builds on both the platforms to ensure no issues are introduced. Thoughts? 2. Feature development impact Some questions have been raised about would new features need to be supported on both the platforms. Yes. I do not see a reason why features cannot work on both the platforms, with the exception of platform specific optimizations. This what Java gives us. I'm concerned about the above. Personally, I don't have access to any Windows boxes with development tools, and I know nothing about developing on Windows. The only Windows I run is an 8GB VM with 1 GB RAM allocated, for powerpoint :) If I submit a patch and it gets -1 tests failed on the Windows slave, how am I supposed to proceed? I think a reasonable compromise would be that the tests should always *build* on Windows before commit, and contributors should do their best to look at the test logs for any Windows-specific failures. But, beyond looking at the logs, a -1 Tests failed on windows should not block a commit. Those contributors who are interested in Windows being a first-class platform should be responsible for watching the Windows builds and debugging/fixing any regressions that might be Windows-specific. I also think the KDE model that Harsh pointed out is an interesting one -- ie the idea that we would not merge windows support to trunk, but rather treat is as a parallel code line which lives in the ASF and has its own builds and releases. The windows team would periodically merge trunk
Re: [Vote] Merge branch-trunk-win to trunk
+1 non-binding Nice to see that this work is going to trunk. Raja Aluri On Tue, Feb 26, 2013 at 2:55 PM, Suresh Srinivas sur...@hortonworks.comwrote: I had posted heads up about merging branch-trunk-win to trunk on Feb 8th. I am happy to announce that we are ready for the merge. Here is a brief recap on the highlights of the work done: - Command-line scripts for the Hadoop surface area - Mapping the HDFS permissions model to Windows - Abstracted and reconciled mismatches around differences in Path semantics in Java and Windows - Native Task Controller for Windows - Implementation of a Block Placement Policy to support cloud environments, more specifically Azure. - Implementation of Hadoop native libraries for Windows (compression codecs, native I/O) - Several reliability issues, including race-conditions, intermittent test failures, resource leaks. - Several new unit test cases written for the above changes Please find the details of the work in CHANGES.branch-trunk-win.txt - Common changeshttp://bit.ly/Xe7Ynv, HDFS changeshttp://bit.ly/13QOSo9, and YARN and MapReduce changes http://bit.ly/128zzMt. This is the work ported from branch-1-win to a branch based on trunk. For details of the testing done, please see the thread - http://bit.ly/WpavJ4. Merge patch for this is available on HADOOP-8562 https://issues.apache.org/jira/browse/HADOOP-8562. This was a large undertaking that involved developing code, testing the entire Hadoop stack, including scale tests. This is made possible only with the contribution from many many folks in the community. Following people contributed to this work: Ivan Mitic, Chuan Liu, Ramya Sunil, Bikas Saha, Kanna Karanam, John Gordon, Brandon Li, Chris Nauroth, David Lao, Sumadhur Reddy Bolli, Arpit Agarwal, Ahmed El Baz, Mike Liddell, Jing Zhao, Thejas Nair, Steve Maine, Ganeshan Iyer, Raja Aluri, Giridharan Kesavan, Ramya Bharathi Nimmagadda, Daryn Sharp, Arun Murthy, Tsz-Wo Nicholas Sze, Suresh Srinivas and Sanjay Radia. There are many others who contributed as well providing feedback and comments on numerous jiras. The vote will run for seven days and will end on March 5, 6:00PM PST. Regards, Suresh On Thu, Feb 7, 2013 at 6:41 PM, Mahadevan Venkatraman mah...@microsoft.comwrote: It is super exciting to look at the prospect of these changes being merged to trunk. Having Windows as one of the supported Hadoop platforms is a fantastic opportunity both for the Hadoop project and Microsoft customers. This work began around a year back when a few of us started with a basic port of Hadoop on Windows. Ever since, the Hadoop team in Microsoft have made significant progress in the following areas: (PS: Some of these items are already included in Suresh's email, but including again for completeness) - Command-line scripts for the Hadoop surface area - Mapping the HDFS permissions model to Windows - Abstracted and reconciled mismatches around differences in Path semantics in Java and Windows - Native Task Controller for Windows - Implementation of a Block Placement Policy to support cloud environments, more specifically Azure. - Implementation of Hadoop native libraries for Windows (compression codecs, native I/O) - Several reliability issues, including race-conditions, intermittent test failures, resource leaks. - Several new unit test cases written for the above changes In the process, we have closely engaged with the Apache open source community and have got great support and assistance from the community in terms of contributing fixes, code review comments and commits. In addition, the Hadoop team at Microsoft has also made good progress in other projects including Hive, Pig, Sqoop, Oozie, HCat and HBase. Many of these changes have already been committed to the respective trunks with help from various committers and contributors. It is great to see the commitment of the community to support multiple platforms, and we look forward to the day when a developer/customer is able to successfully deploy a complete solution stack based on Apache Hadoop releases. Next Steps: All of the above changes are part of the Windows Azure HDInsight and HDInsight Server products from Microsoft. We have successfully on-boarded several internal customers and have been running production workloads on Windows Azure HDInsight. Our vision is to create a big data platform based on Hadoop, and we are committed to helping make Hadoop a world-class solution that anyone can use to solve their biggest data challenges. As an immediate next step, we would like to have a discussion around how we can ensure that the quality of the mainline Hadoop branches on Windows is maintained. To this end, we would like to get to the state where we have pre-checkin validation gates and nightly test runs enabled on Windows. If you have any suggestions around this, please do send an
Re: [Vote] Merge branch-trunk-win to trunk
themselves. 1.) Hadoop's unit tests and their frameworks are very well thought out and the consideration and energy that went into their design is worthy of praise. The MiniCluster abstractions utilize very few resources and put all the processes into one JVM for easy debugging. It is very easy to select specific tests from the full suite to reproduce an issue reported in another environment - like the Jenkins build server or another contributor's environment. 2.) This community has done an excellent job of incorporating well-placed log messages to make it easy to post mortem troubleshoot most failures. The logs are very useful, and it is extremely rare that troubleshooting a failure requires debugging a live repro. 3.) Hadoop is written primarily in Java, a cross-platform language that provides its own platform in the form of the JVM to insulate most of the code from the specifics of the OS layer. 4.) CoPDoC - The right priorities, and well stated. Thank you, John -Original Message- From: Ivan Mitic [mailto:iva...@microsoft.com] Sent: Wednesday, February 27, 2013 6:32 PM To: mapreduce-...@hadoop.apache.org; common-...@hadoop.apache.org Cc: yarn-dev@hadoop.apache.org; hdfs-...@hadoop.apache.org Subject: RE: [Vote] Merge branch-trunk-win to trunk +1 (non-binding) I am really glad to see this happening! As people already mentioned, this has been a great engineering effort involving many people! Folks raised some valid concerns below and I thought it would be good to share my 2 cents. In my opinion, we don't have to solve all these problems right now. As we move forward with two platforms, we can start addressing one problem at a time and incrementally improve. In the first iteration, maintaining Hadoop on Windows could be just everyone trying to do their best effort (make sure Jenkins build succeeds at least). We already have people who are building/running trunk on Windows daily, so they would jump in and fix problems as needed (we've been doing this in branch-trunk-win for a while now). Although I see that the problems could arise with platform specific features/optimizations, I don't think these are frequent, so in most cases everything will just work. Merging the two branches sooner rather than later does seems like the right thing to do if the ultimate goal is to have Hadoop on both platforms. Now that the port has completed, we will have people in Microsoft (and elsewhere) wanting to contribute features/improvements to the trunk branch. A separate branch would just make things more difficult and confusing for everyone :) Hope this makes sense. -Original Message- From: Todd Lipcon [mailto:t...@cloudera.com] Sent: Wednesday, February 27, 2013 3:43 PM To: common-...@hadoop.apache.org Cc: yarn-dev@hadoop.apache.org; hdfs-...@hadoop.apache.org; mapreduce-...@hadoop.apache.org Subject: Re: [Vote] Merge branch-trunk-win to trunk On Wed, Feb 27, 2013 at 2:54 PM, Suresh Srinivas sur...@hortonworks.com wrote: With that we need to decide how our precommit process looks. My inclination is to wait for +1 from precommit builds on both the platforms to ensure no issues are introduced. Thoughts? 2. Feature development impact Some questions have been raised about would new features need to be supported on both the platforms. Yes. I do not see a reason why features cannot work on both the platforms, with the exception of platform specific optimizations. This what Java gives us. I'm concerned about the above. Personally, I don't have access to any Windows boxes with development tools, and I know nothing about developing on Windows. The only Windows I run is an 8GB VM with 1 GB RAM allocated, for powerpoint :) If I submit a patch and it gets -1 tests failed on the Windows slave, how am I supposed to proceed? I think a reasonable compromise would be that the tests should always *build* on Windows before commit, and contributors should do their best to look at the test logs for any Windows-specific failures. But, beyond looking at the logs, a -1 Tests failed on windows should not block a commit. Those contributors who are interested in Windows being a first-class platform should be responsible for watching the Windows builds and debugging/fixing any regressions that might be Windows-specific. I also think the KDE model that Harsh pointed out is an interesting one -- ie the idea that we would not merge windows support to trunk, but rather treat is as a parallel code line which lives in the ASF and has its own builds and releases. The windows team would periodically merge trunk-win to pick up any new changes, and do a separate test/release process. I'm not convinced this is the best idea, but worth discussion of pros and cons. -Todd On Wed, Feb 27, 2013 at 11:56 AM, Eli Collins e...@cloudera.com wrote: Bobby raises some good questions. A related one, since most current developers won't
Re: [Vote] Merge branch-trunk-win to trunk
On Thu, Feb 28, 2013 at 03:08PM, sanjay Radia wrote: +1 Java has done the bulk of the work in making Hadoop multi-platform. Windows specific code is a tiny percentage of the code. Jeninks support for windows is going help us keep the platform portable going forward. I expect that the vast majority of new commits have no problems. I propose that we start by fixing problems that Jenkins raises but not block new commits for too long if the author does not have a windows box or if a volunteer does not step up. Considering a typical set of software most of the people here work with it would be completely inappropriate to block commits for failing Windows specific features. After all, Microsoft never did bother to check what features or compatibilty matters they have broke in Java and elsewhere, so why should we? I believe this kind of rules have to be set and discussed before the merge is done. Cheers, Cos signature.asc Description: Digital signature
Re: [Vote] Merge branch-trunk-win to trunk
+1 for the merge. As someone who has been testing the code for many months now, both on singlenode and multinode clusters, I am very confident about the stability and the quality of the code. I have run several regression tests to verify distributed cache, streaming, compression, capacity scheduler, job history and many more features in HDFS and MR. - Ramya On Thu, Feb 28, 2013 at 3:08 PM, sanjay Radia san...@hortonworks.comwrote: +1 Java has done the bulk of the work in making Hadoop multi-platform. Windows specific code is a tiny percentage of the code. Jeninks support for windows is going help us keep the platform portable going forward. I expect that the vast majority of new commits have no problems. I propose that we start by fixing problems that Jenkins raises but not block new commits for too long if the author does not have a windows box or if a volunteer does not step up. sanjay
Re: [Vote] Merge branch-trunk-win to trunk
+1 Java has done the bulk of the work in making Hadoop multi-platform. Windows specific code is a tiny percentage of the code. Jeninks support for windows is going help us keep the platform portable going forward. I expect that the vast majority of new commits have no problems. I propose that we start by fixing problems that Jenkins raises but not block new commits for too long if the author does not have a windows box or if a volunteer does not step up. sanjay
Re: [Vote] Merge branch-trunk-win to trunk
After this is merged in is Windows still going to be a second class citizen but happens to work for more than just development or is it a fully supported platform where if something breaks it can block a release? How do we as a community intend to keep Windows support from breaking? We don't have any Jenkins slaves to be able to run nightly tests to validate everything still compiles/runs. This is not a blocker for me because we often rely on individuals and groups to test Hadoop, but I do think we need to have this discussion before we put it in. --Bobby On 2/26/13 4:55 PM, Suresh Srinivas sur...@hortonworks.com wrote: I had posted heads up about merging branch-trunk-win to trunk on Feb 8th. I am happy to announce that we are ready for the merge. Here is a brief recap on the highlights of the work done: - Command-line scripts for the Hadoop surface area - Mapping the HDFS permissions model to Windows - Abstracted and reconciled mismatches around differences in Path semantics in Java and Windows - Native Task Controller for Windows - Implementation of a Block Placement Policy to support cloud environments, more specifically Azure. - Implementation of Hadoop native libraries for Windows (compression codecs, native I/O) - Several reliability issues, including race-conditions, intermittent test failures, resource leaks. - Several new unit test cases written for the above changes Please find the details of the work in CHANGES.branch-trunk-win.txt - Common changeshttp://bit.ly/Xe7Ynv, HDFS changeshttp://bit.ly/13QOSo9, and YARN and MapReduce changes http://bit.ly/128zzMt. This is the work ported from branch-1-win to a branch based on trunk. For details of the testing done, please see the thread - http://bit.ly/WpavJ4. Merge patch for this is available on HADOOP-8562 https://issues.apache.org/jira/browse/HADOOP-8562. This was a large undertaking that involved developing code, testing the entire Hadoop stack, including scale tests. This is made possible only with the contribution from many many folks in the community. Following people contributed to this work: Ivan Mitic, Chuan Liu, Ramya Sunil, Bikas Saha, Kanna Karanam, John Gordon, Brandon Li, Chris Nauroth, David Lao, Sumadhur Reddy Bolli, Arpit Agarwal, Ahmed El Baz, Mike Liddell, Jing Zhao, Thejas Nair, Steve Maine, Ganeshan Iyer, Raja Aluri, Giridharan Kesavan, Ramya Bharathi Nimmagadda, Daryn Sharp, Arun Murthy, Tsz-Wo Nicholas Sze, Suresh Srinivas and Sanjay Radia. There are many others who contributed as well providing feedback and comments on numerous jiras. The vote will run for seven days and will end on March 5, 6:00PM PST. Regards, Suresh On Thu, Feb 7, 2013 at 6:41 PM, Mahadevan Venkatraman mah...@microsoft.comwrote: It is super exciting to look at the prospect of these changes being merged to trunk. Having Windows as one of the supported Hadoop platforms is a fantastic opportunity both for the Hadoop project and Microsoft customers. This work began around a year back when a few of us started with a basic port of Hadoop on Windows. Ever since, the Hadoop team in Microsoft have made significant progress in the following areas: (PS: Some of these items are already included in Suresh's email, but including again for completeness) - Command-line scripts for the Hadoop surface area - Mapping the HDFS permissions model to Windows - Abstracted and reconciled mismatches around differences in Path semantics in Java and Windows - Native Task Controller for Windows - Implementation of a Block Placement Policy to support cloud environments, more specifically Azure. - Implementation of Hadoop native libraries for Windows (compression codecs, native I/O) - Several reliability issues, including race-conditions, intermittent test failures, resource leaks. - Several new unit test cases written for the above changes In the process, we have closely engaged with the Apache open source community and have got great support and assistance from the community in terms of contributing fixes, code review comments and commits. In addition, the Hadoop team at Microsoft has also made good progress in other projects including Hive, Pig, Sqoop, Oozie, HCat and HBase. Many of these changes have already been committed to the respective trunks with help from various committers and contributors. It is great to see the commitment of the community to support multiple platforms, and we look forward to the day when a developer/customer is able to successfully deploy a complete solution stack based on Apache Hadoop releases. Next Steps: All of the above changes are part of the Windows Azure HDInsight and HDInsight Server products from Microsoft. We have successfully on-boarded several internal customers and have been running production workloads on Windows Azure HDInsight. Our vision is to create a big data platform based on Hadoop, and we are committed to helping make Hadoop a world-class solution that anyone can use to solve
Re: [Vote] Merge branch-trunk-win to trunk
Similar personal concern as Robert: Does this bring about a development process change? Do new features all need to work on Windows as well to go into trunk (i.e. immediately or eventually, either way requires a new policy for all of us devs)? Not that anyone would be avoiding doing that, I just ask cause it could impact time and effort required for any major undertaking. While most of the project today is cross-platform (via Java, etc., which thankfully remove path handling problems and the sorts), performance improvements at least are going the native side these days, which is where I see this have some impact. We've not been perfectly successful in having the natives continuously work on Solaris/etc. in the past, mainly due to the platform focus of the majority (not all) of devs working on the project(s). Some form of a development policy here would ensure proper Windows support for the features we intend to ship along are not very divergent such that we end up having to maintain docs as well, detailing each task yet to be done (these tend to grow if allowed). Useful to also note from another OSS project KDE, that their working builds of Windows are usually on 1-2 releases of the past (i.e. for example, KDE release is currently at 4.10, but the last released Windows port is still 4.8 today). KDE uses Qt, which is cross-platform by itself, but there's still a port team and a ported release maintained separately (but under the same org.) due to the major development happening on Linux. Same case for *BSD as well. My own patches there at some point have caused trouble cause I did something that I only tested on one platform, and about a bit later things got revised to support the other ones where it was unnecessarily breaking. Or if am being too concerned about feature/performance divergence, let me know. On Wed, Feb 27, 2013 at 9:47 PM, Robert Evans ev...@yahoo-inc.com wrote: After this is merged in is Windows still going to be a second class citizen but happens to work for more than just development or is it a fully supported platform where if something breaks it can block a release? How do we as a community intend to keep Windows support from breaking? We don't have any Jenkins slaves to be able to run nightly tests to validate everything still compiles/runs. This is not a blocker for me because we often rely on individuals and groups to test Hadoop, but I do think we need to have this discussion before we put it in. --Bobby On 2/26/13 4:55 PM, Suresh Srinivas sur...@hortonworks.com wrote: I had posted heads up about merging branch-trunk-win to trunk on Feb 8th. I am happy to announce that we are ready for the merge. Here is a brief recap on the highlights of the work done: - Command-line scripts for the Hadoop surface area - Mapping the HDFS permissions model to Windows - Abstracted and reconciled mismatches around differences in Path semantics in Java and Windows - Native Task Controller for Windows - Implementation of a Block Placement Policy to support cloud environments, more specifically Azure. - Implementation of Hadoop native libraries for Windows (compression codecs, native I/O) - Several reliability issues, including race-conditions, intermittent test failures, resource leaks. - Several new unit test cases written for the above changes Please find the details of the work in CHANGES.branch-trunk-win.txt - Common changeshttp://bit.ly/Xe7Ynv, HDFS changeshttp://bit.ly/13QOSo9, and YARN and MapReduce changes http://bit.ly/128zzMt. This is the work ported from branch-1-win to a branch based on trunk. For details of the testing done, please see the thread - http://bit.ly/WpavJ4. Merge patch for this is available on HADOOP-8562 https://issues.apache.org/jira/browse/HADOOP-8562. This was a large undertaking that involved developing code, testing the entire Hadoop stack, including scale tests. This is made possible only with the contribution from many many folks in the community. Following people contributed to this work: Ivan Mitic, Chuan Liu, Ramya Sunil, Bikas Saha, Kanna Karanam, John Gordon, Brandon Li, Chris Nauroth, David Lao, Sumadhur Reddy Bolli, Arpit Agarwal, Ahmed El Baz, Mike Liddell, Jing Zhao, Thejas Nair, Steve Maine, Ganeshan Iyer, Raja Aluri, Giridharan Kesavan, Ramya Bharathi Nimmagadda, Daryn Sharp, Arun Murthy, Tsz-Wo Nicholas Sze, Suresh Srinivas and Sanjay Radia. There are many others who contributed as well providing feedback and comments on numerous jiras. The vote will run for seven days and will end on March 5, 6:00PM PST. Regards, Suresh On Thu, Feb 7, 2013 at 6:41 PM, Mahadevan Venkatraman mah...@microsoft.comwrote: It is super exciting to look at the prospect of these changes being merged to trunk. Having Windows as one of the supported Hadoop platforms is a fantastic opportunity both for the Hadoop project and Microsoft customers. This work began around a year back when a few of us started with a basic port of Hadoop on Windows. Ever
Re: [Vote] Merge branch-trunk-win to trunk
Bobby raises some good questions. A related one, since most current developers won't add Windows support for new features that are platform specific is it assumed that Windows development will either lag or will people actively work on keeping Windows up with the latest? And vice versa in case Windows support is implemented first. Is there a jira for resolving the outstanding TODOs in the code base (similar to HDFS-2148)? Looks like this merge doesn't introduce many which is great (just did a quick diff and grep). Thanks, Eli On Wed, Feb 27, 2013 at 8:17 AM, Robert Evans ev...@yahoo-inc.com wrote: After this is merged in is Windows still going to be a second class citizen but happens to work for more than just development or is it a fully supported platform where if something breaks it can block a release? How do we as a community intend to keep Windows support from breaking? We don't have any Jenkins slaves to be able to run nightly tests to validate everything still compiles/runs. This is not a blocker for me because we often rely on individuals and groups to test Hadoop, but I do think we need to have this discussion before we put it in. --Bobby On 2/26/13 4:55 PM, Suresh Srinivas sur...@hortonworks.com wrote: I had posted heads up about merging branch-trunk-win to trunk on Feb 8th. I am happy to announce that we are ready for the merge. Here is a brief recap on the highlights of the work done: - Command-line scripts for the Hadoop surface area - Mapping the HDFS permissions model to Windows - Abstracted and reconciled mismatches around differences in Path semantics in Java and Windows - Native Task Controller for Windows - Implementation of a Block Placement Policy to support cloud environments, more specifically Azure. - Implementation of Hadoop native libraries for Windows (compression codecs, native I/O) - Several reliability issues, including race-conditions, intermittent test failures, resource leaks. - Several new unit test cases written for the above changes Please find the details of the work in CHANGES.branch-trunk-win.txt - Common changeshttp://bit.ly/Xe7Ynv, HDFS changeshttp://bit.ly/13QOSo9, and YARN and MapReduce changes http://bit.ly/128zzMt. This is the work ported from branch-1-win to a branch based on trunk. For details of the testing done, please see the thread - http://bit.ly/WpavJ4. Merge patch for this is available on HADOOP-8562 https://issues.apache.org/jira/browse/HADOOP-8562. This was a large undertaking that involved developing code, testing the entire Hadoop stack, including scale tests. This is made possible only with the contribution from many many folks in the community. Following people contributed to this work: Ivan Mitic, Chuan Liu, Ramya Sunil, Bikas Saha, Kanna Karanam, John Gordon, Brandon Li, Chris Nauroth, David Lao, Sumadhur Reddy Bolli, Arpit Agarwal, Ahmed El Baz, Mike Liddell, Jing Zhao, Thejas Nair, Steve Maine, Ganeshan Iyer, Raja Aluri, Giridharan Kesavan, Ramya Bharathi Nimmagadda, Daryn Sharp, Arun Murthy, Tsz-Wo Nicholas Sze, Suresh Srinivas and Sanjay Radia. There are many others who contributed as well providing feedback and comments on numerous jiras. The vote will run for seven days and will end on March 5, 6:00PM PST. Regards, Suresh On Thu, Feb 7, 2013 at 6:41 PM, Mahadevan Venkatraman mah...@microsoft.comwrote: It is super exciting to look at the prospect of these changes being merged to trunk. Having Windows as one of the supported Hadoop platforms is a fantastic opportunity both for the Hadoop project and Microsoft customers. This work began around a year back when a few of us started with a basic port of Hadoop on Windows. Ever since, the Hadoop team in Microsoft have made significant progress in the following areas: (PS: Some of these items are already included in Suresh's email, but including again for completeness) - Command-line scripts for the Hadoop surface area - Mapping the HDFS permissions model to Windows - Abstracted and reconciled mismatches around differences in Path semantics in Java and Windows - Native Task Controller for Windows - Implementation of a Block Placement Policy to support cloud environments, more specifically Azure. - Implementation of Hadoop native libraries for Windows (compression codecs, native I/O) - Several reliability issues, including race-conditions, intermittent test failures, resource leaks. - Several new unit test cases written for the above changes In the process, we have closely engaged with the Apache open source community and have got great support and assistance from the community in terms of contributing fixes, code review comments and commits. In addition, the Hadoop team at Microsoft has also made good progress in other projects including Hive, Pig, Sqoop, Oozie, HCat and HBase. Many of these changes have already been committed to the respective trunks with help from various committers and contributors. It is great to see the
Re: [Vote] Merge branch-trunk-win to trunk
Thanks for raising good questions. Currently the merge patch passes all the tests on Linux, hence the proposal for merging the patch to trunk. But as Bobby, Harsh and Eli pointed out, before declaring support for Windows, we need the discussion on the following: 1. Precommit and development process Jenkins infrastructure for Windows build will be made available. Giri and Microsoft contributors have volunteered to help make this happen. With that we need to decide how our precommit process looks. My inclination is to wait for +1 from precommit builds on both the platforms to ensure no issues are introduced. Thoughts? 2. Feature development impact Some questions have been raised about would new features need to be supported on both the platforms. Yes. I do not see a reason why features cannot work on both the platforms, with the exception of platform specific optimizations. This what Java gives us. 3. Platform specific features/optimizations As regards platform specific optimization, each platform can evolve at its own pace and should not block progress of a specific platform. As indicated in my earlier email, there is a sizable number of contributors to work on issues and support of Hadoop on Windows platform. I am excited to see Hadoop reach the other large part of server market. Eli, as pointed out by you, the TODO items need to be addressed. Also we realized we still need to add information on how to build on Windows in BUILDING.txt. We will address this ASAP. Giri and Matt have some expirience with this and should be able to provide more information. On Wed, Feb 27, 2013 at 11:56 AM, Eli Collins e...@cloudera.com wrote: Bobby raises some good questions. A related one, since most current developers won't add Windows support for new features that are platform specific is it assumed that Windows development will either lag or will people actively work on keeping Windows up with the latest? And vice versa in case Windows support is implemented first. Is there a jira for resolving the outstanding TODOs in the code base (similar to HDFS-2148)? Looks like this merge doesn't introduce many which is great (just did a quick diff and grep). Thanks, Eli On Wed, Feb 27, 2013 at 8:17 AM, Robert Evans ev...@yahoo-inc.com wrote: After this is merged in is Windows still going to be a second class citizen but happens to work for more than just development or is it a fully supported platform where if something breaks it can block a release? How do we as a community intend to keep Windows support from breaking? We don't have any Jenkins slaves to be able to run nightly tests to validate everything still compiles/runs. This is not a blocker for me because we often rely on individuals and groups to test Hadoop, but I do think we need to have this discussion before we put it in. --Bobby On 2/26/13 4:55 PM, Suresh Srinivas sur...@hortonworks.com wrote: I had posted heads up about merging branch-trunk-win to trunk on Feb 8th. I am happy to announce that we are ready for the merge. Here is a brief recap on the highlights of the work done: - Command-line scripts for the Hadoop surface area - Mapping the HDFS permissions model to Windows - Abstracted and reconciled mismatches around differences in Path semantics in Java and Windows - Native Task Controller for Windows - Implementation of a Block Placement Policy to support cloud environments, more specifically Azure. - Implementation of Hadoop native libraries for Windows (compression codecs, native I/O) - Several reliability issues, including race-conditions, intermittent test failures, resource leaks. - Several new unit test cases written for the above changes Please find the details of the work in CHANGES.branch-trunk-win.txt - Common changeshttp://bit.ly/Xe7Ynv, HDFS changeshttp://bit.ly/13QOSo9 , and YARN and MapReduce changes http://bit.ly/128zzMt. This is the work ported from branch-1-win to a branch based on trunk. For details of the testing done, please see the thread - http://bit.ly/WpavJ4. Merge patch for this is available on HADOOP-8562 https://issues.apache.org/jira/browse/HADOOP-8562. This was a large undertaking that involved developing code, testing the entire Hadoop stack, including scale tests. This is made possible only with the contribution from many many folks in the community. Following people contributed to this work: Ivan Mitic, Chuan Liu, Ramya Sunil, Bikas Saha, Kanna Karanam, John Gordon, Brandon Li, Chris Nauroth, David Lao, Sumadhur Reddy Bolli, Arpit Agarwal, Ahmed El Baz, Mike Liddell, Jing Zhao, Thejas Nair, Steve Maine, Ganeshan Iyer, Raja Aluri, Giridharan Kesavan, Ramya Bharathi Nimmagadda, Daryn Sharp, Arun Murthy, Tsz-Wo Nicholas Sze, Suresh Srinivas and Sanjay Radia. There are many others who contributed as well providing feedback and comments on numerous jiras. The vote will run for seven days and will end on March 5, 6:00PM PST.
Re: [Vote] Merge branch-trunk-win to trunk
+1 non-binding. I have extensively tested this on both Windows and Linux over the last few months. Thanks, -Arpit On Wed, Feb 27, 2013 at 11:56 AM, Eli Collins e...@cloudera.com wrote: Bobby raises some good questions. A related one, since most current developers won't add Windows support for new features that are platform specific is it assumed that Windows development will either lag or will people actively work on keeping Windows up with the latest? And vice versa in case Windows support is implemented first. Is there a jira for resolving the outstanding TODOs in the code base (similar to HDFS-2148)? Looks like this merge doesn't introduce many which is great (just did a quick diff and grep). Thanks, Eli On Wed, Feb 27, 2013 at 8:17 AM, Robert Evans ev...@yahoo-inc.com wrote: After this is merged in is Windows still going to be a second class citizen but happens to work for more than just development or is it a fully supported platform where if something breaks it can block a release? How do we as a community intend to keep Windows support from breaking? We don't have any Jenkins slaves to be able to run nightly tests to validate everything still compiles/runs. This is not a blocker for me because we often rely on individuals and groups to test Hadoop, but I do think we need to have this discussion before we put it in. --Bobby On 2/26/13 4:55 PM, Suresh Srinivas sur...@hortonworks.com wrote: I had posted heads up about merging branch-trunk-win to trunk on Feb 8th. I am happy to announce that we are ready for the merge. Here is a brief recap on the highlights of the work done: - Command-line scripts for the Hadoop surface area - Mapping the HDFS permissions model to Windows - Abstracted and reconciled mismatches around differences in Path semantics in Java and Windows - Native Task Controller for Windows - Implementation of a Block Placement Policy to support cloud environments, more specifically Azure. - Implementation of Hadoop native libraries for Windows (compression codecs, native I/O) - Several reliability issues, including race-conditions, intermittent test failures, resource leaks. - Several new unit test cases written for the above changes Please find the details of the work in CHANGES.branch-trunk-win.txt - Common changeshttp://bit.ly/Xe7Ynv, HDFS changeshttp://bit.ly/13QOSo9 , and YARN and MapReduce changes http://bit.ly/128zzMt. This is the work ported from branch-1-win to a branch based on trunk. For details of the testing done, please see the thread - http://bit.ly/WpavJ4. Merge patch for this is available on HADOOP-8562 https://issues.apache.org/jira/browse/HADOOP-8562. This was a large undertaking that involved developing code, testing the entire Hadoop stack, including scale tests. This is made possible only with the contribution from many many folks in the community. Following people contributed to this work: Ivan Mitic, Chuan Liu, Ramya Sunil, Bikas Saha, Kanna Karanam, John Gordon, Brandon Li, Chris Nauroth, David Lao, Sumadhur Reddy Bolli, Arpit Agarwal, Ahmed El Baz, Mike Liddell, Jing Zhao, Thejas Nair, Steve Maine, Ganeshan Iyer, Raja Aluri, Giridharan Kesavan, Ramya Bharathi Nimmagadda, Daryn Sharp, Arun Murthy, Tsz-Wo Nicholas Sze, Suresh Srinivas and Sanjay Radia. There are many others who contributed as well providing feedback and comments on numerous jiras. The vote will run for seven days and will end on March 5, 6:00PM PST. Regards, Suresh On Thu, Feb 7, 2013 at 6:41 PM, Mahadevan Venkatraman mah...@microsoft.comwrote: It is super exciting to look at the prospect of these changes being merged to trunk. Having Windows as one of the supported Hadoop platforms is a fantastic opportunity both for the Hadoop project and Microsoft customers. This work began around a year back when a few of us started with a basic port of Hadoop on Windows. Ever since, the Hadoop team in Microsoft have made significant progress in the following areas: (PS: Some of these items are already included in Suresh's email, but including again for completeness) - Command-line scripts for the Hadoop surface area - Mapping the HDFS permissions model to Windows - Abstracted and reconciled mismatches around differences in Path semantics in Java and Windows - Native Task Controller for Windows - Implementation of a Block Placement Policy to support cloud environments, more specifically Azure. - Implementation of Hadoop native libraries for Windows (compression codecs, native I/O) - Several reliability issues, including race-conditions, intermittent test failures, resource leaks. - Several new unit test cases written for the above changes In the process, we have closely engaged with the Apache open source community and have got great support and assistance from the community in terms of contributing fixes, code review comments and commits.
Re: [Vote] Merge branch-trunk-win to trunk
On Wed, Feb 27, 2013 at 2:54 PM, Suresh Srinivas sur...@hortonworks.comwrote: With that we need to decide how our precommit process looks. My inclination is to wait for +1 from precommit builds on both the platforms to ensure no issues are introduced. Thoughts? 2. Feature development impact Some questions have been raised about would new features need to be supported on both the platforms. Yes. I do not see a reason why features cannot work on both the platforms, with the exception of platform specific optimizations. This what Java gives us. I'm concerned about the above. Personally, I don't have access to any Windows boxes with development tools, and I know nothing about developing on Windows. The only Windows I run is an 8GB VM with 1 GB RAM allocated, for powerpoint :) If I submit a patch and it gets -1 tests failed on the Windows slave, how am I supposed to proceed? I think a reasonable compromise would be that the tests should always *build* on Windows before commit, and contributors should do their best to look at the test logs for any Windows-specific failures. But, beyond looking at the logs, a -1 Tests failed on windows should not block a commit. Those contributors who are interested in Windows being a first-class platform should be responsible for watching the Windows builds and debugging/fixing any regressions that might be Windows-specific. I also think the KDE model that Harsh pointed out is an interesting one -- ie the idea that we would not merge windows support to trunk, but rather treat is as a parallel code line which lives in the ASF and has its own builds and releases. The windows team would periodically merge trunk-win to pick up any new changes, and do a separate test/release process. I'm not convinced this is the best idea, but worth discussion of pros and cons. -Todd On Wed, Feb 27, 2013 at 11:56 AM, Eli Collins e...@cloudera.com wrote: Bobby raises some good questions. A related one, since most current developers won't add Windows support for new features that are platform specific is it assumed that Windows development will either lag or will people actively work on keeping Windows up with the latest? And vice versa in case Windows support is implemented first. Is there a jira for resolving the outstanding TODOs in the code base (similar to HDFS-2148)? Looks like this merge doesn't introduce many which is great (just did a quick diff and grep). Thanks, Eli On Wed, Feb 27, 2013 at 8:17 AM, Robert Evans ev...@yahoo-inc.com wrote: After this is merged in is Windows still going to be a second class citizen but happens to work for more than just development or is it a fully supported platform where if something breaks it can block a release? How do we as a community intend to keep Windows support from breaking? We don't have any Jenkins slaves to be able to run nightly tests to validate everything still compiles/runs. This is not a blocker for me because we often rely on individuals and groups to test Hadoop, but I do think we need to have this discussion before we put it in. --Bobby On 2/26/13 4:55 PM, Suresh Srinivas sur...@hortonworks.com wrote: I had posted heads up about merging branch-trunk-win to trunk on Feb 8th. I am happy to announce that we are ready for the merge. Here is a brief recap on the highlights of the work done: - Command-line scripts for the Hadoop surface area - Mapping the HDFS permissions model to Windows - Abstracted and reconciled mismatches around differences in Path semantics in Java and Windows - Native Task Controller for Windows - Implementation of a Block Placement Policy to support cloud environments, more specifically Azure. - Implementation of Hadoop native libraries for Windows (compression codecs, native I/O) - Several reliability issues, including race-conditions, intermittent test failures, resource leaks. - Several new unit test cases written for the above changes Please find the details of the work in CHANGES.branch-trunk-win.txt - Common changeshttp://bit.ly/Xe7Ynv, HDFS changes http://bit.ly/13QOSo9 , and YARN and MapReduce changes http://bit.ly/128zzMt. This is the work ported from branch-1-win to a branch based on trunk. For details of the testing done, please see the thread - http://bit.ly/WpavJ4. Merge patch for this is available on HADOOP-8562 https://issues.apache.org/jira/browse/HADOOP-8562. This was a large undertaking that involved developing code, testing the entire Hadoop stack, including scale tests. This is made possible only with the contribution from many many folks in the community. Following people contributed to this work: Ivan Mitic, Chuan Liu, Ramya Sunil, Bikas Saha, Kanna Karanam, John Gordon, Brandon Li, Chris Nauroth, David Lao, Sumadhur Reddy Bolli, Arpit Agarwal, Ahmed El Baz, Mike Liddell, Jing Zhao, Thejas Nair, Steve Maine,
RE: [Vote] Merge branch-trunk-win to trunk
+1 (non-binding) I am really glad to see this happening! As people already mentioned, this has been a great engineering effort involving many people! Folks raised some valid concerns below and I thought it would be good to share my 2 cents. In my opinion, we don't have to solve all these problems right now. As we move forward with two platforms, we can start addressing one problem at a time and incrementally improve. In the first iteration, maintaining Hadoop on Windows could be just everyone trying to do their best effort (make sure Jenkins build succeeds at least). We already have people who are building/running trunk on Windows daily, so they would jump in and fix problems as needed (we've been doing this in branch-trunk-win for a while now). Although I see that the problems could arise with platform specific features/optimizations, I don't think these are frequent, so in most cases everything will just work. Merging the two branches sooner rather than later does seems like the right thing to do if the ultimate goal is to have Hadoop on both platforms. Now that the port has completed, we will have people in Microsoft (and elsewhere) wanting to contribute features/improvements to the trunk branch. A separate branch would just make things more difficult and confusing for everyone :) Hope this makes sense. -Original Message- From: Todd Lipcon [mailto:t...@cloudera.com] Sent: Wednesday, February 27, 2013 3:43 PM To: common-...@hadoop.apache.org Cc: yarn-dev@hadoop.apache.org; hdfs-...@hadoop.apache.org; mapreduce-...@hadoop.apache.org Subject: Re: [Vote] Merge branch-trunk-win to trunk On Wed, Feb 27, 2013 at 2:54 PM, Suresh Srinivas sur...@hortonworks.comwrote: With that we need to decide how our precommit process looks. My inclination is to wait for +1 from precommit builds on both the platforms to ensure no issues are introduced. Thoughts? 2. Feature development impact Some questions have been raised about would new features need to be supported on both the platforms. Yes. I do not see a reason why features cannot work on both the platforms, with the exception of platform specific optimizations. This what Java gives us. I'm concerned about the above. Personally, I don't have access to any Windows boxes with development tools, and I know nothing about developing on Windows. The only Windows I run is an 8GB VM with 1 GB RAM allocated, for powerpoint :) If I submit a patch and it gets -1 tests failed on the Windows slave, how am I supposed to proceed? I think a reasonable compromise would be that the tests should always *build* on Windows before commit, and contributors should do their best to look at the test logs for any Windows-specific failures. But, beyond looking at the logs, a -1 Tests failed on windows should not block a commit. Those contributors who are interested in Windows being a first-class platform should be responsible for watching the Windows builds and debugging/fixing any regressions that might be Windows-specific. I also think the KDE model that Harsh pointed out is an interesting one -- ie the idea that we would not merge windows support to trunk, but rather treat is as a parallel code line which lives in the ASF and has its own builds and releases. The windows team would periodically merge trunk-win to pick up any new changes, and do a separate test/release process. I'm not convinced this is the best idea, but worth discussion of pros and cons. -Todd On Wed, Feb 27, 2013 at 11:56 AM, Eli Collins e...@cloudera.com wrote: Bobby raises some good questions. A related one, since most current developers won't add Windows support for new features that are platform specific is it assumed that Windows development will either lag or will people actively work on keeping Windows up with the latest? And vice versa in case Windows support is implemented first. Is there a jira for resolving the outstanding TODOs in the code base (similar to HDFS-2148)? Looks like this merge doesn't introduce many which is great (just did a quick diff and grep). Thanks, Eli On Wed, Feb 27, 2013 at 8:17 AM, Robert Evans ev...@yahoo-inc.com wrote: After this is merged in is Windows still going to be a second class citizen but happens to work for more than just development or is it a fully supported platform where if something breaks it can block a release? How do we as a community intend to keep Windows support from breaking? We don't have any Jenkins slaves to be able to run nightly tests to validate everything still compiles/runs. This is not a blocker for me because we often rely on individuals and groups to test Hadoop, but I do think we need to have this discussion before we put it in. --Bobby On 2/26/13 4:55 PM, Suresh Srinivas sur...@hortonworks.com wrote: I had posted heads up about merging branch-trunk-win to trunk on Feb 8th. I am happy