Re: hadoop-integration-tests jenkins job

2016-03-20 Thread Colin P. McCabe
It seems pretty reasonable to make this part of the nightly build. cheers, Colin On Fri, Mar 18, 2016 at 1:17 PM, Allen Wittenauer wrote: > Given that we’re down to half our capacity in build-land, it’s > interesting to see what everyone’s running

Re: CHANGES.txt is gone from trunk, branch-2, branch-2.8

2016-03-08 Thread Colin P. McCabe
+1 Thanks, Andrew. This will avoid so many spurious conflicts when cherry-picking changes, and so much wasted time on commit. best, Colin On Thu, Mar 3, 2016 at 9:11 PM, Andrew Wang wrote: > Hi all, > > With the inclusion of HADOOP-12651 going back to branch-2.8,

Re: node.js and more as dependencies

2016-03-02 Thread Colin P. McCabe
Just like Java code of >> Hadoop, no user will try to get source code from a running cluster :). >> >> I will make sure integration to Maven is as less as possible, we should >> only need one single sub module, and limit all changes in that module only. >>

Re: node.js and more as dependencies

2016-02-29 Thread Colin P. McCabe
Hmm. Devil's advocate here: Do we really need to have a "JS build"? The main use-cases for "JS builds" seem to be if you want to minimize or obfuscate your JS. Given that this is open source code, obfuscation seems unnecessary. Given that it's a low-traffic management interface, minimizing the

Re: Looking to a Hadoop 3 release

2016-02-22 Thread Colin P. McCabe
ther / when it can be released in 2.9. I think we > should rather concentrate our EC dev efforts to harden key features under > the follow-on umbrella HDFS-8031 and make it solid for a 3.0 release. > > Sincerely, > Zhe > > On Mon, Feb 22, 2016 at 9:25 AM Colin P. McCabe <c

Re: [crypto][chimera] Next steps

2016-02-22 Thread Colin P. McCabe
On Mon, Feb 22, 2016 at 2:20 PM, Steve Loughran <ste...@hortonworks.com> wrote: > >> On 22 Feb 2016, at 17:28, Colin P. McCabe <cmcc...@apache.org> wrote: >> >> I would highly recommend shading this library when it is used in >> Hadoop and/or Spark, to preve

Re: [crypto][chimera] Next steps

2016-02-22 Thread Colin P. McCabe
performance on modern hardware. best, Colin On Mon, Feb 22, 2016 at 10:02 AM, Jochen Wiedmann <jochen.wiedm...@gmail.com> wrote: > On Mon, Feb 22, 2016 at 6:28 PM, Colin P. McCabe <cmcc...@apache.org> wrote: > >> What is the strategy for handling JNI components? > > Wron

Re: [crypto][chimera] Next steps

2016-02-22 Thread Colin P. McCabe
I would highly recommend shading this library when it is used in Hadoop and/or Spark, to prevent version skew problems between Hadoop and Spark like we have had in the past. What is the strategy for handling JNI components? I think at a minimum, we should include the version number in the native

Re: Looking to a Hadoop 3 release

2016-02-22 Thread Colin P. McCabe
+1 for a release of 3.0. There are a lot of significant, compatibility-breaking, but necessary changes in this release... we've touched on some of them in this thread. +1 for a parallel release of 2.8 as well. I think we are pretty close to this, barring a dozen or so blockers. best, Colin On

Re: Replacing Commons-httpclient and bumping httpclient version

2016-02-16 Thread Colin P. McCabe
+1 for updating the dependencies in trunk. best, Colin On Tue, Feb 16, 2016 at 9:20 AM, Wei-Chiu Chuang wrote: > Fellow Hadoop developers, > > Hadoop codebase depends on commons-httpclient, and its latest version, 3.1.2, > is EOL nearly 5 years ago. But because its API is

Re: [DISCUSS} why is checkstyle harassing me on indentation?

2016-01-18 Thread Colin P. McCabe
On Mon, Jan 18, 2016 at 11:34 AM, Steve Loughran wrote: > > Yetus checkstyle is going a bit overboard on indentation policy > > https://builds.apache.org/job/PreCommit-HADOOP-Build/8434/artifact/patchprocess/diff-checkstyle-hadoop-common-project_hadoop-common.txt > > I am

Re: Disable some of the Hudson integration comments on JIRA

2015-11-30 Thread Colin P. McCabe
+1. Colin On Mon, Nov 30, 2015 at 11:33 AM, Andrew Wang wrote: > Good point Allen. So I guess the broader question is, do we find the > per-commit tracking build and test useful? With our current flakiness > levels, there isn't much signal from a FAILED on one of these

Re: Jenkins stability and patching

2015-11-23 Thread Colin P. McCabe
I agree that our tests are in a bad state. It would help if we could maintain a list of "flaky tests" somewhere in git and have Yetus consider the flakiness of a test before -1ing a patch. Right now, we pretty much all have that list in our heads, and we're not applying it very consistently.

Re: Jenkins stability and patching

2015-11-23 Thread Colin P. McCabe
On Mon, Nov 23, 2015 at 1:53 PM, Colin P. McCabe <cmcc...@apache.org> wrote: > I agree that our tests are in a bad state. It would help if we could > maintain a list of "flaky tests" somewhere in git and have Yetus > consider the flakiness of a test before -1ing a patch

Re: Github integration for Hadoop

2015-11-23 Thread Colin P. McCabe
HowToContribute#Naming_your_patch >> > >> > Yetus will actually support a little bit more than that guide suggests. >> If >> > a project doesn't define a URL to point people at for help in naming >> > patches we default to this guide: >> > >&

Re: Github integration for Hadoop

2015-11-13 Thread Colin P. McCabe
Thanks, Allen, I wasn't aware that Yetus now supported testing for other branches. Is there documentation about how to name the branch so it gets tested? best, Colin On Fri, Nov 13, 2015 at 7:52 AM, Allen Wittenauer <a...@altiscale.com> wrote: > >> On Nov 12, 2015, at 10:55 AM,

Re: Github integration for Hadoop

2015-11-12 Thread Colin P. McCabe
gerrit has a button on the UI to cherry-pick to different branches. The button creates separate "gerrit changes" which you can then commit. Eventually we could hook those up to Jenkins-- something which we've never been able to do for different branches with the patch-file-based workflow. best,

Re: Github integration for Hadoop

2015-11-02 Thread Colin P. McCabe
+1 for setting up a gerrit instance to try it out. cheers, Colin On Mon, Nov 2, 2015 at 7:13 AM, Zhe Zhang wrote: >> >> So I think Gerrit/Crucible/whatever are on the table, with some work. >> Does anyone want to take the token for asking other projects and >> assembling

Re: Github integration for Hadoop

2015-10-31 Thread Colin P. McCabe
of each of those is enough :) Colin On Sat, Oct 31, 2015 at 4:40 AM, Steve Loughran <ste...@hortonworks.com> wrote: > >> On 30 Oct 2015, at 17:15, Colin P. McCabe <cmcc...@apache.org> wrote: >> >> I think the Spark guys eventually built some kind of UI on top of >

Re: Github integration for Hadoop

2015-10-30 Thread Colin P. McCabe
I am -1 on the current GH stuff, just because I feel like there wasn't enough discussion, testing, and documentation of it. The initial proposal had no details and was implemented before any of the people who had had misgivings on the previous email thread had a chance to even see it. It's a big

Re: Github integration for Hadoop

2015-10-30 Thread Colin P. McCabe
I think we should take more than 24 hours of discussion to make a big, fundamental change to our code review process. I used github while working on the Spark project, and frankly I didn't like it. I didn't like the way it split the discussion between JIRA and github. Also, often people would

Re: FYI: Major, long-standing Issue with trunk's test-patch

2015-10-30 Thread Colin P. McCabe
Thanks for finding this issue, Allen. This is somewhat of a tangent, but I'm curious who still uses hadoop-pipes. Certainly development has more or less stopped on it. I think the last 5 years of commits on it have been things like updating version numbers, fixing missing #includes, moving to

Re: Java 8 + Jersey updates

2015-10-26 Thread Colin P. McCabe
Looks like a good idea. I assume you are targetting this only at trunk / 3.0 based on the "target version" and the incompatibility discussion? best, Colin On Mon, Oct 26, 2015 at 7:07 AM, Tsuyoshi Ozawa wrote: > Hi Steve, > > Thanks for your help. > > > 2. it's "significant"

Re: hadoop-hdfs-client splitoff is going to break code

2015-10-19 Thread Colin P. McCabe
Thanks for being proactive here, Steve. I think this is a good example of why this change should have been done in a branch rather than having been done directly in trunk. regards, Colin On Wed, Oct 14, 2015 at 10:36 AM, Steve Loughran wrote: > just an FYI, the split

Re: DomainSocket issues on Solaris

2015-10-07 Thread Colin P. McCabe
On Wed, Oct 7, 2015 at 9:35 AM, Alan Burlison wrote: > On 06/10/2015 10:52, Steve Loughran wrote: > >> HADOOP-11127, "Improve versioning and compatibility support in native >> library for downstream hadoop-common users." says "we need to do >> better here", which is

Re: DomainSocket issues on Solaris

2015-10-05 Thread Colin P. McCabe
Hi Alan, As Chris commented earlier, the main use of DomainSocket is to transfer file descriptors from the DataNode to the DFSClient. As you know, this is something that can only be done through domain sockets, not through inet sockets. We do support passing data over domain sockets, but in

Re: [DISCUSS] Looking to a 2.8.0 release

2015-10-05 Thread Colin P. McCabe
I think it makes sense to have a 2.8 release since there are a tremendous number of JIRAs in 2.8 that are not in 2.7. Doing a 3.x release seems like something we should consider separately since it would not have the same compatibility guarantees as a 2.8 release. There's a pretty big delta

Re: CHANGES.TXT

2015-09-23 Thread Colin P. McCabe
icate with each other, regardless of the issue, is intentional or just > a special talent. > > On Sep 22, 2015, at 8:52 AM, Colin P. McCabe <cmcc...@apache.org> wrote: > >> I think it's extremely unrealistic to expect Hadoop to ever follow a >> branchless developme

Re: CHANGES.TXT

2015-09-22 Thread Colin P. McCabe
peter out like the rest of them. Please, let's fix this, finally. Autogenerate this file. best, Colin On Mon, Sep 14, 2015 at 7:10 PM, Allen Wittenauer <a...@altiscale.com> wrote: > > On Sep 14, 2015, at 5:15 PM, Colin P. McCabe <cmcc...@apache.org> wrote: > >> Let'

Re: CHANGES.TXT

2015-09-14 Thread Colin P. McCabe
Let's stay focused on the title of the thread-- CHANGES.txt-- and discuss issues surrounding releasing trunk in a separate thread. Colin On Mon, Sep 14, 2015 at 3:59 PM, Allen Wittenauer wrote: > > Given that we haven’t had a single minor release in the “stable” era

Re: CHANGES.TXT

2015-09-14 Thread Colin P. McCabe
On Sat, Sep 12, 2015 at 11:28 AM, Haohui Mai wrote: > CHANGES.txt is always a pain. *sigh* > > It seems that relying on human efforts to maintain the CHANGES.txt is > error-prone and not sustainable. It is always a pain to fix them. > > I think aw has some scripts for option

Re: [DISCUSS] fixing jenkins; policing the build

2015-09-14 Thread Colin P. McCabe
I think building on Jenkins with jdk8 (even with source version = 1.7) would help prevent the jdk8 javadoc failures from creeping in. With regard to the unit test failures, maybe it's time for another fix-it day? cheers, Colin On Sun, Sep 13, 2015 at 6:58 AM, Steve Loughran

Re: [VOTE] Using rebase and merge for feature branch development

2015-08-24 Thread Colin P. McCabe
+1 cheers, Colin On Mon, Aug 24, 2015 at 10:04 AM, Steve Loughran ste...@hortonworks.com wrote: +1 (binding) On 21 Aug 2015, at 13:44, Andrew Wang andrew.w...@cloudera.com wrote: Hi common-dev, As promised, here is an official vote thread. Let's run it for the standard 7 days, closing on

Re: Doubts about review and validation process

2015-08-19 Thread Colin P. McCabe
Hi Augusto, Sorry that we haven't gotten back to you in a while. This isn't my area of expertise, but hopefully someone will step forward to look at the MR stuff. It seems like maybe you need a design document for the preemption stuff. Each JIRA gives a small piece of the puzzle, but it's hard

Re: Does repository work correctly?

2015-07-06 Thread Colin P. McCabe
I am getting the same error now. Did we ever find the root cause of this problem? cmccabe@keter:~/hadoop2 git push Counting objects: 76, done. Delta compression using up to 4 threads. Compressing objects: 100% (12/12), done. Writing objects: 100% (15/15), 1.24 KiB | 0 bytes/s, done. Total 15

Re: Does repository work correctly?

2015-07-06 Thread Colin P. McCabe
Thanks. This seems to have been resolved, for me at least. best, Colin On Mon, Jul 6, 2015 at 1:16 PM, Sean Busbey bus...@cloudera.com wrote: there appears to be an outage currently. https://issues.apache.org/jira/browse/INFRA-9934 On Mon, Jul 6, 2015 at 3:02 PM, Colin P. McCabe cmcc

Re: [DISCUSS] project for pre-commit patch testing (was Re: upstream jenkins build broken?)

2015-06-22 Thread Colin P. McCabe
+1 for making this a separate project. We've always struggled with a lot of forks of the test-patch code and perhaps this project can help create something that works well for multiple projects. Bypassing the incubator seems kind of weird (I didn't know that was an option) but I will let other

Re: [DISCUSS] More Maintenance Releases

2015-06-22 Thread Colin P. McCabe
+1 for creating a maintenance release with a more rapid release cadence and more effort put into stability backports. I think this would really be great for the project. Colin On Mon, Jun 22, 2015 at 2:43 AM, Akira AJISAKA ajisa...@oss.nttdata.co.jp wrote: Hi everyone, In Hadoop Summit, I

Re: Protocol Buffers version

2015-06-15 Thread Colin P. McCabe
On Mon, Jun 15, 2015 at 7:24 AM, Allen Wittenauer a...@altiscale.com wrote: On Jun 12, 2015, at 1:03 PM, Alan Burlison alan.burli...@oracle.com wrote: On 14/05/2015 18:41, Chris Nauroth wrote: As a reminder though, the community probably would want to see a strong justification for the

Re: What is the limit to the number of properties set in the configuration object

2015-06-15 Thread Colin P. McCabe
Much like zombo.com, the only limit is yourself. But huge Configuration objects are going to be really inefficient, so I would look elsewhere for storing lots of data. best, Colin On Fri, Jun 12, 2015 at 7:30 PM, Sitaraman Vilayannur vrsitaramanietfli...@gmail.com wrote: Thanks Allen, what is

Re: CMake and CMAKE_LD_FLAGS

2015-06-15 Thread Colin P. McCabe
Hi Alan, I think you are right that CMAKE_LD_FLAGS has never done anything, and setting it was always a mistake. The uses of CMAKE_LD_FLAGS in JNIFlags.cmake are harmless, since the -m32 option is not needed for the linker. However, it's more concerning that hadoop-mapreduce-client-nativetask

Re: Maven always detects changes - Is this a Docker 'feature'?

2015-06-15 Thread Colin P. McCabe
Hi Darrell, Sorry, I'm not familiar with this feature of Maven. Perhaps try asking on the Apache Maven mailing list? best, Colin On Fri, May 22, 2015 at 8:34 AM, Darrell Taylor darrell.tay...@gmail.com wrote: Hi, Is it normal behaviour for maven to detect changes when I run tests with no

Re: Protocol Buffers version

2015-05-19 Thread Colin P. McCabe
I agree that the protobuf 2.4.1 - 2.5.0 transition could have been handled a lot better by Google. Specifically, since it was an API-breaking upgrade, it should have been a major version bump for the Java library version. I also feel that removing the download links for the old versions of the

Re: JNIFlags.cmake versus FindJNI.cmake

2015-05-19 Thread Colin P. McCabe
Thanks for looking at this code, Alan. I appreciate your desire to clean things up. However, as I commented on HADOOP-11987, JNIFlags.cmake contains many fixes not in the standard FindJNI.cmake. I very much doubt that we will be able to remove it without causing significant regressions. I guess

Re: Jenkins precommit-*-build

2015-05-05 Thread Colin P. McCabe
Thanks, Allen. This has long been a thorn in our side, and it's really good to see someone work on it. cheers, Colin On Tue, May 5, 2015 at 2:59 PM, Allen Wittenauer a...@altiscale.com wrote: TL;DR: Heads up: I’m going to hack on these scripts to fix the race conditions.

Re: Checkstyle 80 char limit

2015-05-05 Thread Colin P. McCabe
Historical issues aside, the reasons are: * When doing vertical, side-by-side diffs, you need some reasonable length length standard or else it Just Doesn't Work. And vertical, side-by-side diffs are how most people prefer to do code reviews. * Really long lines are visually difficult to

Re: [VOTE] Release Apache Hadoop 2.7.0 RC0

2015-04-17 Thread Colin P. McCabe
, at 2:27 AM, Colin P. McCabe cmcc...@apache.org wrote: I would like to fix HDFS-8070, which just came to light. The impact is that if this isn't fixed, 2.6 clients will be unable to do short-circuit reads against 2.7 datanodes. best, Colin On Wed, Apr 15, 2015 at 8:19 PM, Brahma Reddy

Re: [VOTE] Release Apache Hadoop 2.7.0 RC0

2015-04-16 Thread Colin P. McCabe
I would like to fix HDFS-8070, which just came to light. The impact is that if this isn't fixed, 2.6 clients will be unable to do short-circuit reads against 2.7 datanodes. best, Colin On Wed, Apr 15, 2015 at 8:19 PM, Brahma Reddy Battula brahmareddy.batt...@huawei.com wrote: Need Jcardar

Re: Hadoop Common: Why not re-use the Security model offered by SELINUX?

2015-03-30 Thread Colin P. McCabe
As ATM and Steve have already commented, selinux isn't really comparable to the existing Hadoop security framework. These are just two things that have different functions. The Hadoop security framework needs to deal with authenticating users over the network, managing Kerberos and active

Re: about CHANGES.txt

2015-03-18 Thread Colin P. McCabe
of the commit message Either way I think would be an improvement. Thanks for your ideas folks On Monday, March 16, 2015 11:51 AM, Colin P. McCabe cmcc...@apache.org wrote: +1 for generating CHANGES.txt from JIRA and/or git as part of making a release. Or just dropping it altogether

Re: Hadoop - Major releases

2015-03-17 Thread Colin P. McCabe
Thanks, Andrew and Joep. +1 for maintaining wire and API compatibility, but moving to JDK8 in 3.0 best, Colin On Mon, Mar 16, 2015 at 3:22 PM, Andrew Wang andrew.w...@cloudera.com wrote: I took the liberty of adding line breaks to Joep's mail. Thanks for the great feedback Joep. The goal

Re: about CHANGES.txt

2015-03-16 Thread Colin P. McCabe
+1 for generating CHANGES.txt from JIRA and/or git as part of making a release. Or just dropping it altogether. Keeping it under version control creates lot of false conflicts whenever submitting a patch and generally makes committing minor changes unpleasant. Colin On Sat, Mar 14, 2015 at

upstream jenkins build broken?

2015-03-10 Thread Colin P. McCabe
Hi all, A very quick (and not thorough) survey shows that I can't find any jenkins jobs that succeeded from the last 24 hours. Most of them seem to be failing with some variant of this message: [ERROR] Failed to execute goal org.apache.maven.plugins:maven-clean-plugin:2.5:clean (default-clean)

Re: Hadoop 3.x: what about shipping trunk as a 2.x release in 2015?

2015-03-10 Thread Colin P. McCabe
Er, that should read as Allen commented C. On Tue, Mar 10, 2015 at 11:55 AM, Colin P. McCabe cmcc...@apache.org wrote: Hi Arun, Not all changes which are incompatible can be fixed-- sometimes an incompatibility is a necessary part of a change. For example, taking a really old library

Re: Hadoop 3.x: what about shipping trunk as a 2.x release in 2015?

2015-03-10 Thread Colin P. McCabe
on hadoop-3.x right? So, I don't see the difference? Arun From: Colin P. McCabe cmcc...@apache.org Sent: Monday, March 09, 2015 3:05 PM To: hdfs-...@hadoop.apache.org Cc: mapreduce-...@hadoop.apache.org; common-dev@hadoop.apache.org; yarn

Re: Hadoop 3.x: what about shipping trunk as a 2.x release in 2015?

2015-03-09 Thread Colin P. McCabe
Java 7 will be end-of-lifed in April 2015. I think it would be unwise to plan a new Hadoop release against a version of Java that is almost obsolete and (soon) no longer receiving security updates. I think people will be willing to roll out a new version of Java for Hadoop 3.x. Similarly, the

Re: timsort bug in the JDK

2015-03-04 Thread Colin P. McCabe
Tsuyoshi Ozawa sent out an email to the common-dev list about this recently. It seems like the bug only bites when the number of elements is larger than 67108864, which may limit its impact (to state it mildly). Also, the flawed sorting algorithm is not used on arrays of primitives, just on

Re: DISCUSSION: Patch commit criteria.

2015-03-02 Thread Colin P. McCabe
I agree with Andrew and Konst here. I don't think the language is unclear in the rule, either... consensus with a minimum of one +1 clearly indicates that _other people_ are involved, not just one person. I would also mention that we created the branch committer role specifically to make it

Re: TimSort bug and its workaround

2015-03-02 Thread Colin P. McCabe
Thanks for bringing this up. If you can find any place where an array might realistically be larger than 67 million elements, then I guess file a JIRA for it. Also this array needs to be of objects, not of primitives (quicksort is used for those in jdk7, apparently). I can't think of any such

Re: Erratic Jenkins behavior

2015-02-18 Thread Colin P. McCabe
Hortonworks http://hortonworks.com/ On 2/12/15, 2:00 PM, Colin P. McCabe cmcc...@apache.org wrote: We could potentially use different .m2 directories for each executor. I think this has been brought up in the past as well. I'm not sure how maven handles concurrent access to the .m2

Re: Erratic Jenkins behavior

2015-02-12 Thread Colin P. McCabe
, Colin P. McCabe (cmcc...@apache.orgmailto:cmcc...@apache.org) wrote: I'm sorry, I don't have any insight into this. With regard to HADOOP-11084, I thought that $BUILD_URL would be unique for each concurrent build, which would prevent build artifacts from getting mixed up between jobs. Based

Re: Erratic Jenkins behavior

2015-02-09 Thread Colin P. McCabe
I'm sorry, I don't have any insight into this. With regard to HADOOP-11084, I thought that $BUILD_URL would be unique for each concurrent build, which would prevent build artifacts from getting mixed up between jobs. Based on the value of PATCHPROCESS that Kihwal posted, perhaps this is not the

Re: Patch review process

2015-02-09 Thread Colin P. McCabe
at 2:55 AM, Steve Loughran ste...@hortonworks.com wrote: On 8 February 2015 at 09:55:42, Karthik Kambatla (ka...@cloudera.commailto:ka...@cloudera.com) wrote: On Fri, Feb 6, 2015 at 6:14 PM, Colin P. McCabe cmcc...@apache.org wrote: I think it's healthy to have lots of JIRAs that are patch

Re: Patch review process

2015-02-06 Thread Colin P. McCabe
should also close issues that require too much work to fix, or at least mark them for Later. Not every idea needs to end in a commit, but silence is frustrating for contributors. -C +1. On Wed, Feb 4, 2015 at 10:24 AM, Colin P. McCabe cmcc...@apache.org wrote: I wonder if this work

Re: Patch review process

2015-02-04 Thread Colin P. McCabe
I wonder if this work logically falls under the release manager role. During a release, we generally spend a little bit of time thinking about what new features we added, systems we stabilized, interfaces we changed, etc. etc. This gives us some perspective to look backwards at old JIRAs and

Re: Patch review process

2015-01-29 Thread Colin P. McCabe
I really do not think it's worth looking at Reviewboard at reviews.apache.org again. We have used it in the past, and it has all the downsides of gerrit and none of the upsides. And some extra downsides of its own. * Splits the conversation into two places * No way to search the split out

Re: NFSv3 Filesystem Connector

2015-01-14 Thread Colin P. McCabe
Hi Niels, I agree that direct-attached storage seems more economical for many users. As an HDFS developer, I certainly have a dog in this fight as well :) But we should be respectful towards people trying to contribute code to Hadoop and evaluate the code on its own merits. It is up to our