Re: Erratic Jenkins behavior
I¹m pretty sure there is no guarantee of isolation on a shared .m2/repository directory for multiple concurrent Maven processes. I¹ve had a theory for a while that one build running ³mvm install² can overwrite the snapshot artifact that was just installed by another concurrent build. This can create bizarre problems, for example if a patch introduces a new class in hadoop-common and then references that class from hadoop-hdfs. I expect using completely separate work directories for .m2/repository, the patch directory, and the Jenkins workspace could resolve this. The typical cost for this kind of change is increased disk consumption and increased build time, since Maven would need to download dependencies fresh every time. Chris Nauroth Hortonworks http://hortonworks.com/ On 2/12/15, 2:00 PM, Colin P. McCabe cmcc...@apache.org wrote: We could potentially use different .m2 directories for each executor. I think this has been brought up in the past as well. I'm not sure how maven handles concurrent access to the .m2 directory... if it's not using flock or fnctl then it's not really safe. This might explain some of our missing class error issues. Colin On Tue, Feb 10, 2015 at 2:13 AM, Steve Loughran ste...@hortonworks.com wrote: Mvn is a dark mystery to us all. I wouldn't trust it not pick up things from other builds if they ended up published to ~/.m2/repository during the process On 9 February 2015 at 19:29:06, Colin P. McCabe (cmcc...@apache.orgmailto:cmcc...@apache.org) wrote: I'm sorry, I don't have any insight into this. With regard to HADOOP-11084, I thought that $BUILD_URL would be unique for each concurrent build, which would prevent build artifacts from getting mixed up between jobs. Based on the value of PATCHPROCESS that Kihwal posted, perhaps this is not the case? Perhaps someone can explain how this is supposed to work (I am a Jenkins newbie). regards, Colin On Thu, Feb 5, 2015 at 10:42 AM, Yongjun Zhang yzh...@cloudera.com wrote: Thanks Kihwal for bringing this up. Seems related to: https://issues.apache.org/jira/browse/HADOOP-11084 Hi Andrew/Arpit/Colin/Steve, you guys worked on this jira before, any insight about the issue Kihwal described? Thanks. --Yongjun On Thu, Feb 5, 2015 at 9:49 AM, Kihwal Lee kih...@yahoo-inc.com.invalid wrote: I am sure many of us have seen strange jenkins behavior out of the precommit builds. - build artifacts missing - serving build artifact belonging to another build. This also causes wrong precommit results to be posted on the bug. - etc. The latest one I saw is disappearance of the unit test stdout/stderr file during a build. After a successful run of unit tests, the file vanished, so the script could not cat it. It looked like another build process had deleted it, while this build was in progress. It might have something to do with the fact that the patch-dir is set like following: PATCHPROCESS=/home/jenkins/jenkins-slave/workspace/PreCommit-HDFS-Build /../patchprocessI don't have access to the jenkins build configs or the build machines, so I can't debug it further, but I think we need to take care of it sooner than later. Can any one help? Kihwal
[jira] [Resolved] (HADOOP-11611) fix TestHTracedRESTReceiver unit test failures
[ https://issues.apache.org/jira/browse/HADOOP-11611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin Patrick McCabe resolved HADOOP-11611. --- Resolution: Fixed wrong project fix TestHTracedRESTReceiver unit test failures -- Key: HADOOP-11611 URL: https://issues.apache.org/jira/browse/HADOOP-11611 Project: Hadoop Common Issue Type: Bug Affects Versions: 3.2 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Priority: Critical Fix some issues with HTracedRESTReceiver that are resulting in unit test failures. So there were two main issues: * better way to launch htraced * fixes to the HTracedRESTReceiver logic -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HADOOP-11611) fix TestHTracedRESTReceiver unit test failures
Colin Patrick McCabe created HADOOP-11611: - Summary: fix TestHTracedRESTReceiver unit test failures Key: HADOOP-11611 URL: https://issues.apache.org/jira/browse/HADOOP-11611 Project: Hadoop Common Issue Type: Bug Affects Versions: 3.2 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Priority: Critical Fix some issues with HTracedRESTReceiver that are resulting in unit test failures. So there were two main issues: * better way to launch htraced * fixes to the HTracedRESTReceiver logic -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Erratic Jenkins behavior
Hmm. I guess my thought would be that we would have a fixed number of slots (i.e. executors on a single node with associated .m2 directories). Then we wouldn't clear each .m2 in between runs, but we would ensure that only one slot at a time had access to each directory. In that case, build times wouldn't increase that much (or really at all, until a dependency changed... right?). When a dependency changed we'd have to do O(N_slots) amount of work, but dependencies don't change that often. Of course, the current situation also generates a lot of extra work because people need to rekick builds that failed for mystery reasons. cheers. Colin On Wed, Feb 18, 2015 at 9:53 AM, Chris Nauroth cnaur...@hortonworks.com wrote: I¹m pretty sure there is no guarantee of isolation on a shared .m2/repository directory for multiple concurrent Maven processes. I¹ve had a theory for a while that one build running ³mvm install² can overwrite the snapshot artifact that was just installed by another concurrent build. This can create bizarre problems, for example if a patch introduces a new class in hadoop-common and then references that class from hadoop-hdfs. I expect using completely separate work directories for .m2/repository, the patch directory, and the Jenkins workspace could resolve this. The typical cost for this kind of change is increased disk consumption and increased build time, since Maven would need to download dependencies fresh every time. Chris Nauroth Hortonworks http://hortonworks.com/ On 2/12/15, 2:00 PM, Colin P. McCabe cmcc...@apache.org wrote: We could potentially use different .m2 directories for each executor. I think this has been brought up in the past as well. I'm not sure how maven handles concurrent access to the .m2 directory... if it's not using flock or fnctl then it's not really safe. This might explain some of our missing class error issues. Colin On Tue, Feb 10, 2015 at 2:13 AM, Steve Loughran ste...@hortonworks.com wrote: Mvn is a dark mystery to us all. I wouldn't trust it not pick up things from other builds if they ended up published to ~/.m2/repository during the process On 9 February 2015 at 19:29:06, Colin P. McCabe (cmcc...@apache.orgmailto:cmcc...@apache.org) wrote: I'm sorry, I don't have any insight into this. With regard to HADOOP-11084, I thought that $BUILD_URL would be unique for each concurrent build, which would prevent build artifacts from getting mixed up between jobs. Based on the value of PATCHPROCESS that Kihwal posted, perhaps this is not the case? Perhaps someone can explain how this is supposed to work (I am a Jenkins newbie). regards, Colin On Thu, Feb 5, 2015 at 10:42 AM, Yongjun Zhang yzh...@cloudera.com wrote: Thanks Kihwal for bringing this up. Seems related to: https://issues.apache.org/jira/browse/HADOOP-11084 Hi Andrew/Arpit/Colin/Steve, you guys worked on this jira before, any insight about the issue Kihwal described? Thanks. --Yongjun On Thu, Feb 5, 2015 at 9:49 AM, Kihwal Lee kih...@yahoo-inc.com.invalid wrote: I am sure many of us have seen strange jenkins behavior out of the precommit builds. - build artifacts missing - serving build artifact belonging to another build. This also causes wrong precommit results to be posted on the bug. - etc. The latest one I saw is disappearance of the unit test stdout/stderr file during a build. After a successful run of unit tests, the file vanished, so the script could not cat it. It looked like another build process had deleted it, while this build was in progress. It might have something to do with the fact that the patch-dir is set like following: PATCHPROCESS=/home/jenkins/jenkins-slave/workspace/PreCommit-HDFS-Build /../patchprocessI don't have access to the jenkins build configs or the build machines, so I can't debug it further, but I think we need to take care of it sooner than later. Can any one help? Kihwal
[jira] [Created] (HADOOP-11613) Remove httpclient dependency from hadoop-azure
Akira AJISAKA created HADOOP-11613: -- Summary: Remove httpclient dependency from hadoop-azure Key: HADOOP-11613 URL: https://issues.apache.org/jira/browse/HADOOP-11613 Project: Hadoop Common Issue Type: Sub-task Reporter: Akira AJISAKA Priority: Minor Remove httpclient dependency from MockStorageInterface.java. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HADOOP-11614) Remove httpclient dependency from hadoop-openstack
Akira AJISAKA created HADOOP-11614: -- Summary: Remove httpclient dependency from hadoop-openstack Key: HADOOP-11614 URL: https://issues.apache.org/jira/browse/HADOOP-11614 Project: Hadoop Common Issue Type: Sub-task Reporter: Akira AJISAKA Priority: Minor Remove httpclient dependency from hadoop-openstack. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HADOOP-11612) Workaround for Curator's ChildReaper requiring Guava 15+
Robert Kanter created HADOOP-11612: -- Summary: Workaround for Curator's ChildReaper requiring Guava 15+ Key: HADOOP-11612 URL: https://issues.apache.org/jira/browse/HADOOP-11612 Project: Hadoop Common Issue Type: Task Affects Versions: 2.8.0 Reporter: Robert Kanter Assignee: Robert Kanter HADOOP-11492 upped the Curator version to 2.7.1, which makes the {{ChildReaper}} class use a method that only exists in newer versions of Guava (we have 11.0.2, and it needs 15+). As a workaround, we can copy the {{ChildReaper}} class into hadoop-common and make a minor modification to allow it to work with Guava 11. The {{ChildReaper}} is used by Curator to cleanup old lock znodes. Curator locks are needed by YARN-2942. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HADOOP-11615) Remove MRv1-specific terms from ServiceLevelAuth.md
Akira AJISAKA created HADOOP-11615: -- Summary: Remove MRv1-specific terms from ServiceLevelAuth.md Key: HADOOP-11615 URL: https://issues.apache.org/jira/browse/HADOOP-11615 Project: Hadoop Common Issue Type: Bug Components: documentation Reporter: Akira AJISAKA Priority: Minor JobTracker should be ResourceManager, and {{hadoop mradmin}} should be {{yarn rmadmin}} in ServiceLevelAuth.md. {code} The service-level authorization configuration for the NameNode and JobTracker can be changed without restarting either of the Hadoop master daemons. The cluster administrator can change `$HADOOP_CONF_DIR/hadoop-policy.xml` on the master nodes and instruct the NameNode and JobTracker to reload their respective configurations via the `-refreshServiceAcl` switch to `dfsadmin` and `mradmin` commands respectively. Refresh the service-level authorization configuration for the NameNode: $ bin/hadoop dfsadmin -refreshServiceAcl Refresh the service-level authorization configuration for the JobTracker: $ bin/hadoop mradmin -refreshServiceAcl {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)