Re: Erratic Jenkins behavior

2015-02-18 Thread Chris Nauroth
I¹m pretty sure there is no guarantee of isolation on a shared
.m2/repository directory for multiple concurrent Maven processes.  I¹ve
had a theory for a while that one build running ³mvm install² can
overwrite the snapshot artifact that was just installed by another
concurrent build.  This can create bizarre problems, for example if a
patch introduces a new class in hadoop-common and then references that
class from hadoop-hdfs.

I expect using completely separate work directories for .m2/repository,
the patch directory, and the Jenkins workspace could resolve this.  The
typical cost for this kind of change is increased disk consumption and
increased build time, since Maven would need to download dependencies
fresh every time.

Chris Nauroth
Hortonworks
http://hortonworks.com/






On 2/12/15, 2:00 PM, Colin P. McCabe cmcc...@apache.org wrote:

We could potentially use different .m2 directories for each executor.
I think this has been brought up in the past as well.

I'm not sure how maven handles concurrent access to the .m2
directory... if it's not using flock or fnctl then it's not really
safe.  This might explain some of our missing class error issues.

Colin

On Tue, Feb 10, 2015 at 2:13 AM, Steve Loughran ste...@hortonworks.com
wrote:
 Mvn is a dark mystery to us all. I wouldn't trust it not pick up things
from other builds if they ended up published to ~/.m2/repository during
the process



 On 9 February 2015 at 19:29:06, Colin P. McCabe
(cmcc...@apache.orgmailto:cmcc...@apache.org) wrote:

 I'm sorry, I don't have any insight into this. With regard to
 HADOOP-11084, I thought that $BUILD_URL would be unique for each
 concurrent build, which would prevent build artifacts from getting
 mixed up between jobs. Based on the value of PATCHPROCESS that Kihwal
 posted, perhaps this is not the case? Perhaps someone can explain how
 this is supposed to work (I am a Jenkins newbie).

 regards,
 Colin

 On Thu, Feb 5, 2015 at 10:42 AM, Yongjun Zhang yzh...@cloudera.com
wrote:
 Thanks Kihwal for bringing this up.

 Seems related to:

 https://issues.apache.org/jira/browse/HADOOP-11084

 Hi Andrew/Arpit/Colin/Steve, you guys worked on this jira before, any
 insight about the issue Kihwal described?

 Thanks.

 --Yongjun


 On Thu, Feb 5, 2015 at 9:49 AM, Kihwal Lee
kih...@yahoo-inc.com.invalid
 wrote:

 I am sure many of us have seen strange jenkins behavior out of the
 precommit builds.

 - build artifacts missing
 - serving build artifact belonging to another build. This also causes
 wrong precommit results to be posted on the bug.
 - etc.

 The latest one I saw is disappearance of the unit test stdout/stderr
file
 during a build. After a successful run of unit tests, the file
vanished, so
 the script could not cat it. It looked like another build process had
 deleted it, while this build was in progress.

 It might have something to do with the fact that the patch-dir is set
like
 following:

 
PATCHPROCESS=/home/jenkins/jenkins-slave/workspace/PreCommit-HDFS-Build
/../patchprocessI
 don't have access to the jenkins build configs or the build machines,
so I
 can't debug it further, but I think we need to take care of it sooner
than
 later. Can any one help?

 Kihwal




[jira] [Resolved] (HADOOP-11611) fix TestHTracedRESTReceiver unit test failures

2015-02-18 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-11611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe resolved HADOOP-11611.
---
Resolution: Fixed

wrong project

 fix TestHTracedRESTReceiver unit test failures
 --

 Key: HADOOP-11611
 URL: https://issues.apache.org/jira/browse/HADOOP-11611
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 3.2
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
Priority: Critical

 Fix some issues with HTracedRESTReceiver that are resulting in unit test 
 failures.
 So there were two main issues:
 * better way to launch htraced
 * fixes to the HTracedRESTReceiver logic



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HADOOP-11611) fix TestHTracedRESTReceiver unit test failures

2015-02-18 Thread Colin Patrick McCabe (JIRA)
Colin Patrick McCabe created HADOOP-11611:
-

 Summary: fix TestHTracedRESTReceiver unit test failures
 Key: HADOOP-11611
 URL: https://issues.apache.org/jira/browse/HADOOP-11611
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 3.2
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
Priority: Critical


Fix some issues with HTracedRESTReceiver that are resulting in unit test 
failures.

So there were two main issues:
* better way to launch htraced
* fixes to the HTracedRESTReceiver logic



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Erratic Jenkins behavior

2015-02-18 Thread Colin P. McCabe
Hmm.  I guess my thought would be that we would have a fixed number of
slots (i.e. executors on a single node with associated .m2
directories).  Then we wouldn't clear each .m2 in between runs, but we
would ensure that only one slot at a time had access to each
directory.

In that case, build times wouldn't increase that much (or really at
all, until a dependency changed... right?).  When a dependency changed
we'd have to do O(N_slots) amount of work, but dependencies don't
change that often.

Of course, the current situation also generates a lot of extra work
because people need to rekick builds that failed for mystery reasons.

cheers.
Colin

On Wed, Feb 18, 2015 at 9:53 AM, Chris Nauroth cnaur...@hortonworks.com wrote:
 I¹m pretty sure there is no guarantee of isolation on a shared
 .m2/repository directory for multiple concurrent Maven processes.  I¹ve
 had a theory for a while that one build running ³mvm install² can
 overwrite the snapshot artifact that was just installed by another
 concurrent build.  This can create bizarre problems, for example if a
 patch introduces a new class in hadoop-common and then references that
 class from hadoop-hdfs.

 I expect using completely separate work directories for .m2/repository,
 the patch directory, and the Jenkins workspace could resolve this.  The
 typical cost for this kind of change is increased disk consumption and
 increased build time, since Maven would need to download dependencies
 fresh every time.

 Chris Nauroth
 Hortonworks
 http://hortonworks.com/






 On 2/12/15, 2:00 PM, Colin P. McCabe cmcc...@apache.org wrote:

We could potentially use different .m2 directories for each executor.
I think this has been brought up in the past as well.

I'm not sure how maven handles concurrent access to the .m2
directory... if it's not using flock or fnctl then it's not really
safe.  This might explain some of our missing class error issues.

Colin

On Tue, Feb 10, 2015 at 2:13 AM, Steve Loughran ste...@hortonworks.com
wrote:
 Mvn is a dark mystery to us all. I wouldn't trust it not pick up things
from other builds if they ended up published to ~/.m2/repository during
the process



 On 9 February 2015 at 19:29:06, Colin P. McCabe
(cmcc...@apache.orgmailto:cmcc...@apache.org) wrote:

 I'm sorry, I don't have any insight into this. With regard to
 HADOOP-11084, I thought that $BUILD_URL would be unique for each
 concurrent build, which would prevent build artifacts from getting
 mixed up between jobs. Based on the value of PATCHPROCESS that Kihwal
 posted, perhaps this is not the case? Perhaps someone can explain how
 this is supposed to work (I am a Jenkins newbie).

 regards,
 Colin

 On Thu, Feb 5, 2015 at 10:42 AM, Yongjun Zhang yzh...@cloudera.com
wrote:
 Thanks Kihwal for bringing this up.

 Seems related to:

 https://issues.apache.org/jira/browse/HADOOP-11084

 Hi Andrew/Arpit/Colin/Steve, you guys worked on this jira before, any
 insight about the issue Kihwal described?

 Thanks.

 --Yongjun


 On Thu, Feb 5, 2015 at 9:49 AM, Kihwal Lee
kih...@yahoo-inc.com.invalid
 wrote:

 I am sure many of us have seen strange jenkins behavior out of the
 precommit builds.

 - build artifacts missing
 - serving build artifact belonging to another build. This also causes
 wrong precommit results to be posted on the bug.
 - etc.

 The latest one I saw is disappearance of the unit test stdout/stderr
file
 during a build. After a successful run of unit tests, the file
vanished, so
 the script could not cat it. It looked like another build process had
 deleted it, while this build was in progress.

 It might have something to do with the fact that the patch-dir is set
like
 following:


PATCHPROCESS=/home/jenkins/jenkins-slave/workspace/PreCommit-HDFS-Build
/../patchprocessI
 don't have access to the jenkins build configs or the build machines,
so I
 can't debug it further, but I think we need to take care of it sooner
than
 later. Can any one help?

 Kihwal




[jira] [Created] (HADOOP-11613) Remove httpclient dependency from hadoop-azure

2015-02-18 Thread Akira AJISAKA (JIRA)
Akira AJISAKA created HADOOP-11613:
--

 Summary: Remove httpclient dependency from hadoop-azure
 Key: HADOOP-11613
 URL: https://issues.apache.org/jira/browse/HADOOP-11613
 Project: Hadoop Common
  Issue Type: Sub-task
Reporter: Akira AJISAKA
Priority: Minor


Remove httpclient dependency from MockStorageInterface.java.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HADOOP-11614) Remove httpclient dependency from hadoop-openstack

2015-02-18 Thread Akira AJISAKA (JIRA)
Akira AJISAKA created HADOOP-11614:
--

 Summary: Remove httpclient dependency from hadoop-openstack
 Key: HADOOP-11614
 URL: https://issues.apache.org/jira/browse/HADOOP-11614
 Project: Hadoop Common
  Issue Type: Sub-task
Reporter: Akira AJISAKA
Priority: Minor


Remove httpclient dependency from hadoop-openstack.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HADOOP-11612) Workaround for Curator's ChildReaper requiring Guava 15+

2015-02-18 Thread Robert Kanter (JIRA)
Robert Kanter created HADOOP-11612:
--

 Summary: Workaround for Curator's ChildReaper requiring Guava 15+
 Key: HADOOP-11612
 URL: https://issues.apache.org/jira/browse/HADOOP-11612
 Project: Hadoop Common
  Issue Type: Task
Affects Versions: 2.8.0
Reporter: Robert Kanter
Assignee: Robert Kanter


HADOOP-11492 upped the Curator version to 2.7.1, which makes the 
{{ChildReaper}} class use a method that only exists in newer versions of Guava 
(we have 11.0.2, and it needs 15+).  As a workaround, we can copy the 
{{ChildReaper}} class into hadoop-common and make a minor modification to allow 
it to work with Guava 11.

The {{ChildReaper}} is used by Curator to cleanup old lock znodes.  Curator 
locks are needed by YARN-2942.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HADOOP-11615) Remove MRv1-specific terms from ServiceLevelAuth.md

2015-02-18 Thread Akira AJISAKA (JIRA)
Akira AJISAKA created HADOOP-11615:
--

 Summary: Remove MRv1-specific terms from ServiceLevelAuth.md
 Key: HADOOP-11615
 URL: https://issues.apache.org/jira/browse/HADOOP-11615
 Project: Hadoop Common
  Issue Type: Bug
  Components: documentation
Reporter: Akira AJISAKA
Priority: Minor


JobTracker should be ResourceManager, and {{hadoop mradmin}} should be {{yarn 
rmadmin}} in ServiceLevelAuth.md.
{code}
The service-level authorization configuration for the NameNode and JobTracker 
can be changed without restarting either of the Hadoop master daemons. The 
cluster administrator can change `$HADOOP_CONF_DIR/hadoop-policy.xml` on the 
master nodes and instruct the NameNode and JobTracker to reload their 
respective configurations via the `-refreshServiceAcl` switch to `dfsadmin` and 
`mradmin` commands respectively.

Refresh the service-level authorization configuration for the NameNode:

   $ bin/hadoop dfsadmin -refreshServiceAcl

Refresh the service-level authorization configuration for the JobTracker:

   $ bin/hadoop mradmin -refreshServiceAcl
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)