Re: Adding a node with Queue.withLock in Jenkins

2016-10-27 Thread Surya Gaddipati
Thanks Stephen. I appreciate your quick responses.

On Monday, October 17, 2016 at 9:53:23 AM UTC-5, Surya Gaddipati wrote:
>
> Hi all, 
>
> I am working on  plugin 
>  that 
> creates a single use computer/node whose lifecycle is tied to a single 
> build .  I am currently adding the node like this 
>
> Jenkins.getInstance().addNode(node); 
>
>
> However that method requires multiple Queue locks while doing so. 
>
>
> I believe in my particular case there is no need for queue locking since 
> only a single build can ever be scheduled on the computer via 
> LabelAssignmentAction. 
>
> I wanted to check ,
>
> 1. if that assumption is correct
>
> 2. If the team is open to accepting a patch to jenkins.instance for  a new 
> method which adds a node without the Queue lock. 
>
>
> Thank you.
>
>
> --Surya
>

-- 
You received this message because you are subscribed to the Google Groups 
"Jenkins Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to jenkinsci-dev+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/jenkinsci-dev/eb791424-a592-4262-b9c5-8c45e1485ba0%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Adding a node with Queue.withLock in Jenkins

2016-10-27 Thread Stephen Connolly
We had a hot fix plugin that would run on a timer and look for a specific
deadlock in a specific class with a specific stacktrace and then interrupt
a specific thread that we analysed as safe to interrupt.

It's not to hard to write one... the hard part is the analysis to identify
the criteria for being the deadlock to identify and then the criteria for
*safely* breaking the deadlock

I don't think any of our hotfixes are valid against current versions of
Jenkins (in fact their auto-deactivate logic would turn them off anyway as
they are typically targeted for precise specific versions of Jenkins -
safety reasons again)

On Thursday 27 October 2016, Surya Gaddipati 
wrote:

> Stephen,
>
> I remember you mentioning it a JIRA ( which I can't seem to find now for
> the life of me)  that Cloudbees Jenkins runs some sort of 'deadlock killer'
> on a timer .
>
> Is my memory serving me right about this? Do you have that in a plugin
> somewhere if so?
>
> Surya
>
> On Monday, October 17, 2016 at 9:53:23 AM UTC-5, Surya Gaddipati wrote:
>>
>> Hi all,
>>
>> I am working on  plugin
>>  that
>> creates a single use computer/node whose lifecycle is tied to a single
>> build .  I am currently adding the node like this
>>
>> Jenkins.getInstance().addNode(node);
>>
>>
>> However that method requires multiple Queue locks while doing so.
>>
>>
>> I believe in my particular case there is no need for queue locking since
>> only a single build can ever be scheduled on the computer via
>> LabelAssignmentAction.
>>
>> I wanted to check ,
>>
>> 1. if that assumption is correct
>>
>> 2. If the team is open to accepting a patch to jenkins.instance for  a
>> new method which adds a node without the Queue lock.
>>
>>
>> Thank you.
>>
>>
>> --Surya
>>
> --
> You received this message because you are subscribed to the Google Groups
> "Jenkins Developers" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to jenkinsci-dev+unsubscr...@googlegroups.com
> 
> .
> To view this discussion on the web visit https://groups.google.com/d/
> msgid/jenkinsci-dev/3d00dd3a-15cd-485c-937b-86f99182ecf4%
> 40googlegroups.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>


-- 
Sent from my phone

-- 
You received this message because you are subscribed to the Google Groups 
"Jenkins Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to jenkinsci-dev+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/jenkinsci-dev/CA%2BnPnMzR%3DgrahqycykA8AWT86zS6d8yuZMhBpEnXUBfbMVrL-g%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Adding a node with Queue.withLock in Jenkins

2016-10-26 Thread Surya Gaddipati
Stephen, 

I remember you mentioning it a JIRA ( which I can't seem to find now for 
the life of me)  that Cloudbees Jenkins runs some sort of 'deadlock killer' 
on a timer . 

Is my memory serving me right about this? Do you have that in a plugin 
somewhere if so? 

Surya 

On Monday, October 17, 2016 at 9:53:23 AM UTC-5, Surya Gaddipati wrote:
>
> Hi all, 
>
> I am working on  plugin 
>  that 
> creates a single use computer/node whose lifecycle is tied to a single 
> build .  I am currently adding the node like this 
>
> Jenkins.getInstance().addNode(node); 
>
>
> However that method requires multiple Queue locks while doing so. 
>
>
> I believe in my particular case there is no need for queue locking since 
> only a single build can ever be scheduled on the computer via 
> LabelAssignmentAction. 
>
> I wanted to check ,
>
> 1. if that assumption is correct
>
> 2. If the team is open to accepting a patch to jenkins.instance for  a new 
> method which adds a node without the Queue lock. 
>
>
> Thank you.
>
>
> --Surya
>

-- 
You received this message because you are subscribed to the Google Groups 
"Jenkins Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to jenkinsci-dev+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/jenkinsci-dev/3d00dd3a-15cd-485c-937b-86f99182ecf4%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Adding a node with Queue.withLock in Jenkins

2016-10-23 Thread Kanstantsin Shautsou
I saw some stories (i.e. job auth context) that looks like a roadmap where 
everybody can join and help. Something the same.

> On Oct 24, 2016, at 00:02, Stephen Connolly  
> wrote:
> 
> I'm waiting for you to provide your usual interjection as to how to get this 
> on the roadmap.
> 
> More seriously, there is possibly some work coming up for trying to 
> rearchitect the queue... but I remain skeptical at present!
> 
> On Sunday 23 October 2016, Kanstantsin Shautsou  > wrote:
> Is there any roadmap for it?
> 
> On Tuesday, October 18, 2016 at 7:48:43 PM UTC+3, Jesse Glick wrote:
> On Tue, Oct 18, 2016 at 12:10 PM, Surya Gaddipati
> > wrote:
> > neither concerrentmodificationexception nor scheduling
> > on zombie node are applicable here .
> 
> If you are bypassing a lock, a CME seems like a risk.
> 
> > a patch to core that adds nodes to jenkins
> > without unnecessary queue locking.
> 
> Sounds like the wrong approach to me. For the short term Stephen’s advice was
> 
> > For now, use the queue lock methods, when we remove the need for a lock 
> > they will become no-ops that the JVM will inline away for plugins compiled 
> > against current cores
> 
> The real fix would be to bypass `Queue` altogether for these cases and
> inline all the launching and remoting into the lifecycle of the build
> itself. The main issue is that there are some places in Jenkins core
> where it is assumed that a valid `Node`/`Computer` is also in the
> global lists, which is not something you want here. You can create a
> `FilePath` and `Launcher` not tied to a `Node` or `Computer`, but that
> is also poorly supported in various places. So I think a larger
> redesign is necessary. Adding a one-off method which is not in general
> safe to call would just make a compatible transition harder.
> 
> --
> You received this message because you are subscribed to the Google Groups 
> "Jenkins Developers" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to  <>jenkinsci-dev+unsubscribe@ 
> googlegroups.com 
> .
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/jenkinsci-dev/7123adbe-bae2-4a49-aec2-8826b8fc0e0a%40googlegroups.com
>  
> .
> For more options, visit https://groups.google.com/d/optout 
> .
> 
> 
> --
> Sent from my phone
> 
> --
> You received this message because you are subscribed to a topic in the Google 
> Groups "Jenkins Developers" group.
> To unsubscribe from this topic, visit 
> https://groups.google.com/d/topic/jenkinsci-dev/Z2hKvdBDHbw/unsubscribe 
> .
> To unsubscribe from this group and all its topics, send an email to 
> jenkinsci-dev+unsubscr...@googlegroups.com 
> .
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/jenkinsci-dev/CA%2BnPnMw1rQHbB%3Dr1eFOKM5%2BZXmbOKXaB9J%2B%2Bd0GhurrTpUnV1Q%40mail.gmail.com
>  
> .
> For more options, visit https://groups.google.com/d/optout 
> .

-- 
You received this message because you are subscribed to the Google Groups 
"Jenkins Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to jenkinsci-dev+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/jenkinsci-dev/5F8B9191-C695-4377-9B95-D5ED7895FDA7%40gmail.com.
For more options, visit https://groups.google.com/d/optout.


signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: Adding a node with Queue.withLock in Jenkins

2016-10-23 Thread Stephen Connolly
I'm waiting for you to provide your usual interjection as to how to get
this on the roadmap.

More seriously, there is possibly some work coming up for trying to
rearchitect the queue... but I remain skeptical at present!

On Sunday 23 October 2016, Kanstantsin Shautsou 
wrote:

> Is there any roadmap for it?
>
> On Tuesday, October 18, 2016 at 7:48:43 PM UTC+3, Jesse Glick wrote:
>>
>> On Tue, Oct 18, 2016 at 12:10 PM, Surya Gaddipati
>>  wrote:
>> > neither concerrentmodificationexception nor scheduling
>> > on zombie node are applicable here .
>>
>> If you are bypassing a lock, a CME seems like a risk.
>>
>> > a patch to core that adds nodes to jenkins
>> > without unnecessary queue locking.
>>
>> Sounds like the wrong approach to me. For the short term Stephen’s advice
>> was
>>
>> > For now, use the queue lock methods, when we remove the need for a lock
>> they will become no-ops that the JVM will inline away for plugins compiled
>> against current cores
>>
>> The real fix would be to bypass `Queue` altogether for these cases and
>> inline all the launching and remoting into the lifecycle of the build
>> itself. The main issue is that there are some places in Jenkins core
>> where it is assumed that a valid `Node`/`Computer` is also in the
>> global lists, which is not something you want here. You can create a
>> `FilePath` and `Launcher` not tied to a `Node` or `Computer`, but that
>> is also poorly supported in various places. So I think a larger
>> redesign is necessary. Adding a one-off method which is not in general
>> safe to call would just make a compatible transition harder.
>>
> --
> You received this message because you are subscribed to the Google Groups
> "Jenkins Developers" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to jenkinsci-dev+unsubscr...@googlegroups.com
> 
> .
> To view this discussion on the web visit https://groups.google.com/d/
> msgid/jenkinsci-dev/7123adbe-bae2-4a49-aec2-8826b8fc0e0a%
> 40googlegroups.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>


-- 
Sent from my phone

-- 
You received this message because you are subscribed to the Google Groups 
"Jenkins Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to jenkinsci-dev+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/jenkinsci-dev/CA%2BnPnMw1rQHbB%3Dr1eFOKM5%2BZXmbOKXaB9J%2B%2Bd0GhurrTpUnV1Q%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Adding a node with Queue.withLock in Jenkins

2016-10-23 Thread Kanstantsin Shautsou
Is there any roadmap for it? 

On Tuesday, October 18, 2016 at 7:48:43 PM UTC+3, Jesse Glick wrote:
>
> On Tue, Oct 18, 2016 at 12:10 PM, Surya Gaddipati 
>  wrote: 
> > neither concerrentmodificationexception nor scheduling 
> > on zombie node are applicable here . 
>
> If you are bypassing a lock, a CME seems like a risk. 
>
> > a patch to core that adds nodes to jenkins 
> > without unnecessary queue locking. 
>
> Sounds like the wrong approach to me. For the short term Stephen’s advice 
> was 
>
> > For now, use the queue lock methods, when we remove the need for a lock 
> they will become no-ops that the JVM will inline away for plugins compiled 
> against current cores 
>
> The real fix would be to bypass `Queue` altogether for these cases and 
> inline all the launching and remoting into the lifecycle of the build 
> itself. The main issue is that there are some places in Jenkins core 
> where it is assumed that a valid `Node`/`Computer` is also in the 
> global lists, which is not something you want here. You can create a 
> `FilePath` and `Launcher` not tied to a `Node` or `Computer`, but that 
> is also poorly supported in various places. So I think a larger 
> redesign is necessary. Adding a one-off method which is not in general 
> safe to call would just make a compatible transition harder. 
>

-- 
You received this message because you are subscribed to the Google Groups 
"Jenkins Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to jenkinsci-dev+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/jenkinsci-dev/7123adbe-bae2-4a49-aec2-8826b8fc0e0a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Adding a node with Queue.withLock in Jenkins

2016-10-18 Thread Surya Gaddipati
>. Adding a one-off method which is not in general 
safe to call would just make a compatible transition harder. 

ok yea this  sounds  like a reasonable concern. 
I would be glad to help with lock-free queue implementation  which would be 
ideal for  1 computer <=> 1 build  type of situation. 

On Monday, October 17, 2016 at 9:53:23 AM UTC-5, Surya Gaddipati wrote:
>
> Hi all, 
>
> I am working on  plugin 
>  that 
> creates a single use computer/node whose lifecycle is tied to a single 
> build .  I am currently adding the node like this 
>
> Jenkins.getInstance().addNode(node); 
>
>
> However that method requires multiple Queue locks while doing so. 
>
>
> I believe in my particular case there is no need for queue locking since 
> only a single build can ever be scheduled on the computer via 
> LabelAssignmentAction. 
>
> I wanted to check ,
>
> 1. if that assumption is correct
>
> 2. If the team is open to accepting a patch to jenkins.instance for  a new 
> method which adds a node without the Queue lock. 
>
>
> Thank you.
>
>
> --Surya
>

-- 
You received this message because you are subscribed to the Google Groups 
"Jenkins Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to jenkinsci-dev+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/jenkinsci-dev/e11a97dc-a149-43ca-af52-f8799f1233ba%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Adding a node with Queue.withLock in Jenkins

2016-10-18 Thread Jesse Glick
On Tue, Oct 18, 2016 at 12:10 PM, Surya Gaddipati
 wrote:
> neither concerrentmodificationexception nor scheduling
> on zombie node are applicable here .

If you are bypassing a lock, a CME seems like a risk.

> a patch to core that adds nodes to jenkins
> without unnecessary queue locking.

Sounds like the wrong approach to me. For the short term Stephen’s advice was

> For now, use the queue lock methods, when we remove the need for a lock they 
> will become no-ops that the JVM will inline away for plugins compiled against 
> current cores

The real fix would be to bypass `Queue` altogether for these cases and
inline all the launching and remoting into the lifecycle of the build
itself. The main issue is that there are some places in Jenkins core
where it is assumed that a valid `Node`/`Computer` is also in the
global lists, which is not something you want here. You can create a
`FilePath` and `Launcher` not tied to a `Node` or `Computer`, but that
is also poorly supported in various places. So I think a larger
redesign is necessary. Adding a one-off method which is not in general
safe to call would just make a compatible transition harder.

-- 
You received this message because you are subscribed to the Google Groups 
"Jenkins Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to jenkinsci-dev+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/jenkinsci-dev/CANfRfr21MKnN3sBY6j%3DkrvSWOfCt6Mrk23Y6TW-tK-65hUqf8g%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Adding a node with Queue.withLock in Jenkins

2016-10-18 Thread Surya Gaddipati
>thus encapsulated the API calls you are struggling with.

Oh yea I remember seeing that somewhere too. I remember seeing a commit on 
jenkins core for this too. 


But thats not the issue I am talking about here. 

Let me summarize this discussion,  

Jenkins.getInstance().addNode(node);


This code locks the queue. Locking the queue for adding single-use node is 
not necessary.  I am still not clear what Stephen is inferring with 
''issues start cropping up' , neither concerrentmodificationexception nor 
scheduling on zombie node are applicable here .

I am asking if you'd accept a patch to core that adds nodes to 
jenkins without unnecessary queue locking. 


-- Surya





On Tuesday, October 18, 2016 at 10:20:06 AM UTC-5, Jesse Glick wrote:
>
> On Mon, Oct 17, 2016 at 5:47 PM, Surya Gaddipati 
>  wrote: 
> > yes like that, there are a couple of differences 
>
> IIRC there was a more abstract plugin which purported to provide an 
> API to allocate a node for the duration of one build only, and which 
> thus encapsulated the API calls you are struggling with. I do not 
> recall offhand where it is. @Nicolas? 
>

-- 
You received this message because you are subscribed to the Google Groups 
"Jenkins Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to jenkinsci-dev+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/jenkinsci-dev/76a7fd8f-25d8-4a60-8a86-2936ce59c4db%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Adding a node with Queue.withLock in Jenkins

2016-10-18 Thread Jesse Glick
On Mon, Oct 17, 2016 at 5:47 PM, Surya Gaddipati
 wrote:
> yes like that, there are a couple of differences

IIRC there was a more abstract plugin which purported to provide an
API to allocate a node for the duration of one build only, and which
thus encapsulated the API calls you are struggling with. I do not
recall offhand where it is. @Nicolas?

-- 
You received this message because you are subscribed to the Google Groups 
"Jenkins Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to jenkinsci-dev+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/jenkinsci-dev/CANfRfr3g_XUX496%3DLyTBY3VMe3r28ndrDV3chqUC%2BNzAYmKJQw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Adding a node with Queue.withLock in Jenkins

2016-10-18 Thread Stephen Connolly
On 18 October 2016 at 14:00, Oliver Gondža  wrote:

> On 2016-10-18 14:52, Surya Gaddipati wrote:
>
> In theory, add should be safe and only remove requiring the lock...

>>>
>> Would you be open to  accepting the following patch to jenkins core
>>
>> |
>> Jenkins.getInstance().addNodeWithoutQueueLock(node)
>>
>
> This will further expose the nasty implementation detail we failed to
> hide: scheduling will choke once nodes are manipulated.
>
> Silly question, can not make the Queue/scheduling immune to Nodes changes
> - and get rid of this abomination? The idea that different parts of
> codebase needs to be aware of this and we even rely on plugins to play nice
> to protect scheduling consistency is frighting me.
>

As I said, I have identified a paper which should enable an approach that
would give us a lock free Queue/LoadBalancer... but I have not had the time
to put effort into it.

Part of the issue here is also the queue listeners. Last time we tried to
unpick the queue locking behaviour we ended up causing regressions in the
queue listeners.

I'll raise it internally with our PM to see if they'll be ok allocating
some time for me to experiment with queue de-locking...

-Stephen


> --
> oliver
>
> --
> You received this message because you are subscribed to the Google Groups
> "Jenkins Developers" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to jenkinsci-dev+unsubscr...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/ms
> gid/jenkinsci-dev/4f7b6a22-047e-3911-a0bc-74638a788d5d%40gmail.com.
>
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"Jenkins Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to jenkinsci-dev+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/jenkinsci-dev/CA%2BnPnMzTeqJXC28eP2BNoih14wE1NDYok-1%2B6mi7vWz%2BSPu%3DAA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Adding a node with Queue.withLock in Jenkins

2016-10-18 Thread Oliver Gondža

On 2016-10-18 14:52, Surya Gaddipati wrote:


In theory, add should be safe and only remove requiring the lock...


Would you be open to  accepting the following patch to jenkins core

|
Jenkins.getInstance().addNodeWithoutQueueLock(node)


This will further expose the nasty implementation detail we failed to 
hide: scheduling will choke once nodes are manipulated.


Silly question, can not make the Queue/scheduling immune to Nodes 
changes - and get rid of this abomination? The idea that different parts 
of codebase needs to be aware of this and we even rely on plugins to 
play nice to protect scheduling consistency is frighting me.


--
oliver

--
You received this message because you are subscribed to the Google Groups "Jenkins 
Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to jenkinsci-dev+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/jenkinsci-dev/4f7b6a22-047e-3911-a0bc-74638a788d5d%40gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Adding a node with Queue.withLock in Jenkins

2016-10-18 Thread Surya Gaddipati
Hi Stephen, 

I really appreciate your quick responses,

>> It's when you then end up down at 
https://github.com/jenkinsci/jenkins/blob/master/core/src/main/java/hudson/model/Queue.java#L1525-L1526
 that 
the issues start cropping up.

I am not quite sure what you mean 'issues start cropping up' , what would 
happen?. I've walked through the code but I still don't see where 
'concurrent mod exception' could be happening, perhaps you mean something 
else? For my plugin there is no chance that build would be assigned to a 
phantom node, if you mean that.

>> In theory, add should be safe and only remove requiring the lock... 

Would you be open to  accepting the following patch to jenkins core

Jenkins.getInstance().addNodeWithoutQueueLock( node) 


Thanks again. 

Surya





-- 
You received this message because you are subscribed to the Google Groups 
"Jenkins Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to jenkinsci-dev+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/jenkinsci-dev/fe06e639-e501-4a97-ad58-74e701747733%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Adding a node with Queue.withLock in Jenkins

2016-10-18 Thread Pavel Janousek
On the other hand, is it really necessary to call all associated Listeners 
(like in [1].update() - L222, L236, L242) when the Queue is still locked (is it 
guaranteed and documented for a ProvisioningListener contract)? It can be a 
quite time spent operation/call...

[1] 
https://github.com/jenkinsci/jenkins/blob/master/core/src/main/java/hudson/slaves/NodeProvisioner.java

--
Pavel Janousek
Senior Jenkins QA Engineer


- Original Message -
> From: "Stephen Connolly" 
> To: jenkinsci-dev@googlegroups.com
> Sent: Monday, October 17, 2016 11:51:10 PM
> Subject: Re: Adding a node with Queue.withLock in Jenkins
> 
> On 17 October 2016 at 22:42, Surya Gaddipati 
> wrote:
> 
> > Hi Stephen,
> >
> > Thank you for your quick response. I don't want to belabor this more than
> > it needs to be .
> >
> > >If you change the list of nodes during that time, the queue thread will
> > get a concurrent modification exception (best case) and die
> >
> > I am guessing you are referring to this line
> > 
> >
> >
> > for (Computer c : Jenkins.getInstance().getComputers())
> >
> > This is not same collection that addnode modifies, its a copy . So it
> > seems unlikely that would cause a concurrent modification exception.
> >
> >
> It's when you then end up down at
> https://github.com/jenkinsci/jenkins/blob/master/core/src/main/java/hudson/model/Queue.java#L1525-L1526
> that the issues start cropping up.
> 
> In theory, add should be safe and only remove requiring the lock... in
> practice we found that a lot of the cloud plugins blow up if we remove the
> add lock. If you want to remove the add lock for your Jenkins and have
> complete control over the plugins installed, you may get lucky
> 
> 
> >
> > > it will assign work to a node that no longer exists, except for a
> > phantom object reference that it held onto... or worse
> >
> > This won't happen in this specific case I am describing here since a
> > computer is tied to *one* particular build , it cannot be assigned to a
> > phantom node.
> >
> >
> > My main motivation is to remove queue locking as much as possible as  our
> > Jenkins instance has major scalability issues, almost *all* of it stemming
> > from overzealous queue locking.
> >
> >
> > > It is a long term goal to remove the queue lock
> >
> > I would love for queue.lock to go away  but do you think it realistic to
> > expect that to go away anytime soon given how deeply embedded it is into
> > the core structure of the code. eg: Here it looks like we are acquiring a
> > queue lock
> > within
> > a queue lock
> > 
> > ?
> >
> >
> >
> >
> >
> > On Monday, October 17, 2016 at 1:54:11 PM UTC-5, Stephen Connolly wrote:
> >
> >>
> >>
> >> On Monday 17 October 2016, Surya Gaddipati  wrote:
> >>
> >>> Thanks Stephen for your quick response.
> >>>
> >>> >  as otherwise the scheduling will blow up in your face.
> >>>
> >>> Curious, What do you mean by this ?
> >>>
> >>
> >> When the queue starts scheduling it has to iterate the list of nodes to
> >> build up the candidate nodes for the load balancing algorithm.
> >>
> >> If you change the list of nodes during that time, the queue thread will
> >> get a concurrent modification exception (best case) and die... or it will
> >> assign work to a node that no longer exists, except for a phantom object
> >> reference that it held onto... or worse
> >>
> >> It is a long term goal to remove the queue lock as it impacts scalability
> >> when you have 1000's of executors that can match a job... but even then
> >> the
> >> impact is not so large that removing the lock would be a priority.
> >>
> >> For now, use the queue lock methods, when we remove the need for a lock
> >> they will become no-ops that the JVM will inline away for plugins compiled
> >> against current cores
> >>
> >>
> >>> thanks again.
> >>>
> >>> On Monday, October 17, 2016 at 9:53:23 AM UTC-5, Surya Gaddipati wrote:
> 
>  Hi all,
> 
>  I am working on  plugin
>   that
>  creates a single use computer/node whose lifecycle is tied to a single
>  build .  I am currently adding the node like this
> 
>  Jenkins.getInstance().addNode(node);
> 
> 
>  However that method requires multiple Queue locks while doing so.
> 
> 
>  I believe in my particular case there is no need for queue locking
>  since only a single build can ever be scheduled on the computer via
>  LabelAssignmentAction.
> 
>  I wanted to check ,
> 
>  1. if that assumption is correct
> 
>  2. If the team is open to 

Re: Adding a node with Queue.withLock in Jenkins

2016-10-17 Thread Stephen Connolly
On 17 October 2016 at 22:42, Surya Gaddipati 
wrote:

> Hi Stephen,
>
> Thank you for your quick response. I don't want to belabor this more than
> it needs to be .
>
> >If you change the list of nodes during that time, the queue thread will
> get a concurrent modification exception (best case) and die
>
> I am guessing you are referring to this line
> 
>
>
> for (Computer c : Jenkins.getInstance().getComputers())
>
> This is not same collection that addnode modifies, its a copy . So it
> seems unlikely that would cause a concurrent modification exception.
>
>
It's when you then end up down at
https://github.com/jenkinsci/jenkins/blob/master/core/src/main/java/hudson/model/Queue.java#L1525-L1526
that the issues start cropping up.

In theory, add should be safe and only remove requiring the lock... in
practice we found that a lot of the cloud plugins blow up if we remove the
add lock. If you want to remove the add lock for your Jenkins and have
complete control over the plugins installed, you may get lucky


>
> > it will assign work to a node that no longer exists, except for a
> phantom object reference that it held onto... or worse
>
> This won't happen in this specific case I am describing here since a
> computer is tied to *one* particular build , it cannot be assigned to a
> phantom node.
>
>
> My main motivation is to remove queue locking as much as possible as  our
> Jenkins instance has major scalability issues, almost *all* of it stemming
> from overzealous queue locking.
>
>
> > It is a long term goal to remove the queue lock
>
> I would love for queue.lock to go away  but do you think it realistic to
> expect that to go away anytime soon given how deeply embedded it is into
> the core structure of the code. eg: Here it looks like we are acquiring a
> queue lock
> within
> a queue lock
> 
> ?
>
>
>
>
>
> On Monday, October 17, 2016 at 1:54:11 PM UTC-5, Stephen Connolly wrote:
>
>>
>>
>> On Monday 17 October 2016, Surya Gaddipati  wrote:
>>
>>> Thanks Stephen for your quick response.
>>>
>>> >  as otherwise the scheduling will blow up in your face.
>>>
>>> Curious, What do you mean by this ?
>>>
>>
>> When the queue starts scheduling it has to iterate the list of nodes to
>> build up the candidate nodes for the load balancing algorithm.
>>
>> If you change the list of nodes during that time, the queue thread will
>> get a concurrent modification exception (best case) and die... or it will
>> assign work to a node that no longer exists, except for a phantom object
>> reference that it held onto... or worse
>>
>> It is a long term goal to remove the queue lock as it impacts scalability
>> when you have 1000's of executors that can match a job... but even then the
>> impact is not so large that removing the lock would be a priority.
>>
>> For now, use the queue lock methods, when we remove the need for a lock
>> they will become no-ops that the JVM will inline away for plugins compiled
>> against current cores
>>
>>
>>> thanks again.
>>>
>>> On Monday, October 17, 2016 at 9:53:23 AM UTC-5, Surya Gaddipati wrote:

 Hi all,

 I am working on  plugin
  that
 creates a single use computer/node whose lifecycle is tied to a single
 build .  I am currently adding the node like this

 Jenkins.getInstance().addNode(node);


 However that method requires multiple Queue locks while doing so.


 I believe in my particular case there is no need for queue locking
 since only a single build can ever be scheduled on the computer via
 LabelAssignmentAction.

 I wanted to check ,

 1. if that assumption is correct

 2. If the team is open to accepting a patch to jenkins.instance for  a
 new method which adds a node without the Queue lock.


 Thank you.


 --Surya

>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "Jenkins Developers" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to jenkinsci-dev+unsubscr...@googlegroups.com.
>>> To view this discussion on the web visit https://groups.google.com/d/ms
>>> gid/jenkinsci-dev/6a9cb43d-8eab-4413-89e9-f85775f5bc03%40goo
>>> glegroups.com
>>> 
>>> .
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>
>> --
>> Sent from my phone
>>
> --
> You received this message because you are subscribed to the Google Groups

Re: Adding a node with Queue.withLock in Jenkins

2016-10-17 Thread Surya Gaddipati
Hi Jesse, 

yes like that, there are a couple of differences

My plugin is geared towards swarm( understands swarm resource availability) 
, has docker-cache driver 

 
for build specific cache on overlayfs, labels are defined on global instead 
of local, dashboards ect.  

On Monday, October 17, 2016 at 2:14:37 PM UTC-5, Jesse Glick wrote:
>
> On Mon, Oct 17, 2016 at 10:53 AM, Surya Gaddipati 
>  wrote: 
> > I am working on  plugin that creates a single use computer/node whose 
> > lifecycle is tied to a single build . 
>
> Kind of like https://github.com/jenkinsci/docker-slaves-plugin 
> 
>  
> ? 
>

-- 
You received this message because you are subscribed to the Google Groups 
"Jenkins Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to jenkinsci-dev+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/jenkinsci-dev/11716e2b-d6ba-44be-a7ea-c6ad52dbb05a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Adding a node with Queue.withLock in Jenkins

2016-10-17 Thread Surya Gaddipati
Hi Stephen, 

Thank you for your quick response. I don't want to belabor this more than 
it needs to be . 

>If you change the list of nodes during that time, the queue thread will 
get a concurrent modification exception (best case) and die

I am guessing you are referring to this line 



for (Computer c : Jenkins.getInstance().getComputers())

This is not same collection that addnode modifies, its a copy . So it seems 
unlikely that would cause a concurrent modification exception. 


> it will assign work to a node that no longer exists, except for a phantom 
object reference that it held onto... or worse

This won't happen in this specific case I am describing here since a 
computer is tied to *one* particular build , it cannot be assigned to a 
phantom node. 


My main motivation is to remove queue locking as much as possible as  our 
Jenkins instance has major scalability issues, almost *all* of it stemming 
from overzealous queue locking. 


> It is a long term goal to remove the queue lock 

I would love for queue.lock to go away  but do you think it realistic to 
expect that to go away anytime soon given how deeply embedded it is into 
the core structure of the code. eg: Here it looks like we are acquiring a 
queue lock 
within
 
a queue lock 

?





On Monday, October 17, 2016 at 1:54:11 PM UTC-5, Stephen Connolly wrote:
>
>
>
> On Monday 17 October 2016, Surya Gaddipati  wrote:
>
>> Thanks Stephen for your quick response. 
>>
>> >  as otherwise the scheduling will blow up in your face.
>>
>> Curious, What do you mean by this ? 
>>
>
> When the queue starts scheduling it has to iterate the list of nodes to 
> build up the candidate nodes for the load balancing algorithm.
>
> If you change the list of nodes during that time, the queue thread will 
> get a concurrent modification exception (best case) and die... or it will 
> assign work to a node that no longer exists, except for a phantom object 
> reference that it held onto... or worse
>
> It is a long term goal to remove the queue lock as it impacts scalability 
> when you have 1000's of executors that can match a job... but even then the 
> impact is not so large that removing the lock would be a priority.
>
> For now, use the queue lock methods, when we remove the need for a lock 
> they will become no-ops that the JVM will inline away for plugins compiled 
> against current cores
>
>
>> thanks again.
>>
>> On Monday, October 17, 2016 at 9:53:23 AM UTC-5, Surya Gaddipati wrote:
>>>
>>> Hi all, 
>>>
>>> I am working on  plugin 
>>>  that 
>>> creates a single use computer/node whose lifecycle is tied to a single 
>>> build .  I am currently adding the node like this 
>>>
>>> Jenkins.getInstance().addNode(node); 
>>>
>>>
>>> However that method requires multiple Queue locks while doing so. 
>>>
>>>
>>> I believe in my particular case there is no need for queue locking since 
>>> only a single build can ever be scheduled on the computer via 
>>> LabelAssignmentAction. 
>>>
>>> I wanted to check ,
>>>
>>> 1. if that assumption is correct
>>>
>>> 2. If the team is open to accepting a patch to jenkins.instance for  a 
>>> new method which adds a node without the Queue lock. 
>>>
>>>
>>> Thank you.
>>>
>>>
>>> --Surya
>>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "Jenkins Developers" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to jenkinsci-dev+unsubscr...@googlegroups.com.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/jenkinsci-dev/6a9cb43d-8eab-4413-89e9-f85775f5bc03%40googlegroups.com
>>  
>> 
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>
> -- 
> Sent from my phone
>

-- 
You received this message because you are subscribed to the Google Groups 
"Jenkins Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to jenkinsci-dev+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/jenkinsci-dev/8daeff83-4cbb-4654-8886-c78586655d91%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Adding a node with Queue.withLock in Jenkins

2016-10-17 Thread Jesse Glick
On Mon, Oct 17, 2016 at 10:53 AM, Surya Gaddipati
 wrote:
> I am working on  plugin that creates a single use computer/node whose
> lifecycle is tied to a single build .

Kind of like https://github.com/jenkinsci/docker-slaves-plugin ?

-- 
You received this message because you are subscribed to the Google Groups 
"Jenkins Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to jenkinsci-dev+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/jenkinsci-dev/CANfRfr03aHxc8kmCR8rXm3-Z%2Bt%3D3W5wYj_%3D_F_R4%3Drvv_kbzww%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Adding a node with Queue.withLock in Jenkins

2016-10-17 Thread Stephen Connolly
On Monday 17 October 2016, Surya Gaddipati  wrote:

> Thanks Stephen for your quick response.
>
> >  as otherwise the scheduling will blow up in your face.
>
> Curious, What do you mean by this ?
>

When the queue starts scheduling it has to iterate the list of nodes to
build up the candidate nodes for the load balancing algorithm.

If you change the list of nodes during that time, the queue thread will get
a concurrent modification exception (best case) and die... or it will
assign work to a node that no longer exists, except for a phantom object
reference that it held onto... or worse

It is a long term goal to remove the queue lock as it impacts scalability
when you have 1000's of executors that can match a job... but even then the
impact is not so large that removing the lock would be a priority.

For now, use the queue lock methods, when we remove the need for a lock
they will become no-ops that the JVM will inline away for plugins compiled
against current cores


> thanks again.
>
> On Monday, October 17, 2016 at 9:53:23 AM UTC-5, Surya Gaddipati wrote:
>>
>> Hi all,
>>
>> I am working on  plugin
>>  that
>> creates a single use computer/node whose lifecycle is tied to a single
>> build .  I am currently adding the node like this
>>
>> Jenkins.getInstance().addNode(node);
>>
>>
>> However that method requires multiple Queue locks while doing so.
>>
>>
>> I believe in my particular case there is no need for queue locking since
>> only a single build can ever be scheduled on the computer via
>> LabelAssignmentAction.
>>
>> I wanted to check ,
>>
>> 1. if that assumption is correct
>>
>> 2. If the team is open to accepting a patch to jenkins.instance for  a
>> new method which adds a node without the Queue lock.
>>
>>
>> Thank you.
>>
>>
>> --Surya
>>
> --
> You received this message because you are subscribed to the Google Groups
> "Jenkins Developers" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to jenkinsci-dev+unsubscr...@googlegroups.com
> 
> .
> To view this discussion on the web visit https://groups.google.com/d/
> msgid/jenkinsci-dev/6a9cb43d-8eab-4413-89e9-f85775f5bc03%
> 40googlegroups.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>


-- 
Sent from my phone

-- 
You received this message because you are subscribed to the Google Groups 
"Jenkins Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to jenkinsci-dev+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/jenkinsci-dev/CA%2BnPnMw1%3Dv3VnWRcr9xaX5kv-yOdhz%2B_0bUTpUPwXpd20nE0zw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Adding a node with Queue.withLock in Jenkins

2016-10-17 Thread Surya Gaddipati
Thanks Stephen for your quick response. 

>  as otherwise the scheduling will blow up in your face.

Curious, What do you mean by this ? 

thanks again.

On Monday, October 17, 2016 at 9:53:23 AM UTC-5, Surya Gaddipati wrote:
>
> Hi all, 
>
> I am working on  plugin 
>  that 
> creates a single use computer/node whose lifecycle is tied to a single 
> build .  I am currently adding the node like this 
>
> Jenkins.getInstance().addNode(node); 
>
>
> However that method requires multiple Queue locks while doing so. 
>
>
> I believe in my particular case there is no need for queue locking since 
> only a single build can ever be scheduled on the computer via 
> LabelAssignmentAction. 
>
> I wanted to check ,
>
> 1. if that assumption is correct
>
> 2. If the team is open to accepting a patch to jenkins.instance for  a new 
> method which adds a node without the Queue lock. 
>
>
> Thank you.
>
>
> --Surya
>

-- 
You received this message because you are subscribed to the Google Groups 
"Jenkins Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to jenkinsci-dev+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/jenkinsci-dev/6a9cb43d-8eab-4413-89e9-f85775f5bc03%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Adding a node with Queue.withLock in Jenkins

2016-10-17 Thread Stephen Connolly
Adding/removing a node currently requires that the queue lock be held as
otherwise the scheduling will blow up in your face.

Best plan is typically to off-load to a clean thread if your call path is
in any way complex

On Monday 17 October 2016, Surya Gaddipati  wrote:

> ooops forgot to mention. Ditto for  `removeNode` method.
>
> On Monday, October 17, 2016 at 9:53:23 AM UTC-5, Surya Gaddipati wrote:
>>
>> Hi all,
>>
>> I am working on  plugin
>>  that
>> creates a single use computer/node whose lifecycle is tied to a single
>> build .  I am currently adding the node like this
>>
>> Jenkins.getInstance().addNode(node);
>>
>>
>> However that method requires multiple Queue locks while doing so.
>>
>>
>> I believe in my particular case there is no need for queue locking since
>> only a single build can ever be scheduled on the computer via
>> LabelAssignmentAction.
>>
>> I wanted to check ,
>>
>> 1. if that assumption is correct
>>
>> 2. If the team is open to accepting a patch to jenkins.instance for  a
>> new method which adds a node without the Queue lock.
>>
>>
>> Thank you.
>>
>>
>> --Surya
>>
> --
> You received this message because you are subscribed to the Google Groups
> "Jenkins Developers" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to jenkinsci-dev+unsubscr...@googlegroups.com
> 
> .
> To view this discussion on the web visit https://groups.google.com/d/
> msgid/jenkinsci-dev/377a23fc-6043-4e62-b8fa-b9cf58eb6684%
> 40googlegroups.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>


-- 
Sent from my phone

-- 
You received this message because you are subscribed to the Google Groups 
"Jenkins Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to jenkinsci-dev+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/jenkinsci-dev/CA%2BnPnMzL894Rg3szt%2BmpDCHP-oXgON9XZr4M3abJKvfhSdWH5g%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Adding a node with Queue.withLock in Jenkins

2016-10-17 Thread Surya Gaddipati
ooops forgot to mention. Ditto for  `removeNode` method. 

On Monday, October 17, 2016 at 9:53:23 AM UTC-5, Surya Gaddipati wrote:
>
> Hi all, 
>
> I am working on  plugin 
>  that 
> creates a single use computer/node whose lifecycle is tied to a single 
> build .  I am currently adding the node like this 
>
> Jenkins.getInstance().addNode(node); 
>
>
> However that method requires multiple Queue locks while doing so. 
>
>
> I believe in my particular case there is no need for queue locking since 
> only a single build can ever be scheduled on the computer via 
> LabelAssignmentAction. 
>
> I wanted to check ,
>
> 1. if that assumption is correct
>
> 2. If the team is open to accepting a patch to jenkins.instance for  a new 
> method which adds a node without the Queue lock. 
>
>
> Thank you.
>
>
> --Surya
>

-- 
You received this message because you are subscribed to the Google Groups 
"Jenkins Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to jenkinsci-dev+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/jenkinsci-dev/377a23fc-6043-4e62-b8fa-b9cf58eb6684%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.