Re: [PATCH stable-2.15 1/2] KVM: handle gracefully too old psutil versions

2016-06-21 Thread 'Iustin Pop' via ganeti-devel
On 21 June 2016 at 16:28, Brian Foley  wrote:

> On Sat, Jun 18, 2016 at 04:53:25AM +0200, Iustin Pop wrote:
> > On 2016-06-15 10:23:57, Brian Foley wrote:
> > > Additionally, 0.5.0 had a psutil.Process.{get,set}_cpu_affinity() API,
> > > which we use in the kvm code. In 2.0.0 API was renamed to
> cpu_affinity()
> > > and the old API deprecated, and in 3.0 the old {get,set} API was
> removed.
> > >
> > > So we currently need to restrict psutils to 2.x, but maybe it's just
> > > easier to add an __attr__ check for the new affinity API, but perhaps
> > > this should be done in a separate patch.
> >
> > Just to see if I understand: we don't support 3.x+ due to the
> > cpu_affinity API only?
>
> I think so yes. That's the only thing I saw when I looked through the
> psutil
> version history a few months ago.
>
> Thinking a little more about what I wrote above, since we require psutil
> >= 2.0
> for kvm anyway, then it would be perfectly fine to replace the use of the
> 0.5.0
> affinity API in the kvm code with the >= 2.0 API. I don't know if this on
> its
> own would be sufficient to get it to work with psutil >= 3.0, but it would
> be
> a step in the right direction. We can look at this another time though.
>

Yes, that's why I asked—it was kind of surprising to see the use of 0.5.0
API but not actually supporting 0.5.0 :)

thanks,
iustin


Re: [PATCH stable-2.15] KVM: handle gracefully too old/too new psutil versions

2016-06-21 Thread 'Iustin Pop' via ganeti-devel
On 21 June 2016 at 16:29, Brian Foley  wrote:

> On Sat, Jun 18, 2016 at 04:54:51AM +0200, Iustin Pop wrote:
> > From: Iustin Pop 
> >
> > My previous pylint cleanups were done without psutil installed; as soon
> > I installed it, pylint showed that the wheezy's version of psutil is too
> > old (0.5.1), not having the cpu_count() function which was introduced in
> > 2.0.0. Furthermore, thanks to Brian, it turns out that too new versions
> > are also unsupported.
> >
> > This change adds a simple way to detect this and to degrade gracefully
> > by throwing an appropriate exception instead of an unclear one at
> > runtime. Tested with wheezy's 0.5.0 and sid's 4.1.1 versions. It also
> > updates the INSTALL file to note the supported versions and that
> > [2.0.0,2.2.0) had issues with FH leaks.
> >
> > Signed-off-by: Iustin Pop 
>
> LGTM, and sorry for the delay in reviewing.
>

No worries, thanks!


Re: [PATCH stable-2.15 2/2] Cleanup more pylint/pep8/apidoc errors

2016-06-21 Thread 'Brian Foley' via ganeti-devel
On Wed, Jun 15, 2016 at 07:18:39AM +0200, Iustin Pop wrote:
> From: Iustin Pop 
> 
> Hopefully this makes stable-2.15 clean and able to pass a buildbot run.
> The changes should all be self-explanatory, except test/mocks.py one:
> there were more unused arguments, so I added a silence for that at class
> level, and removed the '_' on _ec_id since it was superfluous now.
> 
> Signed-off-by: Iustin Pop 

LGTM, and thanks for the cleanup work.

Brian.


Re: [PATCH stable-2.15] KVM: handle gracefully too old/too new psutil versions

2016-06-21 Thread 'Brian Foley' via ganeti-devel
On Sat, Jun 18, 2016 at 04:54:51AM +0200, Iustin Pop wrote:
> From: Iustin Pop 
> 
> My previous pylint cleanups were done without psutil installed; as soon
> I installed it, pylint showed that the wheezy's version of psutil is too
> old (0.5.1), not having the cpu_count() function which was introduced in
> 2.0.0. Furthermore, thanks to Brian, it turns out that too new versions
> are also unsupported.
> 
> This change adds a simple way to detect this and to degrade gracefully
> by throwing an appropriate exception instead of an unclear one at
> runtime. Tested with wheezy's 0.5.0 and sid's 4.1.1 versions. It also
> updates the INSTALL file to note the supported versions and that
> [2.0.0,2.2.0) had issues with FH leaks.
> 
> Signed-off-by: Iustin Pop 

LGTM, and sorry for the delay in reviewing.

Thanks,
Brian.


Re: [PATCH stable-2.15 1/2] KVM: handle gracefully too old psutil versions

2016-06-21 Thread 'Brian Foley' via ganeti-devel
On Sat, Jun 18, 2016 at 04:53:25AM +0200, Iustin Pop wrote:
> On 2016-06-15 10:23:57, Brian Foley wrote:
> > On Wed, Jun 15, 2016 at 07:18:38AM +0200, Iustin Pop wrote:
> > > From: Iustin Pop 
> > > 
> > > My previous pylint cleanups were done without psutil installed; as soon
> > > I installed it, pylint showed that the wheezy's version of psutil is too
> > > old (0.5.1), not having the cpu_count() function which was introduced in
> > > 2.0.0.
> > > 
> > > This change adds a simple way to detect this and to degrade gracefully
> > > by throwing an appropriate exception instead of an unclear one at
> > > runtime.
> > 
> > This is good, but we could take it a little further. First of all we
> > support any version 2.x version of psutil from 2.0.0 to 2.2.1. However
> > we probably want to warn if the user is running <2.2.0 because 2.2.0
> > fixed a number of file handle leaks.
> 
> Uh, that is hard to do properly, I don't know if many people watch
> configure-time warnings… I'll add a note to INSTALL.

Excellent. Thanks.

> > Additionally, 0.5.0 had a psutil.Process.{get,set}_cpu_affinity() API,
> > which we use in the kvm code. In 2.0.0 API was renamed to cpu_affinity()
> > and the old API deprecated, and in 3.0 the old {get,set} API was removed.
> > 
> > So we currently need to restrict psutils to 2.x, but maybe it's just
> > easier to add an __attr__ check for the new affinity API, but perhaps
> > this should be done in a separate patch.
> 
> Just to see if I understand: we don't support 3.x+ due to the
> cpu_affinity API only?

I think so yes. That's the only thing I saw when I looked through the psutil
version history a few months ago.

Thinking a little more about what I wrote above, since we require psutil >= 2.0
for kvm anyway, then it would be perfectly fine to replace the use of the 0.5.0
affinity API in the kvm code with the >= 2.0 API. I don't know if this on its
own would be sufficient to get it to work with psutil >= 3.0, but it would be
a step in the right direction. We can look at this another time though.

Cheers,
Brian.


Re: [PATCH stable-2.17] Prevent watcher from holding lock on verify group

2016-06-21 Thread 'Brian Foley' via ganeti-devel
On Fri, Jun 17, 2016 at 04:31:55PM +0100, 'Federico Morg Pareschi' via 
ganeti-devel wrote:
> The ganeti-watcher holds the group file lock for too long, until after
> the execution of a group-verify-disk job. This locks for a long time if
> there are other jobs already running and blocking the verify from
> executing. When the lock is held, another ganeti-watcher run cannot be
> scheduled, so this prevents the ganeti-watcher from running for several
> minutes.
> 
> With this commit, the lock is released before running the VerifyDisks
> operation, so even if the submitted job gets stuck in the Job Queue, a
> subsequient ganeti-watcher run would still happen.
> 
> As an additional change, an incorrect docstring was also removed.
> 
> Signed-off-by: Federico Morg Pareschi 

OK, thanks for the patch and the discussion, and sorry for the delay in
reviewing this.

As you say, we'll have to figure out a good way to avoid multiple redundant
VerifyDisks running at once, but we can worry about that later, and for now this
LGTM.

Brian.


Re: [PATCH stable-2.16] Special case WaitForJobChange to reduce heap use

2016-06-21 Thread 'Viktor Bachraty' via ganeti-devel
Hmm, interesting. I also like when things are handled uniformly, but here I
think the issue arises from the fact that resources are allocated even
before taking a look at what does the requestor expect us to do (the result
of decodeLuxiCall). I think this inversion was introduced to reduce code
repetition, to have more uniformity assuming that config loading is cheap.
So we can have 3 approaches fixing it - making config loading cheap (maybe
ShortByteString + aeson would make it cheap enough), branching before doing
the config loading (as it is done in this patch) or releasing the config
when it's not needed (as you suggested). I like the current solution as it
explicitly avoids doing unnecessary steps, but maybe it's just my
insufficient experience with Haskell (let's leave the compiler to decide
what has to be evaluated and what doesn't). Maybe introducing
handleCallWithConfig and handleCallWithoutConfig would make it feel less
like a manual workaround?

On Sat, Jun 18, 2016 at 4:03 AM, Iustin Pop  wrote:

> On 2016-06-15 17:25:18, Ganeti Development List wrote:
> > As we talked offline, I like the fact that we are not going through the
> > generic handleCall if we do not need the config regardless of the
> > optimisation (which is awesome and is the main motivation for this
> patch).
> > I'm wondering if we still need to keep in the expensive code path - I
> don't
> > see a reason why someone would want to query other fields in
> WaitForJobChange
> > than status or logs (I think these are the only ones that change during
> the
> > lifetime of the job). Maybe we could drop the expensive code path in
> later
> > versions.
>
> I discussed this with Brian briefly over chat a while ago. I regard this
> as a good but temporary stop-gap, as it shouldn't be needed to manually
> handle routing differently. Better type system separation should allow
> releasing the config if not needed automatically. It would be good to
> invest into that, rather than (IMHO) expanding the manual workaround.
>
> Just my 2c :)
>
> iustin
>