Re: [Puppet-dev] ensure property is always retrieve, even if useless

2015-07-23 Thread Romain F.
I dug a little more into this. The weird behavior Jeff and Aurélien seen 
was coming from the following conditional statement :
if ensure_prop = property(:ensure) or (self.class.needs_ensure_retrieved and 
self.class.validattr?(:ensure) and ensure_prop = newattr(:ensure))
  # Retrieve ensure
else   

  
  # Don't retrieve ensure
end
Aurélien's patch adds
if ... and ensure_prop.should


Jeff's patch adds 
if ... and (ignoreensure or ensure_prop.should)

And the problem was coming from this ensure_prop.should. During puppet runs 
ensure_prop always exists, but ensure_prop.should only exists if something 
if specified. But, when we are coming from Puppet::Type.to_resource, 
ensure_prop still exists (if the property is ensurable) but ensure_prop 
don't. 
Thus, we are seeing this behavior : 
-  In the puppet resource application, with ignore_ensure=false and an 
ensurable type (Jeff's patch == Aurélien's patch in this case), this 
conditional statement evaluates to false (true or true and false). Because 
*ensure_prop.should 
evaluates to false*.

And, seeing this behavior, Jeff (IMHO) wrongly concluded 

As an aside, I tested Aurelien's original patch with the mcollective and 
 datacat_collector modules and noticed the same breakage my patch 
 exhibited: 
 *if ensure is not present in the resources returned from Type.retrieve, 
 then modules using collected resources will break*. 


But it's works, by removing the ensure_prop.should out of the conditional 
statement. No need to check for ensure_prop.should

Please correct me if I'm wrong.

Cheers,
  




Le mercredi 22 juillet 2015 14:44:01 UTC+2, Felix Frank a écrit :

 Hi, 

 thanks for picking this up. 

 I think what really needs discussing is 
 https://github.com/puppetlabs/puppet/pull/4038#issuecomment-119290626 

 I'm not sure what kind of breakage specifically Jeff noticed in 
 https://groups.google.com/forum/#!msg/puppet-dev/P8ReCHvmd2o/tuhzAM86wbYJ 

 Perhaps this is about collections of exported resources using Service| 
 | ? Could you make sure that still works with and without ensure 
 attribute set in those? 

 Any other ideas what might have been the issue? For all we know, the 
 cause might have been resolved upstream in the meantime, but it would be 
 nice to be sure. 

 Thanks, 
 Felix 

 On 07/16/2015 09:06 AM, Romain F. wrote: 
  Hello, 
  
  Necrobumping again this thread. Felix's wishes have been granted in 
  PR-4038 https://github.com/puppetlabs/puppet/pull/4038 (PUP-4760 
  https://tickets.puppetlabs.com/browse/PUP-4760) but this change is 
 bit 
  tricky/risky apparently. 
  
  The original goal was to not retrieve ensure property status when we 
  don't ask to. This need a change in Puppet::Type's retrieve method. 
  Before the change in PR-4038 
  https://github.com/puppetlabs/puppet/pull/4038, it was 
  programmatically creating a ensure property when nothing is specified 
  and if the type is ensurable, so it was always retrieving the ensure 
  status. 
  
  The change in the beginning of this thread was adding another condition 
  to this : if the type is ensurable and if the ensure property have a 
  should attribute. 
  
  The change in PR-4038 https://github.com/puppetlabs/puppet/pull/4038 
  is just skipping the creation of the ensure property when nothing is 
  specified and if the type can continue without a ensure property (it's 
  the case for Services, not for Files for example). 
  
  Like Felix's patch, it doesn't break any tests and it doesn't break 
  puppet resource ... (which uses collected resources) 
  
  Can you confirm that this would work ? Do think it can be extended to 
  some other types ? 
  
  Cheers, 
  
  Romain 
  
  
  
  
  Le lundi 5 mai 2014 02:16:33 UTC+2, Felix Frank a écrit : 
  
  Hi, 
  
  necro-bumping yet another thread, I took another stab at that old 
  problem. 
  
  I molded Jeff's approach into a form that I hope to be more 
 palatable. 
  It does not break any tests that I have tried (which is not saying 
 it 
  won't break any whatsoever). 
  
  
 https://github.com/ffrank/puppet/tree/dont-always-retrieve-service-ensure 
  
  
  So, if you guys could give it a spin, that would be awesome. 
  
  Also, a ticket would be helpful, but if you can confirm that this 
 works 
  and helps, I can open one on your behalf so we can make a PR for 
 this. 
  
  Cheers, 
  Felix 
  
  On 12/19/2013 11:39 PM, Jeff Bachtel wrote: 
   
   In the end, even just the behavior change to puppet resource 
  makes 
   the patch a non-starter because it is a widely used feature. 
   
   I understand this feature should be kept, but that a pity this 
  should 
   impact other even more useful feature like apply or agent. 
   
   Could it be possible that puppet resource

Re: [Puppet-dev] ensure property is always retrieve, even if useless

2015-07-16 Thread Romain F.
Hello,

Necrobumping again this thread. Felix's wishes have been granted in PR-4038 
https://github.com/puppetlabs/puppet/pull/4038 (PUP-4760 
https://tickets.puppetlabs.com/browse/PUP-4760) but this change is bit 
tricky/risky apparently.

The original goal was to not retrieve ensure property status when we don't 
ask to. This need a change in Puppet::Type's retrieve method. Before the 
change in PR-4038 https://github.com/puppetlabs/puppet/pull/4038, it was 
programmatically creating a ensure property when nothing is specified and 
if the type is ensurable, so it was always retrieving the ensure status. 

The change in the beginning of this thread was adding another condition to 
this : if the type is ensurable and if the ensure property have a should 
attribute.

The change in PR-4038 https://github.com/puppetlabs/puppet/pull/4038 is 
just skipping the creation of the ensure property when nothing is specified 
and if the type can continue without a ensure property (it's the case for 
Services, not for Files for example).

Like Felix's patch, it doesn't break any tests and it doesn't break puppet 
resource ... (which uses collected resources)

Can you confirm that this would work ? Do think it can be extended to some 
other types ?

Cheers,

Romain




Le lundi 5 mai 2014 02:16:33 UTC+2, Felix Frank a écrit :

 Hi, 

 necro-bumping yet another thread, I took another stab at that old problem. 

 I molded Jeff's approach into a form that I hope to be more palatable. 
 It does not break any tests that I have tried (which is not saying it 
 won't break any whatsoever). 

 https://github.com/ffrank/puppet/tree/dont-always-retrieve-service-ensure 

 So, if you guys could give it a spin, that would be awesome. 

 Also, a ticket would be helpful, but if you can confirm that this works 
 and helps, I can open one on your behalf so we can make a PR for this. 

 Cheers, 
 Felix 

 On 12/19/2013 11:39 PM, Jeff Bachtel wrote: 
  
  In the end, even just the behavior change to puppet resource makes 
  the patch a non-starter because it is a widely used feature. 
  
  I understand this feature should be kept, but that a pity this should 
  impact other even more useful feature like apply or agent. 
  
  Could it be possible that puppet resource and other like apply or 
  agent retrieves only what they need? In apply/agent case, this come 
  from a transaction being applied. For puppet resource I assume this 
  is not the case. Could method parameter solve this case? And this 
  could even keep the compat if this param is not specified 
  
  
  I spent all day (because my Ruby is bad) doing a proof of concept with 
  this. It touches type.rb and indirector/resource/ral.rb to add a 
  seemingly transparent method variable flagging whether ensure should 
  be ignored for the purpose of retrieving resources. It defaults to 
  false (don't ignore ensure), which should cause the speedup Aurelien 
  is needing. The resource RAL indirector is aware of the method 
  parameter and sets it to true (ignoreensure), thereby exhibiting the 
  old behavior when puppet resource is called from the command line. 
  
  There is a nasty bit that I'm unsure of in the retrieve_resource 
  method. I discovered that when running puppet agent -t --noop, if I 
  tried to use my newly defined method parameter that parsing would 
  choke - apparently in that state the retrieve method is targeting 
  another method. I worked around it by putting in a rescue statement 
  and falling back to the old way of calling retrieve which should, 
  eventually, still hit the retrieve with Aurelien's improved 
  conditional logic. 
  
  Anyway, please find attached a diff containing the changes in 
  question. Feel free to refine and submit it as a PR, my Ruby really 
  isn't up for my doing so myself. 
  
  Jeff 
  
  



-- 
You received this message because you are subscribed to the Google Groups 
Puppet Developers group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to puppet-dev+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/puppet-dev/d846f39c-753f-4aaa-99f0-947822791c50%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[Puppet-dev] Re: Catalog Deserialization performance

2015-07-01 Thread Romain F.


Le mercredi 1 juillet 2015 00:04:55 UTC+2, henrik lindberg a écrit :

 On 2015-30-06 16:17, Romain F. wrote: 
  I've already benchmarked and profiled Catalog's from_data_hash and 
  to_data_hash methods using the benchmark framework. 
  Most of the time is spent in the from_data_hash (we already knew it) but 
  there is no big pitfalls where Ruby loses his time. 
  
  My callgrind file shows that the top 5 (in self time) is : 
  - Array.flatten (55000 calls) 
  - Array..each (115089 calls) 
  - Puppet::Resource.initialize (15000 calls) 
  - String.=~ (65045 calls) 
  - Hash[]= (115084 calls) 
  
  This top 5 is taking ~30% of the total time . 
  
  As you can see, it can be dificult to optimize this. IMHO, the 
  benchmark - tweak - benchmark way of optimizing is not sufficient 
  here. I think the way it (de)serialize a catalog needs a deep refactor. 
  

 There is probably lots of duplicated work going on at the levels above 
 and those are causing those generic methods to light up (except 
 Puppet::Resource.initialize). 

 There is both the deserialization process as such to optimize, but also 
 the Resource implementation itself which is far from optimal. 

 The next thing would be to focus on Resource.initialize/from_data_hash 


Agreed, but I can't do that on my own in timely fashion. We just want 
puppet devs  to be aware that Puppet::Resource gets more features but 
performances is not following anyhow, and it's getting worse in puppet 4. I 
don't really now how it can be adressed.


 I think it is also relevant to establish some kind of world record - 
 say serializing and deserializing a hash using MsgPack; a hash of data 
 cannot be transported faster across the wire than that (unless also not 
 using Ruby objects to represent the data - with a lot of extra 
 complexity). 

 I mean, a hash of some complexity will always consume quite a bit of 
 processing and memory to get across the wire. Is it hitting the world 
 record enough? 

 That's not the point, like I said, this performance gap is coming from the 
creation the Graph itself, not the deserialization from PSON or MsgPack. In 
my case, in the 4 sec deserialization, 3.5 secs is the from_data_hash. 


Romain

 - henrik 

  Cheers, 
  
  Le mardi 30 juin 2015 04:23:42 UTC+2, henrik lindberg a écrit : 
  
  On 2015-29-06 22:41, Trevor Vaughan wrote: 
If you get a profiling suite together (aka, bunch of random 
 patches) 
could you release it? 

  
  It is not difficult actually. Look at the benchmarks in the puppet 
 code 
  base. Many of them are suitable for profiling with a ruby profiler. 
  I don't think we have any benchmarks targeting the agent side 
  though, so 
  the first thing to do (for someone) is to write one. 
  
  What is more difficult is coming up with a benchmark that does not 
  involve real/complex resources - but deserialization and up to 
 actually 
  applying should be possible to work with in a simple way. 
  
  Profiling is then just running that benchmark with the ruby profiler 
  turned on and analyzing the result, make changes, run again... 
 (repeat 
  until happy). 
  
  - henrik 
  
  
I've been curious about this for quite some time but never quite 
 got 
around to dealing with it. 

My concern is very much client side performance since the more 
 you 
managing a client, the less the client gets to do it's actual 
 job. 

Thanks, 

Trevor 

On Mon, Jun 29, 2015 at 4:35 PM, Henrik Lindberg 
henrik.@cloudsmith.com javascript: 
  mailto:henrik@cloudsmith.com javascript: 
wrote: 

On 2015-29-06 16 tel:2015-29-06%2016:48, Romain F. wrote: 

Hi everyone, 

I try to optimize our Puppet runs by running some 
  benchmarks and 
patching the puppet core (if possible).. But I have some 
  difficulties 
around the catalog serialization/deserialization. 

In fact, in 3.7.5 or 3.8.x, the Config Retrieval takes 
  roughly 
7secs and 
only 4 secs is on the master side. Same fact in 4.2 but 
  with 9 
secs of 
config retrieval and still 4 secs on the master side. 

My first thoughts was Okay, time to try MsgPack. No 
  improvements. 

I've instrumented a bit the code in the master branch 
 around 
this, and 
I've found out that, on my 9secs of config retrieval, 
  3.61secs 
is lost 
in catalog deserialization, 2 secs is the catalog 
  conversion.. 
But it's 
not the real deserialization (PSON to Hash) that takes 
  ages, 
it's the 
creation

[Puppet-dev] Re: Catalog Deserialization performance

2015-06-30 Thread Romain F.
I've already benchmarked and profiled Catalog's from_data_hash and 
to_data_hash methods using the benchmark framework.
Most of the time is spent in the from_data_hash (we already knew it) but 
there is no big pitfalls where Ruby loses his time.

My callgrind file shows that the top 5 (in self time) is :
- Array.flatten (55000 calls)
- Array.each (115089 calls)
- Puppet::Resource.initialize (15000 calls)
- String.=~ (65045 calls)
- Hash[]= (115084 calls)

This top 5 is taking ~30% of the total time .

As you can see, it can be dificult to optimize this. IMHO, the benchmark 
- tweak - benchmark way of optimizing is not sufficient here. I think 
the way it (de)serialize a catalog needs a deep refactor. 

Cheers,

Le mardi 30 juin 2015 04:23:42 UTC+2, henrik lindberg a écrit :

 On 2015-29-06 22:41, Trevor Vaughan wrote: 
  If you get a profiling suite together (aka, bunch of random patches) 
  could you release it? 
  

 It is not difficult actually. Look at the benchmarks in the puppet code 
 base. Many of them are suitable for profiling with a ruby profiler. 
 I don't think we have any benchmarks targeting the agent side though, so 
 the first thing to do (for someone) is to write one. 

 What is more difficult is coming up with a benchmark that does not 
 involve real/complex resources - but deserialization and up to actually 
 applying should be possible to work with in a simple way. 

 Profiling is then just running that benchmark with the ruby profiler 
 turned on and analyzing the result, make changes, run again... (repeat 
 until happy). 

 - henrik 


  I've been curious about this for quite some time but never quite got 
  around to dealing with it. 
  
  My concern is very much client side performance since the more you 
  managing a client, the less the client gets to do it's actual job. 
  
  Thanks, 
  
  Trevor 
  
  On Mon, Jun 29, 2015 at 4:35 PM, Henrik Lindberg 
  henrik@cloudsmith.com javascript: mailto:
 henrik@cloudsmith.com javascript: 
  wrote: 
  
  On 2015-29-06 16 tel:2015-29-06%2016:48, Romain F. wrote: 
  
  Hi everyone, 
  
  I try to optimize our Puppet runs by running some benchmarks and 
  patching the puppet core (if possible). But I have some 
 difficulties 
  around the catalog serialization/deserialization. 
  
  In fact, in 3.7.5 or 3.8.x, the Config Retrieval takes roughly 
  7secs and 
  only 4 secs is on the master side. Same fact in 4.2 but with 9 
  secs of 
  config retrieval and still 4 secs on the master side. 
  
  My first thoughts was Okay, time to try MsgPack. No 
 improvements. 
  
  I've instrumented a bit the code in the master branch around 
  this, and 
  I've found out that, on my 9secs of config retrieval, 3.61secs 
  is lost 
  in catalog deserialization, 2 secs is the catalog conversion.. 
  But it's 
  not the real deserialization (PSON to Hash) that takes ages, 
  it's the 
  creation of the Catalog object itself (Hash to catalog). 
 Benchmarks 
  shows that the time to deserialize MsgPack (or PSON) is 
 negligible 
  compared to the catalog deserialization time. 
  
  So here is my question : Is that a known issue ? Is there any 
  reason of 
  the regression in 4.x (Future parser creating more objects, ...) 
 ? 
  
  The parser=future setting only makes a difference when compiling the 
  catalog - the catalog itself does not contain more or different data 
  (except possibly using numbers instead of strings for some 
 attributes). 
  
  The best way to optimize this is to write a benchmark using the 
  benchmark framework and measure the time it takes to deserialize a 
  given catalog. Then run that benchmark with Ruby profiling turned 
 on. 
  
  There are quite a few things going on at the agent side in addition 
  to taking the catalog PSON and turning it into a catalog that it can 
  apply (loading types, resolving providers, etc). Make sure to 
  benchmark these separately if possible. 
  
  Regards 
  - henrik 
  
  Cheers, 
  
  -- 
  You received this message because you are subscribed to the 
 Google 
  Groups Puppet Developers group. 
  To unsubscribe from this group and stop receiving emails from 
  it, send 
  an email to puppet-dev+...@googlegroups.com javascript: 
  mailto:puppet-dev%2bunsubscr...@googlegroups.com javascript: 

  mailto:puppet-dev+unsubscr...@googlegroups.com javascript: 
  mailto:puppet-dev%2bunsubscr...@googlegroups.com javascript:. 

  To view this discussion on the web visit 
  
 https://groups.google.com/d/msgid/puppet-dev/a5bf7422-6119-43ee-ba11-44001c1ce097%40googlegroups.com
  
  
 https://groups.google.com/d/msgid/puppet-dev/a5bf7422-6119-43ee-ba11

[Puppet-dev] Catalog Deserialization performance

2015-06-29 Thread Romain F.
Hi everyone,

I try to optimize our Puppet runs by running some benchmarks and patching 
the puppet core (if possible). But I have some difficulties around the 
catalog serialization/deserialization. 

In fact, in 3.7.5 or 3.8.x, the Config Retrieval takes roughly 7secs and 
only 4 secs is on the master side. Same fact in 4.2 but with 9 secs of 
config retrieval and still 4 secs on the master side. 

My first thoughts was Okay, time to try MsgPack. No improvements.

I've instrumented a bit the code in the master branch around this, and I've 
found out that, on my 9secs of config retrieval, 3.61secs is lost in 
catalog deserialization, 2 secs is the catalog conversion.. But it's not 
the real deserialization (PSON to Hash) that takes ages, it's the 
creation of the Catalog object itself (Hash to catalog). Benchmarks shows 
that the time to deserialize MsgPack (or PSON) is negligible compared to 
the catalog deserialization time.

So here is my question : Is that a known issue ? Is there any reason of the 
regression in 4.x (Future parser creating more objects, ...) ?

Cheers,

-- 
You received this message because you are subscribed to the Google Groups 
Puppet Developers group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to puppet-dev+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/puppet-dev/a5bf7422-6119-43ee-ba11-44001c1ce097%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[Puppet-dev] Adding metrics in agent report - PUP-4634

2015-05-22 Thread Romain F.
Hi everyone,
Following the suggestion of Henrik L. we'll continue the discussion of 
PUP-4634 here.
It began with my suggestion of adding some timings in agent code in order 
to check steps which can take some time. And, thus, being able to 
troubleshoot pretty ugly things like this :

Info: Caching catalog for bulbi1.c-bulbi.tgcc.ccc.cea.fr in 6.48 seconds
Info: Convert catalog in 6.18 seconds
Info: Applying configuration version '1432224909'
...
Notice: Finished catalog run in 88.45 seconds
Notice: Send report in 11.17 seconds



These 3 metrics have been added using the Puppet::Util.benchmark tool. So 
it's only visible on the agent side, nothing transfered back in reports.

As R. I. Pienaar suggested (and he's probably right), it would be a big win 
to put those metrics in reports. And then, add a way to show them in agent 
logs (like --evaltrace).

So here's my questions : Which metrics to gather ? Does puppetDB need to be 
aware of those new metrics ?

Any feedback appreciated.

Cheers,

---
Romain F.

-- 
You received this message because you are subscribed to the Google Groups 
Puppet Developers group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to puppet-dev+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/puppet-dev/9afac8e1-fbdb-4c6a-9a4e-33079b4aacb0%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [Puppet-dev] Adding metrics in agent report - PUP-4634

2015-05-22 Thread Romain F.

Le vendredi 22 mai 2015 14:53:10 UTC+2, R.I. Pienaar a écrit :

 these would be good to add to both last_run_summary.yaml and the reports 
 indeed, 
 I mentioned this in the ticket but the report already contains a wealth of 
 perf 
 data, I have a report parser you can use on the CLI that produce output 
 like this: 

 https://github.com/ripienaar/puppet-reportprint/blob/master/SAMPLE.txt 

 So already there you have config retrieval for example, adding more 
 metrics around 
 the saving and re-loading of the catalog would be handy.  As well as 
 firming up 
 the docs around these and explain exactly what they are - this might exist 
 now but 
 last time I checked it didnt and it was a bit of guess work what they all 
 do and 
 mean esp some in last_run_summary.yaml. 

  
It sounds like it shows metrics of the catalog application, not really 
about catalog manipulation, facter or Indirection caching. And those steps 
can take a while.
This is what we wanted to monitor originally.


  Notice: Send report in 11.17 seconds 

 obv this could not be in the report but be handy in last_run_summary.yaml 

 It would be ideal if like the existing agent perf data this is always 
 collected 
 and always stored in reports but optionally shown to the console. 

  
Adding a bunch of report.add_times(:step_to_report, thinmark
{block_of_something}) would do the job right ?
 

 I can't right now think of specific additions but whoever implements it 
 should just 
 go through every major life cycle event and make sure its reported.  One 
 thing that 
 might be handy is some details around the number of HTTP requests that are 
 being made 
 to measure and observe the impact of things like the HTTP connection pool. 


+1 
Too much handshakes can put heavy pressure on masters, this would implement 
a way to monitor this.

-- 
You received this message because you are subscribed to the Google Groups 
Puppet Developers group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to puppet-dev+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/puppet-dev/d0283a1c-6681-4475-a644-caba3f61542c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.