Re: [Puppet-dev] ensure property is always retrieve, even if useless
I dug a little more into this. The weird behavior Jeff and Aurélien seen was coming from the following conditional statement : if ensure_prop = property(:ensure) or (self.class.needs_ensure_retrieved and self.class.validattr?(:ensure) and ensure_prop = newattr(:ensure)) # Retrieve ensure else # Don't retrieve ensure end Aurélien's patch adds if ... and ensure_prop.should Jeff's patch adds if ... and (ignoreensure or ensure_prop.should) And the problem was coming from this ensure_prop.should. During puppet runs ensure_prop always exists, but ensure_prop.should only exists if something if specified. But, when we are coming from Puppet::Type.to_resource, ensure_prop still exists (if the property is ensurable) but ensure_prop don't. Thus, we are seeing this behavior : - In the puppet resource application, with ignore_ensure=false and an ensurable type (Jeff's patch == Aurélien's patch in this case), this conditional statement evaluates to false (true or true and false). Because *ensure_prop.should evaluates to false*. And, seeing this behavior, Jeff (IMHO) wrongly concluded As an aside, I tested Aurelien's original patch with the mcollective and datacat_collector modules and noticed the same breakage my patch exhibited: *if ensure is not present in the resources returned from Type.retrieve, then modules using collected resources will break*. But it's works, by removing the ensure_prop.should out of the conditional statement. No need to check for ensure_prop.should Please correct me if I'm wrong. Cheers, Le mercredi 22 juillet 2015 14:44:01 UTC+2, Felix Frank a écrit : Hi, thanks for picking this up. I think what really needs discussing is https://github.com/puppetlabs/puppet/pull/4038#issuecomment-119290626 I'm not sure what kind of breakage specifically Jeff noticed in https://groups.google.com/forum/#!msg/puppet-dev/P8ReCHvmd2o/tuhzAM86wbYJ Perhaps this is about collections of exported resources using Service| | ? Could you make sure that still works with and without ensure attribute set in those? Any other ideas what might have been the issue? For all we know, the cause might have been resolved upstream in the meantime, but it would be nice to be sure. Thanks, Felix On 07/16/2015 09:06 AM, Romain F. wrote: Hello, Necrobumping again this thread. Felix's wishes have been granted in PR-4038 https://github.com/puppetlabs/puppet/pull/4038 (PUP-4760 https://tickets.puppetlabs.com/browse/PUP-4760) but this change is bit tricky/risky apparently. The original goal was to not retrieve ensure property status when we don't ask to. This need a change in Puppet::Type's retrieve method. Before the change in PR-4038 https://github.com/puppetlabs/puppet/pull/4038, it was programmatically creating a ensure property when nothing is specified and if the type is ensurable, so it was always retrieving the ensure status. The change in the beginning of this thread was adding another condition to this : if the type is ensurable and if the ensure property have a should attribute. The change in PR-4038 https://github.com/puppetlabs/puppet/pull/4038 is just skipping the creation of the ensure property when nothing is specified and if the type can continue without a ensure property (it's the case for Services, not for Files for example). Like Felix's patch, it doesn't break any tests and it doesn't break puppet resource ... (which uses collected resources) Can you confirm that this would work ? Do think it can be extended to some other types ? Cheers, Romain Le lundi 5 mai 2014 02:16:33 UTC+2, Felix Frank a écrit : Hi, necro-bumping yet another thread, I took another stab at that old problem. I molded Jeff's approach into a form that I hope to be more palatable. It does not break any tests that I have tried (which is not saying it won't break any whatsoever). https://github.com/ffrank/puppet/tree/dont-always-retrieve-service-ensure So, if you guys could give it a spin, that would be awesome. Also, a ticket would be helpful, but if you can confirm that this works and helps, I can open one on your behalf so we can make a PR for this. Cheers, Felix On 12/19/2013 11:39 PM, Jeff Bachtel wrote: In the end, even just the behavior change to puppet resource makes the patch a non-starter because it is a widely used feature. I understand this feature should be kept, but that a pity this should impact other even more useful feature like apply or agent. Could it be possible that puppet resource
Re: [Puppet-dev] ensure property is always retrieve, even if useless
Hello, Necrobumping again this thread. Felix's wishes have been granted in PR-4038 https://github.com/puppetlabs/puppet/pull/4038 (PUP-4760 https://tickets.puppetlabs.com/browse/PUP-4760) but this change is bit tricky/risky apparently. The original goal was to not retrieve ensure property status when we don't ask to. This need a change in Puppet::Type's retrieve method. Before the change in PR-4038 https://github.com/puppetlabs/puppet/pull/4038, it was programmatically creating a ensure property when nothing is specified and if the type is ensurable, so it was always retrieving the ensure status. The change in the beginning of this thread was adding another condition to this : if the type is ensurable and if the ensure property have a should attribute. The change in PR-4038 https://github.com/puppetlabs/puppet/pull/4038 is just skipping the creation of the ensure property when nothing is specified and if the type can continue without a ensure property (it's the case for Services, not for Files for example). Like Felix's patch, it doesn't break any tests and it doesn't break puppet resource ... (which uses collected resources) Can you confirm that this would work ? Do think it can be extended to some other types ? Cheers, Romain Le lundi 5 mai 2014 02:16:33 UTC+2, Felix Frank a écrit : Hi, necro-bumping yet another thread, I took another stab at that old problem. I molded Jeff's approach into a form that I hope to be more palatable. It does not break any tests that I have tried (which is not saying it won't break any whatsoever). https://github.com/ffrank/puppet/tree/dont-always-retrieve-service-ensure So, if you guys could give it a spin, that would be awesome. Also, a ticket would be helpful, but if you can confirm that this works and helps, I can open one on your behalf so we can make a PR for this. Cheers, Felix On 12/19/2013 11:39 PM, Jeff Bachtel wrote: In the end, even just the behavior change to puppet resource makes the patch a non-starter because it is a widely used feature. I understand this feature should be kept, but that a pity this should impact other even more useful feature like apply or agent. Could it be possible that puppet resource and other like apply or agent retrieves only what they need? In apply/agent case, this come from a transaction being applied. For puppet resource I assume this is not the case. Could method parameter solve this case? And this could even keep the compat if this param is not specified I spent all day (because my Ruby is bad) doing a proof of concept with this. It touches type.rb and indirector/resource/ral.rb to add a seemingly transparent method variable flagging whether ensure should be ignored for the purpose of retrieving resources. It defaults to false (don't ignore ensure), which should cause the speedup Aurelien is needing. The resource RAL indirector is aware of the method parameter and sets it to true (ignoreensure), thereby exhibiting the old behavior when puppet resource is called from the command line. There is a nasty bit that I'm unsure of in the retrieve_resource method. I discovered that when running puppet agent -t --noop, if I tried to use my newly defined method parameter that parsing would choke - apparently in that state the retrieve method is targeting another method. I worked around it by putting in a rescue statement and falling back to the old way of calling retrieve which should, eventually, still hit the retrieve with Aurelien's improved conditional logic. Anyway, please find attached a diff containing the changes in question. Feel free to refine and submit it as a PR, my Ruby really isn't up for my doing so myself. Jeff -- You received this message because you are subscribed to the Google Groups Puppet Developers group. To unsubscribe from this group and stop receiving emails from it, send an email to puppet-dev+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/puppet-dev/d846f39c-753f-4aaa-99f0-947822791c50%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[Puppet-dev] Re: Catalog Deserialization performance
Le mercredi 1 juillet 2015 00:04:55 UTC+2, henrik lindberg a écrit : On 2015-30-06 16:17, Romain F. wrote: I've already benchmarked and profiled Catalog's from_data_hash and to_data_hash methods using the benchmark framework. Most of the time is spent in the from_data_hash (we already knew it) but there is no big pitfalls where Ruby loses his time. My callgrind file shows that the top 5 (in self time) is : - Array.flatten (55000 calls) - Array..each (115089 calls) - Puppet::Resource.initialize (15000 calls) - String.=~ (65045 calls) - Hash[]= (115084 calls) This top 5 is taking ~30% of the total time . As you can see, it can be dificult to optimize this. IMHO, the benchmark - tweak - benchmark way of optimizing is not sufficient here. I think the way it (de)serialize a catalog needs a deep refactor. There is probably lots of duplicated work going on at the levels above and those are causing those generic methods to light up (except Puppet::Resource.initialize). There is both the deserialization process as such to optimize, but also the Resource implementation itself which is far from optimal. The next thing would be to focus on Resource.initialize/from_data_hash Agreed, but I can't do that on my own in timely fashion. We just want puppet devs to be aware that Puppet::Resource gets more features but performances is not following anyhow, and it's getting worse in puppet 4. I don't really now how it can be adressed. I think it is also relevant to establish some kind of world record - say serializing and deserializing a hash using MsgPack; a hash of data cannot be transported faster across the wire than that (unless also not using Ruby objects to represent the data - with a lot of extra complexity). I mean, a hash of some complexity will always consume quite a bit of processing and memory to get across the wire. Is it hitting the world record enough? That's not the point, like I said, this performance gap is coming from the creation the Graph itself, not the deserialization from PSON or MsgPack. In my case, in the 4 sec deserialization, 3.5 secs is the from_data_hash. Romain - henrik Cheers, Le mardi 30 juin 2015 04:23:42 UTC+2, henrik lindberg a écrit : On 2015-29-06 22:41, Trevor Vaughan wrote: If you get a profiling suite together (aka, bunch of random patches) could you release it? It is not difficult actually. Look at the benchmarks in the puppet code base. Many of them are suitable for profiling with a ruby profiler. I don't think we have any benchmarks targeting the agent side though, so the first thing to do (for someone) is to write one. What is more difficult is coming up with a benchmark that does not involve real/complex resources - but deserialization and up to actually applying should be possible to work with in a simple way. Profiling is then just running that benchmark with the ruby profiler turned on and analyzing the result, make changes, run again... (repeat until happy). - henrik I've been curious about this for quite some time but never quite got around to dealing with it. My concern is very much client side performance since the more you managing a client, the less the client gets to do it's actual job. Thanks, Trevor On Mon, Jun 29, 2015 at 4:35 PM, Henrik Lindberg henrik.@cloudsmith.com javascript: mailto:henrik@cloudsmith.com javascript: wrote: On 2015-29-06 16 tel:2015-29-06%2016:48, Romain F. wrote: Hi everyone, I try to optimize our Puppet runs by running some benchmarks and patching the puppet core (if possible).. But I have some difficulties around the catalog serialization/deserialization. In fact, in 3.7.5 or 3.8.x, the Config Retrieval takes roughly 7secs and only 4 secs is on the master side. Same fact in 4.2 but with 9 secs of config retrieval and still 4 secs on the master side. My first thoughts was Okay, time to try MsgPack. No improvements. I've instrumented a bit the code in the master branch around this, and I've found out that, on my 9secs of config retrieval, 3.61secs is lost in catalog deserialization, 2 secs is the catalog conversion.. But it's not the real deserialization (PSON to Hash) that takes ages, it's the creation
[Puppet-dev] Re: Catalog Deserialization performance
I've already benchmarked and profiled Catalog's from_data_hash and to_data_hash methods using the benchmark framework. Most of the time is spent in the from_data_hash (we already knew it) but there is no big pitfalls where Ruby loses his time. My callgrind file shows that the top 5 (in self time) is : - Array.flatten (55000 calls) - Array.each (115089 calls) - Puppet::Resource.initialize (15000 calls) - String.=~ (65045 calls) - Hash[]= (115084 calls) This top 5 is taking ~30% of the total time . As you can see, it can be dificult to optimize this. IMHO, the benchmark - tweak - benchmark way of optimizing is not sufficient here. I think the way it (de)serialize a catalog needs a deep refactor. Cheers, Le mardi 30 juin 2015 04:23:42 UTC+2, henrik lindberg a écrit : On 2015-29-06 22:41, Trevor Vaughan wrote: If you get a profiling suite together (aka, bunch of random patches) could you release it? It is not difficult actually. Look at the benchmarks in the puppet code base. Many of them are suitable for profiling with a ruby profiler. I don't think we have any benchmarks targeting the agent side though, so the first thing to do (for someone) is to write one. What is more difficult is coming up with a benchmark that does not involve real/complex resources - but deserialization and up to actually applying should be possible to work with in a simple way. Profiling is then just running that benchmark with the ruby profiler turned on and analyzing the result, make changes, run again... (repeat until happy). - henrik I've been curious about this for quite some time but never quite got around to dealing with it. My concern is very much client side performance since the more you managing a client, the less the client gets to do it's actual job. Thanks, Trevor On Mon, Jun 29, 2015 at 4:35 PM, Henrik Lindberg henrik@cloudsmith.com javascript: mailto: henrik@cloudsmith.com javascript: wrote: On 2015-29-06 16 tel:2015-29-06%2016:48, Romain F. wrote: Hi everyone, I try to optimize our Puppet runs by running some benchmarks and patching the puppet core (if possible). But I have some difficulties around the catalog serialization/deserialization. In fact, in 3.7.5 or 3.8.x, the Config Retrieval takes roughly 7secs and only 4 secs is on the master side. Same fact in 4.2 but with 9 secs of config retrieval and still 4 secs on the master side. My first thoughts was Okay, time to try MsgPack. No improvements. I've instrumented a bit the code in the master branch around this, and I've found out that, on my 9secs of config retrieval, 3.61secs is lost in catalog deserialization, 2 secs is the catalog conversion.. But it's not the real deserialization (PSON to Hash) that takes ages, it's the creation of the Catalog object itself (Hash to catalog). Benchmarks shows that the time to deserialize MsgPack (or PSON) is negligible compared to the catalog deserialization time. So here is my question : Is that a known issue ? Is there any reason of the regression in 4.x (Future parser creating more objects, ...) ? The parser=future setting only makes a difference when compiling the catalog - the catalog itself does not contain more or different data (except possibly using numbers instead of strings for some attributes). The best way to optimize this is to write a benchmark using the benchmark framework and measure the time it takes to deserialize a given catalog. Then run that benchmark with Ruby profiling turned on. There are quite a few things going on at the agent side in addition to taking the catalog PSON and turning it into a catalog that it can apply (loading types, resolving providers, etc). Make sure to benchmark these separately if possible. Regards - henrik Cheers, -- You received this message because you are subscribed to the Google Groups Puppet Developers group. To unsubscribe from this group and stop receiving emails from it, send an email to puppet-dev+...@googlegroups.com javascript: mailto:puppet-dev%2bunsubscr...@googlegroups.com javascript: mailto:puppet-dev+unsubscr...@googlegroups.com javascript: mailto:puppet-dev%2bunsubscr...@googlegroups.com javascript:. To view this discussion on the web visit https://groups.google.com/d/msgid/puppet-dev/a5bf7422-6119-43ee-ba11-44001c1ce097%40googlegroups.com https://groups.google.com/d/msgid/puppet-dev/a5bf7422-6119-43ee-ba11
[Puppet-dev] Catalog Deserialization performance
Hi everyone, I try to optimize our Puppet runs by running some benchmarks and patching the puppet core (if possible). But I have some difficulties around the catalog serialization/deserialization. In fact, in 3.7.5 or 3.8.x, the Config Retrieval takes roughly 7secs and only 4 secs is on the master side. Same fact in 4.2 but with 9 secs of config retrieval and still 4 secs on the master side. My first thoughts was Okay, time to try MsgPack. No improvements. I've instrumented a bit the code in the master branch around this, and I've found out that, on my 9secs of config retrieval, 3.61secs is lost in catalog deserialization, 2 secs is the catalog conversion.. But it's not the real deserialization (PSON to Hash) that takes ages, it's the creation of the Catalog object itself (Hash to catalog). Benchmarks shows that the time to deserialize MsgPack (or PSON) is negligible compared to the catalog deserialization time. So here is my question : Is that a known issue ? Is there any reason of the regression in 4.x (Future parser creating more objects, ...) ? Cheers, -- You received this message because you are subscribed to the Google Groups Puppet Developers group. To unsubscribe from this group and stop receiving emails from it, send an email to puppet-dev+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/puppet-dev/a5bf7422-6119-43ee-ba11-44001c1ce097%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[Puppet-dev] Adding metrics in agent report - PUP-4634
Hi everyone, Following the suggestion of Henrik L. we'll continue the discussion of PUP-4634 here. It began with my suggestion of adding some timings in agent code in order to check steps which can take some time. And, thus, being able to troubleshoot pretty ugly things like this : Info: Caching catalog for bulbi1.c-bulbi.tgcc.ccc.cea.fr in 6.48 seconds Info: Convert catalog in 6.18 seconds Info: Applying configuration version '1432224909' ... Notice: Finished catalog run in 88.45 seconds Notice: Send report in 11.17 seconds These 3 metrics have been added using the Puppet::Util.benchmark tool. So it's only visible on the agent side, nothing transfered back in reports. As R. I. Pienaar suggested (and he's probably right), it would be a big win to put those metrics in reports. And then, add a way to show them in agent logs (like --evaltrace). So here's my questions : Which metrics to gather ? Does puppetDB need to be aware of those new metrics ? Any feedback appreciated. Cheers, --- Romain F. -- You received this message because you are subscribed to the Google Groups Puppet Developers group. To unsubscribe from this group and stop receiving emails from it, send an email to puppet-dev+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/puppet-dev/9afac8e1-fbdb-4c6a-9a4e-33079b4aacb0%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: [Puppet-dev] Adding metrics in agent report - PUP-4634
Le vendredi 22 mai 2015 14:53:10 UTC+2, R.I. Pienaar a écrit : these would be good to add to both last_run_summary.yaml and the reports indeed, I mentioned this in the ticket but the report already contains a wealth of perf data, I have a report parser you can use on the CLI that produce output like this: https://github.com/ripienaar/puppet-reportprint/blob/master/SAMPLE.txt So already there you have config retrieval for example, adding more metrics around the saving and re-loading of the catalog would be handy. As well as firming up the docs around these and explain exactly what they are - this might exist now but last time I checked it didnt and it was a bit of guess work what they all do and mean esp some in last_run_summary.yaml. It sounds like it shows metrics of the catalog application, not really about catalog manipulation, facter or Indirection caching. And those steps can take a while. This is what we wanted to monitor originally. Notice: Send report in 11.17 seconds obv this could not be in the report but be handy in last_run_summary.yaml It would be ideal if like the existing agent perf data this is always collected and always stored in reports but optionally shown to the console. Adding a bunch of report.add_times(:step_to_report, thinmark {block_of_something}) would do the job right ? I can't right now think of specific additions but whoever implements it should just go through every major life cycle event and make sure its reported. One thing that might be handy is some details around the number of HTTP requests that are being made to measure and observe the impact of things like the HTTP connection pool. +1 Too much handshakes can put heavy pressure on masters, this would implement a way to monitor this. -- You received this message because you are subscribed to the Google Groups Puppet Developers group. To unsubscribe from this group and stop receiving emails from it, send an email to puppet-dev+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/puppet-dev/d0283a1c-6681-4475-a644-caba3f61542c%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.