[Puppet-dev] Re: Puppet environments implementation question - get_conf() location and result caching
On 2015-16-04 18:34, Bostjan Skufca wrote: On Thursday, 16 April 2015 14:01:29 UTC+2, henrik lindberg wrote: On 2015-16-04 4:30, Bostjan Skufca wrote: Just recently I was looking at the environments part of puppet implementation, and I have stumbled upon this curious gimmick. There is Puppet::Environments class, which is a class for general Environments management (searching, loaders, caching, etc). Then there is Puppet::Node::Environment which is instantiated for every environent found. Logical up to this point. But, contrary to all OOP notions, get_conf() function is not a method of Puppet::Node::Environment class. Rather, it is a lookup function in Puppet::Environments class which takes environment name as an argument and fetches Puppet::Settings::EnvironmentConf instance. Puppet::Environments has two function for environment and environment data retrieval: - get() fetches Puppet::Node::Environment instance - get_conf() fetches Puppet::Settings::EnvironmentConf instance The curious thing is: - get() method operates with cache - get_conf() does not include any caching I tested this on a catalog of 1633 resources, and: - get() was called around 220 times - get_conf() was called around 25.000 times These are the comments above get_conf() in environments.rb: # This implementation evicts the cache, and always gets the current # configuration of the environment # # TODO: While this is wasteful since it # needs to go on a search for the conf, it is too disruptive to optimize # this. My questions are: - why does get_conf() method belong to Environments instead of Node::Environment class? - why are get_conf() results not cached? Why is it too disruptive? It should be fine to cache the env conf in an instance of an environment, and it should be possible to obtain the config given an instantiated environment. This is safe since the conf will then be evicted when the environment is evicted (and while the environment is cached it will not reload or change). It would be more logical that way too, for the user. If there is environment timeout, user would expect that all parts of environment (env configuration, code, even manifests) are cached. At least that was my understanding until I actually looked at the code. Shall I create a Jira ticket and pull request with working code change? Yes please. A ticket for fix-version PUP 4.x (we then update it with the actual release it will go into). I don't think it is worth the trouble of fixing this on 3.x. Target the code to the master branch. I believe that the large number of calls to get the conf could almost exclusively be for an already loaded environment (I should be measured though). Yes, this is true. All the calls are for the environment where the node/agent resides. I tested this with single node only, though. Good to know. Then this is worth doing. All other calls would be for information purposes and those cannot be cached unless the environment.conf file is watched (which is difficult since it may not exist, come into existence, be removed etc.). For those calls, the cost of checking if the file is up to date is almost as expensive as loading the conf. All other calls - do you mean calls for other environments? If so, as said above, I did not notice this on my setup (could not). But I believe that if env caching follows the same pattern for all environments, it should not be a problem. I do mean calls for other environments than the one(s) currently loaded and in use/cached by the master. This happens for instance when commands/services query for available environments, asks which classes that are in an environment etc. These calls when made for environments that are loaded, should read the conf directly (they are almost never interested in loading the environment they want information about). Since the 3.x implementation is very complex due to the dynamic environments, I suggest that this should be fixed for 4.x. This is probably for the best, true. If one wants to refrain from using dynamic environments and wishes to use 3.7.x, a separate patch may be provided. Cheers, and thanks for taking a stab at improving this! - henrik -- Visit my Blog Puppet on the Edge http://puppet-on-the-edge.blogspot.se/ -- You received this message because you are subscribed to the Google Groups Puppet Developers group. To unsubscribe from this group and stop receiving emails from it, send an email to puppet-dev+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/puppet-dev/mgr0bs%24b1c%241%40ger.gmane.org. For more options, visit
[Puppet-dev] Re: Puppet environments implementation question - get_conf() location and result caching
On 2015-16-04 4:30, Bostjan Skufca wrote: Just recently I was looking at the environments part of puppet implementation, and I have stumbled upon this curious gimmick. There is Puppet::Environments class, which is a class for general Environments management (searching, loaders, caching, etc). Then there is Puppet::Node::Environment which is instantiated for every environent found. Logical up to this point. But, contrary to all OOP notions, get_conf() function is not a method of Puppet::Node::Environment class. Rather, it is a lookup function in Puppet::Environments class which takes environment name as an argument and fetches Puppet::Settings::EnvironmentConf instance. Puppet::Environments has two function for environment and environment data retrieval: - get() fetches Puppet::Node::Environment instance - get_conf() fetches Puppet::Settings::EnvironmentConf instance The curious thing is: - get() method operates with cache - get_conf() does not include any caching I tested this on a catalog of 1633 resources, and: - get() was called around 220 times - get_conf() was called around 25.000 times These are the comments above get_conf() in environments.rb: # This implementation evicts the cache, and always gets the current # configuration of the environment # # TODO: While this is wasteful since it # needs to go on a search for the conf, it is too disruptive to optimize # this. My questions are: - why does get_conf() method belong to Environments instead of Node::Environment class? - why are get_conf() results not cached? Why is it too disruptive? The path to the new environments implementation (supporting both directory environments, and the now deprecated (and in 4.0 removed) dynamic environments) was paved with problems as it affected more or less the entire code base. It should be fine to cache the env conf in an instance of an environment, and it should be possible to obtain the config given an instantiated environment. This is safe since the conf will then be evicted when the environment is evicted (and while the environment is cached it will not reload or change). I believe that the large number of calls to get the conf could almost exclusively be for an already loaded environment (I should be measured though). All other calls would be for information purposes and those cannot be cached unless the environment.conf file is watched (which is difficult since it may not exist, come into existence, be removed etc.). For those calls, the cost of checking if the file is up to date is almost as expensive as loading the conf. Since the 3.x implementation is very complex due to the dynamic environments, I suggest that this should be fixed for 4.x. - henrik Thanks for answers, b. -- You received this message because you are subscribed to the Google Groups Puppet Developers group. To unsubscribe from this group and stop receiving emails from it, send an email to puppet-dev+unsubscr...@googlegroups.com mailto:puppet-dev+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/puppet-dev/53b560d5-1c8f-4b00-944c-3156343a175f%40googlegroups.com https://groups.google.com/d/msgid/puppet-dev/53b560d5-1c8f-4b00-944c-3156343a175f%40googlegroups.com?utm_medium=emailutm_source=footer. For more options, visit https://groups.google.com/d/optout. -- Visit my Blog Puppet on the Edge http://puppet-on-the-edge.blogspot.se/ -- You received this message because you are subscribed to the Google Groups Puppet Developers group. To unsubscribe from this group and stop receiving emails from it, send an email to puppet-dev+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/puppet-dev/mgo8ac%24hc5%241%40ger.gmane.org. For more options, visit https://groups.google.com/d/optout.
[Puppet-dev] Re: Puppet environments implementation question - get_conf() location and result caching
On Thursday, 16 April 2015 14:01:29 UTC+2, henrik lindberg wrote: On 2015-16-04 4:30, Bostjan Skufca wrote: Just recently I was looking at the environments part of puppet implementation, and I have stumbled upon this curious gimmick. There is Puppet::Environments class, which is a class for general Environments management (searching, loaders, caching, etc). Then there is Puppet::Node::Environment which is instantiated for every environent found. Logical up to this point. But, contrary to all OOP notions, get_conf() function is not a method of Puppet::Node::Environment class. Rather, it is a lookup function in Puppet::Environments class which takes environment name as an argument and fetches Puppet::Settings::EnvironmentConf instance. Puppet::Environments has two function for environment and environment data retrieval: - get() fetches Puppet::Node::Environment instance - get_conf() fetches Puppet::Settings::EnvironmentConf instance The curious thing is: - get() method operates with cache - get_conf() does not include any caching I tested this on a catalog of 1633 resources, and: - get() was called around 220 times - get_conf() was called around 25.000 times These are the comments above get_conf() in environments.rb: # This implementation evicts the cache, and always gets the current # configuration of the environment # # TODO: While this is wasteful since it # needs to go on a search for the conf, it is too disruptive to optimize # this. My questions are: - why does get_conf() method belong to Environments instead of Node::Environment class? - why are get_conf() results not cached? Why is it too disruptive? It should be fine to cache the env conf in an instance of an environment, and it should be possible to obtain the config given an instantiated environment. This is safe since the conf will then be evicted when the environment is evicted (and while the environment is cached it will not reload or change). It would be more logical that way too, for the user. If there is environment timeout, user would expect that all parts of environment (env configuration, code, even manifests) are cached. At least that was my understanding until I actually looked at the code. Shall I create a Jira ticket and pull request with working code change? I believe that the large number of calls to get the conf could almost exclusively be for an already loaded environment (I should be measured though). Yes, this is true. All the calls are for the environment where the node/agent resides. I tested this with single node only, though. All other calls would be for information purposes and those cannot be cached unless the environment.conf file is watched (which is difficult since it may not exist, come into existence, be removed etc.). For those calls, the cost of checking if the file is up to date is almost as expensive as loading the conf. All other calls - do you mean calls for other environments? If so, as said above, I did not notice this on my setup (could not). But I believe that if env caching follows the same pattern for all environments, it should not be a problem. Since the 3.x implementation is very complex due to the dynamic environments, I suggest that this should be fixed for 4.x. This is probably for the best, true. If one wants to refrain from using dynamic environments and wishes to use 3.7.x, a separate patch may be provided. b. -- You received this message because you are subscribed to the Google Groups Puppet Developers group. To unsubscribe from this group and stop receiving emails from it, send an email to puppet-dev+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/puppet-dev/57e71df8-0a83-4cd7-a851-6f2fc2e0954c%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.