On Mar 19, 2009, at 5:09 PM, Brice Figureau wrote:

>
> On 19/03/09 22:50, Luke Kanies wrote:
>> Hi all,
>>
>> I've been thinking a lot about file recursion and why it's so darn
>> complicated, and I think one reason is the recursion happening in the
>> same resource type doing the managing.  As a result, I've been
>> thinking of moving the file recursion into a Fileset resource type.
>>
>> Currently, the file type generates new file resources during
>> recursion; this basic model would be the same, except that the  
>> fileset
>> resource type would be generating files.
>
> I've also been thinking a lot about local file recursion lately, but  
> for
> performance reasons.
>
> I understand your idea and what are the benefits of your proposals in
> term of clarity, code concision and such.
>
> Right now, the main performance issue with local recursive file
> resources is creating one newchild file resource per managed sub file,
> which in turn will be managed by the system.
> Ruby seems particularly slow at creating tons of objects, and it uses
> memory for something that is at really transient.
>
> My idea on the subject (but I didn't research if that's doable) was  
> that
> we don't really need to create those objects, if we consider a  
> recursive
> resource as an "opaque" system which manages its own sub-resources
> itself. This behavior could be supported by the puppet ral system by
> defining a kind of recursive manager system that could offer
> programmatic resource management instead of being object based.
> I'm not sure I'm prefectly clear, it's late here and my brain need  
> some
> rest :-)
>
> I think this violates the current puppet contract, but I'm sure we  
> could
> implement the recursive behavior outside of the file resource while
> still being able to manage sub-resource procedurally instead of having
> to generate them.
>
> Maybe that's what you are proposing (still late here).
> If not, please try to think about it and see if that could make sense.

It's not actually what I'm proposing - I'd say it's a parallel and  
possibly competing proposal.

I've been thinking about something similar.  I think there are at  
least three ways one could do what you're asking (in inverse order of  
overhead).  I'm describing them here with simple names so it's easier  
to refer back to them; the names aren't perfect, but hopefully they'll  
do.

1) Transient resources: Continue to create the resources but create  
and destroy them one at a time

2) Set resources: Use a single recursive operation that somehow  
manages to retain transactional integrity

3) Set operations: Perform a recursive operation that loses  
transactional integrity

I think you're essentially proposing something like #3.

I'll provide some more detail on each, but there are a couple of  
points of complexity that are worth noting.  In particular, the choice  
here has a significant affect on logging and events.  You can actually  
think of logs and events as isomorphic, and they're only going to get  
moreso:  I hope by 0.26 or so all transaction logs are actually  
generated by events.

Obviously logging is critical so you know what's happening on your  
system.  Events are critical so that you can react to those changes.   
E.g., if you need to restart a service if any file in a fileset  
changes, then an event from a file deep in the hierarchy needs to be  
routed appropriately and then it needs to be able to be logged as  
being from that location.

We solve that right now using proxy resources - the recursing resource  
is the proxy for the event-generating resource.  This will likely work  
for any of the other solutions, too, but it's worth thinking about  
here because I've found it to be the major source of complexity, and I  
think that would continue.

So, on to descriptions:

Transient resources:

Currently, we create the whole list of resources, add them to the  
graph, and then iterate over them.  Instead, we could essentially  
process them one at a time and then discard them.  Our current method  
(loosely) is:

   file.eval_generate.each { |resource| add_to_catalog(resource) and  
eval_resource(resource) }

Instead, it would become:

   file.eval_generate { |resource| eval_resource(resource) }

Ridiculous pseudo-code, obviously, but hopefully you get the idea.   
The optimization here is that 1) we aren't adding to the catalog and  
2) we aren't building a list at all.

Set resources:

We could have some kind of resource that didn't use instance variables  
for any of the values or comparisons, such that a single resource  
instance could be used to do all of the operations.  The main thing is  
that it kicks out change instances for each of the things that needs  
to be done, with all of the appropriate information for logging and  
events.

We're still paying a per-change overhead, but I don't think you can  
get away from that and retain transactional integrity.

Set operations:

This is pretty much just a big, painful chmod -R.  This would be a bit  
more difficult because we'd have to skip any resources that are  
managed anywhere else.

In writing this description, I think the second option is the best,  
even though I was leaning toward the first, initially.  I think it's  
doable - the big limitation right now is the use of instance variables  
in files.  If 'should' weren't set anywhere, then you'd be passing it  
in each time, and if you're passing it in, then you could use the same  
instance for every file you needed to operate on.

One of the big changes we did (but no one noticed) in the fall of '07  
was that we switched 'is' from being an instance variable to being  
transient, only maintained by the transaction.  I've always wanted to  
switch 'should', too, but I haven't known how.

If we were to combine this with my goal of splitting resource types  
into a Resource class (which already exists in 0.25) and a  
ResourceType class (whose instances will model individual resource  
types), then these resource types could be written to operate with no  
instance variables, essentially.  This would, I think, enable the set  
resources pretty easily.  (I suppose I should open this as a ticket,  
so people know wtf I'm talking about.)

Well, 'easily' once you refactored the RAL entirely and broke backward  
compatibility.

-- 
Learning is not attained by chance, it must be sought for with ardor and
attended to with diligence. -- Abigail Adams
---------------------------------------------------------------------
Luke Kanies | http://reductivelabs.com | http://madstop.com


--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Puppet Developers" group.
To post to this group, send email to puppet-dev@googlegroups.com
To unsubscribe from this group, send email to 
puppet-dev+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/puppet-dev?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to