On Fri, May 11, 2012 at 8:06 PM, Philip Brown <[email protected]> wrote: > > > On Friday, May 11, 2012 6:40:19 PM UTC-7, Jeff McCune wrote: >> >> And to jump on this... >> >> We absolutely have to make sure synchronizing plugins is fast and >> efficient. If something like stdlib is a performance issue by its nature of >> containing lots of additional functionality then we consider that a bug and >> we'll fix it. >> >> For me, hundreds of file resources aren't really a concern in puppet >> today. even thousands should be fine. >> >> It's large files that are a concern. if you have a hundred files of about >> a meg each then that's where we have concerns. >> > > > It's interesting... that you are concerned about one aspect, but not at all > about another. > > To give an extreme example of why I care: We used to have a large rsync job, > to transfer files between one host and another that ran every night.It was > on a relatively large filesystem. A full resync, where the other side was > wiped, took something like 1 hour. > > However, a run where everything was already in sync... took *half an hour*. > Half the time of the sync, was just checking the file dates and sizes. > > file comparisons are a small resource cost, but they are non-zero; and > that's when you're only doing stat(). Actually reading the things and > chksumming them, is signifcantly worse. > Have a lot of them, and they add up. > If modules get more popular, then you will potentially find it commonplace > to have hundreds of files that need syncing. *per client*. > > puppet already has a reputation of having difficulty scaling with a single > master server. It would be a shame to have deliberate design choices make > that worse. > 1,000 system farms are becoming commonplace. For some admins, if a product > cannot reliably scale to handle that number of nodes from a single master, > then they view the product as not designed for their standards of scaling, > and they seek elsewhere. > > Are you giving up that area as a design target? > >
This would certainly be wonderful to tackle, but it's not especially critical, imo. Puppet isn't primarily a tool for syncing files; there are plenty tools which *are* designed to do that well, and they'll work just fine used in conjunction with Puppet. You can absolutely use an rsync exec resource (as many users do) if you have a number of files that Puppet can't handle. But, as Jeff said, for only hundreds or thousands of files, Puppet should be fine. If *that* isn't the case, it's definitely something we should address. Tens or hundreds of thousands, on the other hand, I would advise using a more specialized tool. And from a more technical point of view, bulk operations simply aren't something Puppet is really capable of handling today. With few exceptions, Puppet manages resources only on an individual level. We want to do bulk operations, but it entails significant engineering effort. I do think the ability for Puppet itself to use a tool like rsync would be cool. Partly because then you don't have to, but also because we could still report on what changed, which is what's somewhat lost by an exec resource. This could either be as the implementation of bulk file sourcing or, as I would prefer, an "rsync" (or similar) resource type. As for the concerns specifically about plugin syncing, I agree that we probably ought to be properly using timestamps there. I disagree, however, that timestamps would be the correct implementation for general files. Anyway, currently pluginsync is just using file resources with source set, which use file content rather than timestamps by default. It should be fairly simple to use timestamps instead when syncing plugins. Again, though, it's also probably not realistically going to cause much of a performance issue as it is. > > -- > You received this message because you are subscribed to the Google Groups > "Puppet Developers" group. > To view this discussion on the web visit > https://groups.google.com/d/msg/puppet-dev/-/GThkaIhY6RYJ. > > To post to this group, send email to [email protected]. > To unsubscribe from this group, send email to > [email protected]. > For more options, visit this group at > http://groups.google.com/group/puppet-dev?hl=en. -- You received this message because you are subscribed to the Google Groups "Puppet Developers" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/puppet-dev?hl=en.
