Re: [Puppet-dev] Asynchronous catalog compiles

R.I.Pienaar Tue, 05 Oct 2010 08:21:31 -0700

----- "Luke Kanies" <[email protected]> wrote:


> > Particularly when it comes to debugging, keeping track of state,
> > scaling the service, and leveraging existing HTTP tooling and
> > infrastructure.  There's also widespread SSL support and well
> > understood architecture designs that allow SSL offloading while
> > passing through authentication tokens, much like we currently do with
> > the certificate DN and verification status.
> 
> I think you'd find that a bus would actually be *far* easier to debug
> than HTTP, because it can track both ends of the conversation much
> more easily.  Ask RI about the auditing capabilities of mcollective,
> and compare that to what we have now, especially as you get a pool of
> HTTP responders behind a load balancer - you have to audit the client,
> load balancer, and every server.  This might seem easy because you're
> used to it, but I think it's actually a big stumbling block for
> people.

yes, debugging traffic on busses are easier but provide their own challenges.

the big thing is single instances of a middleware bus can handle 100s of
thousands of messages a second - gets slower the bigger the messages and
on more complex networks etc.  But these things scale pretty awesomely.
They are designed to handle traffic internal to banking and stock exchanges.

With the ability to just create large amounts of messages we can do very fine
grained auditing of every step of the way.  

Today from a client perspective using Puppet reporting we more or less 
know only right at the end what happened and then only if the master 
is up, and only if the network between client and master is up, and only if 
its disk isnt full and we  wont know about syntax errors, and we wont know 
about any kind of internal events that led to the compiled artifact etcetc. 

contrast this with a system where every component emits an ongoing audit trail
not into a - more or less useless - syslog as today but in structured data over
a middleware layer:

- compiler: received compile request from x
- compiler: compiling catalog for x
- compiled: failed to compile catelog for x due to ...

etc, streams of events all the time for all the little bits, you can fairly 
easily
construct systems that will emit advisory style messages in large amounts and 
send
them to splunk, syslog, secure audit servers, dashboard, wherever and we'd not
only get the ability to debug the whole stream of associated events with a 
specific
client but also gain far more flexible and less fragile auditing and tracking 
capabilities than we have today.  Specifically unlike with our current reporting
capability this will also be persistent on the bus, as soon as the event is 
generated
even if that server that initiated the compile dies half way through a run due 
to fire or
simple exception you'll still have the audit events and the middleware will 
store
them securely till someone consumes them.  So even if all we did was put our 
current 
report objects on a middleware system we'd be gaining a lot.

Not only can you send these events to a single entity - like syslog - that then 
require some kind of post processing of text but you can send rich data - 
objects - to 
any number of consumers of the audit events.  Audit events can be tagged with 
severity
and so forth and interested parties can decide what kind of events to receive.

I think this level of auditing and debugging will take us to a whole new level 
of
usability and flexibility and enable us to satisfy a whole new level of 
compliance 
requirements.  

The challenge is how to present or consume this level of data in a way that 
makes sense
to a human,

> At this point, the compellin argument is that you could get one
> drop-in technology that would (at least, according to what I've seen,
> but not yet tested) scale pretty easily both up and out - it could
> handle the largest network I've seen, but also handle multiple
> locations very easily.  These are both very difficult in the existing
> HTTP infrastructure, and message buses were essentially invented to
> solve this problem and they do it well.

>From the scaling perspective, you'd have a queue of jobs for compile requests, 
facts saves and so forth.

You could graph the size of these queues and know immediately what kind of work
load goes through the system, you'd know pretty easily where your master
infrastructure is under stress and you'd have the ability to scale up or out
just that component.  If you're seeing that you are not handling your compile
workload then you simply start more compilers, perhaps on another server.

Additionally it would remove a lot of our thundering-hoard problems, if you have
specced out that you can support 10 compiles at a time and you get a 100 
requests
that doesnt mean a 100 http connections will be hammering your servers it only 
means
there will be a queue of 100 requests being served 10 at a time.  the backlog 
is 
something that can be monitored easily and you could essentially auto scale 
more 
compilers within reason if thats what you want.

There are some other interesting far-off possibilities, there's no 
problem for a compiler process to listen on multiple sources of requests. 
We could have queues for requests with different priorities. Today if your 
daemon 
runs are consuming all resources on your master and you need to get an urgent 
change out perhaps by initiating puppet runs using an orchestration tool your 
urgent requests will simply land at the back of the  queue and fight for 
resources.  

What if we had a normal priority and high priority compile queues?  What if in 
a 
place with a very small window of maintenance you can instruct the puppetd's to 
drop those compile requests - like say all the ones with --test or whatever - 
into a high priority compile queue where the compilers will serve them before 
the rest. 

We're thinking about the large and complex sites with this work I dont think 
we're
hoping a small 2 site setup will run or need all this stuff but when you do 
need it
this infrastructure will let you handle the customer demands where today it's 
pretty
challenging.

> > I don't think using REST as the primary communication mechanism is
> mutually exclusive with employing a bus, though perhaps I
> misunderstand the design.  If we implement a bus, couldn't all of the
> puppet processes involved  talk to that bus over REST?
> 
> Essentially, no - the bus protocols are very different from HTTP.  I
> mean, you could have a listener that translated HTTP calls and
> responses to the bus and back, but then the client's still speaking
> HTTP and only the server-side uses the bus.

It's not impossible, many middleware systems provide a REST bridge natively.
ActiveMQ has a REST interface I think they also have websockets now recently, 
dont think Rabbit has their own REST bridge but might be wrong.

We'd though need to write our own such bridge if say we wanted to support 
0.25.x clients
talking to the new system.

-- 
You received this message because you are subscribed to the Google Groups 
"Puppet Developers" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/puppet-dev?hl=en.

Re: [Puppet-dev] Asynchronous catalog compiles

Reply via email to