On Fri, Mar 08, 2013 at 05:14:35PM +0100, Johannes Renner wrote:
> Hello,
> 
> Here is a proposal for a new Spacewalk feature that we would like to merge 
> into
> master. An initial set of patches is attached, but let me explain 'quickly' 
> first:
> 
> It happens that there are systems that should be be managed, but they are 
> located
> in a DMZ or some other subnet with restrictive access to the internal company
> network. Such systems might technically not be allowed to call back to the 
> server
> in order to ask for scheduled actions and such. One way of getting around this
> problem is to open the connection from the server instead, and call back to 
> the
> server ('rhn_check') using a secured tunnel.
> 
> Therefore we would in the first place propose to allow different contact 
> methods
> to be configured for a registered system. Instead of the traditional default
> "Pull" we would offer "SSH Push via Tunnel" as well as "SSH Push" (which is 
> the
> same just without the tunnel). The contact method can be chosen on the 
> activation
> key level, which means that all systems registered with a certain activation 
> key
> will inherit the respective contact method (this is all in patches #1 and #2).
> 
> Further there is a job in taskomatic which runs once every minute to find 
> systems
> that are configured for SSH push and with actions being scheduled to be 
> executed
> in this moment. These will be contacted via
> 
> ssh -R <high_port>:<server>:443 <client> rhn_check
> 
> All scheduled actions will be fetched to the client, while the connection is
> established by the server. To make this work it is however necessary to 
> reconfigure
> the client in two places:
> 
> - /etc/hosts needs to contain <server> in the localhost line
> - /etc/sysconfig/rhn/up2date needs to point to <server>:<high_port> instead of
>   only <server>
> 
> This reconfiguration is currently done during system registration. Since
> registration of such a system needs to be done from the server as well (via 
> tunnel),
> we provide a dedicated script, namely 'spacewalk-push-register', see patch #4.
> Using this script, a client can be registered from the server's commandline:
> 
> spacewalk-push-register <client> <path_to_bootstrap_script>
> 
> Further we would want to prevent a system from not checking in, just in case 
> there
> is no actions scheduled for a certain period of time. Therefore we should 
> contact
> such systems before the respective threshold of inactiveness is reached 
> (default is
> 1 day). In order to prevent from many systems re-checking in at the same time 
> again,
> randomly generated thresholds are used to determine if a system should 
> checkin or
> not. The result is that all inactive systems will eventually checkin between 
> 12 and
> 24 hours of inactiveness, but you do not know when exactly it will happen.
> 
> I can explain the implementation in more detail if you want, but you could 
> also run
> the included unit test, which actually performs a simulation with 100000 
> clients
> to record their checkin times using the implemented algorithm.
> 
> In comparison to the existing method of pushing actions using osad, the 
> proposed
> SSH Push should be more reliable in general and could therefore serve as a 
> valid
> alternative (even without the tunneling). Further it scales better to a high 
> number
> of client systems, since the number of threads opening connections to clients 
> at
> the same time can be configured and therefore limited (default is 2). This is
> however not the case with osad. All clients will be pinged and will call back 
> to
> the server at the same time, which might cause a server to break down under
> circumstances.
> 
> Things to be improved:
> 
> - Client registration: enable/disable a client for either SSH push with or 
> without
>   tunnel.
> - UI integration for reconfiguring clients when the contact method is changed 
> for
>   a system.
> - "Push via osad" could be another contact method or at least we should 
> somehow
>   integrate with the push status indication in the UI.
> 
> Your feedback and comments are more than welcome!

Johannes,

let me summarize some big picture impressions, without commenting on
every detail.

You propose to address two situations:

1) clients in DMZ that cannot reach the server, with server able to
   reach the clients;

2) overloading Spacewalk when multiple actions get (auto)scheduled for
   many clients and they get woken up by osa-dispatcher/jabberd/osad
   all at the same time.

Frankly, the first scenario does not sound that interesting to me.
Access to and from DMZ is typically closed from/to all other networks
as well and only opened in a very targeted fashion. The IT of that
organization would still need to allow access _to_ the DMZ to sshd
ports on those machines. You can always have Spacewalk Proxy in
DMZ2, having client talk to the proxy and that proxy to the Spacewalk,
if your IT does not want to open the ports in the DMZ configuration
directly. In both cases, the HTTP requests run by rhn_check / yum will
end up on that Spacewalk server and if there is a way to compromise
that server that way, it will happen. I would still need to check the
patches in details to see how you solve the problem of the client IP
addresses as seen by the Spacewalk server being 127.0.0.1 for all
those requests which is hardly something you'd like to see in
production.

You propose new scheduling service in taskomatic to initiate the SSH
Push for client ... but we already have such a functionality, it's
osa-dispatcher. So either we should get rid of osa-dispatcher and do
even the jabber/osad based notifications from taskomatic, or we should
stick with osa-dispatcher and not create very similar solution in
Java.

The second scenario is however much more important and interesting
-- yes, the server will get overloaded if you use many osad-enabled
clients, and we had Spacewalk users complaining about this on the
mailing list in the past. However, is the cure really to allow ssh
access from server to the clients? How about clients that are behind
NAT, roaming, or in general unavailable?

I would assume that the majority of the Spacewalk (and downstream
products') installations has server accessible from clients because
otherwise things would currently not work. If you completely reverse
the style of operation, it will cause the disruption in our users'
setups. And still, clients for which you won't be able or willing
to enable the SSH Push functionality will not get the improvement
in timely actions that don't put the server to its knees.

I would very much love to see improvements to the second problem which
could be used by all *existing* clients of Spacewalk, even those that
are behind NAT, independed from the SSH Push feature.

Can the logic you propose to be put to taskomatic be put to
osa-dispatcher, to throttle the number of clients which get "invited"
to rhn_check?

Can we move the osad functionality (which we use purely for
notification) to rhnsd and maybe have rhnsd keep connection open
(WebSockets, maybe?) so that the server can wake the clients up in
timely (yet well managed) fashion?

I'd be interested to hear your thoughts,

-- 
Jan Pazdziora
Principal Software Engineer, Satellite Engineering, Red Hat

_______________________________________________
Spacewalk-devel mailing list
Spacewalk-devel@redhat.com
https://www.redhat.com/mailman/listinfo/spacewalk-devel

Reply via email to