On Tue, 2015-02-10 at 08:33 -0600, Dave Caplinger wrote:
> I'm surprised you need to bother with the HUP if you are using dynafiles
> to write to distinct quarter-hour output files.  Does lsof indicate that
> rsyslog really does still have those older files open?  Even so, I wouldn't
> expect the extra HUP once every 15 minutes to have much of an impact unless
> you have very high log volume since the HUP would pause rsyslog briefly to
> flush all output queues and close/reopen files.  (Maybe HUP doesn't flush
> the queues; I don't recall...?)

I don't believe that HUPing flushes the queues; all it does is close
open filehandles. And yes, with the dynamic file naming, rsyslog will
keep a file open for quite a while (if not indefinitely), because there
is no process that "knows" that file will never get written to again
(dynamic file naming is not aware of chronology, as far as I can tell,
only that it makes a decision based on arbitrary conditions, one of
which can be time). The HUPing is not causing the problem in this case.

> What I suspect is more likely is that something else is contending for
> bandwidth over the link the VPN uses.  If you can catch it in the act, does
> running iperf between the two sites reveal lower throughput?  (You'll need
> to do a similar test outside of the problem window as a baseline of course.)
> We've run into a similar situation where a remote rsyslog forwarder was
> sending data via TCP through a VPN connection back to the central rsyslog
> collector, and the outbound Internet bandwidth available for rsyslog to use
> was often very small.  The input data rate of new logs at the remote
> forwarder simply exceeded the achievable output rate over the VPN link, so it
> would start falling behind and spool to disk.  Sometimes it would be able to
> recover as more bandwidth became available, but often not, and this would
> lead to the main queue filling up and ultimately dropping (all visible in
> impstats output).  This was with the VPN link doing inline compression at
> around 2:1, but it still wasn't sufficient.

After more investigation last night, it turns out that it appeared to be
some problem with the VPN device as well as congestion on the ISP end;
disabling encryption temporarily on the device (none of the traffic
we're sending is sensitive data in this case, so doing so temporarily
was an option) and switching to our alternative ISP managed to reduce
the backpressure on the link and we were eventually able to clear up the
spooling last night.

> Our solution to the "remote sender with low outbound bandwidth" issue was to
> 1) make sure we weren't forwarding things that we really didn't need, and 2)
> enable stream compression.  We initially set up a 60-second compressed
> batching solution using output dynafiles from rsyslog that were gzip'd and
> transferred by another process (then consumed on the other side, uncompressed,
> and re-injected into the log stream), and it was effective.  However, that
> solution was complex (with many moving parts that could break), and stream
> compression in rsyslog performs even better than this, especially for 
> long-lived
> TCP connections between the two rsyslog nodes.

We were aware that we were reaching the limits of our VPN, so we've
already got plans to switch to another solution, in this case shipping
logs over the open internet (where we have a 10 Gbit connection, as
opposed to the VPN's 1 Gbit) using rsyslog with TLS, but that is not
something we are quite ready to implement yet. I'll likely need to
escalate that on the priority list in light of this problem.

> Note that enabling stream compression on the receiving side is all-or-nothing
> for an input.  You cannot have multiple remote devices sending a mix of
> compressed and uncompressed data to the same input queue on the receiving 
> server.
> But if your case is literally just one sender and one receiver, that should 
> not
> be an issue for you.  (You can always move the compressed TCP stream to 
> another
> port and give it it's own input queue and even ruleset.)

I'm already trying to figure out how to handle this hand off for a
switch from TCP to RELP and to TLS, so I've already got a fairly simple
migration plan cooked up for such changes. 

_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of 
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE 
THAT.

Reply via email to