Chris, Yes, I'm running Ubuntu Server 10.10 64-bit...
--Steven. On 9/30/11 2:31 PM, "Steven M Lichti" <[email protected]> wrote: >Chris, > >A fine point, but I can't help but wonder whether they are related. I'll >leave a window attached to the machine and see what I can figure out. > >I'm still working on getting syslog-ng going to I can use Splunk to see >what's really going on, so I won't have answers for a while... > >--Steven. > >-- >Steven Lichti >Academic Technologies >Northwestern University >[email protected] >(847) 467-7805 > > >On 9/30/11 12:34 PM, "Christopher Brooks" <[email protected]> wrote: > >>Steven, >> >>Matterhorn doesn't do anything with SSH, so if the machines are not >>under a really high load, SSH should respond. Can you check if the >>machines are under high load? How long does it take for a machine to >>get inaccessible, and is it fairly repeatable? Can you stay SSH'ed >>into the machine and just watch top to see if the load average jumps up? >> >>Unless MH is causing a high load, I think this is unrelated to MH. >> >>(Ubuntu 10.10?) >> >>Chris >> >>On Fri, 30 Sep 2011 03:40:03 +0000 >>Steven M Lichti <[email protected]> wrote: >> >>> Chris, >>> >>> I'm having a problem sort of like this. My capture agents are >>> dropping off the air, and while they are marked as offline, they are >>> still inaccessible. I can ping them, but not ssh to them. I'm at a >>> complete loss as to why these machines stop responding. I've taken to >>> restarting them a couple of times per morning to make sure they're >>> alright, and that has seemed to help a bit. >>> >>> I've also checked the system log files, but haven't found anything >>> usefulŠ >>> >>> --Steven. >>> >>> -- >>> Steven Lichti >>> Academic Technologies >>> Northwestern University >>> [email protected] >>> (847) 467-7805 >>> >>> >>> >>> From: Rubén Pérez <[email protected]<mailto:[email protected]>> >>> Reply-To: Matterhorn Users >>> >>><[email protected]<mailto:matterhorn-users@opencastpr >>>o >>>ject.org>> >>> Date: Fri, 30 Sep 2011 01:38:53 +0200 To: Matterhorn Users >>> >>><[email protected]<mailto:matterhorn-users@opencastpr >>>o >>>ject.org>> >>> Subject: Re: [Matterhorn-users] Heartburn >>> >>> Hi Chris, >>> >>> We do have the same problem around here and it have been driving us >>> crazy in our new pilot preliminary test. Can you elaborate on what >>> the "heartbeat" is? I understand it is some kind of "keep-alive" to >>> let the system know the machine is operative. What is the method you >>> used to disable it? >>> >>> Thanks for you answers. >>> >>> Best regards >>> Rubenciño >>> >>> 2011/9/29 Christopher Brooks >>> <[email protected]<mailto:[email protected]>> Hi, >>> >>> Our machines constantly get marked as offline. Seems like under load >>> the heartbeat isn't getting through (for whatever reason). We're >>> disabling the heartbeat on our local system to make up for this. >>> >>> Anyone else having these issues on a distributed deployment? >>> >>> Looking for people who might also be running into this, to help test >>> potential patches for 1.2.1. >>> >>> Chris >>> >>> -- >>> Christopher Brooks, BSc, MSc >>> ARIES Laboratory, University of Saskatchewan >>> >>> Web: http://www.cs.usask.ca/~cab938 >>> Phone: 1.306.966.1442 >>> Mail: Advanced Research in Intelligent Educational Systems Laboratory >>> Department of Computer Science >>> University of Saskatchewan >>> 176 Thorvaldson Building >>> 110 Science Place >>> Saskatoon, SK >>> S7N 5C9 >>> _______________________________________________ >>> Matterhorn-users mailing list >>> >>>[email protected]<mailto:Matterhorn-users@opencastpro >>>j >>>ect.org> >>> http://lists.opencastproject.org/mailman/listinfo/matterhorn-users >>> >>> _______________________________________________ Matterhorn-users >>> mailing list >>> >>>[email protected]<mailto:Matterhorn-users@opencastpro >>>j >>>ect.org> >>> http://lists.opencastproject.org/mailman/listinfo/matterhorn-users >> >> >> >>-- >>Christopher Brooks, BSc, MSc >>ARIES Laboratory, University of Saskatchewan >> >>Web: http://www.cs.usask.ca/~cab938 >>Phone: 1.306.966.1442 >>Mail: Advanced Research in Intelligent Educational Systems Laboratory >> Department of Computer Science >> University of Saskatchewan >> 176 Thorvaldson Building >> 110 Science Place >> Saskatoon, SK >> S7N 5C9 > >_______________________________________________ >Matterhorn mailing list >[email protected] >http://lists.opencastproject.org/mailman/listinfo/matterhorn > > >To unsubscribe please email >[email protected] >_______________________________________________ _______________________________________________ Matterhorn mailing list [email protected] http://lists.opencastproject.org/mailman/listinfo/matterhorn To unsubscribe please email [email protected] _______________________________________________
