Hi Rob, thanks for the response/info. Yes, as you might remember I implemented your heartbeat suggestion months ago and am sending heartbeats of a chain of hekad's to influx and am using kapacitor's deadman watch to catch the lack of heartbeat and it's working perfectly fine.
The problem I'm trying to solve is to eliminate the manual intervention of USR1/USR2 exercise; hence my question about any possible reasons for wedgedness. This is my current channel/pool setting (I know this doesn't mean much without knowing what is passing through the pipeline): [hekad] maxprocs = 4 base_dir = "/local/heka/run" pid_file = "/var/run/heka-edge/heka-edge.pid" poolsize = 500 plugin_chansize = 500 Do you have any heuristics about the values I should be using? Ramin On Wed, Mar 30, 2016 at 3:01 PM, Rob Miller <[email protected]> wrote: > Heka can get wedged when there's a deadlock in the pipeline, usually > related to pack exhaustion. For example, a filter might block because it's > waiting for an empty pack with which to inject a new message, but there are > no packs available because they're all tied up in the input channels for > the filter. Tweaking the channel and pool sizes is usually helpful here. > > There are no verbosity levels to Heka's logging, sorry. When Heka is > wedged, you can get a dump to stdout of the current internal state of the > pipeline by sending SIGUSR1 to the Heka process. You can get the dump, > followed by serializing all of the sandboxes and exiting the process by > sending SIGUSR2. > > You can monitor for wedgedness by setting up a filter to emit a message > every N seconds, and then setting up an output to catch those messages. You > could write them to a file and monitor that the file is growing, or send > them to a downstream listener that will notice if the heartbeats stop > coming. > > -r > > > On 03/30/2016 11:13 AM, Ramin Ali Dousti wrote: > >> Hi Timur, >> >> 1) wedged: >> In my case, my hekad gets wedged for no reason (or I think i cannot >> pinpoint as to why) >> >> 2) logging >> Yes, I could do a LogOutput (which I'm doing when I'm troubleshooting) >> but I was looking for some kind of application logging with different >> severity so that i could up or down the severity and get to see less or >> more log entries about what was going on. I'm not talking about the >> actual messages that are passing through the system. For example when a >> TcpInput drops a connection you will see a line indicating that, I'd >> like to see a similar thing about "important events" within the system. >> For example "I've just become wedged because ...". >> >> Thanks for the answers though. >> >> Best, >> Ramin >> >> On Wed, Mar 30, 2016 at 7:34 AM, Timur Batyrshin <[email protected] >> <mailto:[email protected]>> wrote: >> >> Hi Ramin, >> >> When heka is running it emits HeapAlloc, HeapSys and few other >> metrics which you can graph or alert based on them. >> I've seen heka freezing only when I send TERM signal to it and some >> Lua plugin fails to stop correctly or something like that. >> May be someone else will give you insights on other conditions. >> >> For checking how heka is processing I know 3 ways: >> * dashboard >> ( >> http://hekad.readthedocs.org/en/v0.10.0/config/outputs/dashboard.html) >> -- it doesn't display individual messages but displays number of >> messages processed, queues, lags, etc >> * if you send messages over TcpOutput you can enable buffering and >> use "heka-cat" command to browse/tail protobuf logs in /var/cache/heka >> * the most usable for me way is creating additional output like the >> following: >> >> [debug_encoder] >> type="RstEncoder" >> >> [LogOutput] >> encoder="debug_encoder" >> message_matcher="TRUE" >> >> This makes heka dump all messages to stdout. If you store that in >> logs hope you rotate them based on size reached. >> >> >> Best regards, >> Timur >> >> On Wed, Mar 30, 2016 at 2:22 AM, Ramin Ali Dousti <[email protected] >> <mailto:[email protected]>> wrote: >> >> Hi, >> >> I have a nagging question. I read in the documentations and I >> also experienced it myself that hekad might become "wedged" and >> there would be no activity within the system while the process >> seems up from the outside. My question is what are the >> conditions that this might happen? The reason I ask is that my >> hekad instances might be running for weeks with no problem but >> they could end up "wedged" for no obvious reason. Knowing what >> gets them into this mode, might help me prevent that situation. >> >> Also, hekad seems a very quiet process log-wise. How can I have >> it log "important" information about its doing? >> >> I really appreciate any insight. >> >> -- >> Ramin >> >> _______________________________________________ >> Heka mailing list >> [email protected] <mailto:[email protected]> >> https://mail.mozilla.org/listinfo/heka >> >> >> >> >> >> -- >> Ramin >> >> >> _______________________________________________ >> Heka mailing list >> [email protected] >> https://mail.mozilla.org/listinfo/heka >> >> > -- Ramin
_______________________________________________ Heka mailing list [email protected] https://mail.mozilla.org/listinfo/heka

