Re: [heka] measure Heka load

Timur Batyrshin Mon, 16 Nov 2015 22:47:41 -0800

Hi Rob,


On 16 Nov 2015 at 23:40:33, Rob Miller ([email protected]) wrote:

> Parsing heka.all-report messages looks fine as well as having 
> DashboardOutput. 
> I was just hoping that there is already something ready for that :-) 
No, alas. Should be pretty easy w/ cjson in a SandboxFilter, though. I'd 
happily add such a filter to the Heka repo to make it available for everyone. 
No problem, this looks to be easy to do.

> So the question is: 
> Is there a way to make HttpOutput cache outgoing traffic in the same way 
> TcpOutput does? 
Heka v0.9 supports buffering only for the TcpOutput and ElasticSearchOutput. 
Heka v0.10 adds support for disk buffering for *all* filter and output plugins. 
Unfortunately, this has had some stability issues, and I've been too busy doing 
things other than working on the Heka core to yet resolve these issues. One of 
the known problems (https://github.com/mozilla-services/heka/issues/1738) seems 
to have been resolved and will hopefully be merged to the versions/0.10 branch 
this week, but there's another issue where Heka is generating idle pack 
diagnostic messages that I'm pretty sure is related to the new disk buffering 
(https://github.com/mozilla-services/heka/issues/1699) that I haven't yet had a 
chance to debug. That's the last blocker I know about for a 0.10.0 final 
release. I wish I could give you a time-table for resolving it, but I can't 
beyond saying that getting the 0.10.0 final release out the door is on my list 
of 2015 Q4 goals. 
No worries, please take your time.

On the other hand I was referring to 0.10.0b1 in which I still didn’t see the 
buffering for HTTPOutput. Or do you mean it is still pending to be merged?

> Or is there any other way to add persistence to it (without running 
> external services)? 
The upcoming buffering is probably your best choice. There are hackish, fairly 
painful manual options, such as also writing your data out to a FileOutput, or 
maybe even using a SandboxFilter with `preserve_data = true` to hold on to a 
sliding window of the latest set of records, but there's no support for knowing 
what was missed and automatically retrying it, you'd have to basically 
cross-reference what arrived with what didn't and then manually extract the 
record from your backup storage and add it to the end data store by hand. 
Pretty painful. Sorry. :P 

No problem, I’ll wait for the buffering to become available in mainline: large 
lags in a single AWS region are not very likely so buffering on collector hosts 
would be ok.
I was thinking of doing cross-datacenter data transfer which I’ll probably hold 
for a while.

BTW, I’ve just though of one more use case for disk buffering: for debugging 
purposes using heka-cat looks handy (vs LogOutput with RstDecoder) and you’ll 
be able to use it when the buffers appear on disks.


Timur

_______________________________________________
Heka mailing list
[email protected]
https://mail.mozilla.org/listinfo/heka

Re: [heka] measure Heka load

Reply via email to