Thanks, David. I did this way. The collector is scheduled to collect metrics periodically. Collected metrics are shipped to Heka in one shot. Then the connection is close. It loops in the interval of the collectors(open connection -> ship metrics -> close connection). Does this make sense?
I thought Heka has the pooling and queuing, it seems not necessary to implement connection pooling in the client side. Same for the queuing on the client side is not necessary. Please correct me if my understanding is not correct. Thanks, Emily On Mon, Jan 11, 2016 at 12:01 AM, David Birdsong <[email protected]> wrote: > > > On Sun, Jan 10, 2016 at 9:46 PM, Emily Gu <[email protected]> wrote: > >> Hi Timur, >> >> Thanks for sharing your experience and data points. I got it working now >> by using TCP. The performance has not been tested yet. I'll keep you guys >> posted. For each message sent I can see one decoder stopped message as >> below. From what Rob explained, it seems to work as expected. >> > > but not ideal. you should hold the tcp connection open and send many > messages over a single connection. > > >> *"2016/01/10 16:28:38 Decoder 'tcp_in:3242-ProtobufDecoder-127.0.0.1': >> stopped"* >> >> Also, I can hear the pounding sound for each sent when volume is on. I'm >> not sure if this is normal. >> >> thanks. >> >> Emily >> >> On Sun, Jan 10, 2016 at 2:12 AM, Timur Batyrshin <[email protected]> >> wrote: >> >>> Hi Emily, >>> >>> Not sure what is your exact case for sending out data to Heka. >>> Usually I find it much more easy to use JSON or similar plain text >>> format to sending messages to Heka unless you have tight requirements for >>> throughput. >>> >>> In my tests I've seen throughputs of ~1K messages/second (~10Mbit/s) on >>> c4.large instance on AWS using stock lua JSON decoder/encoder and HTTP >>> output/input. >>> If you are expecting smaller throughputs you should probably look into >>> that -- at least until you get used to Heka and to how it works. >>> >>> Best regards, >>> Timur >>> >>> On Fri, Jan 8, 2016 at 3:22 AM, Emily Gu <[email protected]> wrote: >>> >>>> This is working. Thanks! >>>> >>>> I'm confusing on the two instances parts and also others. >>>> >>>> Yes, I need to send our custom data into Heka. I want to see if I need >>>> to write my own custom Heka plugin or leverage existing Heka plugins. My >>>> custom data is a slice of metrics can send into Heka through TCP. >>>> >>>> Your suggestion is very much appreciated. >>>> >>>> Thanks, >>>> Emily >>>> >>>> On Thu, Jan 7, 2016 at 4:10 PM, Rob Miller <[email protected]> wrote: >>>> >>>>> From what I can tell (and it's not very clear), it looks like you've >>>>> got one Heka instance running that has only a TcpInput, nothing else. That >>>>> will accept data, but it's not going to do anything with that data. >>>>> >>>>> Then you've got a separate Heka config that contains no inputs, but >>>>> only a TcpOutput (pointing at the input that's specified in the other >>>>> config) and a FileOutput. These outputs might conceivably send data >>>>> somewhere, but there are no inputs, so it's not clear where that data >>>>> would >>>>> come from. >>>>> >>>>> Drop the TcpOutput altogether, and combine the TcpInput and the >>>>> FileOutput into a single config: >>>>> >>>>> [hekad] >>>>> maxprocs = 1 >>>>> share_dir = "/Users/egu/heka/share/heka" >>>>> >>>>> [tcp_in:3242] >>>>> type = "TcpInput" >>>>> splitter = "HekaFramingSplitter" >>>>> decoder = "ProtobufDecoder" >>>>> address = ":3242" >>>>> >>>>> [tcp_heka_output_log] >>>>> type = "FileOutput" >>>>> message_matcher = "TRUE" >>>>> path = "/tmp/output.log" >>>>> perm = "664" >>>>> encoder = "tcp_heka_output_encoder" >>>>> >>>>> [tcp_heka_output_encoder] >>>>> type = "PayloadEncoder" >>>>> append_newlines = false >>>>> >>>>> >>>>> Once you've done that, you should be able to use `heka-inject` to send >>>>> a message into your running Heka: >>>>> >>>>> $ heka-inject -heka 127.0.0.1:3242 -payload "1212 this is just a test" >>>>> >>>>> If you want to send custom data in through that TcpInput, then you'll >>>>> have to switch to using a different splitter and a different decoder, the >>>>> default setup you're using will only know how to handle Heka protobuf >>>>> streams. >>>>> >>>>> -r >>>>> >>>>> >>>>> >>>>> >>>>> On 01/07/2016 03:48 PM, Emily Gu wrote: >>>>> >>>>>> Thanks you both Rob and David very much! >>>>>> >>>>>> Not sure where I need to define "base_dir"? >>>>>> >>>>>> I'm going to write a Heka plugin to pass our metrics data into Heka. >>>>>> >>>>>> For now, I have a hard time to see the data I send in through >>>>>> TCP programmatically through TcpInput in the output.log file. >>>>>> I don't see any output. The configs are: >>>>>> >>>>>> tcp_input.toml >>>>>> ============ >>>>>> >>>>>> [hekad] >>>>>> >>>>>> maxprocs = 1 >>>>>> >>>>>> share_dir = "/Users/egu/heka/share/heka" >>>>>> >>>>>> >>>>>> [tcp_in:3242] >>>>>> >>>>>> type = "TcpInput" >>>>>> >>>>>> splitter = "HekaFramingSplitter" >>>>>> >>>>>> decoder = "ProtobufDecoder" >>>>>> >>>>>> address = ":3242" >>>>>> >>>>>> >>>>>> tcp_output.toml >>>>>> >>>>>> ============== >>>>>> >>>>>> [hekad] >>>>>> >>>>>> maxprocs = 1 >>>>>> >>>>>> share_dir = "/Users/egu/heka/share/heka" >>>>>> >>>>>> >>>>>> [tcp_out:3242] >>>>>> >>>>>> type = "TcpOutput" >>>>>> >>>>>> message_matcher = "TRUE" >>>>>> >>>>>> address = "127.0.0.1:3242 <http://127.0.0.1:3242>" >>>>>> >>>>>> >>>>>> [tcp_heka_output_log] >>>>>> >>>>>> type = "FileOutput" >>>>>> >>>>>> message_matcher = "TRUE" >>>>>> >>>>>> path = "/tmp/output.log" >>>>>> >>>>>> perm = "664" >>>>>> >>>>>> encoder = "tcp_heka_output_encoder" >>>>>> >>>>>> >>>>>> [tcp_heka_output_encoder] >>>>>> >>>>>> type = "PayloadEncoder" >>>>>> >>>>>> append_newlines = false >>>>>> >>>>>> >>>>>> The client: >>>>>> >>>>>> package main >>>>>> >>>>>> >>>>>> import ( >>>>>> >>>>>> "fmt" >>>>>> >>>>>> "github.com/mozilla-services/heka/client >>>>>> <http://github.com/mozilla-services/heka/client>" >>>>>> >>>>>> ) >>>>>> >>>>>> >>>>>> >>>>>> func main() { >>>>>> >>>>>> message_bytes := []byte {100} >>>>>> >>>>>> >>>>>> sender, err := client.NewNetworkSender("tcp", "127.0.0.1:3242 >>>>>> <http://127.0.0.1:3242>") >>>>>> >>>>>> if err != nil { >>>>>> >>>>>> fmt.Println("Could not connect to", "127.0.0.1:3242 >>>>>> <http://127.0.0.1:3242>") >>>>>> >>>>>> return >>>>>> >>>>>> } >>>>>> >>>>>> fmt.Println("Connected") >>>>>> >>>>>> var i int >>>>>> >>>>>> for i = 0; i < 10; i++ { >>>>>> >>>>>> fmt.Println("message byte:", string(message_bytes)) >>>>>> >>>>>> err = sender.SendMessage(message_bytes) >>>>>> >>>>>> if err != nil { >>>>>> >>>>>> break >>>>>> >>>>>> } >>>>>> >>>>>> } >>>>>> >>>>>> fmt.Println("sent", i, "messages") >>>>>> >>>>>> } >>>>>> >>>>>> >>>>>> >>>>>> Please let me know what else I need to change. >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Emily >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> On Thu, Jan 7, 2016 at 3:28 PM, David Birdsong < >>>>>> [email protected] >>>>>> <mailto:[email protected]>> wrote: >>>>>> >>>>>> >>>>>> >>>>>> On Thu, Jan 7, 2016 at 3:22 PM, Rob Miller <[email protected] >>>>>> <mailto:[email protected]>> wrote: >>>>>> >>>>>> On 01/07/2016 03:09 PM, Emily Gu wrote: >>>>>> >>>>>> Thanks David for all the help! I'll give it a try. >>>>>> >>>>>> Please bear with me as some parts I still not understand. >>>>>> >>>>>> 1. Why do I have to run two Heka instances where one for >>>>>> input and >>>>>> another for output? >>>>>> >>>>>> >>>>>> Because if you send the output from a Heka instance back into >>>>>> itself, then you're likely setting up an infinite loop of >>>>>> traffic that will spin out of control. >>>>>> >>>>>> 2. Did you mean I need to specify different share_dirs in >>>>>> input and >>>>>> output Toml configs? >>>>>> >>>>>> >>>>>> If you're running multiple Heka instances on a single machine, >>>>>> it *should* be fine for them to use the same share_dir, which >>>>>> is >>>>>> read-only. It's very important that each specifies a unique >>>>>> base_dir, however, since that's used by Heka for internal >>>>>> bookkeeping data. Two Heka's using the same base_dir is asking >>>>>> for trouble. >>>>>> >>>>>> 3. Do I need both TcpOutput and FileOutput in order for me >>>>>> to see >>>>>> messages inside an output file? What if I didn't specify >>>>>> TcpOutput? >>>>>> >>>>>> >>>>>> Um, TcpOutput sends output data over a TCP connection. It >>>>>> expects that there is a listener on the other side which will >>>>>> accept that TCP connection, and will know how to correctly >>>>>> handle the data that Heka is sending over the TCP connection. >>>>>> >>>>>> FileOutput sends data to a file on the local file system. >>>>>> >>>>>> It's of course fine to specify a FileOutput without >>>>>> specifying a >>>>>> TcpOutput. >>>>>> >>>>>> -r >>>>>> >>>>>> >>>>>> whoops, yes I meant base_dir for where heka writes various >>>>>> internal >>>>>> state information to. >>>>>> >>>>>> Emily, >>>>>> >>>>>> Maybe you could share what data you're trying to read into heka >>>>>> and >>>>>> what you would like to do with it and we could help get you going. >>>>>> >>>>>> Heka intended to a uni-directional pipeline. It can read data in >>>>>> from many places into various formats, aggregate into interesting >>>>>> new formats, and finally emit data somewhere else. >>>>>> >>>>>> >>>>>> >>>> >>>> _______________________________________________ >>>> Heka mailing list >>>> [email protected] >>>> https://mail.mozilla.org/listinfo/heka >>>> >>>> >>> >> >> _______________________________________________ >> Heka mailing list >> [email protected] >> https://mail.mozilla.org/listinfo/heka >> >> >
_______________________________________________ Heka mailing list [email protected] https://mail.mozilla.org/listinfo/heka

