Hi Timur, Thanks for sharing your experience and data points. I got it working now by using TCP. The performance has not been tested yet. I'll keep you guys posted. For each message sent I can see one decoder stopped message as below. From what Rob explained, it seems to work as expected.
*"2016/01/10 16:28:38 Decoder 'tcp_in:3242-ProtobufDecoder-127.0.0.1': stopped"* Also, I can hear the pounding sound for each sent when volume is on. I'm not sure if this is normal. thanks. Emily On Sun, Jan 10, 2016 at 2:12 AM, Timur Batyrshin <[email protected]> wrote: > Hi Emily, > > Not sure what is your exact case for sending out data to Heka. > Usually I find it much more easy to use JSON or similar plain text format > to sending messages to Heka unless you have tight requirements for > throughput. > > In my tests I've seen throughputs of ~1K messages/second (~10Mbit/s) on > c4.large instance on AWS using stock lua JSON decoder/encoder and HTTP > output/input. > If you are expecting smaller throughputs you should probably look into > that -- at least until you get used to Heka and to how it works. > > Best regards, > Timur > > On Fri, Jan 8, 2016 at 3:22 AM, Emily Gu <[email protected]> wrote: > >> This is working. Thanks! >> >> I'm confusing on the two instances parts and also others. >> >> Yes, I need to send our custom data into Heka. I want to see if I need to >> write my own custom Heka plugin or leverage existing Heka plugins. My >> custom data is a slice of metrics can send into Heka through TCP. >> >> Your suggestion is very much appreciated. >> >> Thanks, >> Emily >> >> On Thu, Jan 7, 2016 at 4:10 PM, Rob Miller <[email protected]> wrote: >> >>> From what I can tell (and it's not very clear), it looks like you've got >>> one Heka instance running that has only a TcpInput, nothing else. That will >>> accept data, but it's not going to do anything with that data. >>> >>> Then you've got a separate Heka config that contains no inputs, but only >>> a TcpOutput (pointing at the input that's specified in the other config) >>> and a FileOutput. These outputs might conceivably send data somewhere, but >>> there are no inputs, so it's not clear where that data would come from. >>> >>> Drop the TcpOutput altogether, and combine the TcpInput and the >>> FileOutput into a single config: >>> >>> [hekad] >>> maxprocs = 1 >>> share_dir = "/Users/egu/heka/share/heka" >>> >>> [tcp_in:3242] >>> type = "TcpInput" >>> splitter = "HekaFramingSplitter" >>> decoder = "ProtobufDecoder" >>> address = ":3242" >>> >>> [tcp_heka_output_log] >>> type = "FileOutput" >>> message_matcher = "TRUE" >>> path = "/tmp/output.log" >>> perm = "664" >>> encoder = "tcp_heka_output_encoder" >>> >>> [tcp_heka_output_encoder] >>> type = "PayloadEncoder" >>> append_newlines = false >>> >>> >>> Once you've done that, you should be able to use `heka-inject` to send a >>> message into your running Heka: >>> >>> $ heka-inject -heka 127.0.0.1:3242 -payload "1212 this is just a test" >>> >>> If you want to send custom data in through that TcpInput, then you'll >>> have to switch to using a different splitter and a different decoder, the >>> default setup you're using will only know how to handle Heka protobuf >>> streams. >>> >>> -r >>> >>> >>> >>> >>> On 01/07/2016 03:48 PM, Emily Gu wrote: >>> >>>> Thanks you both Rob and David very much! >>>> >>>> Not sure where I need to define "base_dir"? >>>> >>>> I'm going to write a Heka plugin to pass our metrics data into Heka. >>>> >>>> For now, I have a hard time to see the data I send in through >>>> TCP programmatically through TcpInput in the output.log file. >>>> I don't see any output. The configs are: >>>> >>>> tcp_input.toml >>>> ============ >>>> >>>> [hekad] >>>> >>>> maxprocs = 1 >>>> >>>> share_dir = "/Users/egu/heka/share/heka" >>>> >>>> >>>> [tcp_in:3242] >>>> >>>> type = "TcpInput" >>>> >>>> splitter = "HekaFramingSplitter" >>>> >>>> decoder = "ProtobufDecoder" >>>> >>>> address = ":3242" >>>> >>>> >>>> tcp_output.toml >>>> >>>> ============== >>>> >>>> [hekad] >>>> >>>> maxprocs = 1 >>>> >>>> share_dir = "/Users/egu/heka/share/heka" >>>> >>>> >>>> [tcp_out:3242] >>>> >>>> type = "TcpOutput" >>>> >>>> message_matcher = "TRUE" >>>> >>>> address = "127.0.0.1:3242 <http://127.0.0.1:3242>" >>>> >>>> >>>> [tcp_heka_output_log] >>>> >>>> type = "FileOutput" >>>> >>>> message_matcher = "TRUE" >>>> >>>> path = "/tmp/output.log" >>>> >>>> perm = "664" >>>> >>>> encoder = "tcp_heka_output_encoder" >>>> >>>> >>>> [tcp_heka_output_encoder] >>>> >>>> type = "PayloadEncoder" >>>> >>>> append_newlines = false >>>> >>>> >>>> The client: >>>> >>>> package main >>>> >>>> >>>> import ( >>>> >>>> "fmt" >>>> >>>> "github.com/mozilla-services/heka/client >>>> <http://github.com/mozilla-services/heka/client>" >>>> >>>> ) >>>> >>>> >>>> >>>> func main() { >>>> >>>> message_bytes := []byte {100} >>>> >>>> >>>> sender, err := client.NewNetworkSender("tcp", "127.0.0.1:3242 >>>> <http://127.0.0.1:3242>") >>>> >>>> if err != nil { >>>> >>>> fmt.Println("Could not connect to", "127.0.0.1:3242 >>>> <http://127.0.0.1:3242>") >>>> >>>> return >>>> >>>> } >>>> >>>> fmt.Println("Connected") >>>> >>>> var i int >>>> >>>> for i = 0; i < 10; i++ { >>>> >>>> fmt.Println("message byte:", string(message_bytes)) >>>> >>>> err = sender.SendMessage(message_bytes) >>>> >>>> if err != nil { >>>> >>>> break >>>> >>>> } >>>> >>>> } >>>> >>>> fmt.Println("sent", i, "messages") >>>> >>>> } >>>> >>>> >>>> >>>> Please let me know what else I need to change. >>>> >>>> Thanks, >>>> >>>> Emily >>>> >>>> >>>> >>>> >>>> >>>> >>>> On Thu, Jan 7, 2016 at 3:28 PM, David Birdsong < >>>> [email protected] >>>> <mailto:[email protected]>> wrote: >>>> >>>> >>>> >>>> On Thu, Jan 7, 2016 at 3:22 PM, Rob Miller <[email protected] >>>> <mailto:[email protected]>> wrote: >>>> >>>> On 01/07/2016 03:09 PM, Emily Gu wrote: >>>> >>>> Thanks David for all the help! I'll give it a try. >>>> >>>> Please bear with me as some parts I still not understand. >>>> >>>> 1. Why do I have to run two Heka instances where one for >>>> input and >>>> another for output? >>>> >>>> >>>> Because if you send the output from a Heka instance back into >>>> itself, then you're likely setting up an infinite loop of >>>> traffic that will spin out of control. >>>> >>>> 2. Did you mean I need to specify different share_dirs in >>>> input and >>>> output Toml configs? >>>> >>>> >>>> If you're running multiple Heka instances on a single machine, >>>> it *should* be fine for them to use the same share_dir, which is >>>> read-only. It's very important that each specifies a unique >>>> base_dir, however, since that's used by Heka for internal >>>> bookkeeping data. Two Heka's using the same base_dir is asking >>>> for trouble. >>>> >>>> 3. Do I need both TcpOutput and FileOutput in order for me >>>> to see >>>> messages inside an output file? What if I didn't specify >>>> TcpOutput? >>>> >>>> >>>> Um, TcpOutput sends output data over a TCP connection. It >>>> expects that there is a listener on the other side which will >>>> accept that TCP connection, and will know how to correctly >>>> handle the data that Heka is sending over the TCP connection. >>>> >>>> FileOutput sends data to a file on the local file system. >>>> >>>> It's of course fine to specify a FileOutput without specifying a >>>> TcpOutput. >>>> >>>> -r >>>> >>>> >>>> whoops, yes I meant base_dir for where heka writes various internal >>>> state information to. >>>> >>>> Emily, >>>> >>>> Maybe you could share what data you're trying to read into heka and >>>> what you would like to do with it and we could help get you going. >>>> >>>> Heka intended to a uni-directional pipeline. It can read data in >>>> from many places into various formats, aggregate into interesting >>>> new formats, and finally emit data somewhere else. >>>> >>>> >>>> >> >> _______________________________________________ >> Heka mailing list >> [email protected] >> https://mail.mozilla.org/listinfo/heka >> >> >
_______________________________________________ Heka mailing list [email protected] https://mail.mozilla.org/listinfo/heka

