Re: [heka] heka statsd [multicore] performance

Tom Cameron Thu, 06 Nov 2014 09:01:41 -0800

>
> i was thinking of having N workers, where every worker is responsible for
> a certain subset of the key space.
> e.g. using consistent hashing, the processing for all metric lines
> relating to an individual key requires no outside data.
> But at the same time, I think you're right that reading from UDP is the
> slowest aspect (which still surprises me) so this optimization would
> probably be pointless.
>
>
That's certainly an interesting approach. I'm not sure how you'd route that
internally without doing the basic work of dissecting the data within the
packet anyway, though. I suppose you could hash on the source address, then
the net listener could send the packet data off to a channel for that hash
which could be backed by N listening goroutines. Were you doing lots of
analysis on each received message, this could certainly be the way to go.


hmm, what do you mean with this? are udp packets slower to process than
> tcp? how so?
> i thought reassembling a tcp stream was more expensive.
>

No, no. UDP isn't slower to process than TCP. What I'm referring to is the
fact that UDP isn't a persistent connection between client and server. A
single packet is what you get (from the networking point of view). TCP on
the other hand, has an entire session around it which is persistent until
one end or the other closes that connection. So, with TCP, once you do the
setup (handshake), you can continue passing messages down that connection.
They will be re-assembled in the case where data is too big for a single
packet (fragmentation), re-transmitted in the case where no ACK was
received, and re-ordered so you have some basic guarantee that data is
received in the order it was sent. UDP has none of this at all. One packet,
no acknoledgement, no frills. Fire-and-forget.

This is why when you see most examples of using TCP connections in any
language, a thread (or Goroutine) is spun up to handle a TCP client. The
server can service many TCP clients concurrently, and the persistence of
the session with the client lends itself to just having an idle thread
listening for more data. With UDP, you can't do any of this. Well, you
can't practically do any of this.

do you mean that the golang net library could be made more performant?
>

Absolutely. This isn't to say that it's bad, just to say that optimization
hasn't been the focus. Additionally, there are several recent improvements
added to the Linux kernel which will require modifications to the net
library in Go to utilize. Again, not a problem. It just takes work.

I find myself reaching for Go every time I need to write a network service
rather than Python these days. It's more performant generally, and I find I
can wrap my head around marshalling and de-marshalling structs easier with
Go than Python lately. Heka has been an amazing project since I found it,
and has been an example I refer to for ways to solve problems in my own
applications. BIG thumbs up to these people!

-- 
Tom Cameron
System Engineer
Dyn, Inc.

P. 603-668-4998
M. 603-289-0124

_______________________________________________
Heka mailing list
[email protected]
https://mail.mozilla.org/listinfo/heka

Re: [heka] heka statsd [multicore] performance

Reply via email to