I am working on a reliable UDP transport currently, and I am writing it using exclusively buffered channels.
The way it works is there is three goroutines, one accepts new input, forwards it to an intermediate 'incoming' worker, who checks if the new data can be bundled (it is for reed solomon encoded shards, of which 3 valid pieces are required), then either bundles it or pushes the new data back through a 'return' channel, which simply then loads it back into the incoming channel. I actually drafted this design some time ago but only just now I tried to make it work and after some hours with several dead loops appearing in the workers, and I realised I needed to first fully understand every aspect of how one implements this thing. So I wrote this simple version of a batch processor which just sequentially pushes incrementing integers into the channels and each time it gets three, which it collects as it goes, it then announces it has a batch and voila, it works. I had a few small details to properly work out and in the code below I show the basic algorithm, but in brief: 1. firstly, you always have to start up the goroutines that respond to channels being loaded, and then after all the goroutines are started, you run the main receiving loop that gets the batch items and puts them into the channel 2. In the 'pusher' goroutine, notice that the 'work complete' check, I observed that one has to set it outside of the bound intended, here, 7, when I want to see 6 items completely batched. If batches were atomic and a shutdown request happens during an operation, you would need to have something that notifies the other end the last batch won't be processed. I am noting this mainly to highlight the fact that the pusher goroutine is first in the processing scheme, and the returner is one step further down the track, so if the receiving thread is at 7, the returner is pushing item 6 back to incoming, where it will be bundled before the pusher shuts down. 3. Last general observation - when working with goroutines, you really need to keep a very close track on the number of selectors, pushers and pullers for each channel. Pushers and pullers have to be paired or one can be starved and deadlock, and selectors, for example for implementing a shutdown cleanup response for the goroutines, need a nonblocking selector at the start to check the quit notification and the second select has to also be nonbreaking (ie default clause). When there is only one channel being selected, as in the returner, you don't want a default to unblock, as this will fall through eithter to the enclosing for loop or terminate the goroutine. So the empty default clause is something you would generally want to omit until you are sure you need it there, otherwise you can get boring old infinite loop problems. The select clause will also terminate if the channel is nilled, which you can see in my example is done as part of the shutdown. This is not a clean shutdown, of course, and I might be throwing out more options to make sure I hit it, but I think that all that matters is the channel is nil, and the goroutine will then stop. Just one tiny question about that last bit - just more getting it clear in my mind - nil channels cannot receive and closed channels cannot send (ie panic). So this means one should close a channel first and then nil it. If you do it the other way around, you can get a send on closed channel panic, or sleeping goroutines deadlock, one or the other. Also, I might be confused about how this would work for a packet receiving/routing system - I have worked out from this that the channel size does not need to be greater than batch size for the 'pusher' (in the transport I am writing, this is the socket listener), and the buffer sizes on incoming/return channels will be something subject to some tuning to balance latency from several of the channels filling up and blocking the sender that loads them - but... so I figure that given a computer wasn't doing anything else except receiving, batching and relaying packets, for example, that you would not need any more buffers than this (like here, bundling sequential values). But if the input was more unpredictable, some amount more, which will depend partly on performance characteristics of computers and networks, then there needs to be some headroom to allow for several of the channels to get full without stopping the whole show. However, keeping that in mind, one characteristic of this algorithm below is that it preserves ordering. You can see if you run it on play that each batch comes out in the order they were sent (thus, for my transport, received), but of course being that the transport willl be receiving many batches of up to 9 packets each from many peers, I think that for this case, I would need to add code that creates a set of buffers on the basis of the number of peers, plus, which will be learned in testing, how much data is latent in the channels most of the time. In my transport, if it doesn't receive 3 uncorrupted packets in 250ms, it dumps them out. That might be a bit too short, maybe it should be 400 or 500ms, but the amount of time the transport waits for three successful shards to be delivered will be another factor in deciding how big to make the buffers. A longer staleness threshold naturally means that potentially a lot more of these batched packets can be received all at once before they either expire or are completed. Anyhow, I am still quite a beginner with working on channels, but like most go programmers, I am quite in love with them, as far as they go. In this particular use case, one of the major beneficial features of the language is very clear - once it successfully gathers a bundle of packets, it is immediately able to bounce it somewhere else or send it to be processed by another goroutine. There is no polling period and the latency overhead for gathering the batches together is very very low. In this use case, this is extremely beneficial, as the batches that the transport works on are short message packets that have a very tight time-window of value, and Go can help a lot minimising this. *just one thing though - as I have been looking at elliptic curve algorithms as well, brings up the subject of so-called 'constant time' operations. For reasons of current fads (in my opinion), almost all of Go's potentially varying processing times with primitive data types is constant time in the stdlib. 'constant time' is a terrible euphemism, in my opinion, for 'spoofing timing data', and actually, seriously, no competent processing algorithm should be deliberately delaying processing.* *I think that's working on the wrong end of the beast - especially it seems retarded to me being that the need for constant time security against timing side channel attacks is for the one use case of elliptic curve signatures - HMAC for TLS/SSL and similar - but in this use case, the throughput cost is very low. Maybe while running a typical SSL session there will be one or two signatures and verifications at most every second.* *I am writing cryptocurrency software and especially during replay, I see absolutely no sane reason to spoof timing of processing when no other party is involved. Or memory wiping, either, since it's not even performing actual signatures, only verification. There is also other aspects that I have become aware of - that being that many of these stdlib functions also make an assumption about use case - constant time as I mentioned, for HMAC, and in general reducing sidechannel information, but another is memory utilisation. Some processes are very memory bus heavy, some are very cache heavy. The differences have stunning effects on performance when they are very wrongly matched, just consider that the typical cache memory speed is 3-4x main memory, and running code optimised for filling the memory bus using functions optimised for cache locality, or vice versa, is naturally going to cause some frustrating latency issues.* *I'll be of course continuing to study this area as it touches on a lot of what I'm working on right now, but I would like to hear the justifications for default constant-time in go stdlib because for those cases where timing is not exposing potential exploits, it is quite simply throttling the processing for no reason, which would surely be adding latency, working against all the good stuff that goroutines enable in this area.* https://play.golang.org/p/lEfaDdKfKvf package main import ( "fmt" "time" ) var packetchan = make(chan int, 3) var incomingchan = make(chan int, 3) var returnchan = make(chan int, 3) func main() { go incomer() go returner() pusher() } func pusher() { counter := 0 done := false for !done { if counter > 7 { fmt.Println("stopping") done = true close(incomingchan) incomingchan = nil close(packetchan) packetchan = nil close(returnchan) returnchan = nil } else { packetchan <- counter fmt.Println("OUT", counter, "-> packetchan") time.Sleep(time.Second) counter++ } } } func incomer() { counter := 0 bundled := []int{} done := false for !done { select { case packet := <-packetchan: if counter < 3 { fmt.Println("INC", "packetchan <-", packet) bundled = append(bundled, packet) returnchan <- packet fmt.Println("INC", packet, "-> returnchan") counter++ } else { fmt.Println("INC", "-> packetchan {", bundled, "}") counter = 0 bundled = []int{} } } } } func returner() { counter := 0 done := false for !done { select { case ret := <-returnchan: fmt.Println("RET", "returnchan <-", ret) incomingchan <- ret fmt.Println("RET", ret, "-> incomingchan") counter++ } } } -- You received this message because you are subscribed to the Google Groups "golang-nuts" group. To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.