I am working on a reliable UDP transport currently, and I am writing it 
using exclusively buffered channels.

The way it works is there is three goroutines, one accepts new input, 
forwards it to an intermediate 'incoming' worker, who checks if the new 
data can be bundled (it is for reed solomon encoded shards, of which 3 
valid pieces are required), then either bundles it or pushes the new data 
back through a 'return' channel, which simply then loads it back into the 
incoming channel.

I actually drafted this design some time ago but only just now I tried to 
make it work and after some hours with several dead loops appearing in the 
workers, and I realised I needed to first fully understand every aspect of 
how one implements this thing.

So I wrote this simple version of a batch processor which just sequentially 
pushes incrementing integers into the channels and each time it gets three, 
which it collects as it goes, it then announces it has a batch and voila, 
it works.

I had a few small details to properly work out and in the code below I show 
the basic algorithm, but in brief:

1. firstly, you always have to start up the goroutines that respond to 
channels being loaded, and then after all the goroutines are started, you 
run the main receiving loop that gets the batch items and puts them into 
the channel

2. In the 'pusher' goroutine, notice that the 'work complete' check, I 
observed that one has to set it outside of the bound intended, here, 7, 
when I want to see 6 items completely batched. If batches were atomic and a 
shutdown request happens during an operation, you would need to have 
something that notifies the other end the last batch won't be processed. I 
am noting this mainly to highlight the fact that the pusher goroutine is 
first in the processing scheme, and the returner is one step further down 
the track, so if the receiving thread is at 7, the returner is pushing item 
6 back to incoming, where it will be bundled before the pusher shuts down.

3. Last general observation - when working with goroutines, you really need 
to keep a very close track on the number of selectors, pushers and pullers 
for each channel. Pushers and pullers have to be paired or one can be 
starved and deadlock, and selectors, for example for implementing a 
shutdown cleanup response for the goroutines, need a nonblocking selector 
at the start to check the quit notification and the second select has to 
also be nonbreaking (ie default clause). When there is only one channel 
being selected, as in the returner, you don't want a default to unblock, as 
this will fall through eithter to the enclosing for loop or terminate the 
goroutine. So the empty default clause is something you would generally 
want to omit until you are sure you need it there, otherwise you can get 
boring old infinite loop problems. The select clause will also terminate if 
the channel is nilled, which you can see in my example is done as part of 
the shutdown. This is not a clean shutdown, of course, and I might be 
throwing out more options to make sure I hit it, but I think that all that 
matters is the channel is nil, and the goroutine will then stop.

Just one tiny question about that last bit - just more getting it clear in 
my mind - nil channels cannot receive and closed channels cannot send (ie 
panic). So this means one should close a channel first and then nil it. If 
you do it the other way around, you can get a send on closed channel panic, 
or sleeping goroutines deadlock, one or the other.

Also, I might be confused about how this would work for a packet 
receiving/routing system - I have worked out from this that the channel 
size does not need to be greater than batch size for the 'pusher' (in the 
transport I am writing, this is the socket listener), and the buffer sizes 
on incoming/return channels will be something subject to some tuning to 
balance latency from several of the channels filling up and blocking the 
sender that loads them - but... so I figure that given a computer wasn't 
doing anything else except receiving, batching and relaying packets, for 
example, that you would not need any more buffers than this (like here, 
bundling sequential values). But if the input was more unpredictable, some 
amount more, which will depend partly on performance characteristics of 
computers and networks, then there needs to be some headroom to allow for 
several of the channels to get full without stopping the whole show.

However, keeping that in mind, one characteristic of this algorithm below 
is that it preserves ordering. You can see if you run it on play that each 
batch comes out in the order they were sent (thus, for my transport, 
received), but of course being that the transport willl be receiving many 
batches of up to 9 packets each from many peers, I think that for this 
case, I would need to add code that creates a set of buffers on the basis 
of the number of peers, plus, which will be learned in testing, how much 
data is latent in the channels  most of the time. In my transport, if it 
doesn't receive 3 uncorrupted packets in 250ms, it dumps them out. That 
might be a bit too short, maybe it should be 400 or 500ms, but the amount 
of time the transport waits for three successful shards to be delivered 
will be another factor in deciding how big to make the buffers. A longer 
staleness threshold naturally means that potentially a lot more of these 
batched packets can be received all at once before they either expire or 
are completed.

Anyhow, I am still quite a beginner with working on channels, but like most 
go programmers, I am quite in love with them, as far as they go. In this 
particular use case, one of the major beneficial features of the language 
is very clear - once it successfully gathers a bundle of packets, it is 
immediately able to bounce it somewhere else or send it to be processed by 
another goroutine. There is no polling period and the latency overhead for 
gathering the batches together is very very low. In this use case, this is 
extremely beneficial, as the batches that the transport works on are short 
message packets that have a very tight time-window of value, and Go can 
help a lot minimising this.

*just one thing though - as I have been looking at elliptic curve 
algorithms as well, brings up the subject of so-called 'constant time' 
operations. For reasons of current fads (in my opinion), almost all of Go's 
potentially varying processing times with primitive data types is constant 
time in the stdlib. 'constant time' is a terrible euphemism, in my opinion, 
for 'spoofing timing data', and actually, seriously, no competent 
processing algorithm should be deliberately delaying processing.*

*I think that's working on the wrong end of the beast - especially it seems 
retarded to me being that the need for constant time security against 
timing side channel attacks is for the one use case of elliptic curve 
signatures - HMAC for TLS/SSL and similar - but in this use case, the 
throughput cost is very low. Maybe while running a typical SSL session 
there will be one or two signatures and verifications at most every second.*

*I am writing cryptocurrency software and especially during replay, I see 
absolutely no sane reason to spoof timing of processing when no other party 
is involved. Or memory wiping, either, since it's not even performing 
actual signatures, only verification. There is also other aspects that I 
have become aware of - that being that many of these stdlib functions also 
make an assumption about use case - constant time as I mentioned, for HMAC, 
and in general reducing sidechannel information, but another is memory 
utilisation. Some processes are very memory bus heavy, some are very cache 
heavy. The differences have stunning effects on performance when they are 
very wrongly matched, just consider that the typical cache memory speed is 
3-4x main memory,  and running code optimised for filling the memory bus 
using functions optimised for cache locality, or vice versa, is naturally 
going to cause some frustrating latency issues.*

*I'll be of course continuing to study this area as it touches on a lot of 
what I'm working on right now, but I would like to hear the justifications 
for default constant-time in go stdlib because for those cases where timing 
is not exposing potential exploits, it is quite simply throttling the 
processing for no reason, which would surely be adding latency, working 
against all the good stuff that goroutines enable in this area.*

https://play.golang.org/p/lEfaDdKfKvf

package main

import (
"fmt"
"time"
)

var packetchan = make(chan int, 3)
var incomingchan = make(chan int, 3)
var returnchan = make(chan int, 3)

func main() {
go incomer()
go returner()
pusher()
}

func pusher() {
counter := 0
done := false
for !done {
if counter > 7 {
fmt.Println("stopping")
done = true
close(incomingchan)
incomingchan = nil
close(packetchan)
packetchan = nil
close(returnchan)
returnchan = nil
} else {
packetchan <- counter
fmt.Println("OUT", counter, "-> packetchan")
time.Sleep(time.Second)
counter++
}
}
}

func incomer() {
counter := 0
bundled := []int{}
done := false
for !done {
select {
case packet := <-packetchan:
if counter < 3 {
fmt.Println("INC", "packetchan <-", packet)
bundled = append(bundled, packet)
returnchan <- packet
fmt.Println("INC", packet, "-> returnchan")
counter++
} else {
fmt.Println("INC", "-> packetchan {", bundled, "}")
counter = 0
bundled = []int{}
}
}
}
}

func returner() {
counter := 0
done := false
for !done {
select {
case ret := <-returnchan:
fmt.Println("RET", "returnchan <-", ret)
incomingchan <- ret
fmt.Println("RET", ret, "-> incomingchan")
counter++
}
}
}

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to