On Wednesday 14 April 2004 22.27, John Lazzaro wrote: > If what you mean by "operating at the ethernet level" means > "no Cobra-like hardware to help, but putting data directly > into Etherframes w/o IP/RTP headers", then its unclear to me that > working at the RTP/IP level is going to hurt you much. The > simplest implementation would have RTP/IP header overhead, > but there are nice header compression schemes that get rid of it: > > http://www.ietf.org/rfc/rfc2508.txt > > and its improved versions. By using RTP, you get a lot of > protocol design you might otherwise need to do, > within RTP (like RTP MIDI) and surrounding it > (session management, etc).
Perhaps RTP is good enough. However, at several hundreds of megabits of throughput, it could be valuable not needing to calculate checksums in software. I guess IP and UDP checksums usually are calculated in software. Possibly other CPU time/latency inducing layers can be bypassed by accessing the ethernet layer directly, and possibly the use of jumbo frames etc is better controlled then. RTP is not designed for the same purpose though, it would be a bit overkill. Using RTP would give the impression that the solution would work over a large routed network, although it will only work in a tight isolated ethernet running no traffic but audio. > One big thing you need to worry about are clocks -- unlike > a protocol like AES/EBU or SPIDF, packet-based media is > not sending an implicit clock along with the data. So, the > nominal "sender sampling rate" can't be precisely linked to > the nominal "receiver sampling rate" in a simple way. The > consequence is either too much data piles up at the receiver, > or not enough. One solution to this problem is to continuously > running a sample-rate converter at the receiver in software, > to keep the two sampling rates locked. See: The sample clock is passed separately with for example wordclock. I don't expect to use more than one computer with a sound card in most cases though, so then it is a non-issue. The idea is basically take inputs on the sound card, broadcast them to the convolver nodes, which convolves and unicast back the result to the machine with the sound card which mixes all inputs from the nodes and puts the result out on the sound card outputs. For WFS it could for example be 20 megabit/s broadcast and total of 400 megabit/s of unicast. > A separate issue for your "many streams" case is synchronizing > the streams to each other, in the case where not all share the > same nominal clock. RTP has tools for this, based on > associating NTP timestamps from a common clock to each > independent stream, that get used for audio/video lipsync, > and can be repurposed here as well. RTP is designed to transport realtime data over a routed network, when packets can be lost, re-ordered and stuff like that. In the controlled ethernet environment for this system there will be no packet losses (I hope), and no reordering. It would probably work in the way that a roundtrip time is tuned and found at startup, and then that is added with some marigin to the I/O-delay in the sound card machine. That is probably quite easy. A small challenge will be to synchronise dynamic commands such as filter changes and such, so they happen in the same block index (synchronised) for all nodes. By specifying block index in the command, synchronisation will be easily kept though, then the problem is just to make the command reach all nodes in time. The simple way is to have quite high latency for the commands, but one would want to minimise that. /Anders