GC should not impact general performance in such scenario but is likely to hinder latency which pretty much matches what you are observing. Can you possibly provide information about network layout, h/w and server code used for testing? There are lot of possible oversights that can make results less legible, would have been nice to verify those ;)
I'd have to dig in and see data to be convinced it's the GC making the latency worse than the C implementation, but from what I know you've got more experience with this so who knows :)
Doing this on a real network was waaay more work than I was willing to put up with for something a colleague said during lunch break, so it was done on localhost. The h/w is a Core i7 Lenovo laptop (W530) with 8GB of RAM running Arch Linux. My server code can be found at https://github.com/atilaneves/mqtt, the C implementation has its own website: http://mosquitto.org/.
Atila
