Hi there!

TLDR; a simple test program appears to show that Go's (*os.File).Write is 
10x slower than C's fputs (on MacOS).

While doing some benchmarking for my lua implementation in Go [1], I found 
very big differences between C Lua and and golua for benchmarks that do a 
lot of output to stdout.  Using pprof, I found that my implementation 
spends a lot of its time in syscall.  I couldn't see an obvious reason why 
so I decided to make a minimal example.  It is a program that writes the 
string "Hello There" one million times to stdout:

-------- test.go --------
package main

import "os"

func main() {
    hello := []byte("Hello There\n") // To make it fairer
    for i := 0; i < 10000000; i++ {
        os.Stdout.Write(hello)
    }
}
--------- /test.go --------

To compare with, here what I think is the equivalent in C:

-------- test.c --------
#include <stdio.h>

int main() {
    for (int i = 0; i < 10000000; i++) {
        fputs("Hello There\n", stdout);
    }
    return 0;
}
-------- /test.c --------

I compared those using multitime [2], using both go 1.15.6 and the beta1 
release of go 1.16, using the following steps (I am using gvm to select 
different Go versions).

- Compile the Go version using go 1.15 and go 1.16, and the C version using 
clang.

$ gvm use go1.16beta1
Now using version go1.16beta1
$ go version && go build -o test-go1.16 test.go
go version go1.16beta1 darwin/amd64

$ gvm use go1.15.6
Now using version go1.15.6
$ go version && go build -o test-go1.15 test.go
go version go1.15.6 darwin/amd64

$ clang -o test-c test.c

- Check that the C version and the Go version output the same amount of 
data to stdout:

$ ./test-c | wc -c
 120000000
$ ./test-go1.15 | wc -c
 120000000

- Run each executable 5 times

$ cat >cmds <<EOF
> -q ./test-c
> -q ./test-go1.15
> -q ./test-go1.16
> EOF
$ multitime -b cmds -n 5
===> multitime results
1: -q ./test-c
            Mean        Std.Dev.    Min         Median      Max
real        0.524       0.070       0.476       0.492       0.662       
user        0.475       0.011       0.465       0.472       0.495       
sys         0.011       0.002       0.009       0.011       0.014       

2: -q ./test-go1.15
            Mean        Std.Dev.    Min         Median      Max
real        5.986       0.125       5.861       5.947       6.186       
user        3.717       0.040       3.677       3.715       3.788       
sys         2.262       0.034       2.221       2.260       2.314       

3: -q ./test-go1.16
            Mean        Std.Dev.    Min         Median      Max
real        5.958       0.160       5.781       5.941       6.213       
user        3.706       0.094       3.624       3.638       3.855       
sys         2.258       0.069       2.200       2.215       2.373       

There is no significant difference between 1.15 and 1.16, but both are more 
than 10 times slower than the C version.  Why is it so? Is there something 
that I can do to overcome this performance penalty?  Any insights would be 
appreciated.

FWIW, I am running these on MacOS Catalina
$ uname -v
Darwin Kernel Version 19.6.0: Thu Oct 29 22:56:45 PDT 2020; 
root:xnu-6153.141.2.2~1/RELEASE_X86_64

(sorry I haven't got easy access to a Linux box to run this on).

-- 
Arnaud Delobelle

[1] https://github.com/arnodel/golua
[2] https://tratt.net/laurie/src/multitime/multitime.1.html

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/golang-nuts/c7966990-d873-4d29-a5aa-9c52642d98fdn%40googlegroups.com.

Reply via email to