> Once you have some proper benchmarks, it might be fun to compare GoAWK's >>> performance to that of my awk package <https://github.com/spakin/awk>. >>> >> I'm not going to do thorough benchmarks at this point, but it looks like GoAWK is significantly faster at present. Using the example in the https://github.com/spakin/awk README, which is equivalent to this AWK script:
BEGIN { FS = OFS = "," } { $3 = $1+$2; print } On a file with 1M lines of random numbers, with the example as is (no stdout buffering) GoAWK takes about 1.1 seconds, and spakin/awk takes 36 seconds! However, most of this is due to the non-buffered writes to os.Stdout. GoAWK automatically wraps os.Stdout in a bufio.Writer (though I'd forgotten to do this at first as well). When I added the line (before s.Run): s.Output = bufio.NewWriterSize(os.Stdout, 64*1024) It speeds up spakin/awk by a factor of about 10x to 3.6 seconds. So GoAWK is about 3x as fast for this simple (but not unrealistic) benchmark. I generated the 1M line random file using this Python script (guess I should have used AWK :-): import random, sys for _ in range(int(sys.argv[1])): n = random.randrange(1000000) m = random.randrange(1000000) print('%d,%d' % (n, m)) So my main suggestion (for spakin/awk) would be able to wrap os.Stdout in a bufio.NewWriter (and be sure to call Flush before Run finishes). If the user wants to pass an unbuffered version, they still can, but at least the default is performant. I also added CPU profiling to the spakin/awk script, and it looks like it's doing a bunch more garbage collection than GoAWK, as well as some regexp stuff. I suspect NewValue() is probably quite slow as it takes an interface{} and does type checking. Also, strings are converted to numbers using a regex, which is probably slower than a dedicated conversion/check function (see parseFloatPrefix in goawk/interp/value.go). See more optimization ideas in my post at https://benhoyt.com/writings/goawk/ -Ben On Thu, Nov 22, 2018 at 11:24 PM Tong Sun <suntong...@gmail.com> wrote: > > > On Tuesday, August 28, 2018 at 9:06:22 AM UTC-4, Ben Hoyt wrote: >> >> Once you have some proper benchmarks, it might be fun to compare GoAWK's >>> performance to that of my awk package <https://github.com/spakin/awk>. >>> >> >> Nice -- will do! >> > > Please post back when you've done that. > > I'm interested to know. Thx. > > -- > You received this message because you are subscribed to a topic in the > Google Groups "golang-nuts" group. > To unsubscribe from this topic, visit > https://groups.google.com/d/topic/golang-nuts/kYZp3Q1KKfE/unsubscribe. > To unsubscribe from this group and all its topics, send an email to > golang-nuts+unsubscr...@googlegroups.com. > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "golang-nuts" group. To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.