If your program is spending that much time in GC, then you're likely
spending a lot of time with the pointer write barriers on, which slow
execution (for the benefit of having your code run at all during GC.)

What happens if you run with GODEBUG=gcstoptheworld=1? That should help the
GC run faster and avoid slowing down your code's execution at the cost of
some latency spikes (which I think for your use case does not matter at
all.)

Consider using `sync.Pool`s to avoid so many allocations iff your objects
have an easily determinable lifetime; for example, you can reuse
`bufio.Reader`s by taking advantage of their `Reset` method.

The `processFromChannel` writes to the ack channel for every file; is that
really necessary? I would have instead sent a single ack for each worker
(after the for loop ends), which avoids the communication overhead per
file. Also, the ackChannel sent is unbuffered, so that means communication
is even more expensive. You could rewrite it to use a sync.WaitGroup, which
I think yields more readable code, but I don't think that's part of the
performance problem (if you change it to only send one ack per worker, that
is.)

On Sun, Oct 30, 2016 at 6:37 AM, Florian Weimer <f...@deneb.enyo.de> wrote:

> * John Morrice:
>
> > Thought I'd be kind and illustrate your bug vs my solution with code.  I
> > use WaitGroups in both examples to keep things clearer.
> >
> > Your bug:
> > https://play.golang.org/p/yoNPmbXnlW
> >
> > I.e.
> >
> > Ringo wrote Yellow Submarine
> > Ringo wrote I am the Walrus
> > Ringo wrote Eleanor Rigby
> > Ringo wrote Come Together
>
> This is misleading because there is too little work involved.
>
> With a larger work queue, I get a distribution which is pretty even:
>
> sum: 41385055007
> sum: 40815802419
> sum: 41898216012
> sum: 41672900827
> sum: 41802902931
> sum: 40815422395
> sum: 42163589241
> sum: 41552235636
> sum: 41989489360
> sum: 42218461137
> sum: 41673625671
> sum: 42011799364
> total: 499999500000
>       (499999500000)
>
> You are right that the specification permits that a receive operation
> drains an arbitrary number of values from the channel into a
> goroutine-local buffer (and draining more than one value at a time may
> be beneficial for performance erasons).  But I don't think the current
> implementation does this.
>
> I haven't bothered to convert this example to the
> one-task-per-goroutine style because creating one goroutine per task
> is too wasteful for such small tasks.
>
> package main
>
> import (
>         "fmt"
> )
>
> func summation(source chan int, result chan int) {
>         sum := 0
>         for i := range source {
>                 sum += i
>         }
>         result <- sum
> }
>
> func main() {
>         integerCount := 1000000
>         integers := make(chan int, integerCount)
>         for i := 0; i < integerCount; i++ {
>                 integers <- i
>         }
>         close(integers)
>
>         threads := 12
>         result := make(chan int)
>
>         for i := 0; i < threads; i++ {
>                 go summation(integers, result)
>         }
>
>         total := 0
>         for i := 0; i < threads; i++ {
>                 sum := <- result
>                 fmt.Printf("sum: %d\n", sum)
>                 total += sum
>         }
>         fmt.Printf("total: %d\n", total)
>         fmt.Printf("      (%d)\n",
>                 (integerCount - 1) * integerCount / 2)
> }
>
> --
> You received this message because you are subscribed to the Google Groups
> "golang-nuts" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to golang-nuts+unsubscr...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to