Re: [go-nuts] pprof: what are runtime.scanobject/gentraceback/pcvalue/etc

2016-09-21 Thread mhhcbon
... Sorry for the publishing issues..


// BytesByLine emits chunks of []byte splitted by line (\n).
func BytesByLine() *Transform {
  var buf []byte
  var EOL = []byte("\n")

  byLine := func () [][]byte {
ret := make([][]byte, 0)
lines := bytes.Split(buf, EOL)
isEOL := len(buf)>0 && buf[len(buf)-1:][0]==EOL[0]
for i, line := range lines {
  if i==len(lines)-1 {
if isEOL && len(line)>0 {
  buf=buf[:0]
  ret = append(ret, line)
} else {
  buf = line
}
  } else {
ret = append(ret, line)
  }
}
return ret
  }

  var TT TransformFunc
  TT = func (s Stream, data interface{}) {
if val, ok := data.([]byte); ok {
  buf = append(buf, val...)
  for _, line := range byLine() {
s.Push(line)
  }
} else {
  s.Emit(errors.New("Value must be of type []byte"))
}
  }

  var FF FlushFunc
  FF = func (s Stream) {
for _, line := range byLine() {
  s.Push(line)
}
if len(buf)>0 {
  s.Push(buf)
}
  }

  return NewTransformFlush(TT, FF)
}

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [go-nuts] pprof: what are runtime.scanobject/gentraceback/pcvalue/etc

2016-09-21 Thread mhhcbon
Hi,

thanks again for all the explanations.

I kept digging, and now i m having the same behavior as this thread

https://groups.google.com/forum/#!topic/golang-nuts/P1EFc4kMBfo

I run test with memprofile enabled, then use pprof to see the results, but 
the numbers are all 0.

$ go test -v -bench=. -cpuprofile=cpu.out -memprofile mem.out -benchmem -
short stream/*
=== RUN   TestReader
--- PASS: TestReader (0.00s)
=== RUN   TestReader2
--- PASS: TestReader2 (0.00s)
BenchmarkByLine-4 10 14654 ns/op2248 B/op   
   39 allocs/op
PASS
ok  command-line-arguments1.635s


$ go tool pprof stream.test mem.out 
Entering interactive mode (type "help" for commands)
(pprof) top10
0 of 0 total (0%)
  flat  flat%   sum%cum   cum%
 0 0% 0%  0 0%  bytes.(*Buffer).WriteString
 0 0% 0%  0 0%  bytes.(*Buffer).grow
 0 0% 0%  0 0%  bytes.Split
 0 0% 0%  0 0%  bytes.genSplit
 0 0% 0%  0 0%  bytes.makeSlice
 0 0% 0%  0 0%  command-line-arguments.(*
Readable).Close
 0 0% 0%  0 0%  command-line-arguments.(*
Readable).Pipe
 0 0% 0%  0 0%  command-line-arguments.(*
Readable).Push
 0 0% 0%  0 0%  command-line-arguments.(*
Readable).Read
 0 0% 0%  0 0%  command-line-arguments.(*
Readable).ReadClose


I tried to put b.ReportAllocs() in front of the benchmark loop. It did not 
help.

At some points I added variations in the benchmark

$ go test -v -bench=. -cpuprofile=cpu.out -memprofile mem.out -benchmem -
short stream/*
=== RUN   TestReader
--- PASS: TestReader (0.00s)
=== RUN   TestReader2
--- PASS: TestReader2 (0.00s)
BenchmarkReader1-4 50  2337 ns/op 976 B/op 
 12 allocs/op
BenchmarkByLine-4  10 16560 ns/op2248 B/op 
 39 allocs/op
BenchmarkByLine2-4100  2113 ns/op 848 B/op 
 13 allocs/op
PASS
ok  command-line-arguments5.153s


which gives

$ go tool pprof stream.test mem.out
Entering interactive mode (type "help" for commands)
(pprof) top50
512.20kB of 512.20kB total (  100%)
Dropped 29 nodes (cum <= 2.56kB)
  flat  flat%   sum%cum   cum%
  512.20kB   100%   100%   512.20kB   100%  runtime.malg
 0 0%   100%   512.20kB   100%  runtime.allocm
 0 0%   100%   512.20kB   100%  runtime.newm
 0 0%   100%   512.20kB   100%  runtime.startTheWorldWithSema
 0 0%   100%   512.20kB   100%  runtime.systemstack


$ go tool pprof -inuse_objects stream.test mem.out
Entering interactive mode (type "help" for commands)
(pprof) top100
1260 of 1260 total (  100%)
Dropped 29 nodes (cum <= 6)
  flat  flat%   sum%cum   cum%
  1260   100%   100%   1260   100%  runtime.malg
 0 0%   100%   1260   100%  runtime.allocm
 0 0%   100%   1260   100%  runtime.newm
 0 0%   100%   1260   100%  runtime.startTheWorldWithSema
 0 0%   100%   1260   100%  runtime.systemstack


I m unclear because the tested func is definitely allocating,

// BytesByLine emits chunks of []byte split by line (\n).
func BytesByLine() *Transform {
  var buf []byte
  var EOL = []byte("\n")

  byLine := func () [][]byte {
ret := make([][]byte, 0)
lines := bytes.Split(buf, EOL)
isEOL := len(buf)>0 && buf[len(buf)-1:][0]==EOL[0]
for i, line := range lines {
  if i==len(lines)-1 {
if isEOL && len(line)>0 {
  buf=buf[:0]
  ret = append(ret, line)
} else {
  buf = line
}
  } else {
ret = append(ret, line)
  }
}
return ret
  }

  var TT TransformFunc
  TT = func (s Stream, data interface{}) {
if val, ok := data.([]byte); ok {
  buf = append(buf, val...)
  for _, line := range byLine() {
s.Push(line)
  }
} else {
  s.Emit(errors.New("Value must be of type []byte"))
}
  }

  var FF FlushFunc
  FF = func (s Stream) {
for _, line := range byLine() {
  s.Push(line)
}
if len(buf)>0 {
  s.Push(buf)
}
  }

  return NewTransformFlush(TT, FF)
}



Le mardi 20 septembre 2016 18:37:55 UTC+2, Ian Lance Taylor a écrit :
>
> On Tue, Sep 20, 2016 at 8:17 AM, mhhcbon  > wrote: 
> > 
> > on my way to learn to use pprof, i now face some intriguing results with 
> > lots of runtime.* calls, 
> > 
> > can you help me to understand what those call are and how i should care 
> > about ? 
> > 
> > (pprof) top30 
> > 8720ms of 14380ms total (60.64%) 
> > Dropped 118 nodes (cum <= 71.90ms) 
> > Showing top 30 nodes out of 143 (cum >= 3800ms) 
> >   flat  flat%   sum%cum   cum% 
> >  720ms  5.01%  5.01% 1050ms  7.30%  runtime.scanobject 
> >  700ms  4.87%  9.87% 3640m

Re: [go-nuts] pprof: what are runtime.scanobject/gentraceback/pcvalue/etc

2016-09-20 Thread Ian Lance Taylor
On Tue, Sep 20, 2016 at 8:17 AM, mhhcbon  wrote:
>
> on my way to learn to use pprof, i now face some intriguing results with
> lots of runtime.* calls,
>
> can you help me to understand what those call are and how i should care
> about ?
>
> (pprof) top30
> 8720ms of 14380ms total (60.64%)
> Dropped 118 nodes (cum <= 71.90ms)
> Showing top 30 nodes out of 143 (cum >= 3800ms)
>   flat  flat%   sum%cum   cum%
>  720ms  5.01%  5.01% 1050ms  7.30%  runtime.scanobject
>  700ms  4.87%  9.87% 3640ms 25.31%  runtime.gentraceback
>  670ms  4.66% 14.53% 1780ms 12.38%  runtime.pcvalue
>  490ms  3.41% 17.94%  490ms  3.41%  runtime.readvarint
>  420ms  2.92% 20.86%  910ms  6.33%  runtime.step
>  420ms  2.92% 23.78%  420ms  2.92%  runtime/internal/atomic.Xchg
>  380ms  2.64% 26.43%  380ms  2.64%  runtime.heapBitsForObject
>  340ms  2.36% 28.79% 1580ms 10.99%  runtime.mallocgc
>  310ms  2.16% 30.95%  360ms  2.50%  runtime.gopark
>  270ms  1.88% 32.82%  270ms  1.88%  runtime.heapBitsSetType
>  260ms  1.81% 34.63%  510ms  3.55%  runtime.scanblock
>  250ms  1.74% 36.37%  250ms  1.74%  runtime/internal/atomic.Cas
>  240ms  1.67% 38.04% 4040ms 28.09%
> github.com/mh-cbon/chan/stream.(*Transform).Push
>  240ms  1.67% 39.71% 1210ms  8.41%  runtime.schedule
>  230ms  1.60% 41.31%  230ms  1.60%  runtime.newdefer
>  220ms  1.53% 42.84%  320ms  2.23%  runtime.deferreturn
>  210ms  1.46% 44.30%  260ms  1.81%  bytes.genSplit
>  210ms  1.46% 45.76%  260ms  1.81%  runtime.greyobject
>  210ms  1.46% 47.22% 6080ms 42.28%  runtime.systemstack
>  200ms  1.39% 48.61%  210ms  1.46%  runtime.acquireSudog
>  200ms  1.39% 50.00% 1760ms 12.24%  runtime.scanframeworker
>  190ms  1.32% 51.32%  790ms  5.49%  sync.(*Mutex).Lock
>  180ms  1.25% 52.57%  180ms  1.25%  runtime.gogo
>  180ms  1.25% 53.82%  180ms  1.25%  runtime.memmove
>  170ms  1.18% 55.01%  790ms  5.49%  runtime.chanrecv
>  170ms  1.18% 56.19%  530ms  3.69%  runtime.lock
>  170ms  1.18% 57.37%  170ms  1.18%  runtime.usleep
>  160ms  1.11% 58.48%  220ms  1.53%  runtime.siftdownTimer
>  160ms  1.11% 59.60%  160ms  1.11%  runtime/internal/atomic.Load
>  150ms  1.04% 60.64% 3800ms 26.43%  main.main.func2
>
> Also, I m not totally comfortable with flat Vs cum meanings,
> and those basics are not covered in
> https://blog.golang.org/profiling-go-programs
>
> Can you take a moment to explain those two notions ?

All the top functions in this profile are being run by the garbage collector.

The "flat" numbers are the time used by just that function.  The "cum"
numbers are the time used by that function and all the functions that
it calls.


> One more Q. In this pprof output
>
> (pprof) list Push
> Total: 14.38s
> ROUTINE 
> github.com/mh-cbon/chan/stream.(*Readable).Push in
> /home/mh-cbon/gow/src/github.com/mh-cbon/chan/stream/reader.go
>  0  1.65s (flat, cum) 11.47% of Total
>  .  . 79:func (s *Readable) Push(b interface{}) {
>  .  . 80:  for s.Paused() {
>  .  . 81:<- time.After(time.Millisecond * 1)
>  .  . 82:  }
>  .  . 83:  s.writeMutex.Lock()
>  .   10ms 84:  defer s.writeMutex.Unlock()
>  .  . 85:  for _, o := range s.Pipes {
>  .  1.64s 86:o.Write(b)
>  .  . 87:  }
>  .  . 88:}
>  .  . 89:func (s *Readable) Catch(fn ErrorFunc) Stream {
>  .  . 90:  s.E = append(s.E, fn)
>  .  . 91:  return s
> ROUTINE 
> github.com/mh-cbon/chan/stream.(*Transform).Push in
> /home/mh-cbon/gow/src/github.com/mh-cbon/chan/stream/transform.go
>  240ms  4.13s (flat, cum) 28.72% of Total
>  .  . 72:  } else {
>  .  . 73:s.Push(b)
>  .  . 74:  }
>  .  . 75:}
>  .  . 76:// Push writes data down in the stream.
>   40ms   40ms 77:func (s *Transform) Push(b interface{}) {
>  .  . 78:  // s.pipeMutex.Lock()
>  .  . 79:  // defer s.pipeMutex.Unlock()
>  190ms  190ms 80:  for _, o := range s.Pipes {
>   10ms  3.90s 81:o.Write(b)
>  .  . 82:  }
>  .  . 83:}
>  .  . 84:func (s *Transform) Catch(fn ErrorFunc) Stream
> {
>  .  . 85:  s.E = append(s.E, fn)
>  .  . 86:  return s
>
>
>
> The tow methods are very similar, but Transform.Push shows 190ms for the for
> loop, where Readable.Push does not show anything.
> How to understand this ?

Without more information, I wo