Ping for review On Thu, Dec 24, 2015 at 07:01:52PM +0800, [email protected] wrote: > Date: Thu, 24 Dec 2015 19:01:52 +0800 > From: [email protected] > To: [email protected] > Subject: [PATCH 00/18] Enable profiling by line number. > X-Mailer: git-send-email 1.7.9.5 > > From: Junyan He <[email protected]> > > This patch set will let the user to specify the line numbers in the source > code to insert the profiling watch points. > As the first step, we just use the env var OCL_PROFILING_LINES to control > the kernel name and line numbers. The format is: > KERNEL_NAME:PROFILING_MODE:LINE_NUMBER0,LINE_NUMBER1,LINE_NUMBER2,... > for example: > export OCL_PROFILING_LINES="builtin_atanpi_float8:2:2,6,7,8,15" > will insert watch points at 2 6 7 8 15 lines in the kernel named > builtin_atanpi_float8. > We have 3 PROFILING_MODE, > level 1: just brief timestamp with line number. > Total log number is 6 > Line 2: Timestamp: 190 Thread Exec:6 > Line 6: Timestamp: 1174 Thread Exec:6 > Line 7: Timestamp: 3092 Thread Exec:6 > Line 8: Timestamp: 3105 Thread Exec:6 > Line 15: Timestamp: 3241 Thread Exec:6 > > level 2: timestamp with source, plus: > Format: Average Timestamp Exec number Source > | __kernel void > builtin_atanpi_float8(__global float *dst, __global float *src1, __global > int *vector) { > TS: 190 Num: 6 ----> | int i = get_global_id(0); > | float8 x1 = (float8) (src1[i * (*vector) > + 0],src1[i * (*vector) + 1],src1[i * (*vector) + 2],src1[i * (*vector) + > 3],src1[i * (*vector) + 4],src1[i * (*vector) + 5],src1[i * (*vector) + > 6],src1[i * (*vector) + 7]); > | > | float8 ret; > TS: 1174 Num: 6 ----> | ret = atanpi(x1); > TS: 3092 Num: 6 ----> | dst[i * (*vector) + 0] = ret[0]; > TS: 3105 Num: 6 ----> | dst[i * (*vector) + 1] = ret[1]; > | dst[i * (*vector) + 2] = ret[2]; > | dst[i * (*vector) + 3] = ret[3]; > | dst[i * (*vector) + 4] = ret[4]; > | dst[i * (*vector) + 5] = ret[5]; > | dst[i * (*vector) + 6] = ret[6]; > | dst[i * (*vector) + 7] = ret[7]; > TS: 3241 Num: 6 ----> | }; > > > level 3: output the detail logs, add all logs as: > ------------------------ Log 0 ----------------------- > | fix functions id: 7 simd: 16 kernel id: 0 | > | thread id: 0 EU id: 8 sub slice id: 0 slice id 0 | > | dispatch Mask: 1 prolog: 6860 epilog: 19548 | > | globalX: 3~ 3 globalY: 0~ 0 globalZ: 0~ 0 | > | ts0 : 201 | ts1 : 1180 | ts2 : 12417 | > | ts3 : 12430 | ts4 : 12637 | ts5 : 0 | > | ts6 : 0 | ts7 : 0 | ts8 : 0 | > | ts9 : 0 | ts10: 0 | ts11: 0 | > | ts12: 0 | ts13: 0 | ts14: 0 | > | ts15: 0 | ts16: 0 | ts17: 0 | > | ts18: 0 | ts19: 0 | | > ------------------------ Log 1 ----------------------- > | fix functions id: 7 simd: 16 kernel id: 0 | > | thread id: 0 EU id: 8 sub slice id: 1 slice id 0 | > | dispatch Mask: 1 prolog: 6877 epilog: 19569 | > | globalX: 4~ 4 globalY: 0~ 0 globalZ: 0~ 0 | > | ts0 : 209 | ts1 : 1190 | ts2 : 12423 | > | ts3 : 12436 | ts4 : 12643 | ts5 : 0 | > | ts6 : 0 | ts7 : 0 | ts8 : 0 | > | ts9 : 0 | ts10: 0 | ts11: 0 | > | ts12: 0 | ts13: 0 | ts14: 0 | > | ts15: 0 | ts16: 0 | ts17: 0 | > | ts18: 0 | ts19: 0 | | > ..... > ..... > > > > Some problems: > 1. On BDW, the timestamp sometimes gives invalid huge value. > It may be a HW issue or feature, we need to check it further. > 2. Sometimes the line number of instruction is different from the > source code. This is caused by optimization and we can notice > and analyse it by Gen IR or ASM. I will send a patch to set > optimization level later. > 3. Some line numbers are missing when there are lots of inline > function call and macro. I will investigate llvm and clang to > find a better solution. > > I will write a detail doc about the profiling later. > > With this parch set, please just ignore the previous two patchsets > about debug info. > > > Signed-off-by: Junyan He <[email protected]> > --- > backend/src/backend/gen_context.cpp | 20 +- > backend/src/backend/gen_encoder.cpp | 9 +- > backend/src/backend/gen_insn_selection.cpp | 10 +- > backend/src/backend/gen_insn_selection_output.cpp | 3 + > backend/src/backend/gen_program.cpp | 11 +- > backend/src/backend/gen_register.hpp | 4 +- > backend/src/backend/program.cpp | 34 ++-- > backend/src/backend/program.h | 3 +- > backend/src/backend/program.hpp | 2 +- > backend/src/ir/context.cpp | 4 +- > backend/src/ir/function.cpp | 6 +- > backend/src/ir/function.hpp | 10 + > backend/src/ir/instruction.cpp | 11 +- > backend/src/ir/instruction.hpp | 3 + > backend/src/ir/lowering.cpp | 15 +- > backend/src/ir/profiling.cpp | 72 +++++++- > backend/src/ir/profiling.hpp | 35 +++- > backend/src/ir/structurizer.cpp | 55 ++++-- > backend/src/ir/structurizer.hpp | 6 +- > backend/src/ir/unit.cpp | 9 +- > backend/src/ir/unit.hpp | 12 +- > backend/src/llvm/ExpandConstantExpr.cpp | 6 +- > backend/src/llvm/ExpandLargeIntegers.cpp | 173 +++++++++-------- > backend/src/llvm/llvm_gen_backend.cpp | 11 +- > backend/src/llvm/llvm_gen_backend.hpp | 11 +- > backend/src/llvm/llvm_intrinsic_lowering.cpp | 8 +- > backend/src/llvm/llvm_passes.cpp | 13 +- > backend/src/llvm/llvm_profiling.cpp | 215 > +++++++++++++++++----- > backend/src/llvm/llvm_sampler_fix.cpp | 7 + > backend/src/llvm/llvm_scalarize.cpp | 5 +- > backend/src/llvm/llvm_to_gen.cpp | 4 +- > backend/src/llvm/llvm_to_gen.hpp | 2 +- > src/cl_program.c | 3 +- > 33 files changed, 535 insertions(+), 257 deletions(-)
_______________________________________________ Beignet mailing list [email protected] http://lists.freedesktop.org/mailman/listinfo/beignet
