Public bug reported:
Description: Ubuntu 18.04 LTS
Release: 18.04
Expected behavior: profile output
Actual behavior: error messages
Reproduce as follows:
cd NVIDIA_CUDA-9.1_Samples/0_Simple/matrixMul
nvcc -I ../../common/inc matrixMul.cu -o matrixMul
# check the exe works
./matrixMul
[Matrix Multiply Using CUDA] - Starting...
GPU Device 0: "GeForce GTX 1080" with compute capability 6.1
MatrixA(320,320), MatrixB(640,320)
Computing result using CUDA Kernel...
done
Performance= 1137.23 GFlop/s, Time= 0.115 msec, Size= 131072000 Ops,
WorkgroupSize= 1024 threads/block
Checking computed result for correctness: Result = PASS
NOTE: The CUDA Samples are not meant for performance measurements.
Results may vary when GPU Boost is enabled.
# now try nvprof
nvprof ./matrixMul
[Matrix Multiply Using CUDA] - Starting...
==4775== NVPROF is profiling process 4775, command: ./matrixMul
GPU Device 0: "GeForce GTX 1080" with compute capability 6.1
MatrixA(320,320), MatrixB(640,320)
Computing result using CUDA Kernel...
done
==4775== Error: Internal profiling error 4168:999.
Performance= 1130.40 GFlop/s, Time= 0.116 msec, Size= 131072000 Ops,
WorkgroupSize= 1024 threads/block
Checking computed result for correctness: Result = PASS
NOTE: The CUDA Samples are not meant for performance measurements. Results may
vary when GPU Boost is enabled.
======== Error: CUDA profiling error.
# run with sudo
sudo nvprof ./matrixMul
[Matrix Multiply Using CUDA] - Starting...
==4797== NVPROF is profiling process 4797, command: ./matrixMul
GPU Device 0: "GeForce GTX 1080" with compute capability 6.1
MatrixA(320,320), MatrixB(640,320)
Computing result using CUDA Kernel...
done
Performance= 1132.95 GFlop/s, Time= 0.116 msec, Size= 131072000 Ops,
WorkgroupSize= 1024 threads/block
Checking computed result for correctness: Result = PASS
NOTE: The CUDA Samples are not meant for performance measurements. Results may
vary when GPU Boost is enabled.
==4797== Profiling application: ./matrixMul
==4797== Profiling result:
Type Time(%) Time Calls Avg Min Max
Name
GPU activities: 99.54% 34.644ms 301 115.10us 114.15us 116.07us
void matrixMulCUDA<int=32>(float*, float*, float*, int, int)
0.28% 98.465us 2 49.232us 32.960us 65.505us
[CUDA memcpy HtoD]
0.18% 62.944us 1 62.944us 62.944us 62.944us
[CUDA memcpy DtoH]
API calls: 74.77% 110.27ms 3 36.757ms 3.4300us 110.26ms
cudaMalloc
22.45% 33.105ms 1 33.105ms 33.105ms 33.105ms
cudaEventSynchronize
0.93% 1.3780ms 3 459.33us 427.70us 478.26us
cudaGetDeviceProperties
0.81% 1.1874ms 301 3.9440us 3.7260us 18.511us
cudaLaunch
0.36% 536.51us 3 178.84us 56.346us 363.23us
cudaMemcpy
0.31% 451.50us 94 4.8030us 301ns 228.31us
cuDeviceGetAttribute
0.11% 156.37us 1 156.37us 156.37us 156.37us
cudaDeviceSynchronize
0.09% 132.82us 1505 88ns 79ns 289ns
cudaSetupArgument
0.07% 100.43us 3 33.475us 4.3440us 83.746us
cudaFree
0.06% 82.848us 1 82.848us 82.848us 82.848us
cuDeviceTotalMem
0.02% 35.673us 301 118ns 110ns 801ns
cudaConfigureCall
0.02% 33.788us 1 33.788us 33.788us 33.788us
cuDeviceGetName
0.00% 5.3080us 2 2.6540us 2.2050us 3.1030us
cudaEventRecord
0.00% 3.2350us 2 1.6170us 1.0960us 2.1390us
cudaEventCreate
0.00% 2.8120us 1 2.8120us 2.8120us 2.8120us
cudaSetDevice
0.00% 2.0920us 1 2.0920us 2.0920us 2.0920us
cudaEventElapsedTime
0.00% 1.7410us 3 580ns 292ns 1.0710us
cuDeviceGetCount
0.00% 1.0230us 2 511ns 353ns 670ns
cuDeviceGet
0.00% 658ns 1 658ns 658ns 658ns
cudaGetDeviceCount
ProblemType: Bug
DistroRelease: Ubuntu 18.04
Package: nvidia-profiler 9.1.85-3
ProcVersionSignature: Ubuntu 4.15.0-20.21-generic 4.15.17
Uname: Linux 4.15.0-20-generic x86_64
NonfreeKernelModules: nvidia_modeset nvidia
ApportVersion: 2.20.9-0ubuntu7
Architecture: amd64
Date: Thu Apr 26 17:28:48 2018
Dependencies:
gcc-8-base 8-20180414-1ubuntu2
libc6 2.27-3ubuntu1
libcuinj64-9.1 9.1.85-3
libgcc1 1:8-20180414-1ubuntu2
InstallationDate: Installed on 2018-04-21 (5 days ago)
InstallationMedia: Ubuntu 18.04 LTS "Bionic Beaver" - Alpha amd64 (20180421)
ProcEnviron:
TERM=xterm-256color
PATH=(custom, no user)
XDG_RUNTIME_DIR=<set>
LANG=en_US.UTF-8
SHELL=/bin/bash
SourcePackage: nvidia-cuda-toolkit
UpgradeStatus: No upgrade log present (probably fresh install)
** Affects: nvidia-cuda-toolkit (Ubuntu)
Importance: Undecided
Status: New
** Tags: amd64 apport-bug bionic
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1767205
Title:
nvprof does not complete without sudo
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/nvidia-cuda-toolkit/+bug/1767205/+subscriptions
--
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs