On Monday, 13 November 2017 15:12:56 GMT Daniel Frey wrote:
> On 11/13/17 02:59, Peter Humphrey wrote:
> > Hello list,
> >
> > I'm hunting a problem with cooling in this box, and I've got as far as
> > suspecting my new AMD WX 5100 GPU.
> >
> > One of my BOINC projects causes the GPU temperature, as shown by
> > gkrellm, to shoot up to 75C or more and cause intolerable system
> > cooling noise. If I suspend that project but leave the other seven
> > running, the temperature returns to what I hope is a normal 55C. Those
> > seven projects are supposed to use the GPU, but I'm not sure whether
> > they do in fact.
> >
> > Is there any way I can monitor what is using the GPU, to find out?
>
> I don't know if there's a utility for consumer level cards that can do
> this. I do remember for Nvidia there's nvidia-smi but I don't think it
> will list processes for desktop cards.
This isn't consumer grade (look it up in your local shops ;-) ):
# lspci -v -s 01:00.0
01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI]
Ellesmere [Radeon Pro WX 5100] (prog-if 00 [VGA controller])
Subsystem: Advanced Micro Devices, Inc. [AMD/ATI] Ellesmere [Radeon
Pro WX 5100]
Flags: bus master, fast devsel, latency 0, IRQ 34, NUMA node 0
Memory at c0000000 (64-bit, prefetchable) [size=256M]
Memory at d0000000 (64-bit, prefetchable) [size=2M]
I/O ports at e000 [size=256]
Memory at fbe00000 (32-bit, non-prefetchable) [size=256K]
Expansion ROM at 000c0000 [disabled] [size=128K]
Capabilities: [48] Vendor Specific Information: Len=08 <?>
Capabilities: [50] Power Management version 3
Capabilities: [58] Express Legacy Endpoint, MSI 00
Capabilities: [a0] MSI: Enable+ Count=1/1 Maskable- 64bit+
Capabilities: [100] Vendor Specific Information: ID=0001 Rev=1
Len=010 <?>
Capabilities: [150] Advanced Error Reporting
Capabilities: [200] #15
Capabilities: [270] #19
Capabilities: [2b0] Address Translation Service (ATS)
Capabilities: [2c0] Page Request Interface (PRI)
Capabilities: [2d0] Process Address Space ID (PASID)
Capabilities: [320] Latency Tolerance Reporting
Capabilities: [328] Alternative Routing-ID Interpretation (ARI)
Capabilities: [370] L1 PM Substates
Kernel driver in use: amdgpu
> The only other generic ones I can think of are cuda-z and gputop. Have
> you tried one of those? Although I don't think it'll give you the
> information you need either.
As it's AMD, not nVidia, nvidia-smi and cuda aren't suitable. I hadn't heard
of GPU Top - thanks. I'll have a look at it.
I forgot to add that I'm using the proprietary dev-libs/amdgpu-pro-opencl
because mesa hasn't caught up yet.
--
Regards,
Peter.