Hello,

On Mon, Nov 13, 2017 at 9:46 AM, Peter Humphrey <[email protected]> wrote:
> On Monday, 13 November 2017 15:12:56 GMT Daniel Frey wrote:
>> On 11/13/17 02:59, Peter Humphrey wrote:
>> > Hello list,
>> >
>> > I'm hunting a problem with cooling in this box, and I've got as far as
>> > suspecting my new AMD WX 5100 GPU.
>> >
>> > One of my BOINC projects causes the GPU temperature, as shown by
>> > gkrellm, to shoot up to 75C or more and cause intolerable system
>> > cooling noise. If I suspend that project but leave the other seven
>> > running, the temperature returns to what I hope is a normal 55C. Those
>> > seven projects are supposed to use the GPU, but I'm not sure whether
>> > they do in fact.
>> >
>> > Is there any way I can monitor what is using the GPU, to find out?
>>
>> I don't know if there's a utility for consumer level cards that can do
>> this. I do remember for Nvidia there's nvidia-smi but I don't think it
>> will list processes for desktop cards.
>
> This isn't consumer grade (look it up in your local shops ;-) ):
>
> # lspci -v -s 01:00.0
> 01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI]
> Ellesmere [Radeon Pro WX 5100] (prog-if 00 [VGA controller])
>         Subsystem: Advanced Micro Devices, Inc. [AMD/ATI] Ellesmere [Radeon
> Pro WX 5100]
>         Flags: bus master, fast devsel, latency 0, IRQ 34, NUMA node 0
>         Memory at c0000000 (64-bit, prefetchable) [size=256M]
>         Memory at d0000000 (64-bit, prefetchable) [size=2M]
>         I/O ports at e000 [size=256]
>         Memory at fbe00000 (32-bit, non-prefetchable) [size=256K]
>         Expansion ROM at 000c0000 [disabled] [size=128K]
>         Capabilities: [48] Vendor Specific Information: Len=08 <?>
>         Capabilities: [50] Power Management version 3
>         Capabilities: [58] Express Legacy Endpoint, MSI 00
>         Capabilities: [a0] MSI: Enable+ Count=1/1 Maskable- 64bit+
>         Capabilities: [100] Vendor Specific Information: ID=0001 Rev=1
> Len=010 <?>
>         Capabilities: [150] Advanced Error Reporting
>         Capabilities: [200] #15
>         Capabilities: [270] #19
>         Capabilities: [2b0] Address Translation Service (ATS)
>         Capabilities: [2c0] Page Request Interface (PRI)
>         Capabilities: [2d0] Process Address Space ID (PASID)
>         Capabilities: [320] Latency Tolerance Reporting
>         Capabilities: [328] Alternative Routing-ID Interpretation (ARI)
>         Capabilities: [370] L1 PM Substates
>         Kernel driver in use: amdgpu
>
>> The only other generic ones I can think of are cuda-z and gputop. Have
>> you tried one of those? Although I don't think it'll give you the
>> information you need either.
>
> As it's AMD, not nVidia, nvidia-smi and cuda aren't suitable. I hadn't heard
> of GPU Top - thanks. I'll have a look at it.
>
> I forgot to add that I'm using the proprietary dev-libs/amdgpu-pro-opencl
> because mesa hasn't caught up yet.
>

The level of detail you want will likely necessitate the use of a GPU
debugger. AMD provides CodeXL, located at
https://gpuopen.com/compute-product/codexl/. I suggest looking at the
profiling features.

You may want to communicate your findings to the relevant BOINC projects.

Cheers,
     R0b0t1

Reply via email to