Yes, it's meant to be like a spec sheet. We are not interested in the current bandwidth utilization.
Marek On Wed, Jan 4, 2023 at 10:33 AM Lazar, Lijo <lijo.la...@amd.com> wrote: > [AMD Official Use Only - General] > > To clarify, with DPM in place, the current bandwidth will be changing > based on the load. > > If apps/umd already has a way to know the current bandwidth utilisation, > then possible maximum also could be part of the same API. Otherwise, this > only looks like duplicate information. We have the same information in > sysfs DPM nodes. > > BTW, I don't know to what extent app/umd really makes use of this. Take > that memory frequency as an example (I'm reading it as 16GHz). It only > looks like a spec sheet. > > Thanks, > Lijo > ------------------------------ > *From:* Marek Olšák <mar...@gmail.com> > *Sent:* Wednesday, January 4, 2023 8:40:00 PM > *To:* Lazar, Lijo <lijo.la...@amd.com> > *Cc:* amd-gfx@lists.freedesktop.org <amd-gfx@lists.freedesktop.org> > *Subject:* Re: [PATCH 1/2] drm/amdgpu: return the PCIe gen and lanes from > the INFO > > On Wed, Jan 4, 2023 at 9:19 AM Lazar, Lijo <lijo.la...@amd.com> wrote: > > > > On 1/4/2023 7:43 PM, Marek Olšák wrote: > > On Wed, Jan 4, 2023 at 6:50 AM Lazar, Lijo <lijo.la...@amd.com > > <mailto:lijo.la...@amd.com>> wrote: > > > > > > > > On 1/4/2023 4:11 AM, Marek Olšák wrote: > > > I see. Well, those sysfs files are not usable, and I don't think > it > > > would be important even if they were usable, but for completeness: > > > > > > The ioctl returns: > > > pcie_gen = 1 > > > pcie_num_lanes = 16 > > > > > > Theoretical bandwidth from those values: 4.0 GB/s > > > My DMA test shows this write bandwidth: 3.5 GB/s > > > It matches the expectation. > > > > > > Let's see the devices (there is only 1 GPU Navi21 in the system): > > > $ lspci |egrep '(PCI|VGA).*Navi' > > > 0a:00.0 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] Navi > > 10 XL > > > Upstream Port of PCI Express Switch (rev c3) > > > 0b:00.0 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] Navi > > 10 XL > > > Downstream Port of PCI Express Switch > > > 0c:00.0 VGA compatible controller: Advanced Micro Devices, Inc. > > > [AMD/ATI] Navi 21 [Radeon RX 6800/6800 XT / 6900 XT] (rev c3) > > > > > > Let's read sysfs: > > > > > > $ cat /sys/bus/pci/devices/0000:0a:00.0/current_link_width > > > 16 > > > $ cat /sys/bus/pci/devices/0000:0b:00.0/current_link_width > > > 16 > > > $ cat /sys/bus/pci/devices/0000:0c:00.0/current_link_width > > > 16 > > > $ cat /sys/bus/pci/devices/0000:0a:00.0/current_link_speed > > > 2.5 GT/s PCIe > > > $ cat /sys/bus/pci/devices/0000:0b:00.0/current_link_speed > > > 16.0 GT/s PCIe > > > $ cat /sys/bus/pci/devices/0000:0c:00.0/current_link_speed > > > 16.0 GT/s PCIe > > > > > > Problem 1: None of the speed numbers match 4 GB/s. > > > > US bridge = 2.5GT/s means operating at PCIe Gen 1 speed. Total > > theoretical bandwidth is then derived based on encoding and total > > number > > of lanes. > > > > > Problem 2: Userspace doesn't know the bus index of the bridges, > > and it's > > > not clear which bridge should be used. > > > > In general, modern ones have this arch= US->DS->EP. US is the one > > connected to physical link. > > > > > Problem 3: The PCIe gen number is missing. > > > > Current link speed is based on whether it's Gen1/2/3/4/5. > > > > BTW, your patch makes use of capabilities flags which gives the > maximum > > supported speed/width by the device. It may not necessarily reflect > the > > current speed/width negotiated. I guess in NV, this info is already > > obtained from PMFW and made available through metrics table. > > > > > > It computes the minimum of the device PCIe gen and the motherboard/slot > > PCIe gen to get the final value. These 2 lines do that. The low 16 bits > > of the mask contain the device PCIe gen mask. The high 16 bits of the > > mask contain the slot PCIe gen mask. > > + pcie_gen_mask = adev->pm.pcie_gen_mask & (adev->pm.pcie_gen_mask >> > 16); > > + dev_info->pcie_gen = fls(pcie_gen_mask); > > > > With DPM in place on some ASICs, how much does this static info help for > upper level apps? > > > It helps UMDs make better decisions if they know the maximum achievable > bandwidth. UMDs also compute the maximum memory bandwidth and compute > performance (FLOPS). Right now it's printed by Mesa to give users detailed > information about their GPU. For example: > > $ AMD_DEBUG=info glxgears > Device info: > name = NAVI21 > marketing_name = AMD Radeon RX 6800 > num_se = 3 > num_rb = 12 > num_cu = 60 > max_gpu_freq = 2475 MHz > max_gflops = 19008 GFLOPS > l0_cache_size = 16 KB > l1_cache_size = 128 KB > l2_cache_size = 4096 KB > l3_cache_size = 128 MB > memory_channels = 16 (TCC blocks) > memory_size = 16 GB (16384 MB) > memory_freq = 16 GHz > memory_bus_width = 256 bits > memory_bandwidth = 512 GB/s > pcie_gen = 1 > pcie_num_lanes = 16 > pcie_bandwidth = 4.0 GB/s > > Marek >