Bug#1032899: unblock: rocm-hipamd/5.2.3-6
Hi Paul, On 2023-04-30 07:59, Paul Gevers wrote: > Please go ahead with your 04_ proposal and please remove the moreinfo > tag once the upload happened. In -7, there was a typo that broke installability for hipcc, specifically a version contained a second colon where a dot was expected: > [...] | libclang-rt-15-dev (>= 1:15:0.6-5~exp1), ^^^ I went ahead with an -8 upload that changes just that one typo, and I successfully tested all install/upgrade paths for hipcc: bookworm: none -> -8 bookworm: -1 -> -8 unstable: none -> -8 unstable: -7 -> -8 I'm sorry for the noise. I'm more than puzzled how this could have snuck in, as I tested the above upgrade paths before proposing the change. Best, Christian
Bug#1032899: unblock: rocm-hipamd/5.2.3-6
Control: tags -1 - moreinfo Hi Paul, On 2023-04-30 07:59, Paul Gevers wrote: >> I may be misunderstanding something here. I interpreted your t-p-u hint >> for the case where a fix via unstable wouldn't be possible because of >> the dependency issue. The proposal, however would work via unstable. > > It was for *me*. I reviewed the version in unstable and had some > concerns. Due to the size of the debdiff, it's easier to look at the > proposed delta to unstable (that I reviewed), than to review from scratch. Got it. > Please go ahead with your 04_ proposal and please remove the moreinfo > tag once the upload happened. Best, Christian
Bug#1032899: unblock: rocm-hipamd/5.2.3-6
Control: tags -1 confirmed moreinfo Hi Christian, On 29-04-2023 00:17, Christian Kastner wrote: I asked you to *also* provide the diff between *current* unstable and your proposal (via unstable), because "I was about to propose to upload it to tpu" (2023-04-20). Sure, the 04_ attachment is the debdiff between unstable -6 and the proposed update -7, which removes all of the less important changes that I marked with (+) in my previous log. I may be misunderstanding something here. I interpreted your t-p-u hint for the case where a fix via unstable wouldn't be possible because of the dependency issue. The proposal, however would work via unstable. It was for *me*. I reviewed the version in unstable and had some concerns. Due to the size of the debdiff, it's easier to look at the proposed delta to unstable (that I reviewed), than to review from scratch. Please go ahead with your 04_ proposal and please remove the moreinfo tag once the upload happened. Paul OpenPGP_signature Description: OpenPGP digital signature
Bug#1032899: unblock: rocm-hipamd/5.2.3-6
Hi Paul, On 2023-04-28 17:48, Paul Gevers wrote: > On 28-04-2023 00:58, Christian Kastner wrote: >> So I split that diff into 02 (patches) and 03 (NOT-patches), also >> attached. > > I think you forgot to add them. I did, sorry. >> Would a package with just the patches and the (*) changes be acceptable? > > I asked you to *also* provide the diff between *current* unstable and > your proposal (via unstable), because "I was about to propose to upload > it to tpu" (2023-04-20). Sure, the 04_ attachment is the debdiff between unstable -6 and the proposed update -7, which removes all of the less important changes that I marked with (+) in my previous log. I may be misunderstanding something here. I interpreted your t-p-u hint for the case where a fix via unstable wouldn't be possible because of the dependency issue. The proposal, however would work via unstable. Best, Christian diff -Nru rocm-hipamd-5.2.3/debian/patches/0001-Clang-version-munging.patch rocm-hipamd-5.2.3/debian/patches/0001-Clang-version-munging.patch --- rocm-hipamd-5.2.3/debian/patches/0001-Clang-version-munging.patch 2022-10-20 21:20:33.0 +0200 +++ rocm-hipamd-5.2.3/debian/patches/0001-Clang-version-munging.patch 2023-04-25 19:50:14.0 +0200 @@ -1,9 +1,9 @@ From: Maxime Chambonnet Date: Sat, 11 Feb 2022 11:28:54 +0100 Subject: Clang version munging - https://github.com/ROCm-Developer-Tools/HIP/pull/2451 -Forwarded: yes +Forwarded: https://github.com/ROCm-Developer-Tools/HIP/pull/2451 +Applied-Upstream: https://github.com/ROCm-Developer-Tools/HIP/commit/0c443d12011da16a036057e0472ae59c68bc901f --- hip/bin/hipcc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff -Nru rocm-hipamd-5.2.3/debian/patches/0002-fix-cmake-library-notfound-check.patch rocm-hipamd-5.2.3/debian/patches/0002-fix-cmake-library-notfound-check.patch --- rocm-hipamd-5.2.3/debian/patches/0002-fix-cmake-library-notfound-check.patch 1970-01-01 01:00:00.0 +0100 +++ rocm-hipamd-5.2.3/debian/patches/0002-fix-cmake-library-notfound-check.patch 2023-04-25 19:50:14.0 +0200 @@ -0,0 +1,47 @@ +From: Cordell Bloor +Date: Mon, 24 Oct 2022 00:07:40 -0400 +Subject: fix cmake library notfound check + +If find_library does not find the library, the given variable is +set with a value that has a -NOTFOUND suffix. For example, the +CLANGRT_BUILTINS variable will be set with the value +CLANGRT_BUILTINS-NOTFOUND. + +Applied-Upstream: https://github.com/ROCm-Developer-Tools/HIP/commit/d12d0ebc578601de138765ee4b1ddd2dcbc79edf + +--- +diff --git a/hip-config.cmake.in b/hip-config.cmake.in +index ba3e75c..a27badc 100755 +--- a/hip-config.cmake.in b/hip-config.cmake.in +@@ -287,7 +287,7 @@ if(HIP_COMPILER STREQUAL "clang") + ${HIP_CLANG_INCLUDE_PATH}/../lib/linux) + + # Add support for __fp16 and _Float16, explicitly link with compiler-rt +-if(CLANGRT_BUILTINS-NOTFOUND) ++if(NOT CLANGRT_BUILTINS) + message(FATAL_ERROR "clangrt builtins lib not found") + else() + set_property(TARGET hip::host APPEND PROPERTY INTERFACE_LINK_LIBRARIES "${CLANGRT_BUILTINS}") +diff --git a/hip/hip-lang-config.cmake.in b/hip/hip-lang-config.cmake.in +index 1a72643..07f24f9 100644 +--- a/hip/hip-lang-config.cmake.in b/hip/hip-lang-config.cmake.in +@@ -94,7 +94,7 @@ find_path(HSA_HEADER hsa/hsa.h + /opt/rocm/include + ) + +-if (HSA_HEADER-NOTFOUND) ++if (NOT HSA_HEADER) + message (FATAL_ERROR "HSA header not found! ROCM_PATH environment not set") + endif() + +@@ -136,7 +136,7 @@ set_property(TARGET hip-lang::device APPEND PROPERTY + ) + + # Add support for __fp16 and _Float16, explicitly link with compiler-rt +-if(CLANGRT_BUILTINS-NOTFOUND) ++if(NOT CLANGRT_BUILTINS) + message(FATAL_ERROR "clangrt builtins lib not found") + else() + set_property(TARGET hip-lang::device APPEND PROPERTY diff -Nru rocm-hipamd-5.2.3/debian/patches/0003-hip-config.cmake.patch rocm-hipamd-5.2.3/debian/patches/0003-hip-config.cmake.patch --- rocm-hipamd-5.2.3/debian/patches/0003-hip-config.cmake.patch 2022-10-20 21:20:33.0 +0200 +++ rocm-hipamd-5.2.3/debian/patches/0003-hip-config.cmake.patch 2023-04-25 19:50:14.0 +0200 @@ -2,6 +2,7 @@ Date: Thu, 27 Jan 2022 18:47:04 +0100 Subject: hip-config.cmake +Forwarded: no --- hip-config.cmake.in | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff -Nru rocm-hipamd-5.2.3/debian/patches/0004-hip-cmake-install.patch rocm-hipamd-5.2.3/debian/patches/0004-hip-cmake-install.patch --- rocm-hipamd-5.2.3/debian/patches/0004-hip-cmake-install.patch 2022-10-20 21:20:33.0 +0200 +++ rocm-hipamd-5.2.3/debian/patches/0004-hip-cmake-install.patch 2023-04-25 19:50:14.0 +0200 @@ -2,6 +2,7 @@ Date: Tue, 8 Feb 2022 12:41:33 +0100 Subject: hip cmake install +Applied-Upstream: https://github.com/ROCm-Developer-Tools/hipamd/commit/f892306e227983a7c1943992ba70bf4e4b189105 --- src/CMakeLists.txt | 10 +- 1 file changed, 5 insertions(+), 5 deletions(-) @@
Bug#1032899: unblock: rocm-hipamd/5.2.3-6
Hi Christian, On 28-04-2023 00:58, Christian Kastner wrote: So I split that diff into 02 (patches) and 03 (NOT-patches), also attached. I think you forgot to add them. Would a package with just the patches and the (*) changes be acceptable? I asked you to *also* provide the diff between *current* unstable and your proposal (via unstable), because "I was about to propose to upload it to tpu" (2023-04-20). Paul OpenPGP_signature Description: OpenPGP digital signature
Bug#1032899: unblock: rocm-hipamd/5.2.3-6
Control: tags -1 - moreinfo Hi Paul, sorry this took a while. On 2023-04-22 13:34, Paul Gevers wrote: > On 21-04-2023 23:43, Christian Kastner wrote: >> The only way to do that with llvm-toolchain-15 from testing is by >> changing the dependency libclang-rt-15-dev back to >> libclang-common-15-dev (the pre-split version). > > Hmm, so this complicates things. Can you do this change in unstable, or > would it be broken in unstable? Luckily, the newer llvm-toolchain-15 is only needed for building tests. These aren't run (cannot be run) by buildds, so by dropping them for now, we can drop the problematic build dependency. And for bin:hipcc, the only binary package affected, I believe the dependency on libclang-rt-15-dev was wrong anyway, there's a broken upgrade path for the files that moved in the dependency. The correct specification should be: libclang-common-15-dev (<< 1:15.0.6-5~exp1) | libclang-rt-15-dev (>= 1:15:0.6-5~exp1) >> If that is an option, I could prepare an upload, and also reduce out >> whatever other changes you don't feel comfortable with in the larger >> diff. > > That would be good. Can you also share the minimal delta with the > current version in unstable? I'll check if that's acceptable. I've attached the new diff as 01 (FULL) but its d/changelog is noisy, reflecting the ongoing development process we had in this younger library. So I split that diff into 02 (patches) and 03 (NOT-patches), also attached. 02_rocm-hipamd-patches.diff There were 5 patches added (Jan:4 Feb:1), and these represent fixes that really must be in the package, but were held up by our dependency. One patch was dropped. Many others just got DEP3 headers. 03_rocm-NOT-hipamd-patches.diff The diff is not as large as d/changelog suggests. I've summarized all the changes below, with (*) marking changes that really should get into testing, and (+) marking changes that aren't strictly needed. * Build Depends added: llvm-15, file * (RC #1032677) Depends fixed: bin:libamdhip64-5, bin:libamdhip64-dev * Depends fixed: bin:hipcc (as described above) * *.install files fixed (+ one d/rules change), not-installed added * Build flags fixed in d/rules * Another RPATH removed * Updates to d/copyright + Build Depends added: rocminfo (just for tests) + Reduce architectures to amd64, arm64, ppc64el (the only platforms with the necessary drivers) + Update Standards-Version from 4.6.1 to 4.6.2 + autopkgtest added Would a package with just the patches and the (*) changes be acceptable? Best, Christiandiff -Nru rocm-hipamd-5.2.3/debian/changelog rocm-hipamd-5.2.3/debian/changelog --- rocm-hipamd-5.2.3/debian/changelog 2022-10-20 21:20:33.0 +0200 +++ rocm-hipamd-5.2.3/debian/changelog 2023-04-25 19:50:14.0 +0200 @@ -1,3 +1,85 @@ +rocm-hipamd (5.2.3-7) UNRELEASED; urgency=medium + + * hipcc: Fix Depends to enable transition from split clang package + * Drop building of tests, and libclang-rt-15-dev dependency + + -- Christian Kastner Tue, 25 Apr 2023 19:50:14 +0200 + +rocm-hipamd (5.2.3-6) unstable; urgency=medium + + * Reduce arch to amd64, arm64, ppc64el + * libamdhip64-5: Add dependency on libamd-comgr2 (Closes: #1032677) + * Add myself to Uploaders + * Fix Maintainer (same list, different name) + + -- Christian Kastner Fri, 10 Mar 2023 23:38:51 +0100 + +rocm-hipamd (5.2.3-5) unstable; urgency=medium + + * d/{libamdhip64-dev,rules}: fix version file +Closes: #1031264 + * add d/p/0020-replace-x86_64-with-variables.patch +to fix build on aarch64 + * d/control: add file to hipcc dependencies + * d/control: add dependencies for find_package(hip) +Closes: #1031538 + * add d/p/0021-fix-default-cmake-build-on-unsupported-gpus.patch +to enable gpu arch autodetection with find_package(hip) + * d/not-installed: ignore doxygen docs + * d/p/000{4,8,9}*.patch: change hip-lang cmake files, +to partially fix #1031540 + * d/copyright: update copyright date + * d/control: add self to uploaders + * cleanup patch metadata + + -- Cordell Bloor Sun, 19 Feb 2023 03:51:26 -0700 + +rocm-hipamd (5.2.3-4) unstable; urgency=medium + + * d/t/hipcc: also skip when no kfd driver is loaded. + + -- Étienne Mollier Sat, 21 Jan 2023 12:54:49 +0100 + +rocm-hipamd (5.2.3-3) unstable; urgency=medium + + * d/control: build depends on libclang-rt-15-dev. + * d/control: hipcc depends on libclang-rt-15-dev. + * d/t/hipcc: add; basic script testing hipcc. + * d/t/hipconfig: add; script skipping hipconfig if no amdgpu is available. + * d/t/control: add hipcc to superficial autopkgtests. + * d/t/control: run the d/t/hipconfig test script instead of the command; +this allows us to trigger conditions for when hardware is not available +and the script has to be skipped. + + -- Étienne Mollier Wed, 18 Jan 2023 20:35:17 +0100 + +rocm-hipamd (5.2.3-2) unstable; urgency=medium + + [ Cordell Bloor ] + * d/patches: add 0020-hipcc-remove-rpath-flags.patch +
Bug#1032899: unblock: rocm-hipamd/5.2.3-6
Hi Paul, just wanted to say sorry, this is taking a while. On 2023-04-22 13:34, Paul Gevers wrote: >> The only way to do that with llvm-toolchain-15 from testing is by >> changing the dependency libclang-rt-15-dev back to >> libclang-common-15-dev (the pre-split version). > > Hmm, so this complicates things. Can you do this change in unstable, or > would it be broken in unstable? I did not think of that, and you are right, of course. The build breaks in unstable; the relevant files have all been moved to libclang-rt-15-dev. However: unless I'm utterly mistaken, these files are only needed for building tests -- which we don't run on buildds anyway. The package builds fine without this dependency if test building is skipped, so this could be a solution when going through unstable. However-however: libclang-rt-15-dev is also a dependency of the produced binary package hipcc. That makes sense, since I may want to compile a test skipped above on my own machine, for example. It's this dependency makes things tricky (I'm pretty sure there's a versioned Depends missing anyway) and I'd like to be 100% confident before suggesting any change to this. I'm leaving the moreinfo tag for now, and I'll remove it once this is solved and tested thoroughly. >> If that is an option, I could prepare an upload, and also reduce out >> whatever other changes you don't feel comfortable with in the larger >> diff. > > That would be good. Can you also share the minimal delta with the > current version in unstable? I'll check if that's acceptable. Best, Christian
Bug#1032899: unblock: rocm-hipamd/5.2.3-6
Control: tags -1 moreinfo Hi, On 21-04-2023 23:43, Christian Kastner wrote: In the event that llvm-toolchain-15 will not be allowed to migrate: I would be surprised if llvm-toolchain-15 gets updated in bookworm. there are some fixes in the current version of rocm-hipamd that really should get into bookworm, most notably the missing libamd-comgr-dev dependency, and the added patches. The only way to do that with llvm-toolchain-15 from testing is by changing the dependency libclang-rt-15-dev back to libclang-common-15-dev (the pre-split version). Hmm, so this complicates things. Can you do this change in unstable, or would it be broken in unstable? If that is an option, I could prepare an upload, and also reduce out whatever other changes you don't feel comfortable with in the larger diff. That would be good. Can you also share the minimal delta with the current version in unstable? I'll check if that's acceptable. Paul OpenPGP_signature Description: OpenPGP digital signature
Bug#1032899: unblock: rocm-hipamd/5.2.3-6
Control: tags -1 - moreinfo Hi Paul, On 2023-04-20 08:58, Paul Gevers wrote: > Sorry for taking so long to respond (the moreinfo tag was still attached > to the bug, so it didn't show up in my regular bts view, so please > remove it when you reply). done. > On 16-03-2023 11:40, Christian Kastner wrote: >>> Overall, the diff is a bit long (and has some irrelevant stuff), so >>> I'm hesitant to offer t-p-u now (to avoid waiting for >>> llvm-toolchain-15). >> >> Understood. Yeah, the diff is long, unfortunately, as the packaging >> fixes accumulated over time. > > That's why (especially around the freeze) we expect maintainers to keep > track of migration and ensure they happen. You got stuck behind > llvm-toolchain-15, but that's very unlikely to be fixed before the release. We were actually well aware of the migration issue (it was, after all, preventing our own migration). But that blocking RC bug appeared like an isolated issue in llvm-toolchain-15, so we were kind of speculating on the idea that it would eventually resolve itself in time. That bug got overlooked out of sheer bad luck, though. In the event that llvm-toolchain-15 will not be allowed to migrate: there are some fixes in the current version of rocm-hipamd that really should get into bookworm, most notably the missing libamd-comgr-dev dependency, and the added patches. The only way to do that with llvm-toolchain-15 from testing is by changing the dependency libclang-rt-15-dev back to libclang-common-15-dev (the pre-split version). If that is an option, I could prepare an upload, and also reduce out whatever other changes you don't feel comfortable with in the larger diff. >> Is this something that you could consider at a later point in time, if I >> also break down the diff into more reviewable fragments (dependencies, >> build, metadata, ...)? Because I do think that most changes are just >> fixes of one sort or another - no features added. > > I checked the diff again and I was about to propose to upload it to tpu, > but I saw the following: > > diff -Nru rocm-hipamd-5.2.3/debian/rules rocm-hipamd-5.2.3/debian/rules > --- rocm-hipamd-5.2.3/debian/rules 2022-10-20 19:20:33.0 + > +++ rocm-hipamd-5.2.3/debian/rules 2023-03-10 22:38:51.0 + > > [...] > + -DHIP_PLATFORM=amd > > Is that correct for the arm64 builds? Thanks for checking! Yes, that refers to the GPU arch, not the CPU arch. HIP code is portable in the sense that it can work with both AMD and Nvidia GPUs. Best, Christian
Bug#1032899: unblock: rocm-hipamd/5.2.3-6
Hi, Sorry for taking so long to respond (the moreinfo tag was still attached to the bug, so it didn't show up in my regular bts view, so please remove it when you reply). On 16-03-2023 11:40, Christian Kastner wrote: Overall, the diff is a bit long (and has some irrelevant stuff), so I'm hesitant to offer t-p-u now (to avoid waiting for llvm-toolchain-15). Understood. Yeah, the diff is long, unfortunately, as the packaging fixes accumulated over time. That's why (especially around the freeze) we expect maintainers to keep track of migration and ensure they happen. You got stuck behind llvm-toolchain-15, but that's very unlikely to be fixed before the release. Is this something that you could consider at a later point in time, if I also break down the diff into more reviewable fragments (dependencies, build, metadata, ...)? Because I do think that most changes are just fixes of one sort or another - no features added. I checked the diff again and I was about to propose to upload it to tpu, but I saw the following: diff -Nru rocm-hipamd-5.2.3/debian/rules rocm-hipamd-5.2.3/debian/rules --- rocm-hipamd-5.2.3/debian/rules 2022-10-20 19:20:33.0 + +++ rocm-hipamd-5.2.3/debian/rules 2023-03-10 22:38:51.0 + [...] + -DHIP_PLATFORM=amd Is that correct for the arm64 builds? Paul OpenPGP_signature Description: OpenPGP digital signature
Bug#1032899: unblock: rocm-hipamd/5.2.3-6
Hi all, I feel responsible for several of the issues listed by Paul, as my earlier activity matches the time frame of some of the changes and problems. Christian Kastner, on 2023-03-16: > On 2023-03-16 10:31, Paul Gevers wrote: > > Control: tags -1 moreinfo On 16-03-2023 00:16, Christian Kastner > > wrote: For next time, can you please contact us earlier? We could > > have solved the earlier problems in testing-proposed-updates (in > > January), then we would now be in a better position. > > I didn't think of that solution as the RC-blocked dependency was only > available in unstable, and admittedly because I thought this would > resolve itself in time. > > But in any case: yes, earlier contact would have been helpful, and I'll > do so in future. Acknowledged, I must admit I had a similar perception of the situation when I sloppily checked migration status two months ago, and it didn't occur immediately to me that it would become an entangled migration problem during hard freeze. I'm sorry about that. > > By the way, I checked, but none of the ci.d.n host will run any of > > your tests, as none of them has an amdgpu (is that a thing you could > > expect on non-amd architectures by the way?). > > Correct! Tests will be skipped on official infra. > > It's not just a matter of the missing hardware (we have it, but DSA has > understandable concerns), it's also about how to even express that a > package needs a GPU to run its tests (build-time or autopkgtest). Some kernel and hardware combinations may cause a host hangup, e.g. the rocm-hipamd package version in testing doesn't serialize properly tests and this causes a number of bus contention errors when running the test suite, eventually leading to a hangup. I also have a more concerning case of a test item running into a potential kernel bug on rx6800, which I'm long overdue to investigate in depth with competent kernel people (actually I'm unable to tell whether the hardware or the kernel is at fault thus far, as the crash occurs in amdgpu ecc functions). There are other technical concerns regarding maintenance of virtual machines and binding them to physical hardware due to having to pass the GPU through the hardware. The third issue was it is almost always mandatory to run using non-free-firmware that cannot be freely audited for passing tests. The current combination of skippable tests with check on the availability of kfd device is the best we managed achieve thus far. > I recently initiated a discussion about this [3]. For now, the idea to > run parallel debci infra with guaranteed GPU presence, gather > experience, and to eventually share proposals on how a GPU dependency > could be expressed in d/control and d/tests/control. (I'm overdue to answer to [3], but overall I was mostly fine with the ideas and haven't spotted anything of concern yet.) > > One thing I spotted along the way; the (Build-)Depends on llvm > > related packages use the *versioned* ones. Is there a reason not to > > use the unversioned ones from src:llvm-defaults? That would make llvm > > transitions a bit easier. > > I'd have to check with the co-maintainers who added it, but from what I > gather so far, the ROCm stack needs a very recent llvm because of many > changes being upstreamed there. The ROCm stack is actually developed against a fork of llvm (the rocm-llvm). To avoid having to package more or less a code copy of the native llvm, we target instead the next llvm-toolchain version which contains upstreamed changes from rocm-llvm. Even that requires extensive patching, thankfully we have benefited from the substantial help of people from AMD this far on that front. > > [1] https://lists.debian.org/debian-devel/2022/09/msg00105.html and > > follow-up > > [2] > https://github.com/torvalds/linux/blob/v6.2/drivers/gpu/drm/amd/amdkfd/Kconfig#L6-L8 > [3] https://lists.debian.org/debian-ai/2023/03/msg00038.html Thank you for your work on putting together Debian 12 bookworm! Have a nice day, :) -- .''`. Étienne Mollier : :' : gpg: 8f91 b227 c7d6 f2b1 948c 8236 793c f67e 8f0d 11da `. `' sent from /dev/tty1, please excuse my verbosity `-on air: Status Minor - Feel My Hunger signature.asc Description: PGP signature
Bug#1032899: unblock: rocm-hipamd/5.2.3-6
Hi Paul, On 2023-03-16 10:31, Paul Gevers wrote: > Control: tags -1 moreinfo On 16-03-2023 00:16, Christian Kastner > wrote: For next time, can you please contact us earlier? We could > have solved the earlier problems in testing-proposed-updates (in > January), then we would now be in a better position. I didn't think of that solution as the RC-blocked dependency was only available in unstable, and admittedly because I thought this would resolve itself in time. But in any case: yes, earlier contact would have been helpful, and I'll do so in future. > + * Reduce arch to amd64, arm64, ppc64el > > But it fails on ppc64el; so why this selection? Because those are the only architectures for which the required amdgpu kernel driver is available [2]. > Also, as the other architectures FTBFS, we prefer in Debian to *not* > limit the architectures, but just let them fail [1]. This eases > porter efforts. Thanks for pointing this out, I thought it was the other way around (prefer *to* limit to avoid failures). Well, with ppc64el, we followed that strategy. > If the packages really don't make sense on some architectures, > consider using some of the "properties" provided by > bin:architecture-properties in your Build-Depends. I wasn't aware of this package and I don't think it'll help us here because we're specifically tracking [2]. But it'll be very useful to some of my other packages, thanks! > By the way, I checked, but none of the ci.d.n host will run any of > your tests, as none of them has an amdgpu (is that a thing you could > expect on non-amd architectures by the way?). Correct! Tests will be skipped on official infra. It's not just a matter of the missing hardware (we have it, but DSA has understandable concerns), it's also about how to even express that a package needs a GPU to run its tests (build-time or autopkgtest). I recently initiated a discussion about this [3]. For now, the idea to run parallel debci infra with guaranteed GPU presence, gather experience, and to eventually share proposals on how a GPU dependency could be expressed in d/control and d/tests/control. > One thing I spotted along the way; the (Build-)Depends on llvm > related packages use the *versioned* ones. Is there a reason not to > use the unversioned ones from src:llvm-defaults? That would make llvm > transitions a bit easier. I'd have to check with the co-maintainers who added it, but from what I gather so far, the ROCm stack needs a very recent llvm because of many changes being upstreamed there. > Overall, the diff is a bit long (and has some irrelevant stuff), so > I'm hesitant to offer t-p-u now (to avoid waiting for > llvm-toolchain-15). Understood. Yeah, the diff is long, unfortunately, as the packaging fixes accumulated over time. Is this something that you could consider at a later point in time, if I also break down the diff into more reviewable fragments (dependencies, build, metadata, ...)? Because I do think that most changes are just fixes of one sort or another - no features added. Best, Christian > [1] https://lists.debian.org/debian-devel/2022/09/msg00105.html and follow-up [2] https://github.com/torvalds/linux/blob/v6.2/drivers/gpu/drm/amd/amdkfd/Kconfig#L6-L8 [3] https://lists.debian.org/debian-ai/2023/03/msg00038.html
Bug#1032899: unblock: rocm-hipamd/5.2.3-6
Control: tags -1 moreinfo Hi, On 16-03-2023 00:16, Christian Kastner wrote: On 2023-03-13 18:28, Christian Kastner wrote: [ Impact ] The new versions are in far better shape: they've catched missing dependencies, added patches, improved the build process, etc. Apologies, I was only thinking of the more recent releases. Revision -2 fixed an RC bug in January, but never got the chance to migrate because of an RC bug in a dependency. Revision -6 fixed another RC bug. All releases after -2 were incremental improvements that basically never got the chance to migrate because of a dependency not migrating. For next time, can you please contact us earlier? We could have solved the earlier problems in testing-proposed-updates (in January), then we would now be in a better position. + * Reduce arch to amd64, arm64, ppc64el But it fails on ppc64el; so why this selection? Also, as the other architectures FTBFS, we prefer in Debian to *not* limit the architectures, but just let them fail [1]. This eases porter efforts. If the packages really don't make sense on some architectures, consider using some of the "properties" provided by bin:architecture-properties in your Build-Depends. By the way, I checked, but none of the ci.d.n host will run any of your tests, as none of them has an amdgpu (is that a thing you could expect on non-amd architectures by the way?). One thing I spotted along the way; the (Build-)Depends on llvm related packages use the *versioned* ones. Is there a reason not to use the unversioned ones from src:llvm-defaults? That would make llvm transitions a bit easier. Overall, the diff is a bit long (and has some irrelevant stuff), so I'm hesitant to offer t-p-u now (to avoid waiting for llvm-toolchain-15). Paul [1] https://lists.debian.org/debian-devel/2022/09/msg00105.html and follow-up OpenPGP_signature Description: OpenPGP digital signature
Bug#1032899: unblock: rocm-hipamd/5.2.3-6
On 2023-03-13 18:28, Christian Kastner wrote: > [ Impact ] > The new versions are in far better shape: they've catched missing > dependencies, added patches, improved the build process, etc. Apologies, I was only thinking of the more recent releases. Revision -2 fixed an RC bug in January, but never got the chance to migrate because of an RC bug in a dependency. Revision -6 fixed another RC bug. All releases after -2 were incremental improvements that basically never got the chance to migrate because of a dependency not migrating.
Bug#1032899: unblock: rocm-hipamd/5.2.3-6
Package: release.debian.org Severity: normal User: release.debian@packages.debian.org X-Debbugs-Cc: debian...@lists.debian.org Usertags: unblock Control: affects -1 + src:rocm-hipamd Please unblock package rocm-hipamd rocm-hipamd 5.2.3-1 has been in testing for a few months now, so have the following -2 and -3 revisions. The three revisions since January were blocked from migrating by its dependency src:llvm-toolchain-15, where a package split was introduced to unstable, and one of the new packages was not allowed to migrate because of an RC bug. This bug was recently fixed. [ Reason ] The changes in -2 to -6 are all just added patches, or packaging fixes. [ Impact ] The new versions are in far better shape: they've catched missing dependencies, added patches, improved the build process, etc. [ Tests ] Manual tests, on the workstations of multpile maintainers. These packages cannot be tested on debci because the tests require GPUs to work. [ Risks ] Given that there are no upstream changes other than added patches for fixing this, the risks are minimal. [ Checklist ] [x] all changes are documented in the d/changelog [x] I reviewed all changes and I approve them [x] attach debdiff against the package in testing unblock rocm-hipamd/5.2.3-6diff -Nru rocm-hipamd-5.2.3/debian/changelog rocm-hipamd-5.2.3/debian/changelog --- rocm-hipamd-5.2.3/debian/changelog 2022-10-20 21:20:33.0 +0200 +++ rocm-hipamd-5.2.3/debian/changelog 2023-03-10 23:38:51.0 +0100 @@ -1,3 +1,78 @@ +rocm-hipamd (5.2.3-6) unstable; urgency=medium + + * Reduce arch to amd64, arm64, ppc64el + * libamdhip64-5: Add dependency on libamd-comgr2 (Closes: #1032677) + * Add myself to Uploaders + * Fix Maintainer (same list, different name) + + -- Christian Kastner Fri, 10 Mar 2023 23:38:51 +0100 + +rocm-hipamd (5.2.3-5) unstable; urgency=medium + + * d/{libamdhip64-dev,rules}: fix version file +Closes: #1031264 + * add d/p/0020-replace-x86_64-with-variables.patch +to fix build on aarch64 + * d/control: add file to hipcc dependencies + * d/control: add dependencies for find_package(hip) +Closes: #1031538 + * add d/p/0021-fix-default-cmake-build-on-unsupported-gpus.patch +to enable gpu arch autodetection with find_package(hip) + * d/not-installed: ignore doxygen docs + * d/p/000{4,8,9}*.patch: change hip-lang cmake files, +to partially fix #1031540 + * d/copyright: update copyright date + * d/control: add self to uploaders + * cleanup patch metadata + + -- Cordell Bloor Sun, 19 Feb 2023 03:51:26 -0700 + +rocm-hipamd (5.2.3-4) unstable; urgency=medium + + * d/t/hipcc: also skip when no kfd driver is loaded. + + -- Étienne Mollier Sat, 21 Jan 2023 12:54:49 +0100 + +rocm-hipamd (5.2.3-3) unstable; urgency=medium + + * d/control: build depends on libclang-rt-15-dev. + * d/control: hipcc depends on libclang-rt-15-dev. + * d/t/hipcc: add; basic script testing hipcc. + * d/t/hipconfig: add; script skipping hipconfig if no amdgpu is available. + * d/t/control: add hipcc to superficial autopkgtests. + * d/t/control: run the d/t/hipconfig test script instead of the command; +this allows us to trigger conditions for when hardware is not available +and the script has to be skipped. + + -- Étienne Mollier Wed, 18 Jan 2023 20:35:17 +0100 + +rocm-hipamd (5.2.3-2) unstable; urgency=medium + + [ Cordell Bloor ] + * d/patches: add 0020-hipcc-remove-rpath-flags.patch +Closes: #1021642 + * d/rules: trim unnecessary rules + * d/rules: strip RUNPATH from libamdhip64.so + * debian/patches: backport 56b3260 from upstream +Closes: #1021643 + * d/rules: disable creation of duplicate files + * d/patches: fix search paths when building with g++ + * d/patches: add 0002-fix-cmake-library-notfound-check.patch + * d/libamdhip64-dev.install: install /usr/share/hip/version + + [ Étienne Mollier ] + * 0005-clang-15.patch: also adjust llc postfix. +Thanks to Jakub Jaszewski + * d/t/control: check hipconfig doesn't output error messages. + * d/control: hipcc depends on rocminfo. + * d/control: declare compliance to standards version 4.6.2. + * d/copyright: update copyright year. + * d/rules: build tests in parallel. + * d/rules: set library path to find the freshly built library. + * d/rules: force run tests sequentially; avoid bus contention on the GPU. + + -- Étienne Mollier Sat, 14 Jan 2023 11:16:01 +0100 + rocm-hipamd (5.2.3-1) unstable; urgency=medium * Migrate ROCm 5.2.3 to unstable. diff -Nru rocm-hipamd-5.2.3/debian/control rocm-hipamd-5.2.3/debian/control --- rocm-hipamd-5.2.3/debian/control2022-10-20 21:20:33.0 +0200 +++ rocm-hipamd-5.2.3/debian/control2023-03-10 23:38:51.0 +0100 @@ -6,12 +6,14 @@ Section: devel Homepage: https://github.com/rocm-developer-tools/hipamd Priority: optional -Standards-Version: 4.6.1 +Standards-Version: 4.6.2 Vcs-Git: https://salsa.debian.org/rocm-team/rocm-hipamd.git Vcs-Browser: