Bug#1032899: unblock: rocm-hipamd/5.2.3-6

2023-04-30 Thread Christian Kastner
Hi Paul,

On 2023-04-30 07:59, Paul Gevers wrote:
> Please go ahead with your 04_ proposal and please remove the moreinfo
> tag once the upload happened.

In -7, there was a typo that broke installability for hipcc,
specifically a version contained a second colon where a dot was expected:

> [...] | libclang-rt-15-dev (>= 1:15:0.6-5~exp1),
^^^
I went ahead with an -8 upload that changes just that one typo, and I
successfully tested all install/upgrade paths for hipcc:
  bookworm: none -> -8
  bookworm:  -1  -> -8
  unstable: none -> -8
  unstable:  -7  -> -8

I'm sorry for the noise. I'm more than puzzled how this could have snuck
in, as I tested the above upgrade paths before proposing the change.

Best,
Christian



Bug#1032899: unblock: rocm-hipamd/5.2.3-6

2023-04-30 Thread Christian Kastner
Control: tags -1 - moreinfo

Hi Paul,

On 2023-04-30 07:59, Paul Gevers wrote:
>> I may be misunderstanding something here. I interpreted your t-p-u hint
>> for the case where a fix via unstable wouldn't be possible because of
>> the dependency issue. The proposal, however would work via unstable.
> 
> It was for *me*. I reviewed the version in unstable and had some
> concerns. Due to the size of the debdiff, it's easier to look at the
> proposed delta to unstable (that I reviewed), than to review from scratch.

Got it.

> Please go ahead with your 04_ proposal and please remove the moreinfo
> tag once the upload happened.
Best,
Christian



Bug#1032899: unblock: rocm-hipamd/5.2.3-6

2023-04-30 Thread Paul Gevers

Control: tags -1 confirmed moreinfo

Hi Christian,

On 29-04-2023 00:17, Christian Kastner wrote:

I asked you to *also* provide the diff between *current* unstable and
your proposal (via unstable), because "I was about to propose to upload
it to tpu" (2023-04-20).


Sure, the 04_ attachment is the debdiff between unstable -6 and the
proposed update -7, which removes all of the less important changes that
I marked with (+) in my previous log.

I may be misunderstanding something here. I interpreted your t-p-u hint
for the case where a fix via unstable wouldn't be possible because of
the dependency issue. The proposal, however would work via unstable.


It was for *me*. I reviewed the version in unstable and had some 
concerns. Due to the size of the debdiff, it's easier to look at the 
proposed delta to unstable (that I reviewed), than to review from scratch.


Please go ahead with your 04_ proposal and please remove the moreinfo 
tag once the upload happened.


Paul


OpenPGP_signature
Description: OpenPGP digital signature


Bug#1032899: unblock: rocm-hipamd/5.2.3-6

2023-04-28 Thread Christian Kastner
Hi Paul,

On 2023-04-28 17:48, Paul Gevers wrote:
> On 28-04-2023 00:58, Christian Kastner wrote:
>> So I split that diff into 02 (patches) and 03 (NOT-patches), also
>> attached.
> 
> I think you forgot to add them.

I did, sorry.

>> Would a package with just the patches and the (*) changes be acceptable?
> 
> I asked you to *also* provide the diff between *current* unstable and
> your proposal (via unstable), because "I was about to propose to upload
> it to tpu" (2023-04-20).

Sure, the 04_ attachment is the debdiff between unstable -6 and the
proposed update -7, which removes all of the less important changes that
I marked with (+) in my previous log.

I may be misunderstanding something here. I interpreted your t-p-u hint
for the case where a fix via unstable wouldn't be possible because of
the dependency issue. The proposal, however would work via unstable.

Best,
Christian
diff -Nru rocm-hipamd-5.2.3/debian/patches/0001-Clang-version-munging.patch rocm-hipamd-5.2.3/debian/patches/0001-Clang-version-munging.patch
--- rocm-hipamd-5.2.3/debian/patches/0001-Clang-version-munging.patch	2022-10-20 21:20:33.0 +0200
+++ rocm-hipamd-5.2.3/debian/patches/0001-Clang-version-munging.patch	2023-04-25 19:50:14.0 +0200
@@ -1,9 +1,9 @@
 From: Maxime Chambonnet 
 Date: Sat, 11 Feb 2022 11:28:54 +0100
 Subject: Clang version munging
- https://github.com/ROCm-Developer-Tools/HIP/pull/2451
 
-Forwarded: yes
+Forwarded: https://github.com/ROCm-Developer-Tools/HIP/pull/2451
+Applied-Upstream: https://github.com/ROCm-Developer-Tools/HIP/commit/0c443d12011da16a036057e0472ae59c68bc901f
 ---
  hip/bin/hipcc | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)
diff -Nru rocm-hipamd-5.2.3/debian/patches/0002-fix-cmake-library-notfound-check.patch rocm-hipamd-5.2.3/debian/patches/0002-fix-cmake-library-notfound-check.patch
--- rocm-hipamd-5.2.3/debian/patches/0002-fix-cmake-library-notfound-check.patch	1970-01-01 01:00:00.0 +0100
+++ rocm-hipamd-5.2.3/debian/patches/0002-fix-cmake-library-notfound-check.patch	2023-04-25 19:50:14.0 +0200
@@ -0,0 +1,47 @@
+From: Cordell Bloor 
+Date: Mon, 24 Oct 2022 00:07:40 -0400
+Subject: fix cmake library notfound check
+
+If find_library does not find the library, the given variable is
+set with a value that has a -NOTFOUND suffix. For example, the
+CLANGRT_BUILTINS variable will be set with the value
+CLANGRT_BUILTINS-NOTFOUND.
+
+Applied-Upstream: https://github.com/ROCm-Developer-Tools/HIP/commit/d12d0ebc578601de138765ee4b1ddd2dcbc79edf
+
+---
+diff --git a/hip-config.cmake.in b/hip-config.cmake.in
+index ba3e75c..a27badc 100755
+--- a/hip-config.cmake.in
 b/hip-config.cmake.in
+@@ -287,7 +287,7 @@ if(HIP_COMPILER STREQUAL "clang")
+ ${HIP_CLANG_INCLUDE_PATH}/../lib/linux)
+ 
+ # Add support for __fp16 and _Float16, explicitly link with compiler-rt
+-if(CLANGRT_BUILTINS-NOTFOUND)
++if(NOT CLANGRT_BUILTINS)
+   message(FATAL_ERROR "clangrt builtins lib not found")
+ else()
+   set_property(TARGET hip::host APPEND PROPERTY INTERFACE_LINK_LIBRARIES "${CLANGRT_BUILTINS}")
+diff --git a/hip/hip-lang-config.cmake.in b/hip/hip-lang-config.cmake.in
+index 1a72643..07f24f9 100644
+--- a/hip/hip-lang-config.cmake.in
 b/hip/hip-lang-config.cmake.in
+@@ -94,7 +94,7 @@ find_path(HSA_HEADER hsa/hsa.h
+ /opt/rocm/include
+ )
+ 
+-if (HSA_HEADER-NOTFOUND)
++if (NOT HSA_HEADER)
+   message (FATAL_ERROR "HSA header not found! ROCM_PATH environment not set")
+ endif()
+ 
+@@ -136,7 +136,7 @@ set_property(TARGET hip-lang::device APPEND PROPERTY
+ )
+ 
+ # Add support for __fp16 and _Float16, explicitly link with compiler-rt
+-if(CLANGRT_BUILTINS-NOTFOUND)
++if(NOT CLANGRT_BUILTINS)
+ message(FATAL_ERROR "clangrt builtins lib not found")
+ else()
+   set_property(TARGET hip-lang::device APPEND PROPERTY
diff -Nru rocm-hipamd-5.2.3/debian/patches/0003-hip-config.cmake.patch rocm-hipamd-5.2.3/debian/patches/0003-hip-config.cmake.patch
--- rocm-hipamd-5.2.3/debian/patches/0003-hip-config.cmake.patch	2022-10-20 21:20:33.0 +0200
+++ rocm-hipamd-5.2.3/debian/patches/0003-hip-config.cmake.patch	2023-04-25 19:50:14.0 +0200
@@ -2,6 +2,7 @@
 Date: Thu, 27 Jan 2022 18:47:04 +0100
 Subject: hip-config.cmake
 
+Forwarded: no
 ---
  hip-config.cmake.in | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)
diff -Nru rocm-hipamd-5.2.3/debian/patches/0004-hip-cmake-install.patch rocm-hipamd-5.2.3/debian/patches/0004-hip-cmake-install.patch
--- rocm-hipamd-5.2.3/debian/patches/0004-hip-cmake-install.patch	2022-10-20 21:20:33.0 +0200
+++ rocm-hipamd-5.2.3/debian/patches/0004-hip-cmake-install.patch	2023-04-25 19:50:14.0 +0200
@@ -2,6 +2,7 @@
 Date: Tue, 8 Feb 2022 12:41:33 +0100
 Subject: hip cmake install
 
+Applied-Upstream: https://github.com/ROCm-Developer-Tools/hipamd/commit/f892306e227983a7c1943992ba70bf4e4b189105
 ---
  src/CMakeLists.txt | 10 +-
  1 file changed, 5 insertions(+), 5 deletions(-)
@@ 

Bug#1032899: unblock: rocm-hipamd/5.2.3-6

2023-04-28 Thread Paul Gevers

Hi Christian,

On 28-04-2023 00:58, Christian Kastner wrote:

So I split that diff into 02 (patches) and 03 (NOT-patches), also attached.


I think you forgot to add them.


Would a package with just the patches and the (*) changes be acceptable?


I asked you to *also* provide the diff between *current* unstable and 
your proposal (via unstable), because "I was about to propose to upload 
it to tpu" (2023-04-20).


Paul


OpenPGP_signature
Description: OpenPGP digital signature


Bug#1032899: unblock: rocm-hipamd/5.2.3-6

2023-04-27 Thread Christian Kastner
Control: tags -1 - moreinfo

Hi Paul,

sorry this took a while.

On 2023-04-22 13:34, Paul Gevers wrote:
> On 21-04-2023 23:43, Christian Kastner wrote:
>> The only way to do that with llvm-toolchain-15 from testing is by
>> changing the dependency libclang-rt-15-dev back to
>> libclang-common-15-dev (the pre-split version).
> 
> Hmm, so this complicates things. Can you do this change in unstable, or
> would it be broken in unstable?

Luckily, the newer llvm-toolchain-15 is only needed for building tests.
These aren't run (cannot be run) by buildds, so by dropping them for
now, we can drop the problematic build dependency.

And for bin:hipcc, the only binary package affected, I believe the
dependency on libclang-rt-15-dev was wrong anyway, there's a broken
upgrade path for the files that moved in the dependency. The correct
specification  should be:

libclang-common-15-dev (<< 1:15.0.6-5~exp1) | libclang-rt-15-dev (>=
1:15:0.6-5~exp1)

>> If that is an option, I could prepare an upload, and also reduce out
>> whatever other changes you don't feel comfortable with in the larger
>> diff.
> 
> That would be good. Can you also share the minimal delta with the
> current version in unstable? I'll check if that's acceptable.

I've attached the new diff as 01 (FULL) but its d/changelog is noisy,
reflecting the ongoing development process we had in this younger library.

So I split that diff into 02 (patches) and 03 (NOT-patches), also attached.

02_rocm-hipamd-patches.diff
There were 5 patches added (Jan:4 Feb:1), and these represent fixes that
really must be in the package, but were held up by our dependency. One
patch was dropped. Many others just got DEP3 headers.

03_rocm-NOT-hipamd-patches.diff
The diff is not as large as d/changelog suggests. I've summarized all
the changes below, with (*) marking changes that really should get into
testing, and (+) marking changes that aren't strictly needed.

  * Build Depends added: llvm-15, file
  * (RC #1032677) Depends fixed: bin:libamdhip64-5, bin:libamdhip64-dev
  * Depends fixed: bin:hipcc (as described above)
  * *.install files fixed (+ one d/rules change), not-installed added
  * Build flags fixed in d/rules
  * Another RPATH removed
  * Updates to d/copyright

  + Build Depends added: rocminfo (just for tests)
  + Reduce architectures to amd64, arm64, ppc64el (the only platforms
with the necessary drivers)
  + Update Standards-Version from 4.6.1 to 4.6.2
  + autopkgtest added

Would a package with just the patches and the (*) changes be acceptable?

Best,
Christiandiff -Nru rocm-hipamd-5.2.3/debian/changelog rocm-hipamd-5.2.3/debian/changelog
--- rocm-hipamd-5.2.3/debian/changelog  2022-10-20 21:20:33.0 +0200
+++ rocm-hipamd-5.2.3/debian/changelog  2023-04-25 19:50:14.0 +0200
@@ -1,3 +1,85 @@
+rocm-hipamd (5.2.3-7) UNRELEASED; urgency=medium
+
+  * hipcc: Fix Depends to enable transition from split clang package
+  * Drop building of tests, and libclang-rt-15-dev dependency
+
+ -- Christian Kastner   Tue, 25 Apr 2023 19:50:14 +0200
+
+rocm-hipamd (5.2.3-6) unstable; urgency=medium
+
+  * Reduce arch to amd64, arm64, ppc64el
+  * libamdhip64-5: Add dependency on libamd-comgr2 (Closes: #1032677)
+  * Add myself to Uploaders
+  * Fix Maintainer (same list, different name)
+
+ -- Christian Kastner   Fri, 10 Mar 2023 23:38:51 +0100
+
+rocm-hipamd (5.2.3-5) unstable; urgency=medium
+
+  * d/{libamdhip64-dev,rules}: fix version file
+Closes: #1031264
+  * add d/p/0020-replace-x86_64-with-variables.patch
+to fix build on aarch64
+  * d/control: add file to hipcc dependencies
+  * d/control: add dependencies for find_package(hip)
+Closes: #1031538
+  * add d/p/0021-fix-default-cmake-build-on-unsupported-gpus.patch
+to enable gpu arch autodetection with find_package(hip)
+  * d/not-installed: ignore doxygen docs
+  * d/p/000{4,8,9}*.patch: change hip-lang cmake files,
+to partially fix #1031540
+  * d/copyright: update copyright date
+  * d/control: add self to uploaders
+  * cleanup patch metadata
+
+ -- Cordell Bloor   Sun, 19 Feb 2023 03:51:26 -0700
+
+rocm-hipamd (5.2.3-4) unstable; urgency=medium
+
+  * d/t/hipcc: also skip when no kfd driver is loaded.
+
+ -- Étienne Mollier   Sat, 21 Jan 2023 12:54:49 +0100
+
+rocm-hipamd (5.2.3-3) unstable; urgency=medium
+
+  * d/control: build depends on libclang-rt-15-dev.
+  * d/control: hipcc depends on libclang-rt-15-dev.
+  * d/t/hipcc: add; basic script testing hipcc.
+  * d/t/hipconfig: add; script skipping hipconfig if no amdgpu is available.
+  * d/t/control: add hipcc to superficial autopkgtests.
+  * d/t/control: run the d/t/hipconfig test script instead of the command;
+this allows us to trigger conditions for when hardware is not available
+and the script has to be skipped.
+
+ -- Étienne Mollier   Wed, 18 Jan 2023 20:35:17 +0100
+
+rocm-hipamd (5.2.3-2) unstable; urgency=medium
+
+  [ Cordell Bloor ]
+  * d/patches: add 0020-hipcc-remove-rpath-flags.patch
+

Bug#1032899: unblock: rocm-hipamd/5.2.3-6

2023-04-25 Thread Christian Kastner
Hi Paul,

just wanted to say sorry, this is taking a while.

On 2023-04-22 13:34, Paul Gevers wrote:
>> The only way to do that with llvm-toolchain-15 from testing is by
>> changing the dependency libclang-rt-15-dev back to
>> libclang-common-15-dev (the pre-split version).
> 
> Hmm, so this complicates things. Can you do this change in unstable, or
> would it be broken in unstable?

I did not think of that, and you are right, of course. The build breaks
in unstable; the relevant files have all been moved to libclang-rt-15-dev.

However: unless I'm utterly mistaken, these files are only needed for
building tests -- which we don't run on buildds anyway. The package
builds fine without this dependency if test building is skipped, so this
could be a solution when going through unstable.

However-however: libclang-rt-15-dev is also a dependency of the produced
binary package hipcc. That makes sense, since I may want to compile a
test skipped above on my own machine, for example.

It's this dependency makes things tricky (I'm pretty sure there's a
versioned Depends missing anyway) and I'd like to be 100% confident
before suggesting any change to this.

I'm leaving the moreinfo tag for now, and I'll remove it once this is
solved and tested thoroughly.

>> If that is an option, I could prepare an upload, and also reduce out
>> whatever other changes you don't feel comfortable with in the larger
>> diff.
> 
> That would be good. Can you also share the minimal delta with the
> current version in unstable? I'll check if that's acceptable.
Best,
Christian



Bug#1032899: unblock: rocm-hipamd/5.2.3-6

2023-04-22 Thread Paul Gevers

Control: tags -1 moreinfo

Hi,

On 21-04-2023 23:43, Christian Kastner wrote:

In the event that llvm-toolchain-15 will not be allowed to migrate:


I would be surprised if llvm-toolchain-15 gets updated in bookworm.


there are some fixes in the current version of rocm-hipamd that really
should get into bookworm, most notably the missing  libamd-comgr-dev
dependency, and the added patches.

The only way to do that with llvm-toolchain-15 from testing is by
changing the dependency libclang-rt-15-dev back to
libclang-common-15-dev (the pre-split version).


Hmm, so this complicates things. Can you do this change in unstable, or 
would it be broken in unstable?



If that is an option, I could prepare an upload, and also reduce out
whatever other changes you don't feel comfortable with in the larger diff.


That would be good. Can you also share the minimal delta with the 
current version in unstable? I'll check if that's acceptable.


Paul


OpenPGP_signature
Description: OpenPGP digital signature


Bug#1032899: unblock: rocm-hipamd/5.2.3-6

2023-04-21 Thread Christian Kastner
Control: tags -1 - moreinfo

Hi Paul,

On 2023-04-20 08:58, Paul Gevers wrote:
> Sorry for taking so long to respond (the moreinfo tag was still attached
> to the bug, so it didn't show up in my regular bts view, so please
> remove it when you reply).

done.

> On 16-03-2023 11:40, Christian Kastner wrote:
>>> Overall, the diff is a bit long (and has some irrelevant stuff), so
>>> I'm hesitant to offer t-p-u now (to avoid waiting for
>>> llvm-toolchain-15).
>>
>> Understood. Yeah, the diff is long, unfortunately, as the packaging
>> fixes accumulated over time.
> 
> That's why (especially around the freeze) we expect maintainers to keep
> track of migration and ensure they happen. You got stuck behind
> llvm-toolchain-15, but that's very unlikely to be fixed before the release.

We were actually well aware of the migration issue (it was, after all,
preventing our own migration). But that blocking RC bug appeared like an
isolated issue in llvm-toolchain-15, so we were kind of speculating on
the idea that it would eventually resolve itself in time. That bug got
overlooked out of sheer bad luck, though.

In the event that llvm-toolchain-15 will not be allowed to migrate:
there are some fixes in the current version of rocm-hipamd that really
should get into bookworm, most notably the missing  libamd-comgr-dev
dependency, and the added patches.

The only way to do that with llvm-toolchain-15 from testing is by
changing the dependency libclang-rt-15-dev back to
libclang-common-15-dev (the pre-split version).

If that is an option, I could prepare an upload, and also reduce out
whatever other changes you don't feel comfortable with in the larger diff.

>> Is this something that you could consider at a later point in time, if I
>> also break down the diff into more reviewable fragments (dependencies,
>> build, metadata, ...)? Because I do think that most changes are just
>> fixes of one sort or another - no features added.
> 
> I checked the diff again and I was about to propose to upload it to tpu,
> but I saw the following:
> 
> diff -Nru rocm-hipamd-5.2.3/debian/rules rocm-hipamd-5.2.3/debian/rules
> --- rocm-hipamd-5.2.3/debian/rules  2022-10-20 19:20:33.0 +
> +++ rocm-hipamd-5.2.3/debian/rules  2023-03-10 22:38:51.0 +
> 
> [...]
> +   -DHIP_PLATFORM=amd
> 
> Is that correct for the arm64 builds?

Thanks for checking! Yes, that refers to the GPU arch, not the CPU arch.
HIP code is portable in the sense that it can work with both AMD and
Nvidia GPUs.

Best,
Christian



Bug#1032899: unblock: rocm-hipamd/5.2.3-6

2023-04-20 Thread Paul Gevers

Hi,

Sorry for taking so long to respond (the moreinfo tag was still attached 
to the bug, so it didn't show up in my regular bts view, so please 
remove it when you reply).


On 16-03-2023 11:40, Christian Kastner wrote:

Overall, the diff is a bit long (and has some irrelevant stuff), so
I'm hesitant to offer t-p-u now (to avoid waiting for
llvm-toolchain-15).


Understood. Yeah, the diff is long, unfortunately, as the packaging
fixes accumulated over time.


That's why (especially around the freeze) we expect maintainers to keep 
track of migration and ensure they happen. You got stuck behind 
llvm-toolchain-15, but that's very unlikely to be fixed before the release.



Is this something that you could consider at a later point in time, if I
also break down the diff into more reviewable fragments (dependencies,
build, metadata, ...)? Because I do think that most changes are just
fixes of one sort or another - no features added.


I checked the diff again and I was about to propose to upload it to tpu, 
but I saw the following:


diff -Nru rocm-hipamd-5.2.3/debian/rules rocm-hipamd-5.2.3/debian/rules
--- rocm-hipamd-5.2.3/debian/rules  2022-10-20 19:20:33.0 +
+++ rocm-hipamd-5.2.3/debian/rules  2023-03-10 22:38:51.0 +

[...]
+   -DHIP_PLATFORM=amd

Is that correct for the arm64 builds?

Paul


OpenPGP_signature
Description: OpenPGP digital signature


Bug#1032899: unblock: rocm-hipamd/5.2.3-6

2023-03-16 Thread Étienne Mollier
Hi all,

I feel responsible for several of the issues listed by Paul, as
my earlier activity matches the time frame of some of the
changes and problems.

Christian Kastner, on 2023-03-16:
> On 2023-03-16 10:31, Paul Gevers wrote:
> > Control: tags -1 moreinfo On 16-03-2023 00:16, Christian Kastner 
> > wrote: For next time, can you please contact us earlier? We could 
> > have solved the earlier problems in testing-proposed-updates (in 
> > January), then we would now be in a better position.
> 
> I didn't think of that solution as the RC-blocked dependency was only
> available in unstable, and admittedly because I thought this would
> resolve itself in time.
> 
> But in any case: yes, earlier contact would have been helpful, and I'll
> do so in future.

Acknowledged, I must admit I had a similar perception of the
situation when I sloppily checked migration status two months
ago, and it didn't occur immediately to me that it would become
an entangled migration problem during hard freeze.  I'm sorry
about that.

> > By the way, I checked, but none of the ci.d.n host will run any of 
> > your tests, as none of them has an amdgpu (is that a thing you could 
> > expect on non-amd architectures by the way?).
> 
> Correct! Tests will be skipped on official infra.
> 
> It's not just a matter of the missing hardware (we have it, but DSA has
> understandable concerns), it's also about how to even express that a
> package needs a GPU to run its tests (build-time or autopkgtest).

Some kernel and hardware combinations may cause a host hangup,
e.g. the rocm-hipamd package version in testing doesn't
serialize properly tests and this causes a number of bus
contention errors when running the test suite, eventually
leading to a hangup.  I also have a more concerning case of a
test item running into a potential kernel bug on rx6800, which
I'm long overdue to investigate in depth with competent kernel
people (actually I'm unable to tell whether the hardware or the
kernel is at fault thus far, as the crash occurs in amdgpu ecc
functions).

There are other technical concerns regarding maintenance of
virtual machines and binding them to physical hardware due to
having to pass the GPU through the hardware.  The third issue
was it is almost always mandatory to run using non-free-firmware
that cannot be freely audited for passing tests.

The current combination of skippable tests with check on the
availability of kfd device is the best we managed achieve thus
far.

> I recently initiated a discussion about this [3]. For now, the idea to
> run parallel debci infra with guaranteed GPU presence, gather
> experience, and to eventually share proposals on how a GPU dependency
> could be expressed in d/control and d/tests/control.

(I'm overdue to answer to [3], but overall I was mostly fine
with the ideas and haven't spotted anything of concern yet.)

> > One thing I spotted along the way; the (Build-)Depends on llvm 
> > related packages use the *versioned* ones. Is there a reason not to 
> > use the unversioned ones from src:llvm-defaults? That would make llvm
> > transitions a bit easier.
> 
> I'd have to check with the co-maintainers who added it, but from what I
> gather so far, the ROCm stack needs a very recent llvm because of many
> changes being upstreamed there.

The ROCm stack is actually developed against a fork of llvm (the
rocm-llvm).  To avoid having to package more or less a code copy
of the native llvm, we target instead the next llvm-toolchain
version which contains upstreamed changes from rocm-llvm.  Even
that requires extensive patching, thankfully we have benefited
from the substantial help of people from AMD this far on that
front.

> > [1] https://lists.debian.org/debian-devel/2022/09/msg00105.html and 
> > follow-up 
> 
> [2] 
> https://github.com/torvalds/linux/blob/v6.2/drivers/gpu/drm/amd/amdkfd/Kconfig#L6-L8
> [3] https://lists.debian.org/debian-ai/2023/03/msg00038.html

Thank you for your work on putting together Debian 12 bookworm!

Have a nice day,  :)
-- 
  .''`.  Étienne Mollier 
 : :' :  gpg: 8f91 b227 c7d6 f2b1 948c  8236 793c f67e 8f0d 11da
 `. `'   sent from /dev/tty1, please excuse my verbosity
   `-on air: Status Minor - Feel My Hunger


signature.asc
Description: PGP signature


Bug#1032899: unblock: rocm-hipamd/5.2.3-6

2023-03-16 Thread Christian Kastner
Hi Paul,

On 2023-03-16 10:31, Paul Gevers wrote:
> Control: tags -1 moreinfo On 16-03-2023 00:16, Christian Kastner 
> wrote: For next time, can you please contact us earlier? We could 
> have solved the earlier problems in testing-proposed-updates (in 
> January), then we would now be in a better position.

I didn't think of that solution as the RC-blocked dependency was only
available in unstable, and admittedly because I thought this would
resolve itself in time.

But in any case: yes, earlier contact would have been helpful, and I'll
do so in future.

> + * Reduce arch to amd64, arm64, ppc64el
> 
> But it fails on ppc64el; so why this selection?

Because those are the only architectures for which the required amdgpu
kernel driver is available [2].

> Also, as the other architectures FTBFS, we prefer in Debian to *not*
>  limit the architectures, but just let them fail [1]. This eases 
> porter efforts.

Thanks for pointing this out, I thought it was the other way around
(prefer *to* limit to avoid failures). Well, with ppc64el, we followed
that strategy.

> If the packages really don't make sense on some architectures, 
> consider using some of the "properties" provided by 
> bin:architecture-properties in your Build-Depends.

I wasn't aware of this package and I don't think it'll help us here
because we're specifically tracking [2]. But it'll be very useful to
some of my other packages, thanks!

> By the way, I checked, but none of the ci.d.n host will run any of 
> your tests, as none of them has an amdgpu (is that a thing you could 
> expect on non-amd architectures by the way?).

Correct! Tests will be skipped on official infra.

It's not just a matter of the missing hardware (we have it, but DSA has
understandable concerns), it's also about how to even express that a
package needs a GPU to run its tests (build-time or autopkgtest).

I recently initiated a discussion about this [3]. For now, the idea to
run parallel debci infra with guaranteed GPU presence, gather
experience, and to eventually share proposals on how a GPU dependency
could be expressed in d/control and d/tests/control.

> One thing I spotted along the way; the (Build-)Depends on llvm 
> related packages use the *versioned* ones. Is there a reason not to 
> use the unversioned ones from src:llvm-defaults? That would make llvm
> transitions a bit easier.

I'd have to check with the co-maintainers who added it, but from what I
gather so far, the ROCm stack needs a very recent llvm because of many
changes being upstreamed there.

> Overall, the diff is a bit long (and has some irrelevant stuff), so 
> I'm hesitant to offer t-p-u now (to avoid waiting for 
> llvm-toolchain-15).

Understood. Yeah, the diff is long, unfortunately, as the packaging
fixes accumulated over time.

Is this something that you could consider at a later point in time, if I
also break down the diff into more reviewable fragments (dependencies,
build, metadata, ...)? Because I do think that most changes are just
fixes of one sort or another - no features added.

Best,
Christian


> [1] https://lists.debian.org/debian-devel/2022/09/msg00105.html and follow-up 

[2] 
https://github.com/torvalds/linux/blob/v6.2/drivers/gpu/drm/amd/amdkfd/Kconfig#L6-L8
[3] https://lists.debian.org/debian-ai/2023/03/msg00038.html



Bug#1032899: unblock: rocm-hipamd/5.2.3-6

2023-03-16 Thread Paul Gevers

Control: tags -1 moreinfo

Hi,

On 16-03-2023 00:16, Christian Kastner wrote:

On 2023-03-13 18:28, Christian Kastner wrote:

[ Impact ]
The new versions are in far better shape: they've catched missing
dependencies, added patches, improved the build process, etc.


Apologies, I was only thinking of the more recent releases.

Revision -2 fixed an RC bug in January, but never got the chance to
migrate because of an RC bug in a dependency. Revision -6 fixed another
RC bug.

All releases after -2 were incremental improvements that basically never
got the chance to migrate because of a dependency not migrating.


For next time, can you please contact us earlier? We could have solved 
the earlier problems in testing-proposed-updates (in January), then we 
would now be in a better position.


+ * Reduce arch to amd64, arm64, ppc64el

But it fails on ppc64el; so why this selection? Also, as the other 
architectures FTBFS, we prefer in Debian to *not* limit the 
architectures, but just let them fail [1]. This eases porter efforts. If 
the packages really don't make sense on some architectures, consider 
using some of the "properties" provided by bin:architecture-properties 
in your Build-Depends.


By the way, I checked, but none of the ci.d.n host will run any of your 
tests, as none of them has an amdgpu (is that a thing you could expect 
on non-amd architectures by the way?).


One thing I spotted along the way; the (Build-)Depends on llvm related 
packages use the *versioned* ones. Is there a reason not to use the 
unversioned ones from src:llvm-defaults? That would make llvm 
transitions a bit easier.


Overall, the diff is a bit long (and has some irrelevant stuff), so I'm 
hesitant to offer t-p-u now (to avoid waiting for llvm-toolchain-15).


Paul

[1] https://lists.debian.org/debian-devel/2022/09/msg00105.html and 
follow-up


OpenPGP_signature
Description: OpenPGP digital signature


Bug#1032899: unblock: rocm-hipamd/5.2.3-6

2023-03-15 Thread Christian Kastner
On 2023-03-13 18:28, Christian Kastner wrote:
> [ Impact ]
> The new versions are in far better shape: they've catched missing
> dependencies, added patches, improved the build process, etc.

Apologies, I was only thinking of the more recent releases.

Revision -2 fixed an RC bug in January, but never got the chance to
migrate because of an RC bug in a dependency. Revision -6 fixed another
RC bug.

All releases after -2 were incremental improvements that basically never
got the chance to migrate because of a dependency not migrating.



Bug#1032899: unblock: rocm-hipamd/5.2.3-6

2023-03-13 Thread Christian Kastner
Package: release.debian.org
Severity: normal
User: release.debian@packages.debian.org
X-Debbugs-Cc: debian...@lists.debian.org
Usertags: unblock
Control: affects -1 + src:rocm-hipamd

Please unblock package rocm-hipamd

rocm-hipamd 5.2.3-1 has been in testing for a few months now, so have
the following -2 and -3 revisions.

The three revisions since January were blocked from migrating by its
dependency src:llvm-toolchain-15, where a package split was introduced
to unstable, and one of the new packages was not allowed to migrate
because of an RC bug. This bug was recently fixed.

[ Reason ]
The changes in -2 to -6 are all just added patches, or packaging fixes.

[ Impact ]
The new versions are in far better shape: they've catched missing
dependencies, added patches, improved the build process, etc.

[ Tests ]
Manual tests, on the workstations of multpile maintainers. These
packages cannot be tested on debci because the tests require GPUs to work.

[ Risks ]
Given that there are no upstream changes other than added patches for
fixing this, the risks are minimal.

[ Checklist ]
  [x] all changes are documented in the d/changelog
  [x] I reviewed all changes and I approve them
  [x] attach debdiff against the package in testing

unblock rocm-hipamd/5.2.3-6diff -Nru rocm-hipamd-5.2.3/debian/changelog rocm-hipamd-5.2.3/debian/changelog
--- rocm-hipamd-5.2.3/debian/changelog  2022-10-20 21:20:33.0 +0200
+++ rocm-hipamd-5.2.3/debian/changelog  2023-03-10 23:38:51.0 +0100
@@ -1,3 +1,78 @@
+rocm-hipamd (5.2.3-6) unstable; urgency=medium
+
+  * Reduce arch to amd64, arm64, ppc64el
+  * libamdhip64-5: Add dependency on libamd-comgr2 (Closes: #1032677)
+  * Add myself to Uploaders
+  * Fix Maintainer (same list, different name)
+
+ -- Christian Kastner   Fri, 10 Mar 2023 23:38:51 +0100
+
+rocm-hipamd (5.2.3-5) unstable; urgency=medium
+
+  * d/{libamdhip64-dev,rules}: fix version file
+Closes: #1031264
+  * add d/p/0020-replace-x86_64-with-variables.patch
+to fix build on aarch64
+  * d/control: add file to hipcc dependencies
+  * d/control: add dependencies for find_package(hip)
+Closes: #1031538
+  * add d/p/0021-fix-default-cmake-build-on-unsupported-gpus.patch
+to enable gpu arch autodetection with find_package(hip)
+  * d/not-installed: ignore doxygen docs
+  * d/p/000{4,8,9}*.patch: change hip-lang cmake files,
+to partially fix #1031540
+  * d/copyright: update copyright date
+  * d/control: add self to uploaders
+  * cleanup patch metadata
+
+ -- Cordell Bloor   Sun, 19 Feb 2023 03:51:26 -0700
+
+rocm-hipamd (5.2.3-4) unstable; urgency=medium
+
+  * d/t/hipcc: also skip when no kfd driver is loaded.
+
+ -- Étienne Mollier   Sat, 21 Jan 2023 12:54:49 +0100
+
+rocm-hipamd (5.2.3-3) unstable; urgency=medium
+
+  * d/control: build depends on libclang-rt-15-dev.
+  * d/control: hipcc depends on libclang-rt-15-dev.
+  * d/t/hipcc: add; basic script testing hipcc.
+  * d/t/hipconfig: add; script skipping hipconfig if no amdgpu is available.
+  * d/t/control: add hipcc to superficial autopkgtests.
+  * d/t/control: run the d/t/hipconfig test script instead of the command;
+this allows us to trigger conditions for when hardware is not available
+and the script has to be skipped.
+
+ -- Étienne Mollier   Wed, 18 Jan 2023 20:35:17 +0100
+
+rocm-hipamd (5.2.3-2) unstable; urgency=medium
+
+  [ Cordell Bloor ]
+  * d/patches: add 0020-hipcc-remove-rpath-flags.patch
+Closes: #1021642
+  * d/rules: trim unnecessary rules
+  * d/rules: strip RUNPATH from libamdhip64.so
+  * debian/patches: backport 56b3260 from upstream
+Closes: #1021643
+  * d/rules: disable creation of duplicate files
+  * d/patches: fix search paths when building with g++
+  * d/patches: add 0002-fix-cmake-library-notfound-check.patch
+  * d/libamdhip64-dev.install: install /usr/share/hip/version
+
+  [ Étienne Mollier ]
+  * 0005-clang-15.patch: also adjust llc postfix.
+Thanks to Jakub Jaszewski
+  * d/t/control: check hipconfig doesn't output error messages.
+  * d/control: hipcc depends on rocminfo.
+  * d/control: declare compliance to standards version 4.6.2.
+  * d/copyright: update copyright year.
+  * d/rules: build tests in parallel.
+  * d/rules: set library path to find the freshly built library.
+  * d/rules: force run tests sequentially; avoid bus contention on the GPU.
+
+ -- Étienne Mollier   Sat, 14 Jan 2023 11:16:01 +0100
+
 rocm-hipamd (5.2.3-1) unstable; urgency=medium
 
   * Migrate ROCm 5.2.3 to unstable.
diff -Nru rocm-hipamd-5.2.3/debian/control rocm-hipamd-5.2.3/debian/control
--- rocm-hipamd-5.2.3/debian/control2022-10-20 21:20:33.0 +0200
+++ rocm-hipamd-5.2.3/debian/control2023-03-10 23:38:51.0 +0100
@@ -6,12 +6,14 @@
 Section: devel
 Homepage: https://github.com/rocm-developer-tools/hipamd
 Priority: optional
-Standards-Version: 4.6.1
+Standards-Version: 4.6.2
 Vcs-Git: https://salsa.debian.org/rocm-team/rocm-hipamd.git
 Vcs-Browser: