[clang] [llvm] [HIP] add --offload-compression-level= option (PR #83605)

2024-03-09 Thread Yaxun Liu via cfe-commits

https://github.com/yxsamliu closed 
https://github.com/llvm/llvm-project/pull/83605
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [HIP] add --offload-compression-level= option (PR #83605)

2024-03-08 Thread Yaxun Liu via cfe-commits

https://github.com/yxsamliu updated 
https://github.com/llvm/llvm-project/pull/83605

>From c46a3ce625a34a497cd0b14631cb755b903e93d6 Mon Sep 17 00:00:00 2001
From: "Yaxun (Sam) Liu" 
Date: Fri, 1 Mar 2024 13:16:45 -0500
Subject: [PATCH] [HIP] add --offload-compression-level= option

Added --offload-compression-level= option to clang and -compression-level=
option to clang-offload-bundler and clang-linker-wrapper for
controlling compression level.

Added support of long distance matching (LDM) for llvm::zstd which is off
by default. Enable it for clang-offload-bundler by default since it
improves compression rate in general.

Change default compression level to 3 for zstd for clang-offload-bundler
since it works well for bundle entry size from 1KB to 32MB, which should
cover most of the clang-offload-bundler usage. Users can still specify
compression level by -compression-level= option if necessary.

Change-Id: I5e2d7abcaa11a2a37a5e798476e3f572bba11cab
---
 clang/include/clang/Driver/OffloadBundler.h   |   6 +-
 clang/include/clang/Driver/Options.td |   4 +
 clang/lib/Driver/OffloadBundler.cpp   | 111 ++
 clang/lib/Driver/ToolChains/Clang.cpp |  11 +-
 clang/lib/Driver/ToolChains/CommonArgs.cpp|  12 ++
 clang/lib/Driver/ToolChains/CommonArgs.h  |   2 +
 clang/lib/Driver/ToolChains/HIPUtility.cpp|   7 +-
 .../test/Driver/clang-offload-bundler-zlib.c  |  21 +++-
 .../test/Driver/clang-offload-bundler-zstd.c  |  19 ++-
 .../test/Driver/hip-offload-compress-zlib.hip |   7 +-
 .../test/Driver/hip-offload-compress-zstd.hip |   5 +-
 clang/test/Driver/linker-wrapper.c|   5 +-
 .../ClangLinkerWrapper.cpp|   3 +
 .../clang-linker-wrapper/LinkerWrapperOpts.td |   2 +
 .../ClangOffloadBundler.cpp   |   5 +
 llvm/include/llvm/Support/Compression.h   |   5 +-
 llvm/lib/Support/Compression.cpp  |  43 +--
 17 files changed, 204 insertions(+), 64 deletions(-)

diff --git a/clang/include/clang/Driver/OffloadBundler.h 
b/clang/include/clang/Driver/OffloadBundler.h
index 84349abe185fa4..65d33bfbd2825f 100644
--- a/clang/include/clang/Driver/OffloadBundler.h
+++ b/clang/include/clang/Driver/OffloadBundler.h
@@ -17,6 +17,7 @@
 #ifndef LLVM_CLANG_DRIVER_OFFLOADBUNDLER_H
 #define LLVM_CLANG_DRIVER_OFFLOADBUNDLER_H
 
+#include "llvm/Support/Compression.h"
 #include "llvm/Support/Error.h"
 #include "llvm/TargetParser/Triple.h"
 #include 
@@ -36,6 +37,8 @@ class OffloadBundlerConfig {
   bool HipOpenmpCompatible = false;
   bool Compress = false;
   bool Verbose = false;
+  llvm::compression::Format CompressionFormat;
+  int CompressionLevel;
 
   unsigned BundleAlignment = 1;
   unsigned HostInputIndex = ~0u;
@@ -116,7 +119,8 @@ class CompressedOffloadBundle {
 
 public:
   static llvm::Expected>
-  compress(const llvm::MemoryBuffer , bool Verbose = false);
+  compress(llvm::compression::Params P, const llvm::MemoryBuffer ,
+   bool Verbose = false);
   static llvm::Expected>
   decompress(const llvm::MemoryBuffer , bool Verbose = false);
 };
diff --git a/clang/include/clang/Driver/Options.td 
b/clang/include/clang/Driver/Options.td
index 5b3d366dbcf91b..2d26a7983f397b 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -1264,6 +1264,10 @@ def fno_gpu_sanitize : Flag<["-"], "fno-gpu-sanitize">, 
Group;
 def offload_compress : Flag<["--"], "offload-compress">,
   HelpText<"Compress offload device binaries (HIP only)">;
 def no_offload_compress : Flag<["--"], "no-offload-compress">;
+
+def offload_compression_level_EQ : Joined<["--"], 
"offload-compression-level=">,
+  Flags<[HelpHidden]>,
+  HelpText<"Compression level for offload device binaries (HIP only)">;
 }
 
 // CUDA options
diff --git a/clang/lib/Driver/OffloadBundler.cpp 
b/clang/lib/Driver/OffloadBundler.cpp
index f9eadfaec88dec..77c89356bc76bb 100644
--- a/clang/lib/Driver/OffloadBundler.cpp
+++ b/clang/lib/Driver/OffloadBundler.cpp
@@ -924,6 +924,17 @@ CreateFileHandler(MemoryBuffer ,
 }
 
 OffloadBundlerConfig::OffloadBundlerConfig() {
+  if (llvm::compression::zstd::isAvailable()) {
+CompressionFormat = llvm::compression::Format::Zstd;
+// Compression level 3 is usually sufficient for zstd since long distance
+// matching is enabled.
+CompressionLevel = 3;
+  } else if (llvm::compression::zlib::isAvailable()) {
+CompressionFormat = llvm::compression::Format::Zlib;
+// Use default level for zlib since higher level does not have significant
+// improvement.
+CompressionLevel = llvm::compression::zlib::DefaultCompression;
+  }
   auto IgnoreEnvVarOpt =
   llvm::sys::Process::GetEnv("OFFLOAD_BUNDLER_IGNORE_ENV_VAR");
   if (IgnoreEnvVarOpt.has_value() && IgnoreEnvVarOpt.value() == "1")
@@ -937,11 +948,41 @@ OffloadBundlerConfig::OffloadBundlerConfig() {
   llvm::sys::Process::GetEnv("OFFLOAD_BUNDLER_COMPRESS");
   if (CompressEnvVarOpt.has_value())
 

[clang] [llvm] [HIP] add --offload-compression-level= option (PR #83605)

2024-03-08 Thread Yaxun Liu via cfe-commits

https://github.com/yxsamliu updated 
https://github.com/llvm/llvm-project/pull/83605

>From 906b23c5f8ef815b7727fe2bda852c33f0d9147b Mon Sep 17 00:00:00 2001
From: "Yaxun (Sam) Liu" 
Date: Fri, 1 Mar 2024 13:16:45 -0500
Subject: [PATCH] [HIP] add --offload-compression-level= option

Added --offload-compression-level= option to clang and -compression-level=
option to clang-offload-bundler and clang-linker-wrapper for
controlling compression level.

Added support of long distance matching (LDM) for llvm::zstd which is off
by default. Enable it for clang-offload-bundler by default since it
improves compression rate in general.

Change default compression level to 3 for zstd for clang-offload-bundler
since it works well for bundle entry size from 1KB to 32MB, which should
cover most of the clang-offload-bundler usage. Users can still specify
compression level by -compression-level= option if necessary.

Change-Id: I5e2d7abcaa11a2a37a5e798476e3f572bba11cab
---
 clang/include/clang/Driver/OffloadBundler.h   |   6 +-
 clang/include/clang/Driver/Options.td |   4 +
 clang/lib/Driver/OffloadBundler.cpp   | 111 ++
 clang/lib/Driver/ToolChains/Clang.cpp |  11 +-
 clang/lib/Driver/ToolChains/CommonArgs.cpp|  14 +++
 clang/lib/Driver/ToolChains/CommonArgs.h  |   2 +
 clang/lib/Driver/ToolChains/HIPUtility.cpp|   7 +-
 .../test/Driver/clang-offload-bundler-zlib.c  |  21 +++-
 .../test/Driver/clang-offload-bundler-zstd.c  |  19 ++-
 .../test/Driver/hip-offload-compress-zlib.hip |   7 +-
 .../test/Driver/hip-offload-compress-zstd.hip |   5 +-
 clang/test/Driver/linker-wrapper.c|   5 +-
 .../ClangLinkerWrapper.cpp|   5 +
 .../clang-linker-wrapper/LinkerWrapperOpts.td |   2 +
 .../ClangOffloadBundler.cpp   |   5 +
 llvm/include/llvm/Support/Compression.h   |   5 +-
 llvm/lib/Support/Compression.cpp  |  43 +--
 17 files changed, 208 insertions(+), 64 deletions(-)

diff --git a/clang/include/clang/Driver/OffloadBundler.h 
b/clang/include/clang/Driver/OffloadBundler.h
index 84349abe185fa4..65d33bfbd2825f 100644
--- a/clang/include/clang/Driver/OffloadBundler.h
+++ b/clang/include/clang/Driver/OffloadBundler.h
@@ -17,6 +17,7 @@
 #ifndef LLVM_CLANG_DRIVER_OFFLOADBUNDLER_H
 #define LLVM_CLANG_DRIVER_OFFLOADBUNDLER_H
 
+#include "llvm/Support/Compression.h"
 #include "llvm/Support/Error.h"
 #include "llvm/TargetParser/Triple.h"
 #include 
@@ -36,6 +37,8 @@ class OffloadBundlerConfig {
   bool HipOpenmpCompatible = false;
   bool Compress = false;
   bool Verbose = false;
+  llvm::compression::Format CompressionFormat;
+  int CompressionLevel;
 
   unsigned BundleAlignment = 1;
   unsigned HostInputIndex = ~0u;
@@ -116,7 +119,8 @@ class CompressedOffloadBundle {
 
 public:
   static llvm::Expected>
-  compress(const llvm::MemoryBuffer , bool Verbose = false);
+  compress(llvm::compression::Params P, const llvm::MemoryBuffer ,
+   bool Verbose = false);
   static llvm::Expected>
   decompress(const llvm::MemoryBuffer , bool Verbose = false);
 };
diff --git a/clang/include/clang/Driver/Options.td 
b/clang/include/clang/Driver/Options.td
index 5b3d366dbcf91b..2d26a7983f397b 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -1264,6 +1264,10 @@ def fno_gpu_sanitize : Flag<["-"], "fno-gpu-sanitize">, 
Group;
 def offload_compress : Flag<["--"], "offload-compress">,
   HelpText<"Compress offload device binaries (HIP only)">;
 def no_offload_compress : Flag<["--"], "no-offload-compress">;
+
+def offload_compression_level_EQ : Joined<["--"], 
"offload-compression-level=">,
+  Flags<[HelpHidden]>,
+  HelpText<"Compression level for offload device binaries (HIP only)">;
 }
 
 // CUDA options
diff --git a/clang/lib/Driver/OffloadBundler.cpp 
b/clang/lib/Driver/OffloadBundler.cpp
index f9eadfaec88dec..77c89356bc76bb 100644
--- a/clang/lib/Driver/OffloadBundler.cpp
+++ b/clang/lib/Driver/OffloadBundler.cpp
@@ -924,6 +924,17 @@ CreateFileHandler(MemoryBuffer ,
 }
 
 OffloadBundlerConfig::OffloadBundlerConfig() {
+  if (llvm::compression::zstd::isAvailable()) {
+CompressionFormat = llvm::compression::Format::Zstd;
+// Compression level 3 is usually sufficient for zstd since long distance
+// matching is enabled.
+CompressionLevel = 3;
+  } else if (llvm::compression::zlib::isAvailable()) {
+CompressionFormat = llvm::compression::Format::Zlib;
+// Use default level for zlib since higher level does not have significant
+// improvement.
+CompressionLevel = llvm::compression::zlib::DefaultCompression;
+  }
   auto IgnoreEnvVarOpt =
   llvm::sys::Process::GetEnv("OFFLOAD_BUNDLER_IGNORE_ENV_VAR");
   if (IgnoreEnvVarOpt.has_value() && IgnoreEnvVarOpt.value() == "1")
@@ -937,11 +948,41 @@ OffloadBundlerConfig::OffloadBundlerConfig() {
   llvm::sys::Process::GetEnv("OFFLOAD_BUNDLER_COMPRESS");
   if (CompressEnvVarOpt.has_value())
 

[clang] [llvm] [HIP] add --offload-compression-level= option (PR #83605)

2024-03-08 Thread Yaxun Liu via cfe-commits


@@ -2863,3 +2863,18 @@ void tools::addOutlineAtomicsArgs(const Driver , const 
ToolChain ,
 CmdArgs.push_back("+outline-atomics");
   }
 }
+
+void tools::addOffloadCompressArgs(const llvm::opt::ArgList ,
+   llvm::opt::ArgStringList ) {
+  if (TCArgs.hasFlag(options::OPT_offload_compress,
+ options::OPT_no_offload_compress, false))
+CmdArgs.push_back("-compress");
+  if (TCArgs.hasArg(options::OPT_v))
+CmdArgs.push_back("-verbose");
+  if (auto *Arg =
+  TCArgs.getLastArg(options::OPT_offload_compression_level_EQ)) {
+std::string CompressionLevelArg =
+std::string("-compression-level=") + Arg->getValue();
+CmdArgs.push_back(TCArgs.MakeArgString(CompressionLevelArg));

yxsamliu wrote:

will do

https://github.com/llvm/llvm-project/pull/83605
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [HIP] add --offload-compression-level= option (PR #83605)

2024-03-08 Thread Artem Belevich via cfe-commits


@@ -2863,3 +2863,18 @@ void tools::addOutlineAtomicsArgs(const Driver , const 
ToolChain ,
 CmdArgs.push_back("+outline-atomics");
   }
 }
+
+void tools::addOffloadCompressArgs(const llvm::opt::ArgList ,
+   llvm::opt::ArgStringList ) {
+  if (TCArgs.hasFlag(options::OPT_offload_compress,
+ options::OPT_no_offload_compress, false))
+CmdArgs.push_back("-compress");
+  if (TCArgs.hasArg(options::OPT_v))
+CmdArgs.push_back("-verbose");
+  if (auto *Arg =
+  TCArgs.getLastArg(options::OPT_offload_compression_level_EQ)) {
+std::string CompressionLevelArg =
+std::string("-compression-level=") + Arg->getValue();
+CmdArgs.push_back(TCArgs.MakeArgString(CompressionLevelArg));

Artem-B wrote:

This may be collapsed to just 
```
CmdArgs.push_back(TCArgs.MakeArgString("-compression-level=" + 
Arg->getValue()))`. 
```
Maybe with a `Twine` or `StringRef` wrapping the string literal.

https://github.com/llvm/llvm-project/pull/83605
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [HIP] add --offload-compression-level= option (PR #83605)

2024-03-08 Thread Artem Belevich via cfe-commits

https://github.com/Artem-B approved this pull request.

LGTM.

https://github.com/llvm/llvm-project/pull/83605
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [HIP] add --offload-compression-level= option (PR #83605)

2024-03-08 Thread Artem Belevich via cfe-commits

https://github.com/Artem-B edited 
https://github.com/llvm/llvm-project/pull/83605
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [HIP] add --offload-compression-level= option (PR #83605)

2024-03-08 Thread Yaxun Liu via cfe-commits

https://github.com/yxsamliu updated 
https://github.com/llvm/llvm-project/pull/83605

>From 78ad578a19d2a3585f20ab64d364a46a584ec035 Mon Sep 17 00:00:00 2001
From: "Yaxun (Sam) Liu" 
Date: Fri, 1 Mar 2024 13:16:45 -0500
Subject: [PATCH] [HIP] add --offload-compression-level= option

Added --offload-compression-level= option to clang and -compression-level=
option to clang-offload-bundler and clang-linker-wrapper for
controlling compression level.

Added support of long distance matching (LDM) for llvm::zstd which is off
by default. Enable it for clang-offload-bundler by default since it
improves compression rate in general.

Change default compression level to 3 for zstd for clang-offload-bundler
since it works well for bundle entry size from 1KB to 32MB, which should
cover most of the clang-offload-bundler usage. Users can still specify
compression level by -compression-level= option if necessary.

Change-Id: I5e2d7abcaa11a2a37a5e798476e3f572bba11cab
---
 clang/include/clang/Driver/OffloadBundler.h   |   6 +-
 clang/include/clang/Driver/Options.td |   4 +
 clang/lib/Driver/OffloadBundler.cpp   | 111 ++
 clang/lib/Driver/ToolChains/Clang.cpp |  11 +-
 clang/lib/Driver/ToolChains/CommonArgs.cpp|  15 +++
 clang/lib/Driver/ToolChains/CommonArgs.h  |   2 +
 clang/lib/Driver/ToolChains/HIPUtility.cpp|   7 +-
 .../test/Driver/clang-offload-bundler-zlib.c  |  21 +++-
 .../test/Driver/clang-offload-bundler-zstd.c  |  19 ++-
 .../test/Driver/hip-offload-compress-zlib.hip |   7 +-
 .../test/Driver/hip-offload-compress-zstd.hip |   5 +-
 clang/test/Driver/linker-wrapper.c|   5 +-
 .../ClangLinkerWrapper.cpp|   5 +
 .../clang-linker-wrapper/LinkerWrapperOpts.td |   2 +
 .../ClangOffloadBundler.cpp   |   5 +
 llvm/include/llvm/Support/Compression.h   |   5 +-
 llvm/lib/Support/Compression.cpp  |  40 +--
 17 files changed, 207 insertions(+), 63 deletions(-)

diff --git a/clang/include/clang/Driver/OffloadBundler.h 
b/clang/include/clang/Driver/OffloadBundler.h
index 84349abe185fa4..65d33bfbd2825f 100644
--- a/clang/include/clang/Driver/OffloadBundler.h
+++ b/clang/include/clang/Driver/OffloadBundler.h
@@ -17,6 +17,7 @@
 #ifndef LLVM_CLANG_DRIVER_OFFLOADBUNDLER_H
 #define LLVM_CLANG_DRIVER_OFFLOADBUNDLER_H
 
+#include "llvm/Support/Compression.h"
 #include "llvm/Support/Error.h"
 #include "llvm/TargetParser/Triple.h"
 #include 
@@ -36,6 +37,8 @@ class OffloadBundlerConfig {
   bool HipOpenmpCompatible = false;
   bool Compress = false;
   bool Verbose = false;
+  llvm::compression::Format CompressionFormat;
+  int CompressionLevel;
 
   unsigned BundleAlignment = 1;
   unsigned HostInputIndex = ~0u;
@@ -116,7 +119,8 @@ class CompressedOffloadBundle {
 
 public:
   static llvm::Expected>
-  compress(const llvm::MemoryBuffer , bool Verbose = false);
+  compress(llvm::compression::Params P, const llvm::MemoryBuffer ,
+   bool Verbose = false);
   static llvm::Expected>
   decompress(const llvm::MemoryBuffer , bool Verbose = false);
 };
diff --git a/clang/include/clang/Driver/Options.td 
b/clang/include/clang/Driver/Options.td
index 5b3d366dbcf91b..2d26a7983f397b 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -1264,6 +1264,10 @@ def fno_gpu_sanitize : Flag<["-"], "fno-gpu-sanitize">, 
Group;
 def offload_compress : Flag<["--"], "offload-compress">,
   HelpText<"Compress offload device binaries (HIP only)">;
 def no_offload_compress : Flag<["--"], "no-offload-compress">;
+
+def offload_compression_level_EQ : Joined<["--"], 
"offload-compression-level=">,
+  Flags<[HelpHidden]>,
+  HelpText<"Compression level for offload device binaries (HIP only)">;
 }
 
 // CUDA options
diff --git a/clang/lib/Driver/OffloadBundler.cpp 
b/clang/lib/Driver/OffloadBundler.cpp
index f9eadfaec88dec..77c89356bc76bb 100644
--- a/clang/lib/Driver/OffloadBundler.cpp
+++ b/clang/lib/Driver/OffloadBundler.cpp
@@ -924,6 +924,17 @@ CreateFileHandler(MemoryBuffer ,
 }
 
 OffloadBundlerConfig::OffloadBundlerConfig() {
+  if (llvm::compression::zstd::isAvailable()) {
+CompressionFormat = llvm::compression::Format::Zstd;
+// Compression level 3 is usually sufficient for zstd since long distance
+// matching is enabled.
+CompressionLevel = 3;
+  } else if (llvm::compression::zlib::isAvailable()) {
+CompressionFormat = llvm::compression::Format::Zlib;
+// Use default level for zlib since higher level does not have significant
+// improvement.
+CompressionLevel = llvm::compression::zlib::DefaultCompression;
+  }
   auto IgnoreEnvVarOpt =
   llvm::sys::Process::GetEnv("OFFLOAD_BUNDLER_IGNORE_ENV_VAR");
   if (IgnoreEnvVarOpt.has_value() && IgnoreEnvVarOpt.value() == "1")
@@ -937,11 +948,41 @@ OffloadBundlerConfig::OffloadBundlerConfig() {
   llvm::sys::Process::GetEnv("OFFLOAD_BUNDLER_COMPRESS");
   if (CompressEnvVarOpt.has_value())
 

[clang] [llvm] [HIP] add --offload-compression-level= option (PR #83605)

2024-03-08 Thread Joseph Huber via cfe-commits

https://github.com/jhuber6 commented:

Looks fine to me, I'll wait a bit to see if Artem or Fangrui have anything to 
add.

https://github.com/llvm/llvm-project/pull/83605
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [HIP] add --offload-compression-level= option (PR #83605)

2024-03-08 Thread Yaxun Liu via cfe-commits

yxsamliu wrote:

> Should an option like in #84337 be added for the new driver?

added the option to linker wrapper

https://github.com/llvm/llvm-project/pull/83605
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [HIP] add --offload-compression-level= option (PR #83605)

2024-03-08 Thread Yaxun Liu via cfe-commits

https://github.com/yxsamliu updated 
https://github.com/llvm/llvm-project/pull/83605

>From 60faf7f657fdcc00edfa0a1813d1e2746c341ef1 Mon Sep 17 00:00:00 2001
From: "Yaxun (Sam) Liu" 
Date: Fri, 1 Mar 2024 13:16:45 -0500
Subject: [PATCH] [HIP] add --offload-compression-level= option

Added --offload-compression-level= option to clang and -compression-level=
option to clang-offload-bundler and clang-linker-wrapper for
controlling compression level.

Added support of long distance matching (LDM) for llvm::zstd which is off
by default. Enable it for clang-offload-bundler by default since it
improves compression rate in general.

Change default compression level to 3 for zstd for clang-offload-bundler
since it works well for bundle entry size from 1KB to 32MB, which should
cover most of the clang-offload-bundler usage. Users can still specify
compression level by -compression-level= option if necessary.

Change-Id: I5e2d7abcaa11a2a37a5e798476e3f572bba11cab
---
 clang/include/clang/Driver/OffloadBundler.h   |   6 +-
 clang/include/clang/Driver/Options.td |   4 +
 clang/lib/Driver/OffloadBundler.cpp   | 113 ++
 clang/lib/Driver/ToolChains/Clang.cpp |  11 +-
 clang/lib/Driver/ToolChains/CommonArgs.cpp|  15 +++
 clang/lib/Driver/ToolChains/CommonArgs.h  |   2 +
 clang/lib/Driver/ToolChains/HIPUtility.cpp|   7 +-
 .../test/Driver/clang-offload-bundler-zlib.c  |  21 +++-
 .../test/Driver/clang-offload-bundler-zstd.c  |  19 ++-
 .../test/Driver/hip-offload-compress-zlib.hip |   7 +-
 .../test/Driver/hip-offload-compress-zstd.hip |   5 +-
 clang/test/Driver/linker-wrapper.c|   5 +-
 .../ClangLinkerWrapper.cpp|   5 +
 .../clang-linker-wrapper/LinkerWrapperOpts.td |   2 +
 .../ClangOffloadBundler.cpp   |   5 +
 llvm/include/llvm/Support/Compression.h   |   5 +-
 llvm/lib/Support/Compression.cpp  |  40 +--
 17 files changed, 209 insertions(+), 63 deletions(-)

diff --git a/clang/include/clang/Driver/OffloadBundler.h 
b/clang/include/clang/Driver/OffloadBundler.h
index 84349abe185fa4..65d33bfbd2825f 100644
--- a/clang/include/clang/Driver/OffloadBundler.h
+++ b/clang/include/clang/Driver/OffloadBundler.h
@@ -17,6 +17,7 @@
 #ifndef LLVM_CLANG_DRIVER_OFFLOADBUNDLER_H
 #define LLVM_CLANG_DRIVER_OFFLOADBUNDLER_H
 
+#include "llvm/Support/Compression.h"
 #include "llvm/Support/Error.h"
 #include "llvm/TargetParser/Triple.h"
 #include 
@@ -36,6 +37,8 @@ class OffloadBundlerConfig {
   bool HipOpenmpCompatible = false;
   bool Compress = false;
   bool Verbose = false;
+  llvm::compression::Format CompressionFormat;
+  int CompressionLevel;
 
   unsigned BundleAlignment = 1;
   unsigned HostInputIndex = ~0u;
@@ -116,7 +119,8 @@ class CompressedOffloadBundle {
 
 public:
   static llvm::Expected>
-  compress(const llvm::MemoryBuffer , bool Verbose = false);
+  compress(llvm::compression::Params P, const llvm::MemoryBuffer ,
+   bool Verbose = false);
   static llvm::Expected>
   decompress(const llvm::MemoryBuffer , bool Verbose = false);
 };
diff --git a/clang/include/clang/Driver/Options.td 
b/clang/include/clang/Driver/Options.td
index 5b3d366dbcf91b..2d26a7983f397b 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -1264,6 +1264,10 @@ def fno_gpu_sanitize : Flag<["-"], "fno-gpu-sanitize">, 
Group;
 def offload_compress : Flag<["--"], "offload-compress">,
   HelpText<"Compress offload device binaries (HIP only)">;
 def no_offload_compress : Flag<["--"], "no-offload-compress">;
+
+def offload_compression_level_EQ : Joined<["--"], 
"offload-compression-level=">,
+  Flags<[HelpHidden]>,
+  HelpText<"Compression level for offload device binaries (HIP only)">;
 }
 
 // CUDA options
diff --git a/clang/lib/Driver/OffloadBundler.cpp 
b/clang/lib/Driver/OffloadBundler.cpp
index f9eadfaec88dec..a077a9648b0e9b 100644
--- a/clang/lib/Driver/OffloadBundler.cpp
+++ b/clang/lib/Driver/OffloadBundler.cpp
@@ -924,6 +924,17 @@ CreateFileHandler(MemoryBuffer ,
 }
 
 OffloadBundlerConfig::OffloadBundlerConfig() {
+  if (llvm::compression::zstd::isAvailable()) {
+CompressionFormat = llvm::compression::Format::Zstd;
+// Compression level 3 is usually sufficient for zstd since long distance
+// matching is enabled.
+CompressionLevel = 3;
+  } else if (llvm::compression::zlib::isAvailable()) {
+CompressionFormat = llvm::compression::Format::Zlib;
+// Use default level for zlib since higher level does not have significant
+// improvement.
+CompressionLevel = llvm::compression::zlib::DefaultCompression;
+  }
   auto IgnoreEnvVarOpt =
   llvm::sys::Process::GetEnv("OFFLOAD_BUNDLER_IGNORE_ENV_VAR");
   if (IgnoreEnvVarOpt.has_value() && IgnoreEnvVarOpt.value() == "1")
@@ -937,11 +948,41 @@ OffloadBundlerConfig::OffloadBundlerConfig() {
   llvm::sys::Process::GetEnv("OFFLOAD_BUNDLER_COMPRESS");
   if (CompressEnvVarOpt.has_value())
 

[clang] [llvm] [HIP] add --offload-compression-level= option (PR #83605)

2024-03-08 Thread Yaxun Liu via cfe-commits

yxsamliu wrote:

> > Should an option like in #84337 be added for the new driver?
> 
> Yes please

Oh. I can add it

https://github.com/llvm/llvm-project/pull/83605
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [HIP] add --offload-compression-level= option (PR #83605)

2024-03-08 Thread Yaxun Liu via cfe-commits

yxsamliu wrote:

> Should an option like in #84337 be added for the new driver?

Yes please

https://github.com/llvm/llvm-project/pull/83605
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [HIP] add --offload-compression-level= option (PR #83605)

2024-03-08 Thread Yaxun Liu via cfe-commits

https://github.com/yxsamliu edited 
https://github.com/llvm/llvm-project/pull/83605
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [HIP] add --offload-compression-level= option (PR #83605)

2024-03-08 Thread Joseph Huber via cfe-commits

jhuber6 wrote:

Should an option like in https://github.com/llvm/llvm-project/pull/84337 be 
added for the new driver?

https://github.com/llvm/llvm-project/pull/83605
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [HIP] add --offload-compression-level= option (PR #83605)

2024-03-08 Thread Yaxun Liu via cfe-commits

yxsamliu wrote:

zstd developers suggest to enable long distance matching (LDM), i.e. the 
`--long` option. I updated the PR with the change, and tested that it works 
well for bundle entry sizes range from 1KB to 20MB, for both compression rate 
and compression/decompression speed.

https://github.com/llvm/llvm-project/pull/83605
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [HIP] add --offload-compression-level= option (PR #83605)

2024-03-08 Thread Yaxun Liu via cfe-commits

https://github.com/yxsamliu updated 
https://github.com/llvm/llvm-project/pull/83605

>From 16796bc8eb3b32436903db4b689d4cb9cfc348d8 Mon Sep 17 00:00:00 2001
From: "Yaxun (Sam) Liu" 
Date: Fri, 1 Mar 2024 13:16:45 -0500
Subject: [PATCH] [HIP] add --offload-compression-level= option

Added --offload-compression-level= option to clang and -compression-level=
option to clang-offload-bundler for controlling compression level.

Added support of long distance matching (LDM) for llvm::zstd which is off
by default. Enable it for clang-offload-bundler by default since it
improves compression rate in general.

Change default compression level to 3 for zstd for clang-offload-bundler
since it works well for bundle entry size from 1KB to 32MB, which should
cover most of the clang-offload-bundler usage. Users can still specify
compression level by -compression-level= option if necessary.
---
 clang/include/clang/Driver/OffloadBundler.h   |   6 +-
 clang/include/clang/Driver/Options.td |   4 +
 clang/lib/Driver/OffloadBundler.cpp   | 113 ++
 clang/lib/Driver/ToolChains/Clang.cpp |  20 +++-
 clang/lib/Driver/ToolChains/Clang.h   |   2 +
 clang/lib/Driver/ToolChains/HIPUtility.cpp|   7 +-
 .../test/Driver/clang-offload-bundler-zlib.c  |  21 +++-
 .../test/Driver/clang-offload-bundler-zstd.c  |  19 ++-
 .../test/Driver/hip-offload-compress-zlib.hip |   7 +-
 .../test/Driver/hip-offload-compress-zstd.hip |   5 +-
 .../ClangOffloadBundler.cpp   |   5 +
 llvm/include/llvm/Support/Compression.h   |   5 +-
 llvm/lib/Support/Compression.cpp  |  40 +--
 13 files changed, 197 insertions(+), 57 deletions(-)

diff --git a/clang/include/clang/Driver/OffloadBundler.h 
b/clang/include/clang/Driver/OffloadBundler.h
index 84349abe185fa4..65d33bfbd2825f 100644
--- a/clang/include/clang/Driver/OffloadBundler.h
+++ b/clang/include/clang/Driver/OffloadBundler.h
@@ -17,6 +17,7 @@
 #ifndef LLVM_CLANG_DRIVER_OFFLOADBUNDLER_H
 #define LLVM_CLANG_DRIVER_OFFLOADBUNDLER_H
 
+#include "llvm/Support/Compression.h"
 #include "llvm/Support/Error.h"
 #include "llvm/TargetParser/Triple.h"
 #include 
@@ -36,6 +37,8 @@ class OffloadBundlerConfig {
   bool HipOpenmpCompatible = false;
   bool Compress = false;
   bool Verbose = false;
+  llvm::compression::Format CompressionFormat;
+  int CompressionLevel;
 
   unsigned BundleAlignment = 1;
   unsigned HostInputIndex = ~0u;
@@ -116,7 +119,8 @@ class CompressedOffloadBundle {
 
 public:
   static llvm::Expected>
-  compress(const llvm::MemoryBuffer , bool Verbose = false);
+  compress(llvm::compression::Params P, const llvm::MemoryBuffer ,
+   bool Verbose = false);
   static llvm::Expected>
   decompress(const llvm::MemoryBuffer , bool Verbose = false);
 };
diff --git a/clang/include/clang/Driver/Options.td 
b/clang/include/clang/Driver/Options.td
index 5b3d366dbcf91b..2d26a7983f397b 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -1264,6 +1264,10 @@ def fno_gpu_sanitize : Flag<["-"], "fno-gpu-sanitize">, 
Group;
 def offload_compress : Flag<["--"], "offload-compress">,
   HelpText<"Compress offload device binaries (HIP only)">;
 def no_offload_compress : Flag<["--"], "no-offload-compress">;
+
+def offload_compression_level_EQ : Joined<["--"], 
"offload-compression-level=">,
+  Flags<[HelpHidden]>,
+  HelpText<"Compression level for offload device binaries (HIP only)">;
 }
 
 // CUDA options
diff --git a/clang/lib/Driver/OffloadBundler.cpp 
b/clang/lib/Driver/OffloadBundler.cpp
index f9eadfaec88dec..a077a9648b0e9b 100644
--- a/clang/lib/Driver/OffloadBundler.cpp
+++ b/clang/lib/Driver/OffloadBundler.cpp
@@ -924,6 +924,17 @@ CreateFileHandler(MemoryBuffer ,
 }
 
 OffloadBundlerConfig::OffloadBundlerConfig() {
+  if (llvm::compression::zstd::isAvailable()) {
+CompressionFormat = llvm::compression::Format::Zstd;
+// Compression level 3 is usually sufficient for zstd since long distance
+// matching is enabled.
+CompressionLevel = 3;
+  } else if (llvm::compression::zlib::isAvailable()) {
+CompressionFormat = llvm::compression::Format::Zlib;
+// Use default level for zlib since higher level does not have significant
+// improvement.
+CompressionLevel = llvm::compression::zlib::DefaultCompression;
+  }
   auto IgnoreEnvVarOpt =
   llvm::sys::Process::GetEnv("OFFLOAD_BUNDLER_IGNORE_ENV_VAR");
   if (IgnoreEnvVarOpt.has_value() && IgnoreEnvVarOpt.value() == "1")
@@ -937,11 +948,41 @@ OffloadBundlerConfig::OffloadBundlerConfig() {
   llvm::sys::Process::GetEnv("OFFLOAD_BUNDLER_COMPRESS");
   if (CompressEnvVarOpt.has_value())
 Compress = CompressEnvVarOpt.value() == "1";
+
+  auto CompressionLevelEnvVarOpt =
+  llvm::sys::Process::GetEnv("OFFLOAD_BUNDLER_COMPRESSION_LEVEL");
+  if (CompressionLevelEnvVarOpt.has_value()) {
+llvm::StringRef CompressionLevelStr = CompressionLevelEnvVarOpt.value();
+int Level;
+if 

[clang] [llvm] [HIP] add --offload-compression-level= option (PR #83605)

2024-03-07 Thread Yaxun Liu via cfe-commits


@@ -906,6 +906,16 @@ CreateFileHandler(MemoryBuffer ,
 }
 
 OffloadBundlerConfig::OffloadBundlerConfig() {
+  if (llvm::compression::zstd::isAvailable()) {
+CompressionFormat = llvm::compression::Format::Zstd;
+// Use a high zstd compress level by default for better size reduction.

yxsamliu wrote:

> Also, I've just discovered that zstd already has `--long` option: 
> https://github.com/facebook/zstd/blob/b293d2ebc3a5d29309390a70b3e7861b6f5133ec/lib/zstd.h#L394
> 
> ```
> ZSTD_c_enableLongDistanceMatching=160, /* Enable long distance matching.
>  * This parameter is designed to improve 
> compression ratio
>  * for large inputs, by finding large 
> matches at long distance.
>  * It increases memory usage and window 
> size.
>  * Note: enabling this parameter 
> increases default ZSTD_c_windowLog to 128 MB
>  * except when expressly set to a 
> different value.
>  * Note: will be enabled by default if 
> ZSTD_c_windowLog >= 128 MB and
>  * compression strategy >= ZSTD_btopt (== 
> compression level 16+) */
> ```
> 
> This sounds like something we could use here.

Thanks this option is promising. Here is some benchmark result of a fat binary 
containing 13 code objects each of which is about 2.7MB.

The following data is without `--long`.  The numbers are compression level, 
original size -> compressed size (compression rate), compression speed, 
decompression speed.
```
$ zstd -b1 -e22 -f --ultra tmp.o
 1#tmp.o :  34864866 ->   9169246 (3.802), 657.0 MB/s ,1691.0 MB/s 
 2#tmp.o :  34864866 ->   7352667 (4.742), 626.3 MB/s ,1903.8 MB/s 
 3#tmp.o :  34864866 ->   6885718 (5.063), 488.1 MB/s ,1900.2 MB/s 
 4#tmp.o :  34864866 ->   6700508 (5.203), 416.7 MB/s ,1897.2 MB/s 
 5#tmp.o :  34864866 ->   6405252 (5.443), 236.4 MB/s ,1918.8 MB/s 
 6#tmp.o :  34864866 ->   6336706 (5.502), 211.8 MB/s ,1941.4 MB/s 
 7#tmp.o :  34864866 ->   6170409 (5.650), 153.5 MB/s ,2032.5 MB/s 
 8#tmp.o :  34864866 ->   6121226 (5.696), 131.1 MB/s ,2071.5 MB/s 
 9#tmp.o :  34864866 ->   6098948 (5.717), 124.9 MB/s ,2080.4 MB/s 
10#tmp.o :  34864866 ->   299 (13.64), 179.4 MB/s ,3504.2 MB/s 
11#tmp.o :  34864866 ->   2545375 (13.70), 119.4 MB/s ,3516.8 MB/s 
12#tmp.o :  34864866 ->   2542711 (13.71), 107.2 MB/s ,3518.4 MB/s 
13#tmp.o :  34864866 ->   2601619 (13.40),  58.4 MB/s ,3507.6 MB/s 
14#tmp.o :  34864866 ->   2590656 (13.46),  46.2 MB/s ,3520.4 MB/s 
15#tmp.o :  34864866 ->   2518599 (13.84),  28.4 MB/s ,3557.4 MB/s 
16#tmp.o :  34864866 ->   2527122 (13.80),  20.8 MB/s ,3348.5 MB/s 
17#tmp.o :  34864866 ->   2277125 (15.31),  19.0 MB/s ,3370.6 MB/s 
18#tmp.o :  34864866 ->   2138918 (16.30),  15.0 MB/s ,3182.2 MB/s 
19#tmp.o :  34864866 ->   2118238 (16.46),  8.82 MB/s ,3194.5 MB/s 
20#tmp.o :  34864866 ->   2041007 (17.08),  8.31 MB/s ,3178.4 MB/s 
21#tmp.o :  34864866 ->   2039075 (17.10),  5.21 MB/s ,3170.6 MB/s 
22#tmp.o :  34864866 ->   2038568 (17.10),  3.60 MB/s ,3171.5 MB/s 
```
The following data are with `--long`:

```
$ zstd --long -b1 -e22 -f --ultra tmp.o
 1#tmp.o :  34864866 ->   3281430 (10.62), 375.0 MB/s ,3531.9 MB/s 
 2#tmp.o :  34864866 ->   2854143 (12.22), 360.6 MB/s ,3536.7 MB/s 
 3#tmp.o :  34864866 ->   2648807 (13.16), 325.4 MB/s ,3462.7 MB/s 
 4#tmp.o :  34864866 ->   2548618 (13.68), 309.6 MB/s ,3345.9 MB/s 
 5#tmp.o :  34864866 ->   2540406 (13.72), 265.8 MB/s ,3297.8 MB/s 
 6#tmp.o :  34864866 ->   2518788 (13.84), 251.9 MB/s ,3296.0 MB/s 
 7#tmp.o :  34864866 ->   2451360 (14.22), 206.5 MB/s ,3446.9 MB/s 
 8#tmp.o :  34864866 ->   2421083 (14.40), 186.5 MB/s ,3522.7 MB/s 
 9#tmp.o :  34864866 ->   2406717 (14.49), 172.0 MB/s ,3472.2 MB/s 
10#tmp.o :  34864866 ->   2392819 (14.57), 139.6 MB/s ,3439.4 MB/s 
11#tmp.o :  34864866 ->   2386599 (14.61), 113.0 MB/s ,3415.2 MB/s 
12#tmp.o :  34864866 ->   2385088 (14.62), 104.5 MB/s ,3430.0 MB/s 
13#tmp.o :  34864866 ->   2389264 (14.59),  69.5 MB/s ,3422.9 MB/s 
14#tmp.o :  34864866 ->   2382705 (14.63),  61.2 MB/s ,3428.6 MB/s 
15#tmp.o :  34864866 ->   2372640 (14.69),  51.2 MB/s ,3446.7 MB/s 
16#tmp.o :  34864866 ->   2209022 (15.78),  20.5 MB/s ,3483.3 MB/s 
17#tmp.o :  34864866 ->   2168474 (16.08),  18.2 MB/s ,3381.5 MB/s 
18#tmp.o :  34864866 ->   2065724 (16.88),  14.2 MB/s 

[clang] [llvm] [HIP] add --offload-compression-level= option (PR #83605)

2024-03-07 Thread Yaxun Liu via cfe-commits

yxsamliu wrote:

> It may be worth asking on https://github.com/facebook/zstd/ . I am sure zstd 
> maintainers are happy to see more adoption:)

Posted a question to zstd https://github.com/facebook/zstd/issues/3932

https://github.com/llvm/llvm-project/pull/83605
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [HIP] add --offload-compression-level= option (PR #83605)

2024-03-07 Thread Yaxun Liu via cfe-commits

yxsamliu wrote:

Here is the size distribution of individual code object file (each code object 
file is for one GPU arch, and a fat binary contains a bunch of code object 
files, therefore the optimal compression parameter is mostly related to code 
object file size ).

| Bin Size  | Count | Percentage | Cumulative Percentage | Example File 

 |
|---|---||---|---|
| 0-16K | 961   | 12.31% | 12.31%| 
`librocrand.so#offset=27172864=0`  
  |
| 16K-32K   | 602   | 7.71%  | 20.03%| 
`librocalution_hip.so#offset=35438592=27264`   
  |
| 32K-64K   | 1463  | 18.75% | 38.77%| 
`librocalution_hip.so#offset=32391168=37808`   
  |
| 64K-128K  | 1134  | 14.53% | 53.31%| 
`libMIOpen.so#offset=566800384=98984`  
  |
| 128K-256K | 897   | 11.49% | 64.80%| 
`libMIOpen.so#offset=562827264=141624` 
  |
| 256K-512K | 977   | 12.52% | 77.32%| 
`libMIOpen.so#offset=659791872=504120` 
  |
| 512K-1M   | 482   | 6.18%  | 83.50%| 
`libMIOpen.so#offset=567713792=545032` 
  |
| 1M-2M | 443   | 5.68%  | 89.17%| 
`libMIOpen.so#offset=569909248=1134632`
  |
| 2M-4M | 412   | 5.28%  | 94.45%| 
`librocrand.so#offset=27172864=2650696`
  |
| 4M-8M | 251   | 3.22%  | 97.67%| 
`librocblas.so#offset=1671168=5344160` 
  |
| 8M-16M| 136   | 1.74%  | 99.41%| 
`librocblas.so#offset=389632000=15117200`  
  |
| 16M-32M   | 41| 0.53%  | 99.94%| 
`librccl.so#offset=135168=20252464`
  |
| 32M-64M   | 1 | 0.01%  | 99.95%| 
`TensileLibrary_Type_HH_HPA_Contraction_l_Alik_Bljk_Cijk_Dijk_gfx90a.co`
 |
| 64M-128M  | 4 | 0.05%  | 100.00%   | 
`TensileLibrary_Type_HH_HPA_ExperimentalGrid_Contraction_l_Ailk_Bjlk_Cijk_Dijk_CU104_gfx90a.co`
  |

>From the table we can see 99.9% of code object files are below 32MB. Also all 
>code object files are below 128MB.


https://github.com/llvm/llvm-project/pull/83605
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [HIP] add --offload-compression-level= option (PR #83605)

2024-03-04 Thread Artem Belevich via cfe-commits

https://github.com/Artem-B edited 
https://github.com/llvm/llvm-project/pull/83605
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [HIP] add --offload-compression-level= option (PR #83605)

2024-03-04 Thread Artem Belevich via cfe-commits


@@ -906,6 +906,16 @@ CreateFileHandler(MemoryBuffer ,
 }
 
 OffloadBundlerConfig::OffloadBundlerConfig() {
+  if (llvm::compression::zstd::isAvailable()) {
+CompressionFormat = llvm::compression::Format::Zstd;
+// Use a high zstd compress level by default for better size reduction.

Artem-B wrote:

Also, I've just discovered that zstd already has 
https://github.com/facebook/zstd/blob/b293d2ebc3a5d29309390a70b3e7861b6f5133ec/lib/zstd.h#L394

```
ZSTD_c_enableLongDistanceMatching=160, /* Enable long distance matching.
 * This parameter is designed to improve 
compression ratio
 * for large inputs, by finding large 
matches at long distance.
 * It increases memory usage and window 
size.
 * Note: enabling this parameter increases 
default ZSTD_c_windowLog to 128 MB
 * except when expressly set to a different 
value.
 * Note: will be enabled by default if 
ZSTD_c_windowLog >= 128 MB and
 * compression strategy >= ZSTD_btopt (== 
compression level 16+) */
```

This sounds like something we could use here.

https://github.com/llvm/llvm-project/pull/83605
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [HIP] add --offload-compression-level= option (PR #83605)

2024-03-04 Thread Fangrui Song via cfe-commits

MaskRay wrote:

It may be worth asking on https://github.com/facebook/zstd/ . I am sure zstd 
maintainers are happy to see more adoption:)

https://github.com/llvm/llvm-project/pull/83605
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [HIP] add --offload-compression-level= option (PR #83605)

2024-03-04 Thread David Blaikie via cfe-commits

dwblaikie wrote:

> level 20 is a sweet spot for both compression rate and compression time

I wonder how much this is overfitting for kernels of a particular size, though? 
(is it making the window just large enough that there's some "memory" from one 
kernel to the next - but a slightly larger kernel would cause it to fail to see 
the similarities of two or more kernels - and equally, would a lower level have 
a smaller window which would be adequate for smaller kernels?)

If level does imply window size and that does imply the size of the kernels 
that can be efficiently compressed across multiple similar copies - it might be 
interesting to know what range of kernel sizes fit in the new default, and if 
that size is representative of the majority of kernels.

https://github.com/llvm/llvm-project/pull/83605
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [HIP] add --offload-compression-level= option (PR #83605)

2024-03-04 Thread Artem Belevich via cfe-commits


@@ -906,6 +906,16 @@ CreateFileHandler(MemoryBuffer ,
 }
 
 OffloadBundlerConfig::OffloadBundlerConfig() {
+  if (llvm::compression::zstd::isAvailable()) {
+CompressionFormat = llvm::compression::Format::Zstd;
+// Use a high zstd compress level by default for better size reduction.

Artem-B wrote:

I'd add more details here. While higher compression levels usually do improve 
compression ratio, in typical use case it's an incremental improvement. Here, 
we do it to achieve dramatic increase in compression ratio by exploiting the 
fact that we carry multiple sets of very similar large bitcode blobs, and that 
we need compression level high enough to fit one complete blob into compression 
window. At least that's the theory. 

Should we print a warning (or just document it?) when compression level ends up 
being below of what we'd expect? Considering that good compression starts at 
zstd-20, I suspect that compression level will go back to ~2.5x if the binary 
size for one GPU doubles in size and no longer fits. On top of that compression 
time will also increase, a lot. That will be a rather unpleasant surprise for 
whoever runs into it.

ZSTD's current compression parameters are set this way:
https://github.com/facebook/zstd/blob/dev/lib/compress/clevels.h#L47

```
{ 23, 24, 22,  7,  3,256, ZSTD_btultra2},  /* level 19 */
{ 25, 25, 23,  7,  3,256, ZSTD_btultra2,  /* level 20 */
```
First three numbers are log2 of (largest match distance, fully searched 
segment, dispatch table).

2^25 = 32MB which happens to be about the size of the single GPU binary in your 
example. I'm pretty sure this explains why `zstd-20` works so well on it, while 
zstd-19 does not. It will work well for the smaller binaries, but I'm pretty 
sure it will regress for a slightly larger binary.

I think it may be worth experimenting with fine-tuning compression settings and 
instead of blindly setting `zstd-20`, consider the size of the binary we need 
to deal with, and adjust only windowLog/chainLog appropriately.

Or we could set the default to lower compression level + large windowLog. This 
should still give us most of the compression benefits for the binaries that 
would fit into the window, but would avoid the performance cliff if the binary 
is too large.

I may be overcomplicating it too much, too. If someone does run into the 
problem, they now have a way to work around it by tweaking the compression 
level.


https://github.com/llvm/llvm-project/pull/83605
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [HIP] add --offload-compression-level= option (PR #83605)

2024-03-04 Thread Artem Belevich via cfe-commits


@@ -942,20 +942,28 @@ CompressedOffloadBundle::compress(const 
llvm::MemoryBuffer ,
   Input.getBuffer().size());
 
   llvm::compression::Format CompressionFormat;
+  int Level;
 
-  if (llvm::compression::zstd::isAvailable())
+  if (llvm::compression::zstd::isAvailable()) {
 CompressionFormat = llvm::compression::Format::Zstd;
-  else if (llvm::compression::zlib::isAvailable())
+// Use a high zstd compress level by default for better size reduction.
+const int DefaultZstdLevel = 20;

Artem-B wrote:

> compiling kernels to bitcode for 6 GPU takes 30s. compression with zstd level 
> 20 takes 2s.

This looks acceptable for me.

> unless zstd can be parallelized.

zstd does support multithreaded compression, but enabling it would run into the 
same issue we had with enabling multi-threaded compilation -- it will interfere 
with the build system's idea of resource usage. 


https://github.com/llvm/llvm-project/pull/83605
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [HIP] add --offload-compression-level= option (PR #83605)

2024-03-03 Thread Yaxun Liu via cfe-commits

https://github.com/yxsamliu edited 
https://github.com/llvm/llvm-project/pull/83605
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [HIP] add --offload-compression-level= option (PR #83605)

2024-03-03 Thread Yaxun Liu via cfe-commits

https://github.com/yxsamliu edited 
https://github.com/llvm/llvm-project/pull/83605
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits