This is an automated email from the ASF dual-hosted git repository.

potiuk pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/airflow.git


The following commit(s) were added to refs/heads/main by this push:
     new d5c0f32a87 Fix reproducibility of source-tarballs prepared as part of 
release (#36819)
d5c0f32a87 is described below

commit d5c0f32a87775984d156f0a3827f457808db7afc
Author: Jarek Potiuk <[email protected]>
AuthorDate: Tue Jan 16 22:02:40 2024 +0100

    Fix reproducibility of source-tarballs prepared as part of release (#36819)
    
    The source-tarball that was supposed to be reproducible, was
    "almost" reproducible. It turned out that we forgot that group
    permissions are bound to change depending on the umask that is
    configured in the system (because using umask is the default git
    configuration when files are checked out). This means that the
    packages were reproducible only if the two people who built it
    had the same umask set.
    
    Since it is unlikely that default owner umask is different than
    rwx allowed, all the tools that are bound to provide reproducibility
    approach it via setting the umask to clear any permissions for group
    and other - this way the files in archive have only owner permissions
    set.
    
    This is what this PR does - we set the tar.umask when running the
    `git archive` command to 0077 which effectively cleans all the group
    and other permissions.
    
    In this PR we also fix FutureDeprecation warning raised in newer
    versions of Python where GzipFile should take keyword parameters
    rather than positional ones.
---
 dev/breeze/src/airflow_breeze/commands/release_candidate_command.py | 2 ++
 dev/breeze/src/airflow_breeze/utils/reproducible.py                 | 2 +-
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git 
a/dev/breeze/src/airflow_breeze/commands/release_candidate_command.py 
b/dev/breeze/src/airflow_breeze/commands/release_candidate_command.py
index 11cba8d9f5..5408420a30 100644
--- a/dev/breeze/src/airflow_breeze/commands/release_candidate_command.py
+++ b/dev/breeze/src/airflow_breeze/commands/release_candidate_command.py
@@ -88,6 +88,8 @@ def tarball_release(version: str, version_without_rc: str, 
source_date_epoch: in
         run_command(
             [
                 "git",
+                "-c",
+                "tar.umask=0077",
                 "archive",
                 "--format=tar.gz",
                 f"{version}",
diff --git a/dev/breeze/src/airflow_breeze/utils/reproducible.py 
b/dev/breeze/src/airflow_breeze/utils/reproducible.py
index a85d871a3c..ca272ea9ec 100644
--- a/dev/breeze/src/airflow_breeze/utils/reproducible.py
+++ b/dev/breeze/src/airflow_breeze/utils/reproducible.py
@@ -100,7 +100,7 @@ def archive_deterministically(dir_to_archive, dest_archive, 
prepend_path=None, t
         # packaging (in case of exceptional situations like running out of 
disk space).
         temp_file = f"{dest_archive}.temp~"
         with os.fdopen(os.open(temp_file, os.O_WRONLY | os.O_CREAT, 0o644), 
"wb") as out_file:
-            with gzip.GzipFile("wb", fileobj=out_file, mtime=0) as gzip_file:
+            with gzip.GzipFile(fileobj=out_file, mtime=0, mode="wb") as 
gzip_file:
                 with tarfile.open(fileobj=gzip_file, mode="w:") as tar_file:
                     for entry in file_list:
                         arcname = entry

Reply via email to