kbendick commented on a change in pull request #1227:
URL: https://github.com/apache/iceberg/pull/1227#discussion_r487659408



##########
File path: dev/source-release.sh
##########
@@ -66,7 +66,12 @@ tarball=$tag.tar.gz
 
 # be conservative and use the release hash, even though git produces the same
 # archive (identical hashes) using the scm tag
-git archive $release_hash --prefix $tag/ -o $tarball .baseline api arrow 
bundled-guava common core data dev flink gradle gradlew hive mr orc parquet pig 
spark spark2 spark-runtime spark3 spark3-runtime LICENSE NOTICE README.md 
build.gradle baseline.gradle deploy.gradle tasks.gradle jmh.gradle 
gradle.properties settings.gradle versions.lock versions.props version.txt
+adds=" .baseline"  # prefixed with a blank space for each file name including 
the first one
+excludes="build|examples|jitpack.yml|python|site"  # excluded as they are not 
of use for releasing jars
+echo "Excluded files and directories: ${excludes}"
+archives=$(git ls-tree --name-only -r HEAD | cut -d"/" -f1 | uniq | grep -vE 
${excludes} | tr '\n' ' ')${adds}
+echo "Included files and directories: ${archives}"
+git archive $release_hash --prefix $tag/ -o $tarball ${archives}

Review comment:
       Ok. So I've I'm come up wth another solution that generates the proper 
tarball, akin to my second command there, but still broken up into chunks (vs 
doing WAY too much shell scripting and commands on a single line).
   
   Instead of doing `git archive $release_hash --prefix $tag/ -o $tarball $(git 
ls-tree --name-only -r HEAD | cut -d"/" -f1 | uniq | grep -vE ${excludes} | tr 
'\n' ' ')`, where we inline what was previously the $archives variable, we can 
either write the $archives to a file and then cat that file out as part of the 
git archive or we can just echo the $archives variable in line (which seems to 
handle the weird string issues in my zsh shell, which is a relatively common 
shell).
   
   If we wrote it to a file, that file wouldn't get included as it wouldn't be 
in the `git ls-tree`. We'd just want to `rm` it at the end to do some clean up. 
Alternatively, we could create the `/tmp` filder somewhat higher up in the 
script so that we can place the archives file there and then it wouldn't be in 
danger of accidentally making it into the git tree as `/tmp` is part of the 
gitignore.
   
   So here's my proposed solution. It's a little more complex than using 
$archives inline, but my zsh shell seems to be interpreting it as a single 
string when we substitute it in and either of these approaches gets around that.
   
   ```bash
   # Note I am removing the adds line as it is no longer needed because the 
`git ls-tree` will include .baseline
   excludes="build|examples|jitpack.yml|python|site"  # excluded as they are 
not of use for releasing jars
   # Alternativey, we might add the `.github` directory to the excludes list as 
it's not needed for releasing
   echo "Excluded files and directories: ${excludes}"
   # Generate the files to archive
   # Note that we no longer need to append the $adds as the `.baseline` file 
comes from git ls-tree
   archives=$(git ls-tree --name-only -r HEAD | cut -d"/" -f1 | uniq | grep -vE 
${excludes} | tr '\n' ' ')
   echo "Included files and directories: ${archives}"
   
   # Two possible ways of achieving this.
   # Way 1: cat the variable so as to make it no longer a string. This worked 
for me.
   git archive $release_hash --prefix $tag/ -o $tarball $(cat $archives) 
   
   # Way two: Write the archives to a file and then echo that file where we are 
currently placing $archives
   echo $archives >> to_include.txt
   git archive $release_hash --prefix $tag/ -o $tarball $(cat to_include.txt) # 
Also generates the tarball
   rm to_include.txt
   ```
   
   I'm still looking to see if there's a cleaner way to do this, but either of 
the above solutions will work across multiple shells. I think excluding zsh 
shell portability would be short sighted as it's now the default on the latest 
versions of macOS if I'm not mistaken.
   
   Both of my solutions however generated a tarball that have the exact same 
results when inspecting via `tar -tf $tarball`.
   
   Let me know what you think @rdblue. I'm personally fond of writing it out to 
a file.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to