kbendick commented on a change in pull request #1227:
URL: https://github.com/apache/iceberg/pull/1227#discussion_r487671520
##########
File path: dev/source-release.sh
##########
@@ -66,7 +66,12 @@ tarball=$tag.tar.gz
# be conservative and use the release hash, even though git produces the same
# archive (identical hashes) using the scm tag
-git archive $release_hash --prefix $tag/ -o $tarball .baseline api arrow
bundled-guava common core data dev flink gradle gradlew hive mr orc parquet pig
spark spark2 spark-runtime spark3 spark3-runtime LICENSE NOTICE README.md
build.gradle baseline.gradle deploy.gradle tasks.gradle jmh.gradle
gradle.properties settings.gradle versions.lock versions.props version.txt
+adds=" .baseline" # prefixed with a blank space for each file name including
the first one
+excludes="build|examples|jitpack.yml|python|site" # excluded as they are not
of use for releasing jars
+echo "Excluded files and directories: ${excludes}"
+archives=$(git ls-tree --name-only -r HEAD | cut -d"/" -f1 | uniq | grep -vE
${excludes} | tr '\n' ' ')${adds}
+echo "Included files and directories: ${archives}"
+git archive $release_hash --prefix $tag/ -o $tarball ${archives}
Review comment:
Scratch that, we can tag certain files or folders with `export-ignore`
in the .gitattributes file. This is the proper way to exclude them from a git
archive command as it's built into git itself. This would also allows us to
skip the admittedly large amount of shell commands that are going into this.
And while some might not be familiar with `.gitattributes`, I would suspect
that a release engineer should hopefully be.
Here's the documentation on the git archive command:
https://www.linux.org/docs/man1/git-archive.html
And here's a post discussing the usage of `export-ignore` in the
`.gitattributes` file to properly ignore files.
https://feeding.cloud.geek.nz/posts/excluding-files-from-git-archive/
I think this is the best approach as it uses the in built tools of git for
generating this (so hopefully it should be better understood by release
engineers), and it involves way less shell munging than currently.
An example .gitattributes file for the excludes we want would be something
like
```
/build export-ignore
/examples export-ignore
.jitpack.yml export-ignore
/python export-ignore # though do we really want to continue to exclude the
python directory from the release?
/site export-ignore
```
The above complicated set of `excludes`, `archives, etc could instead be
used off of the tag that is already pushed earlier in the script.
```bash
# Assuming the .gitattributes file is set up with the appropriate
export-ignore
git archive --prefix $tag/ -o $tarball --worktree-attributes $tag^{tree}
```
I am going to open a PR for adding a `.gitattributes` file so we can test
with that. You can feel free to add it to this branch if you prefer, but I
can't push to this branch so it's easier for me to create a separate PR.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]