kbendick commented on a change in pull request #1227:
URL: https://github.com/apache/iceberg/pull/1227#discussion_r487671520



##########
File path: dev/source-release.sh
##########
@@ -66,7 +66,12 @@ tarball=$tag.tar.gz
 
 # be conservative and use the release hash, even though git produces the same
 # archive (identical hashes) using the scm tag
-git archive $release_hash --prefix $tag/ -o $tarball .baseline api arrow 
bundled-guava common core data dev flink gradle gradlew hive mr orc parquet pig 
spark spark2 spark-runtime spark3 spark3-runtime LICENSE NOTICE README.md 
build.gradle baseline.gradle deploy.gradle tasks.gradle jmh.gradle 
gradle.properties settings.gradle versions.lock versions.props version.txt
+adds=" .baseline"  # prefixed with a blank space for each file name including 
the first one
+excludes="build|examples|jitpack.yml|python|site"  # excluded as they are not 
of use for releasing jars
+echo "Excluded files and directories: ${excludes}"
+archives=$(git ls-tree --name-only -r HEAD | cut -d"/" -f1 | uniq | grep -vE 
${excludes} | tr '\n' ' ')${adds}
+echo "Included files and directories: ${archives}"
+git archive $release_hash --prefix $tag/ -o $tarball ${archives}

Review comment:
       Scratch that, we can tag certain files or folders with `export-ignore` 
in the .gitattributes file. This is the proper way to exclude them from a git 
archive command as it's built into git itself. This would also allows us to 
skip the admittedly large amount of shell commands that are going into this. 
And while some might not be familiar with `.gitattributes`, I would suspect 
that a release engineer should hopefully be.
   
   Here's the documentation on the git archive command: 
https://www.linux.org/docs/man1/git-archive.html
   And here's a post discussing the usage of `export-ignore` in the 
`.gitattributes` file to properly ignore files. 
https://feeding.cloud.geek.nz/posts/excluding-files-from-git-archive/
   
   I think this is the best approach as it uses the in built tools of git for 
generating this (so hopefully it should be better understood by release 
engineers), and it involves way less shell munging than currently.
   
   An example .gitattributes file for the excludes we want would be something 
like
   ```
   /build export-ignore
   /examples export-ignore
   .jitpack.yml export-ignore
   /python export-ignore
   /site export-ignore
   ```
   
   The above complicated set of `excludes`, `archives, etc could instead be 
used off of the tag that is already pushed earlier in the script.
   
   ```bash
   # Assuming the .gitattributes file is set up with the appropriate 
export-ignore
   git archive --prefix $tag/ -o $tarball --worktree-attributes  $tag^{tree}
   ```
   
   I am going to open a PR for adding a `.gitattributes` file so we can test 
with that. You can feel free to add it to this branch if you prefer, but I 
can't push to this branch so it's easier for me to create a separate PR.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to