StefanKarpinski opened a new pull request #8448:
URL: https://github.com/apache/arrow/pull/8448
This pull request merges a synthetic history of the Arrow.jl Julia package
into the main arrow monorepo under the `julia` top-level directory. The history
of Arrow.jl has been rewritten so that it appears that all development was done
in this directory, retaining only a commit for each published version of the
Arrow.jl package. Preserving this history (specifically the git tree objects
associated with each commit) allows Julia's package manager to continue to
install historical versions of Arrow.jl while having the arrow monorepo as the
git repository of record going forward.
I'm making this pull request on behalf of the Arrow.jl project (cc @quinnj,
@ExpandingMan) as the resident git mage. Let me know if there's anything I
should change about this PR to integrate better into the arrow project.
---
For my own record (in case I need to do this again), here's the code I used
to generate the synthetic history:
```jl
using TOML
data = TOML.parse("""
["0.1.2"]
git-tree-sha1 = "5cab061e3fcf0d78291f9c4b3db1f58c8f5e1bc5"
["0.2.0"]
git-tree-sha1 = "5081382c0e5c78c1849b9841b9d8941437060b48"
["0.2.1"]
git-tree-sha1 = "ecfe11bd0874ab41b78be0ca8d0f680ba37978dc"
["0.2.2"]
git-tree-sha1 = "c66fc3e71747c99a3e3940ade685c0d8ea66c0ae"
["0.2.3"]
git-tree-sha1 = "d3c36842140057276f6f8348afa08f0f7dae2d1e"
["0.2.4"]
git-tree-sha1 = "c86df6ed41b3bd192d663e5e0e7cac0d11fd4375"
["0.3.0"]
git-tree-sha1 = "76641f71ac332cd4d3cf54b98234a0f597bd7a2f"
""")
trees = Dict(VersionNumber(k) => v["git-tree-sha1"] for (k, v) in data)
ENV["GIT_AUTHOR_NAME"] = "Jacob Quinn"
ENV["GIT_AUTHOR_EMAIL"] = "[email protected]"
let commit = "16b729db74d78ecb010efab855c9e46c8052f59e"
for (ver, tree) in sort!(collect(trees), by=first)
message = """
ARROW-10228: [Julia] Arrow.jl v$ver
Co-authored-by: Michael Savastio <[email protected]>
"""
ENV["GIT_AUTHOR_DATE"] = readchomp(`git show -s --format=%ai v$ver`)
commit = readchomp(`git commit-tree -p $commit -m $message $tree`)
end
run(`git branch -f sk/synthetic $commit`)
end
run(`git filter-repo --force --to-subdirectory-filter julia`)
```
Then I used the following `.git/config` in a clone of `arrow`:
```gitconfig
[core]
repositoryformatversion = 0
filemode = true
bare = false
logallrefupdates = true
ignorecase = true
precomposeunicode = true
[remote "origin"]
url = https://github.com/apache/arrow.git
fetch = +refs/heads/*:refs/remotes/origin/*
[remote "StefanKarpinski"]
url = https://github.com/StefanKarpinski/arrow.git
fetch = +refs/heads/*:refs/remotes/StefanKarpinski/*
[remote "Arrow.jl"]
url = ../Arrow.jl
fetch = +refs/heads/*:refs/remotes/Arrow.jl/*
[branch "master"]
remote = origin
merge = refs/heads/master
```
With that setup, you just do this in the `arrow` clone:
```sh
git fetch Arrow.jl --no-tags
git merge Arrow.jl/sk/synthetic
```
Enter the merge commit message when prompted.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]