The current strategy we've decided on, to keep us in sync with illumos,
is to merge in changes from illumos into the OpenZFS repo on a recurring
basis (daily, weekly, monthly, w/e). Thus, the current work flow looks a
bit like this:
1. We land pull requests onto OpenZFS, slightly diverging from
illumos
2. illumos lands patches submitted to them by non-OpenZFS folks
(usually patches unrelated to the illumos ZFS source), slightly
diverging from OpenZFS
3. We push patches unique to OpenZFS (i.e. the patches we land via
our pull requests) back to illumos via the normal illumos RTI
process; reducing the divergence between the illumos ZFS and
OpenZFS implementations.
4. We merge in the entire illumos tree periodically, to pull in
changes that were landed in illumos (again, usually non-ZFS
changes) to reduce the divergence even more.
As a result of this type of work flow, the following subtleties arise:
- OpenZFS will rarely be completely "in sync" with illumos. Even
after a merge with illumos, it'll likely be the case that we have
outstanding patches in OpenZFS that still remain to land in
illumos. So, best case we only have "extra" changes that need to
be pushed to illumos, and worst case we have extra changes to push
and illumos has also has changes that we need to pull.
- Since we're pushing changes to illumos via the normal RTI process,
and then merging their tree in with OpenZFS, there will inevitably
be "two commits" for each change that we land in OpenZFS. One that
landed in our tree first, and then the second that landed in
illumos and we inherit via the merge with illumos. To me, this is
a non-issue since the two patches will likely be identical (or at
least very close to it). Is there a specific reason why you think
this is an issue? Maybe I'm overlooking something?
One reason why I'm against the idea of rebasing the OpenZFS repository
on top of illumos instead of our current merge workflow, is that it'll
rewrite the history and prevent a simple "pull" from working. Each time
a downstream consumer of OpenZFS wants to pull our changes, they'll have
to do a "force" pull since we will have rewritten the history of the
branch.
I've seen this "rebase" workflow work relatively well for small projects
in which the "master" branch isn't actually rebased. Instead, an new
branch is created that is a rebased version of the previous branch.
Thus, no re-writing of the history, but at the same time the original
branch is "abandoned" so it's effectively the same thing.
Merge conflicts are much more painful for rebasing many commits onto a
new tree, too. But, practically, I don't think this will be a real issue
for us since we plan to continually RTI our "local" commits to illumos,
so we shouldn't have conflicts if we were to rebase instead of merge.
So, to me, I'm in favor of merges because it doesn't rewrite the history
which makes it more difficult for downstream consumers of the
repository. But, I don't think I fully understand your concerns about
the "two commits" issue, so it's hard for me to weight that against the
merge work flow. Can you elaborate a bit more about why this concerns
you?
Additionally, if we end up getting a lot of pull requests and
contributions from the community, I think the cost of doing an RTI will
become unreasonable. In that case, I think it'll make sense to try and
work with the illumos advocates to try and expedite the process of
syncing our OpenZFS changes with illumos. One method I envision for that
would be to, instead of RTI'ing each commit into illumos, simply merge
OpenZFS into illumos. In that world, we wouldn't run into this "two
commits for each pull request" scenario, as we would merge our local
commits into illumos via a single merge commit to that repository; much
like we do with changes made on illumos. At the surface it doesn't seem
like us merging with illumos vs. rebasing onto illumos affects this long
term idea, but it might be worth considering.
Ultimately, to me, the most important things to maintain regardless of
which work flow we choose, is this:
1. It needs to be easy for developers to push changes to OpenZFS.
This is definitely the most important thing, and the main reason
we created this repository to begin with. Getting all platforms
sharing code and reviewing other developers code, is good for
everybody using OpenZFS on any platform. Pull request seem like
the best method to facilitate this.
2. It needs to be easy for OpenZFS maintainers to accept changes
pushed to OpenZFS and quickly and painlessly get these integrated
into illumos. Which is why we require most of the same
integration "rules" as illumos, and also why we continually keep
OpenZFS "in sync" with illumos as much as possible.
3. It needs to be easy and intuitive for downstream folks to pull
the lastest version of OpenZFS, and merge our repository into
whatever work they may be doing. Which is why we've adopted a
strategy to merge with illumos instead of rebase.
I think using rebase instead of merge would probably help with point 2
(much easier to get a definitive list of what remains to be pushed
upstream, as you said), but I worry that it'll hurt point 3 (due to
requiring "force" updates). I'm open to new ideas, though, and really
appreciate the involvement; I just want to make sure we don't hurt an
important use case while trying to cater to another.
---
Reply to this email directly or view it on GitHub:
https://github.com/openzfs/openzfs/pull/28#issuecomment-153404354
_______________________________________________
developer mailing list
developer@open-zfs.org
http://lists.open-zfs.org/mailman/listinfo/developer