D2096: infinitepush: move the extension to core from fb-hgext
indygreg accepted this revision. indygreg added a comment. This revision is now accepted and ready to land. OK. I applied this series locally and looked at the final code. There are some minor issues with the current series. But I can fix those in flight. I've also captured a number of other issues in the review of this commit. IMO they can be addressed as follow-ups. A larger issue that I think may be worth pursuing is dropping the code to support pre-bundle2. I think it is reasonable for a server operator to say that a repo using this feature requires bundle2. Because attempting to handle e.g. bookmark or phases updates pre-bundle2 is a recipe for disaster since these updates are performed on a separate wire protocol command, after changegroup data is exchanged. Clients pushing without bundle2 may see errors on `pushkey` operations because nodes aren't readily available. And, having a server update bookmarks, phases, etc in multiple bundles isn't reasonable. I would make it so enabling this extension also implies `server.bundle1=false`. INLINE COMMENTS > README:1 > +## What is it? > + I think we can delete this file since it is redundant with info in the docstring of the extension. > __init__.py:1409-1428 > +def _asyncsavemetadata(root, nodes): > +'''starts a separate process that fills metadata for the nodes > + > +This function creates a separate process and doesn't wait for it's > +completion. This was done to avoid slowing down pushes > +''' > + This code can be deleted since `hg debugfilleinfinitepushmetadata` was deleted. > indexapi.py:10 > + > +class indexapi(object): > +"""Class that manages access to infinitepush index. We'll want to port this to `zope.interface`. > sqlindexapi.py:15 > +import warnings > +import mysql.connector > + I think we should rename this module since it is MySQL specific. Various database packages do conform to a similar API. So we could potentially make the SQL layer implementation agnostic by coding to an interface. > sqlindexapi.py:64 > +self.sqlconn = mysql.connector.connect( > +force_ipv6=True, **self.sqlargs) > + `force_ipv6=True`. Kudos to Facebook for getting away with that in their network. But this won't fly in the real world. > sqlindexapi.py:84-85 > +self.sqlcursor = self.sqlconn.cursor() > +self.sqlcursor.execute("SET wait_timeout=%s" % waittimeout) > +self.sqlcursor.execute("SET innodb_lock_wait_timeout=%s" % > + self._locktimeout) And here we have some MySQL specific functionality :/ REPOSITORY rHG Mercurial REVISION DETAIL https://phab.mercurial-scm.org/D2096 To: pulkit, #hg-reviewers, indygreg Cc: indygreg, mercurial-devel ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
D2096: infinitepush: move the extension to core from fb-hgext
indygreg added inline comments. INLINE COMMENTS > __init__.py:136 > +common, > +infinitepushcommands, > +) This deleted module isn't dropped later in the series. > __init__.py:266 > +self._repo = repo > +storetype = self._repo.ui.config('infinitepush', 'storetype', '') > +if storetype == 'disk': Default value can be dropped. > __init__.py:276 > + > +indextype = self._repo.ui.config('infinitepush', 'indextype', '') > +if indextype == 'disk': Default value can be dropped. REPOSITORY rHG Mercurial REVISION DETAIL https://phab.mercurial-scm.org/D2096 To: pulkit, #hg-reviewers, indygreg Cc: indygreg, mercurial-devel ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
D2096: infinitepush: move the extension to core from fb-hgext
indygreg added inline comments. INLINE COMMENTS > __init__.py:258 > +if common.isremotebooksenabled(ui): > +hoist = ui.config('remotenames', 'hoist') + '/' > +if remotebookmark.startswith(hoist): This will need updated to `hoistedpeer`. REPOSITORY rHG Mercurial REVISION DETAIL https://phab.mercurial-scm.org/D2096 To: pulkit, #hg-reviewers, indygreg Cc: indygreg, mercurial-devel ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
D2096: infinitepush: move the extension to core from fb-hgext
pulkit added a comment. In https://phab.mercurial-scm.org/D2096#43544, @indygreg wrote: > Per discussion at sprint, Pulkit will follow up by deleting yet more code around workspaces and scratch branches. We're going to focus on the "try server" use case for the initial landing. > > We will land the full code initially. So if we want to land what we have now and delete code later (rather than waiting for all the commits before landing anything), I'm OK with that. The last commit of this 20 patch series get us to the state where we can easily use it for "try server" use case with no client side wrapping required. REPOSITORY rHG Mercurial REVISION DETAIL https://phab.mercurial-scm.org/D2096 To: pulkit, #hg-reviewers, indygreg Cc: indygreg, mercurial-devel ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
D2096: infinitepush: move the extension to core from fb-hgext
indygreg requested changes to this revision. indygreg added a comment. This revision now requires changes to proceed. Per discussion at sprint, Pulkit will follow up by deleting yet more code around workspaces and scratch branches. We're going to focus on the "try server" use case for the initial landing. We will land the full code initially. So if we want to land what we have now and delete code later (rather than waiting for all the commits before landing anything), I'm OK with that. REPOSITORY rHG Mercurial REVISION DETAIL https://phab.mercurial-scm.org/D2096 To: pulkit, #hg-reviewers, indygreg Cc: indygreg, mercurial-devel ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
D2096: infinitepush: move the extension to core from fb-hgext
indygreg added a comment. I'll try to look at this tomorrow. (I was on a mini vacation last week and didn't spend much time looking at a computer.) REPOSITORY rHG Mercurial REVISION DETAIL https://phab.mercurial-scm.org/D2096 To: pulkit, #hg-reviewers Cc: indygreg, mercurial-devel ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
D2096: infinitepush: move the extension to core from fb-hgext
pulkit added a comment. In https://phab.mercurial-scm.org/D2096#35158, @indygreg wrote: > I see the following high-level potential use cases for infinitepush: > > 1. A hosting/repo management alternative to //forks// > 2. A central repository for submitting changesets to a service (e.g. code review, a "try" repo to trigger CI, etc) > 3. As a backup mechanism I think infinitepush can serve as a very good base for solving these problems. > Let's elaborate. > > As I wrote under the //Forks aren't the Model You are Looking For// section at https://gregoryszorc.com/blog/2017/12/11/high-level-problems-with-git-and-how-to-fix-them/, using clones / separate repos to manage //forks// really isn't a great model. I think you want something far less heavyweight where derivations from the main repo are //hung off// that main repo somehow. I think you want a repo-like primitive - let's call it a //workspace// - that is attached to the main repo. Conceptually, imagine that every push goes to a central store. But instead of exposing every changeset with a traditional repository, we instead have different //views// of that data. A //view// in Mercurial technical terms would be defined by a set of heads, bookmarks, phases, obsolescence markers, etc. Although the rules for what data is in that set isn't clear: I think it is valid to have a view that is independent of the main repo as well as a view that is additive to the main repo (that would allow your //fork// to automatically //track// changes to the main repo). > > I think the Mercurial server should grow the ability to host //workspaces// instead of //forks//. What this looks like and what role infinitepush plays, I'm not sure. But I do know infinitepush dabbles in this space, so we need to consider it. That sounds like a very great idea. If we introduce user namespace in infinitpush bundlestore, we can have per user store, where bundles of that user are stored. Then we can introduce a new namespace like `remote workspace` which can be used to push to that workspace. For example: pulkit$ hg push # should push to my workspace pulkit$ hg pull# should pull from my workspace pulkit$ hg pull --workspace indygreg # should pull form your workspace pulkit$ hg push --workspace indygreg# should raise an error initially This will make all the pushes default to bundlestore. Since this thing sounds like a big BC, we can have all this as a part of extension. > Another use case is a central repository to push/upload to in order to trigger events. For example, you may want to have a central server that receives pushes to trigger code review, trigger a test run of CI, run static analysis, etc. This is different than the //workspaces// use case because the data being pushed is more ephemeral and ad-hoc. There is a difference between pushing something to your //workspace// so you can save and share it versus pushing something to a service to trigger an action on it. To me, this use case sounds best served by server functionality that stashes standalone bundles somewhere and provides access to individual revisions on demand. I think this can be done using a server which stores all the incoming pushes in the bundlestore unless specified otherwise. The current infinitepush extension can handle that well. > Our third use case is backups. If a changeset is produced, it gets uploaded/saved somewhere. This is actually a hybrid between the above 2 items. It initially looks a lot like the central repository storing bundles that can accessed on demand. But if you think of each client using the backup service as an individual entity, then modeling each client as having a //workspace// starts to make sense as well. What if each client's state were automatically backed up / kept in sync with a dedicated //workspace// on the server? I deleted the backup commands, once the initial series get in, will add that back with appending 'debug' in front of command names. > I think we need to decide which of these use cases we're targeting and what is the pathway (if any) for achieving the remaining use cases in the future. I suspect there is potential to integrate remotenames into our vision here. We definitely need to think about how we'd //address// these offshoot stores. Infinitepush currently uses bookmarks. I'm not sure if we want to be forcing people to use bookmarks or if bookmarks are the best way to //route// pushes/pulls to specific areas of the repository. > > @pulkit: I know you have been doing a lot of work with remotenames. I'd be very curious about your thoughts on how you'd integrate //workspaces//, infinitepush, and remotenames. You've also looked at this infinitepush code. So I'm curious about your general thoughts on what you think we should include in core and what our plan should be for adding potentially missing features
D2096: infinitepush: move the extension to core from fb-hgext
indygreg added a comment. In https://phab.mercurial-scm.org/D2096#35153, @pulkit wrote: > In https://phab.mercurial-scm.org/D2096#35123, @indygreg wrote: > > > I can take review of this series since this feature is of extreme interest to Mozilla. I //might// get around to looking at it this weekend. But no promises. > > > Do you have thoughts on which functionality from the following you want in core? > > - the --bundle-store flag to push command > - functionality to pull from bundlestore using hg pull > - functionality to pull changesets from bundlestore if a changeset is not found locally on hg update > - logic around sql store > - interaction with the hoisting functionality of remotenames extension which is also being moved to core I haven't looked at the full code in detail yet. But I think we should take a step back and think about what are the fundamental features/workflows we're trying to accomplish with infinitepush. I see the following high-level potential use cases for infinitepush: 1. A hosting/repo management alternative to //forks// 2. A central repository for submitting changesets to a service (e.g. code review, a "try" repo to trigger CI, etc) 3. As a backup mechanism Let's elaborate. As I wrote under the //Forks aren't the Model You are Looking For// section at https://gregoryszorc.com/blog/2017/12/11/high-level-problems-with-git-and-how-to-fix-them/, using clones / separate repos to manage //forks// really isn't a great model. I think you want something far less heavyweight where derivations from the main repo are //hung off// that main repo somehow. I think you want a repo-like primitive - let's call it a //workspace// - that is attached to the main repo. Conceptually, imagine that every push goes to a central store. But instead of exposing every changeset with a traditional repository, we instead have different //views// of that data. A //view// in Mercurial technical terms would be defined by a set of heads, bookmarks, phases, obsolescence markers, etc. Although the rules for what data is in that set isn't clear: I think it is valid to have a view that is independent of the main repo as well as a view that is additive to the main repo (that would allow your //fork// to automatically //track// changes to the main repo). I think the Mercurial server should grow the ability to host //workspaces// instead of //forks//. What this looks like and what role infinitepush plays, I'm not sure. But I do know infinitepush dabbles in this space, so we need to consider it. Another use case is a central repository to push/upload to in order to trigger events. For example, you may want to have a central server that receives pushes to trigger code review, trigger a test run of CI, run static analysis, etc. This is different than the //workspaces// use case because the data being pushed is more ephemeral and ad-hoc. There is a difference between pushing something to your //workspace// so you can save and share it versus pushing something to a service to trigger an action on it. To me, this use case sounds best served by server functionality that stashes standalone bundles somewhere and provides access to individual revisions on demand. Our third use case is backups. If a changeset is produced, it gets uploaded/saved somewhere. This is actually a hybrid between the above 2 items. It initially looks a lot like the central repository storing bundles that can accessed on demand. But if you think of each client using the backup service as an individual entity, then modeling each client as having a //workspace// starts to make sense as well. What if each client's state were automatically backed up / kept in sync with a dedicated //workspace// on the server? I think we need to decide which of these use cases we're targeting and what is the pathway (if any) for achieving the remaining use cases in the future. I suspect there is potential to integrate remotenames into our vision here. We definitely need to think about how we'd //address// these offshoot stores. Infinitepush currently uses bookmarks. I'm not sure if we want to be forcing people to use bookmarks or if bookmarks are the best way to //route// pushes/pulls to specific areas of the repository. My hunch is that teaching the server to accept incoming pushes and store them in standalone bundles (instead of as part of the main repo store) is a much easier problem to solve than providing //workspaces//. I don't think it requires nearly as many changes and will introduce few - if any - user-facing changes. But the feature isn't as valuable as persisted //workspaces// would be. @pulkit: I know you have been doing a lot of work with remotenames. I'd be very curious about your thoughts on how you'd integrate //workspaces//, infinitepush, and remotenames. You've also looked at this infinitepush code. So I'm curious about
D2096: infinitepush: move the extension to core from fb-hgext
pulkit added a comment. In https://phab.mercurial-scm.org/D2096#35123, @indygreg wrote: > I can take review of this series since this feature is of extreme interest to Mozilla. I //might// get around to looking at it this weekend. But no promises. Do you have thoughts on which functionality from the following you want in core? - the --bundle-store flag to push command - functionality to pull from bundlestore using hg pull - functionality to pull changesets from bundlestore if a changeset is not found locally on hg update - logic around sql store - interaction with the hoisting functionality of remotenames extension which is also being moved to core REPOSITORY rHG Mercurial REVISION DETAIL https://phab.mercurial-scm.org/D2096 To: pulkit, #hg-reviewers Cc: indygreg, mercurial-devel ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
D2096: infinitepush: move the extension to core from fb-hgext
indygreg added a comment. I can take review of this series since this feature is of extreme interest to Mozilla. I //might// get around to looking at it this weekend. But no promises. REPOSITORY rHG Mercurial REVISION DETAIL https://phab.mercurial-scm.org/D2096 To: pulkit, #hg-reviewers Cc: indygreg, mercurial-devel ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel