On Sat, May 26, 2018 at 2:23 AM, Daniel Alley <[email protected]> wrote: > @Brian > > I agree with a lot of those points, but I would say that we're not just > competing against hodgepodge collections of "scripts", but also against > writing small microservice-y Flask apps that only implement the API for one > content type. > > Also, rollback is not something Pulp would necessarily be able to offer with > respect to history-sensitive content and metadata, like git repositories, or > the Cargo example I provided. It's still something the plugin writer would > have to implement themselves in this case. > > @Jeff > >> perhaps a new component of a Publication like PublishedDirectory that >> references an OSTree/Git repository created in /var/lib/pulp/published. > > > I like the idea generally, but I don't think it would be able to be a > component of a Publication. I think it would need to be an alternative to a > Publication which fulfills a similar function. > > The fundamental problem is this scenario: > > You upload a git repository with a git repository plugin > You publish and distribute version 1 of the git repository > You publish and distribute version 2 of the git repository > A client downloads the git repository > You notice a problem and decide to roll back to version 1. A publication of > version 1 already exists, which you distribute. > Clients have a broken git history. New clients can download the old version > but anyone who has already downloaded version 2 will not be able to roll > back to version 1 by pulling from Pulp
Just trying to understand the situation: Is that because of the rollback actually creates version #3 that's "newer" but lacks the rolled-back commits? So there are some "merge" conflict if folks, that cloned #2, want to pull from version #3 but their branch contains a commit the origin lacks now? Or rather that the published bits of the version #2 doesn't exist anymore at all? > > We need to prevent step 5 from happening. > > There are a couple of possible solutions to this problem: > > As a Pulp admin, you ignore Pulp's rollback functionality. Instead of using > Pulp to roll back, you manually revert the commits using git, and upload a > new version of the repository to Pulp as "version 3". You then distribute > version 3 instead of version 1. You understand that if you were to publish > and old version using Pulp, it would misbehave for clients that tried to > pull / update instead of cloning. In my opinion folks needing Pulp to track a git(-like) repo are probably interested in more workflows than just the clone. > > As a Pulp admin / plugin writer / user, you know that the client for the > content type will never try to pull or update, only clone. Therefore it is > not a problem for you and can be ignored. The cloning might be equivalent of just snapshotting the tree at a particular commit and just publishing a plain tar.gz w/o the git structures. Limiting but clean? > > As a Plugin writer, whenever you publish a new version of the git > repository, you delete or invalidate every publication for previous versions > for the distribution base path. If a Pulp admin wants to roll back, they > need to create a new Publication. The Plugin knows to apply revert commits > on top of the repository to keep history linear. > > But really we've just pushed the problem forwards. What happens when you > want to upload future versions? Now history of the git repository in Pulp > is different from the Pulp admin's git repo history > This is only acceptable for content types where the history is immaterial to > the content itself. Probably viable for Cargo, but probably not a Git > content type. > Does it mean a publication directory git tree is built anew every time a rollback happens? So Pulp history and the original project history are meant to be different? Can there be ever conflicts? > As a Plugin writer, you ignore publications entirely. You don't make it > possible to do the wrong thing. You have something along the lines of a > "PluginManagedDirectory" which core does not try to mess with. If you want > to implement rollback functionality, you do it through your own API where > the side effects are more easily controlled and reasoned about. +1 seems like the cleanest way to me > > I have doubts about whether Option 3 is viable - it seems like making it > work reliably would be difficult. I'd say option #1 and #3 are the same, #3 adding the complexity of automating the rollback in Pulp, option #2 and #4 are the same in the sense of Pulp staying away from the incompatible workflow a content type has while providing a limited functionality subset to the consumer. In addition, #4 allows for Pulp service/host to provide both the Pulp-specific, limited functionality as well as the incompatible, content-type specific workflows from a "single" point. This might be a benefit to some folks. Option #5: somehow make core Pulp (content versioning) compatible with the Git model ;) -- milan > > On Fri, May 25, 2018 at 5:05 PM, Brian Bouterse <[email protected]> wrote: >> >> I think Pulp does have enough value proposition over a script-based >> alternative to make it worthwile for all of those types of plugins. Here are >> a few points I think about: >> >> * scalability. A common story users tell is that scripts work well up >> until a point. Doing it for an entire organization, or when content comes >> from many places, or with more than a few people involved in maintaining the >> content, it becomes unmaintainable. >> >> * Stacks of content. Often a group of content goes together, but each >> piece of content is updated separately. For instance with Ansible roles, you >> may use many of them together to deploy something, but each role may receive >> changes separately. I think of all this content together as a "stack". >> Keeping everything up to date can be challenging. Managing that change with >> scripts can be hard and fragile. Also the ability to rollback quickly an >> confidently is something Pulp can offer. >> >> * Organizing content is easier. Having an API that you can use to organize >> content is easier than doing lots and lots of git yourself or with scripts. >> >> * Tasking. Long running tasks (and a lot of them) can be unweildy, and >> Pulp makes that very organized and run very well. >> >> * Static and vulnerability analysis. We're seeing interest in using >> analysis projects like Clair (https://github.com/arminc/clair-scanner) to >> scan content in Pulp. By bringing all the content into one place, and that >> place having a tasking system that plugin writers can control how their >> content can be analyzed continuously. >> >> Also +1 to jortel's idea. I think that's a great idea and exactly what we >> need. >> >> >> On Thu, May 24, 2018 at 1:33 PM, Jeff Ortel <[email protected]> wrote: >>> >>> >>> >>> On 05/17/2018 07:46 AM, Daniel Alley wrote: >>> >>> Some content types are not going to be compatible with the normal >>> sync/publish/distribute Pulp workflows, and will need to be live API-only. >>> To what degree should Pulp accomodate these use cases? >>> >>> Example: >>> >>> Pulp makes the assumptions that >>> >>> A) the metadata for a repository can be generated in its entirety by the >>> known set of content in a RepositoryVersion, and >>> >>> B) the client wouldn't care if you point it at an older version of the >>> same repository. >>> >>> Cargo, the package manager for the Rust programming language, expects the >>> registry url to be a git repository. When a user does a "cargo update", >>> cargo essentially does a "git pull" to update a local copy of the registry. >>> >>> Both of those assumptions are false in this case. You cannot generate the >>> git history just from the set of content, and you cannot "roll back" the >>> state of the repository without either breaking it for clients, or adding >>> new commits on top. >>> >>> A theoretical Pulp plugin that worked with Cargo would need to ignore >>> almost all of the existing Pulp primitives and very little (if any) of the >>> normal Pulp workflow could be used. >>> >>> Should Pulp attempt to cater to plugins like these? What could Pulp do >>> to provide a benefit for such plugins over writing something from scratch >>> from the ground up? To what extent would such plugins be able to integrate >>> with the rest of Pulp, if at all? >>> >>> >>> I think OSTree and Ansible plugins will be in the same boat as Cargo. In >>> the case of OSTree, libostree does the heavy lifting for sync and publishing >>> and I suspect the same is true for Git based repositories. We should >>> consider way to best support distributing (serving) content in core for >>> these content types. I suspect this will mainly entail something in the >>> content app and perhaps a new component of a Publication like >>> PublishedDirectory that references an OSTree/Git repository created in >>> /var/lib/pulp/published. This may benefit Maven as well. >>> >>> >>> >>> We don't have to commit to anything pre-GA but it is a good thing to keep >>> in mind. I'm sure there are other content types out there (not just Cargo) >>> which would face similar problems. pulp_git was inquired about a few months >>> ago, it seems like it would share a few of them. >>> >>> >>> _______________________________________________ >>> Pulp-dev mailing list >>> [email protected] >>> https://www.redhat.com/mailman/listinfo/pulp-dev >>> >>> >>> >>> _______________________________________________ >>> Pulp-dev mailing list >>> [email protected] >>> https://www.redhat.com/mailman/listinfo/pulp-dev >>> >> >> >> _______________________________________________ >> Pulp-dev mailing list >> [email protected] >> https://www.redhat.com/mailman/listinfo/pulp-dev >> > > > _______________________________________________ > Pulp-dev mailing list > [email protected] > https://www.redhat.com/mailman/listinfo/pulp-dev > _______________________________________________ Pulp-dev mailing list [email protected] https://www.redhat.com/mailman/listinfo/pulp-dev
