No opinion on name; foreman will name it whatever they want on the front end user experience. Devs working on pulp-2 to pulp-3 foreman transition may desire maintaining existing names.
Yes, I'd say everything but step 14 in that diagram. In addition, I would ensure that the squid cache size is configurable to zero so that it is effectively a straight pull through. I assume that all pulp-3 content types will have this as an option as well, if the type supports it? I want straight proxy of container images, for example. An straight proxy of files. etc. On Wed, May 30, 2018 at 11:34 AM, Brian Bouterse <bbout...@redhat.com> wrote: > Actually, what about these as names? > > policy=immediate -> downloads now while the task runs (no lazy). Also the > default if unspecified. > policy=cache-and-save -> All the steps in the diagram. Content that is > downloaded is saved so that it's only ever downloaded once. > policy=cache -> All the steps in the diagram except step 14. If squid > pushes the bits out of the cache, it will be re-downloaded again to serve > to other clients requesting the same bits. > > If ^ is better I can update the stories. Other naming ideas and use cases > are welcome. > > Thanks, > Brian > > On Wed, May 30, 2018 at 10:50 AM, Brian Bouterse <bbout...@redhat.com> > wrote: > >> >> >> On Wed, May 30, 2018 at 8:57 AM, Tom McKay <thomasmc...@redhat.com> >> wrote: >> >>> I think there is a usecase for "proxy only" like is being described >>> here. Several years ago there was a project called thumbslug[1] that was >>> used in a version of katello instead of pulp. It's job was to check >>> entitlements and then proxy content from a cdn. The same functionality >>> could be implemented in pulp. (Perhaps it's even as simple as telling squid >>> not to cache anything so the content would never make it from cache to pulp >>> in current pulp-2.) >>> >> >> What would you call this policy? >> policy=proxy? >> policy=stream-dont-save? >> policy=stream-no-save? >> >> Are the names 'on-demand' and 'immediate' clear enough? Are there better >> names? >> >>> >>> Overall I'm +1 to the idea of an only-squid version, if others think it >>> would be useful. >>> >> >> I understand describing this as a "only-squid" version, but for clarity, >> the streamer would still be required because it is what requests the bits >> with the correctly configured downloader (certs, proxy, etc). The streamer >> streams the bits into squid which provides caching and client multiplexing. >> >> To confirm my understanding this "squid-only" policy would be the same as >> on-demand except that it would *not* perform step 14 from the diagram here ( >> https://pulp.plan.io/issues/3693). Is that right? >> >> >>> >>> [1] https://github.com/candlepin/thumbslug >>> >>> On Wed, May 30, 2018 at 8:34 AM, Milan Kovacik <mkova...@redhat.com> >>> wrote: >>> >>>> On Tue, May 29, 2018 at 9:31 PM, Dennis Kliban <dkli...@redhat.com> >>>> wrote: >>>> > On Tue, May 29, 2018 at 11:42 AM, Milan Kovacik <mkova...@redhat.com> >>>> wrote: >>>> >> >>>> >> On Tue, May 29, 2018 at 5:13 PM, Dennis Kliban <dkli...@redhat.com> >>>> wrote: >>>> >> > On Tue, May 29, 2018 at 10:41 AM, Milan Kovacik < >>>> mkova...@redhat.com> >>>> >> > wrote: >>>> >> >> >>>> >> >> Good point! >>>> >> >> More the second; it might be a bit crazy to utilize Squid for >>>> that but >>>> >> >> first, let's answer the why ;) >>>> >> >> So why does Pulp need to store the content here? >>>> >> >> Why don't we point the users to the Squid all the time (for the >>>> lazy >>>> >> >> repos)? >>>> >> > >>>> >> > >>>> >> > Pulp's Streamer needs to fetch and store the content because that's >>>> >> > Pulp's >>>> >> > primary responsibility. >>>> >> >>>> >> Maybe not that much the storing but rather the content views >>>> management? >>>> >> I mean the partitioning into repositories, promoting. >>>> >> >>>> > >>>> > Exactly this. We want Pulp users to be able to reuse content that was >>>> > brought in using the 'on_demand' download policy in other >>>> repositories. >>>> I see. >>>> >>>> > >>>> >> >>>> >> If some of the content lived in Squid and some lived >>>> >> > in Pulp, it would be difficult for the user to know what content is >>>> >> > actually >>>> >> > available in Pulp and what content needs to be fetched from a >>>> remote >>>> >> > repository. >>>> >> >>>> >> I'd say the rule of the thumb would be: lazy -> squid, regular -> >>>> pulp >>>> >> so not that difficult. >>>> >> Maybe Pulp could have a concept of Origin, where folks upload stuff >>>> to >>>> >> a Pulp repo, vs. Proxy for it's repo storage policy? >>>> >> >>>> > >>>> > Squid removes things from the cache at some point. You can probably >>>> > configure it to never remove anything from the cache, but then we >>>> would need >>>> > to implement orphan cleanup that would work across two systems: pulp >>>> and >>>> > squid. >>>> >>>> Actually "remote" units wouldn't need orphan cleaning from the disk, >>>> just dropping them from the DB would suffice. >>>> >>>> > >>>> > Answering that question would still be difficult. Not all content >>>> that is in >>>> > the repository that was synced using on_demand download policy will >>>> be in >>>> > Squid - only the content that has been requested by clients. So it's >>>> still >>>> > hard to know which of the content units have been downloaded and >>>> which have >>>> > not been. >>>> >>>> But the beauty is exactly in that: we don't have to track whether the >>>> content is downloaded if it is reverse-proxied[1][2]. >>>> Moreover, this would work both with and without a proxy between Pulp >>>> and the Origin of the remote unit. >>>> A "remote" content artifact might just need to carry it's URL in a DB >>>> column for this to work; so the async artifact model, instead of the >>>> "policy=on-demand" would have a mandatory remote "URL" attribute; I >>>> wouldn't say it's more complex than tracking the "policy" attribute. >>>> >>>> > >>>> > >>>> >> >>>> >> > >>>> >> > As Pulp downloads an Artifact, it calculates all the checksums and >>>> it's >>>> >> > size. It then performs validation based on information that was >>>> provided >>>> >> > from the RemoteArtifact. After validation is performed, the >>>> Artifact, is >>>> >> > saved to the database and it's final place in >>>> >> > /var/lib/content/artifacts/. >>>> >> >>>> >> This could be still achieved by storing the content just temporarily >>>> >> in the Squid proxy i.e use Squid as the content source, not the disk. >>>> >> >>>> >> > Once this information is in the database, Pulp's web server can >>>> serve >>>> >> > the >>>> >> > content without having to involve the Streamer or Squid. >>>> >> >>>> >> Pulp might serve just the API and the metadata, the content might be >>>> >> redirected to the Proxy all the time, correct? >>>> >> Doesn't Crane do that btw? >>>> > >>>> > >>>> > Theoretically we could do this, but in practice we would run into >>>> problems >>>> > when we needed to scale out the Content app. Right now when the >>>> Content app >>>> > needs to be scaled, a user can launch another machine that will run >>>> the >>>> > Content app. Squid does not support that kind of scaling. Squid can >>>> only >>>> > take advantage of additional cores in a single machine >>>> >>>> I don't think I understand; proxies are actually designed to scale[1] >>>> and are used as tools to scale the web too. >>>> >>>> This is all about the How question but when it comes to my original >>>> Why, please correct me if I'm being wrong, the answer so far has been: >>>> Pulp always downloads the content because that's what it is supposed >>>> to do. >>>> >>>> Cheers, >>>> milan >>>> >>>> [1] https://en.wikipedia.org/wiki/Reverse_proxy >>>> [2] https://paste.fedoraproject.org/paste/zkBTyxZjm330FsqvPP0lIA >>>> [3] https://wiki.squid-cache.org/Features/CacheHierarchy?highlig >>>> ht=%28faqlisted.yes%29 >>>> >>>> > >>>> >> >>>> >> >>>> >> Cheers, >>>> >> milan >>>> >> >>>> >> > >>>> >> > -dennis >>>> >> > >>>> >> > >>>> >> > >>>> >> > >>>> >> > >>>> >> >> >>>> >> >> >>>> >> >> -- >>>> >> >> cheers >>>> >> >> milan >>>> >> >> >>>> >> >> On Tue, May 29, 2018 at 4:25 PM, Brian Bouterse < >>>> bbout...@redhat.com> >>>> >> >> wrote: >>>> >> >> > >>>> >> >> > On Mon, May 28, 2018 at 9:57 AM, Milan Kovacik < >>>> mkova...@redhat.com> >>>> >> >> > wrote: >>>> >> >> >> >>>> >> >> >> Hi, >>>> >> >> >> >>>> >> >> >> Looking at the diagram[1] I'm wondering what's the reasoning >>>> behind >>>> >> >> >> Pulp having to actually fetch the content locally? >>>> >> >> > >>>> >> >> > >>>> >> >> > Is the question "why is Pulp doing the fetching and not Squid?" >>>> or >>>> >> >> > "why >>>> >> >> > is >>>> >> >> > Pulp storing the content after fetching it?" or both? >>>> >> >> > >>>> >> >> >> Couldn't Pulp just rely on the proxy with regards to the >>>> content >>>> >> >> >> streaming? >>>> >> >> >> >>>> >> >> >> Thanks, >>>> >> >> >> milan >>>> >> >> >> >>>> >> >> >> >>>> >> >> >> [1] https://pulp.plan.io/attachments/130957 >>>> >> >> >> >>>> >> >> >> On Fri, May 25, 2018 at 9:11 PM, Brian Bouterse >>>> >> >> >> <bbout...@redhat.com> >>>> >> >> >> wrote: >>>> >> >> >> > A mini-team of core devs** met to talk through lazy use >>>> cases for >>>> >> >> >> > Pulp3. >>>> >> >> >> > It's effectively the same lazy from Pulp2 except: >>>> >> >> >> > >>>> >> >> >> > * it's now built into core (not just RPM) >>>> >> >> >> > * It disincludes repo protection use cases because we haven't >>>> >> >> >> > added >>>> >> >> >> > repo >>>> >> >> >> > protection to Pulp3 yet >>>> >> >> >> > * It disincludes the "background" policy which based on >>>> feedback >>>> >> >> >> > from >>>> >> >> >> > stakeholders provided very little value >>>> >> >> >> > * it will no longer will depend on Twisted as a dependency. >>>> It >>>> >> >> >> > will >>>> >> >> >> > use >>>> >> >> >> > asyncio instead. >>>> >> >> >> > >>>> >> >> >> > While it is being built into core, it will require minimal >>>> support >>>> >> >> >> > by >>>> >> >> >> > a >>>> >> >> >> > plugin writer to add support for it. Details in the epic >>>> below. >>>> >> >> >> > >>>> >> >> >> > The current use cases along with a technical plan are >>>> written on >>>> >> >> >> > this >>>> >> >> >> > epic: >>>> >> >> >> > https://pulp.plan.io/issues/3693 >>>> >> >> >> > >>>> >> >> >> > We're putting it out for comment, questions, and feedback >>>> before >>>> >> >> >> > we >>>> >> >> >> > start >>>> >> >> >> > into the code. I hope we are able to add this into our next >>>> >> >> >> > sprint. >>>> >> >> >> > >>>> >> >> >> > ** ipanova, jortel, ttereshc, dkliban, bmbouter >>>> >> >> >> > >>>> >> >> >> > Thanks! >>>> >> >> >> > Brian >>>> >> >> >> > >>>> >> >> >> > >>>> >> >> >> > _______________________________________________ >>>> >> >> >> > Pulp-dev mailing list >>>> >> >> >> > Pulp-dev@redhat.com >>>> >> >> >> > https://www.redhat.com/mailman/listinfo/pulp-dev >>>> >> >> >> > >>>> >> >> > >>>> >> >> > >>>> >> >> >>>> >> >> _______________________________________________ >>>> >> >> Pulp-dev mailing list >>>> >> >> Pulp-dev@redhat.com >>>> >> >> https://www.redhat.com/mailman/listinfo/pulp-dev >>>> >> > >>>> >> > >>>> > >>>> > >>>> >>>> _______________________________________________ >>>> Pulp-dev mailing list >>>> Pulp-dev@redhat.com >>>> https://www.redhat.com/mailman/listinfo/pulp-dev >>>> >>> >>> >>> _______________________________________________ >>> Pulp-dev mailing list >>> Pulp-dev@redhat.com >>> https://www.redhat.com/mailman/listinfo/pulp-dev >>> >>> >> >
_______________________________________________ Pulp-dev mailing list Pulp-dev@redhat.com https://www.redhat.com/mailman/listinfo/pulp-dev