* Moinak Ghosh <[EMAIL PROTECTED]> [2008-09-16 15:00]: > On Mon, Sep 15, 2008 at 10:21 PM, Stephen Hahn <[EMAIL PROTECTED]> wrote: > > * Moinak Ghosh <[EMAIL PROTECTED]> [2008-09-14 19:31]: > > > > Responses to the points that haven't already been responded to in > > other threads/fora, or aren't already tracked in the bug database. > > I will make the overall comment that many of these points are > > insufficiently well-specified to act upon. > > > >> 2. One fundamental design approach in IPS is to use an intelligent > >> package and metadata server. This makes IPS unsuitable for community > >> distro mirrors. Community distros need to use public mirror services like > >> say Ibiblio and it will be very rare, if at all for mirrors to run a custom > >> server on their machines just to mirror a particular distro's packages. > > > > Actually, it's always been the plan to have the retrieval side be > > simple. We had to take a detour when we determined that Python's base > > HTTP implementation didn't support HTTP/1.1, and thus couldn't > > pipeline. (Trivia: the logo actually hints at this relationship--the > > retrieval server is much smaller and simpler than the publication > > server...) > > See Peter's comments. We may even decide to distribute packages > and metadata over simple anonftp. In addition since metadata is in general > read-only (as you mention below) with occasional writes during publishing > I do not see much of a need in having a server side component. > > The client does not seem to be that lightweight. It has to do a > bunch of processing checking package revisions, processing metadata > and generating package plans which are non-trivial computations. > I'd tend to think that the processing load is being balanced 50-50 > between server and client. It should be possible to have the > server-side component being done by the client as well without > functionality loss. The client is not going to become lightweight; your CPU processing estimate is not accurate (nor realistic, if one thinks about scale). In fact, the client is doing the bulk of the work--the server is only handling our HTTP/1.0 workaround and the search operation.
> >> 3. Another fundamental restriction is that an IPS repo cannot be > >> rsync-ed. IPS maintains an index in a huge sparse file rendering rsync > >> impossible. In addition a running server is continuously accessing/ > >> updating metadata making it unsafe for rsync. Rsync is a tried and > >> proven and highly optimized algorithm for mirroring used virtually by > >> every mirroring service on the planet and distros need to support it. > > > > Dan pointed out that the index implementation changed some time ago. > > I am uncertain why you believe that there is continuous change in the > > metadata; such a belief is incorrect, and the discrete changes at > > package publication time can be isolated from any rsync service. > > This is fine and removes one big problem of the sparse file. However > rsync is still not straightforward. When rsync-ing from server_a to > server_b > the depotd on the server_b will have to be stopped for the duration of the > rsync. Alternatively one has to maintain a duplicate directory structure > on server_b, rsync to that and then cpio it to the actual depot to reduce > downtime. In any case this some amount of round-about activity and > does not fit into the straight zero-complexity distribution of content used > all over the place today. On a ZFS-based system, it's easier than that, but two trees and a symbolic link is sufficient for systems without snapshots. If one is only mirroring content, then you can simply rsync on top; since that's the bulk of the install traffic, it's the most likely to have value when mirroring. > >> Why is IPS re-inventing mirroring ? > > > > I don't believe we are. > > What about Pkgrecv ... why do you need that if rsync will suffice ... pkgrecv(1) is complementary to pkgsend(1)--it allows one to isolate a package version for later manipulation in transaction form. That folks are using it for other purposes doesn't invalidate its primary use. > >> 7. There is no boolean dependency mechanism in IPS though this > >> may possibly appear at some point. > > > > Not certain what you mean by a boolean dependency mechanism. > > A dependency relation of the form: > (PkgA, Version: 1.0) (requires) (PkgB, (version >= 1.0 and version <= > 1.5)) > > Makes it possible to exactly specify software requirements and allows > computation of the exact limit set of the packages absolutely needed to > perform an install or upgrade. > > In addition there should be a logical separation of package dependencies > between base OS packages and layered software. For eg. the transitive > dependency closure for an application package say Gaim should not > include core OS package like kernel or libc. Makes it possible to cleanly > separate application package transforms from base OS package transforms. > > pkg image-update today doe not give an easy way to upgrade my base OS > without upgrading all the bundled applications. Yes, the inherited dependencies have problems, which we recognize. We have a plan to constrain dependencies, but we think it's more precise than the single interval approach. > >> 8. IPS metadata is extremely opaque making it impossible for anyone > >> to understand it and cost of corruption high both on installed system and > >> on the repository server. With other solutions repairing a corrupt repo can > >> be as simple as an rsync from a mirror. We believe that simple human- > >> readable metadata that adequately serves the purpose is enough and is > >> in fact vital. > > > > I'm sure I'm too close to this, so you'll need to explain "extremely > > opaque" and "impossible for anyone". What specific improvements would > > lead to simple human-readability? > > > I will admit here that my original comment goes a little overboard. I have > been compiling this list based on wide feedback and did not digest this > one. > > However the approach of naming files in the repo as hashes instead of the > actual filenames is confusing. One cannot figure out what is what without > cross-checking with the manifest. I suppose we'll have to disagree on the importance of this point. > >> 11. IPS operations are somewhat opaque from the observability point of > >> view. It is rather difficult for developers. > > > > Vague; please expand. > > I will point you to an example: > http://www.thewrittenword.com/www/projects/pkgutils/pkgadd/ > > Excruciating verbosity yes but I will expect it if I am providing a ' -v ' > argument. It is clear as daylight was the utility is doing underneath. In > contrast pkg image-update -v for eg. is excruciatingly silent. What if > fetching pkg.opensolaris.org/catalog/0 is slow due to a network problem > ... the user won't have a clue. Sure, that's a bug that the team has acknowledged and is working on. > >> 13. The download cache in IPS uses hashes instead of filenames making > >> it impossible for a human to understand. Sometimes esp. in emergencies > >> human visibility into the guts of a system is critical. > > > > At what points would the download cache contents be useful in an > > emergency, in a way beyond that envisioned by the fix subcommand to > > pkg(1)? > > Ability too see filenames makes it clear what is there in the cache. One > cannot predict what kind of crooked emergency situation might arise > requiring hackery beyond one's dreams just to get a something critical > to work. Pkg itsef screwed up, hand copying of files and so on. I cannot > articulate examples right now but everyone including myself have faced > situations in the past. I remember one case where I desperately needed > to apply a patch and a bug in patchadd caused me to hand-edit pkginfo > files of 30 packages to get the patch to install. I expect we would see snapshots come into play in any recovery process. I don't agree that the cache is a general solution to such problems--if anything, the cache is probably as untrustworthy as the image's contents. > While in this thread, I will dare to make a few more comments which I > was able to recollect yesterday: > > *) Adopting an existing FOSS packaging framework and working with that > community would have gone a long way to boost SUN's perception among > FOSS communities. And forking from one of those frameworks might have done just as much to doom how OpenSolaris is perceived. > *) How does IPS compare to something like Smart (http://labix.org/smart). > I'd guess IPS still has ways to go to match those features. By that time > those solutions will have moved further forward. Perhaps, perhaps we'll catch up. > *) Why tie every package version into an ON build number. What sense > it makes to refer to an ON build number for say Thunderbird ? It is > understandable that one may require tagging as releases are synced to > ON builds but a separate taglist property should have been more useful. > This will also allow flexibility in tagging packages for multiple different > kinds of deliverables like say a Network Appliance focussed distro. As others have noted, this is an artifact of how we're currently importing the historical operating system. At some point, I would expect each major package to have its own revision and branch history, and those to be assembled by successors to the current "entire" package. > *) The feature of tagging within a package and filtering is not yet being > used and the potential to misuse this is already being exploited. Consider > the monolithic 450M OpenOffice package. There is no way one can install > say a single or selected components like Writer. One has to install the > whole hog. > This increases opaqueness and reduces visibility into what is already there > in the packages, unless one is prepared to list all files in the package. > I'd say sub-package tagging makes sense only for multi-architecture > support. > Not the way things like *-devel, *-doc etc. are being collapsed > into a single > package. > Imagine a small town college student in India sitting with a 128Kbps link > trying to install OpenOffice, SunStudio and blah, blah, blah on his freshly > installed OpenSolaris 2008.xx(in his home PC) that he got from a recent > Bangalore OpenSolaris user group meet. Unfortunately he can't even ask > someone in Bangalore with say a 2Mbps link to download the packages > and provide those to him on a DVD! And yet we've already seen published a repository on a DVD as an experiment for C`T Magazine--thanks, Detlef--and are planning to refine that process further. > *) One final point from my observation, enterprises today have heterogenous > environments having Windows, Linux, Solaris and possibly other legacy > OS-es like say AIX. Leaving aside Windows and legacy, there are significant > frameworks setup for controlled delivery of software to hundreds and > thousands > of boxes typically involving a package repository. The management of all > these > can become hell of a lot easier if it is possible to use a uniform > repository > across platforms. So the repository needs to be modular and extensible to > different native packaging systems. Unfortunately IPS tightly couples > packaging and network repository making this use-case impossible. If IPS > had defined an independent stable on-disk format, had worked with an > existing > community repository project rather than re-doing from scratch, it would > have > made possible a common repository deployment for both Linux and Solaris, > reduced administrative and maintenance cost and reduced one small barrier > to entry for OpenSolaris. I'm sorry we aren't doing the project you wanted, in the order you wanted. I hope you'll do us the courtesy of letting us do the project we believe we need to do, in the order we believe is open to us. - Stephen -- [EMAIL PROTECTED] http://blogs.sun.com/sch/ _______________________________________________ pkg-discuss mailing list [email protected] http://mail.opensolaris.org/mailman/listinfo/pkg-discuss
