Re: [pkg-discuss] Observations on IPS

Stephen Hahn Tue, 16 Sep 2008 09:34:07 -0700

* Moinak Ghosh <[EMAIL PROTECTED]> [2008-09-16 15:00]:
> On Mon, Sep 15, 2008 at 10:21 PM, Stephen Hahn <[EMAIL PROTECTED]> wrote:
> > * Moinak Ghosh <[EMAIL PROTECTED]> [2008-09-14 19:31]:
> >
> >  Responses to the points that haven't already been responded to in
> >  other threads/fora, or aren't already tracked in the bug database.
> >  I will make the overall comment that many of these points are
> >  insufficiently well-specified to act upon.
> >
> >>    2. One fundamental design approach in IPS is to use an intelligent
> >> package and metadata server. This makes IPS unsuitable for community
> >> distro mirrors. Community distros need to use public mirror services like
> >> say Ibiblio and it will be very rare, if at all for mirrors to run a custom
> >> server on their machines just to mirror a particular distro's packages.
> >
> >  Actually, it's always been the plan to have the retrieval side be
> >  simple.  We had to take a detour when we determined that Python's base
> >  HTTP implementation didn't support HTTP/1.1, and thus couldn't
> >  pipeline.  (Trivia:  the logo actually hints at this relationship--the
> >  retrieval server is much smaller and simpler than the publication
> >  server...)
> 
>    See Peter's comments. We may even decide to distribute packages
>    and metadata over simple anonftp. In addition since metadata is in general
>    read-only (as you mention below) with occasional writes during publishing
>    I do not see much of a need in having a server side component.
> 
>    The client does not seem to be that lightweight. It has to do a
>    bunch of processing checking package revisions, processing metadata
>    and generating package plans which are non-trivial computations.
>    I'd tend to think that the processing load is being balanced 50-50
>    between server and client. It should be possible to have the
>    server-side component being done by the client as well without
>    functionality loss.
 
   The client is not going to become lightweight; your CPU processing
   estimate is not accurate (nor realistic, if one thinks about scale).
   In fact, the client is doing the bulk of the work--the server is only
   handling our HTTP/1.0 workaround and the search operation.


> >>    3. Another fundamental restriction is that an IPS repo cannot be
> >> rsync-ed. IPS maintains an index in a huge sparse file rendering rsync
> >> impossible. In addition a running server is continuously accessing/
> >> updating metadata making it unsafe for rsync. Rsync is a tried and
> >> proven and highly optimized algorithm for mirroring used virtually by
> >> every mirroring service on the planet and distros need to support it.
> >
> >  Dan pointed out that the index implementation changed some time ago.
> >  I am uncertain why you believe that there is continuous change in the
> >  metadata; such a belief is incorrect, and the discrete changes at
> >  package publication time can be isolated from any rsync service.
> 
>    This is fine and removes one big problem of the sparse file. However
>    rsync is still not straightforward. When rsync-ing from server_a to 
> server_b
>    the depotd on the server_b will have to be stopped for the duration of the
>    rsync. Alternatively one has to maintain a duplicate directory structure
>    on server_b, rsync to that and then cpio it to the actual depot to reduce
>    downtime. In any case this some amount of round-about activity and
>    does not fit into the straight zero-complexity distribution of content used
>    all over the place today.
 
  On a ZFS-based system, it's easier than that, but two trees and a
  symbolic link is sufficient for systems without snapshots.  If one is
  only mirroring content, then you can simply rsync on top; since that's
  the bulk of the install traffic, it's the most likely to have value
  when mirroring.

> >> Why is IPS re-inventing mirroring ?
> >
> >  I don't believe we are.
> 
>    What about Pkgrecv ... why do you need that if rsync will suffice ...
 
  pkgrecv(1) is complementary to pkgsend(1)--it allows one to isolate a
  package version for later manipulation in transaction form.  That
  folks are using it for other purposes doesn't invalidate its primary
  use.

> >>    7. There is no boolean dependency mechanism in IPS though this
> >> may possibly appear at some point.
> >
> >  Not certain what you mean by a boolean dependency mechanism.
> 
>    A dependency relation of the form:
>    (PkgA, Version: 1.0)  (requires) (PkgB, (version >= 1.0 and version <= 
> 1.5))
> 
>    Makes it possible to exactly specify software requirements and allows
>    computation of the exact limit set of the packages absolutely needed to
>    perform an install or upgrade.
> 
>    In addition there should be a logical separation of package dependencies
>    between base OS packages and layered software. For eg. the transitive
>    dependency closure for an application package say Gaim should not
>    include core OS package like kernel or libc. Makes it possible to cleanly
>    separate application package transforms from base OS package transforms.
> 
>    pkg image-update today doe not give an easy way to upgrade my base OS
>    without upgrading all the bundled applications.
 
  Yes, the inherited dependencies have problems, which we recognize.  We
  have a plan to constrain dependencies, but we think it's more precise
  than the single interval approach.

> >>    8. IPS metadata is extremely opaque making it impossible for anyone
> >> to understand it and cost of corruption high both on installed system and
> >> on the repository server. With other solutions repairing a corrupt repo can
> >> be as simple as an rsync from a mirror. We believe that simple human-
> >> readable metadata that adequately serves the purpose is enough and is
> >> in fact vital.
> >
> >  I'm sure I'm too close to this, so you'll need to explain "extremely
> >  opaque" and "impossible for anyone".  What specific improvements would
> >  lead to simple human-readability?
> >
>    I will admit here that my original comment goes a little overboard. I have
>    been compiling this list based on wide feedback and did not digest this
>    one.
> 
>    However the approach of naming files in the repo as hashes instead of the
>    actual filenames is confusing. One cannot figure out what is what without
>    cross-checking with the manifest.
 
  I suppose we'll have to disagree on the importance of this point.

> >>   11. IPS operations are somewhat opaque from the observability point of
> >> view. It is rather difficult for developers.
> >
> >  Vague; please expand.
> 
>    I will point you to an example:
>    http://www.thewrittenword.com/www/projects/pkgutils/pkgadd/
> 
>    Excruciating verbosity yes but I will expect it if I am providing a ' -v '
>    argument. It is clear as daylight was the utility is doing underneath. In
>    contrast pkg image-update -v  for eg. is excruciatingly silent. What if
>    fetching  pkg.opensolaris.org/catalog/0  is slow due to a network problem
>    ... the user won't have a clue.
 
  Sure, that's a bug that the team has acknowledged and is working on.

> >>   13. The download cache in IPS uses hashes instead of filenames making
> >> it impossible for a human to understand. Sometimes esp. in emergencies
> >> human visibility into the guts of a system is critical.
> >
> >  At what points would the download cache contents be useful in an
> >  emergency, in a way beyond that envisioned by the fix subcommand to
> >  pkg(1)?
> 
>    Ability too see filenames makes it clear what is there in the cache. One
>    cannot predict what kind of crooked emergency situation might arise
>    requiring hackery beyond one's dreams just to get a something critical
>    to work. Pkg itsef screwed up, hand copying of files and so on. I cannot
>    articulate examples right now but everyone including myself have faced
>    situations in the past. I remember one case where I desperately needed
>    to apply a patch and a bug in patchadd caused me to hand-edit pkginfo
>    files of 30 packages to get the patch to install.
 
  I expect we would see snapshots come into play in any recovery
  process.  I don't agree that the cache is a general solution to such
  problems--if anything, the cache is probably as untrustworthy as the
  image's contents.

>    While in this thread, I will dare to make a few more comments which I
>    was able to recollect yesterday:
> 
>    *) Adopting an existing FOSS packaging framework and working with that
>    community would have gone a long way to boost SUN's perception among
>    FOSS communities.
 
  And forking from one of those frameworks might have done just as much
  to doom how OpenSolaris is perceived.

>    *) How does IPS compare to something like Smart (http://labix.org/smart).
>    I'd guess IPS still has ways to go to match those features. By that time
>    those solutions will have moved further forward.

  Perhaps, perhaps we'll catch up.

>    *) Why tie every package version into an ON build number. What sense
>    it makes to refer to an ON build number for say Thunderbird ?  It is
>    understandable that one may require tagging as releases are synced to
>    ON builds but a separate taglist property should have been more useful.
>    This will also allow flexibility in tagging packages for multiple different
>    kinds of deliverables like say a Network Appliance focussed distro.

  As others have noted, this is an artifact of how we're currently
  importing the historical operating system.  At some point, I would
  expect each major package to have its own revision and branch history,
  and those to be assembled by successors to the current "entire"
  package.

>    *) The feature of tagging within a package and filtering is not yet being
>    used and the potential to misuse this is already being exploited. Consider
>    the monolithic 450M OpenOffice package. There is no way one can install
>    say a single or selected components like Writer. One has to install the
>    whole hog.
>    This increases opaqueness and reduces visibility into what is already there
>    in the packages, unless one is prepared to list all files in the package.
>    I'd say sub-package tagging makes sense only for multi-architecture 
> support.
>    Not the way things like  *-devel, *-doc etc. are being collapsed
> into a single
>    package.
>    Imagine a small town college student in India sitting with a 128Kbps link
>    trying to install OpenOffice, SunStudio and blah, blah, blah on his freshly
>    installed OpenSolaris 2008.xx(in his home PC) that he got from a recent
>    Bangalore OpenSolaris user group meet. Unfortunately he can't even ask
>    someone in Bangalore with say a 2Mbps link to download the packages
>    and provide those to him on a DVD!

  And yet we've already seen published a repository on a DVD as an
  experiment for C`T Magazine--thanks, Detlef--and are planning to
  refine that process further.    

>   *) One final point from my observation, enterprises today have heterogenous
>   environments having Windows, Linux, Solaris and possibly other legacy
>   OS-es like say AIX. Leaving aside Windows and legacy, there are significant
>   frameworks setup for controlled delivery of software to hundreds and 
> thousands
>   of boxes typically involving a package repository. The management of all 
> these
>   can become hell of a lot easier if it is possible to use a uniform 
> repository
>   across platforms. So the repository needs to be modular and extensible to
>   different native packaging systems. Unfortunately IPS tightly couples
>   packaging and network repository making this use-case impossible. If IPS
>   had defined an independent stable on-disk format, had worked with an 
> existing
>   community repository project rather than re-doing from scratch, it would 
> have
>   made possible a common repository deployment for both Linux and Solaris,
>   reduced administrative and maintenance cost and reduced one small barrier
>   to entry for OpenSolaris.

  I'm sorry we aren't doing the project you wanted, in the order you
  wanted.  I hope you'll do us the courtesy of letting us do the project
  we believe we need to do, in the order we believe is open to us.

  - Stephen

-- 
[EMAIL PROTECTED]  http://blogs.sun.com/sch/
_______________________________________________
pkg-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/pkg-discuss

Re: [pkg-discuss] Observations on IPS

Reply via email to