Danek Duvall wrote:
Shawn Walker wrote:
http://cr.opensolaris.org/~swalker/pkg-cat-p2-2/
Looks pretty good; most of this is nits, and I'm pretty sure that some of
my questions were already raised.
Bugids should be sort in increasing numerical order, though if there's one
umbrella bug to capture the entire change, it can go up top.
Does this fix bug 7063?
client.py:
- line 356: why not img.PKG_STATE_KNOWN?
So, I've obviously been a bit opaque on package states, so let me
attempt to clarify:
PKG_STATE_KNOWN - I axed this, since, in my mind, 'known' means that the
package system has knowledge *of* the package; and as such, all
'packages' are *known* to the packaging system if they are installed or
available in a package repository's catalog.
The only reason that this still says 'known' is because I wanted to
defer changing the output of pkg list. My thought was to dump the
current 'state' column and add an 'A' column (for available) to the UFIX
set of columns we currently have. I can do so in this changeset.
Either way I need to open a bug for this.
PKG_STATE_AVAILABLE - This state replaces PKG_STATE_KNOWN. This state
is only applied to packages that are currently *available* in a
repository catalog.
PKG_STATE_INSTALLED - This state simply indicates that the package is
installed.
PKG_STATE_PREFERRED - This state is a temporary one that is only used in
combination with PKG_STATE_INSTALLED. It indicates that the version of
the package is installed is from a publisher that was *preferred* at the
time of installation. This status will become obsolete once we start
ranking publishers instead of having the 'preferred' sledgehammer that
we do now.
api.py:
- line 105(?): do you no longer need to call load_config() on the image?
Nope. That's one of those things I wanted to get away from. I want
callers to know as little as possible about the Image object. It can
tell the object to save image configuration, but it seemed silly to
require callers to know that they have to call load_config() to load it.
In fact, I've just changed load_config() to be private since no
outside callers use it now ...
image.py:
- line 134ff: should these just be integers? Bitfields? I'm not sure
I've changed them to integers. Given JSON's representation of python
data types, that seems the best fit.
what some of these mean -- "preferred" state of a package? What's the
difference between the two "installed" states? Should the preferred
state be private (__)?
I've added comments explaining what each state means.
- line 1013: self.__catalogs.pop(name, None)?
Cool. I've been missing perl's "del returns deleted value" behaviour :)
- line 1080: this either/or seems a little sketchy to me. Wouldn't you
care almost exclusively whether the package was installed with the
preferred publisher at the time, and not really care about what the
current preferred publisher is?
The sketchiness is intentional as it mimics exactly what we previously
did. As an example, if a publisher was preferred previously, then its
prefix would not be included when calling get_pkg_stem(). In addition,
when a package was installed, if the publisher it was installed from was
preferred at the time of installation, then the publisher's prefix was
written out to that package's installed file with __PRE__ intact. This
had the effect of causing get_pkg_stem() calls on FMRIs of packages of
the *current* preferred publisher and *previously* preferred publishers
being omitted.
This method's existence is mostly a backwards-compatibility hack for
modules/client/api.py's plan_update_all.
- line 1431: so we'll actually rewrite the installed files, upgrading to
the latest format of those files? Do we ever use it again?
Nice spot; that should have been removed completely.
- line 1480: small suggestion: the if/else could just define generators
that yielded fmristrs, and then the four shared lines could simply draw
from whichever generator was defined.
Actually, only three lines aren't shared that I can see, but I've
consolidated this (though a bit differently).
- line 1606: I'm probably confused, but I thought the installed catalog
was not a subset of the known catalog -- that is, packages which are
installed but not known by any publisher wouldn't show up in the known
catalog.
See state definition far above. The 'installed' catalog is simply a
subset of the 'known' catalog. That makes state management a lot easier
IMO :)
manifest.py:
- line 503: why this change? For short lines, this could end up being
slow.
Some of our manifest files are rather gigantic (10+ megabytes?). I
figured this was a bit more memory friendly, but I'm willing to change
this back.
publisher.py:
- line 1031: when does this get called? Won't the catalogs have all been
At the end of refresh().
converted when the image called __upgrade_image()? Could that call the
appropriate code here? They seem reasonably similar, though obviously
this one doesn't update the installed and known catalogs.
Remember that this is Phase II, so no repository provides v1 catalogs,
only v0 catalogs. So the client has to retrieve the v0 catalog and then
immediately convert it to v1 format. Even when Phase III is complete,
the client will still have to support repositories that only provide v0
catalogs.
- line 1099: I guess I don't understand this, either. Hm. Well, I see
that if it's just not there, then it doesn't do anything, which is
fine. But what happens if both the v0 and the v1 catalog are on the
system? I guess it's unlikely to happen, but if it's the case, then
it'll take the v0 as canonical, which seems wrong to me. If you took
the v1 first, then you could even ignore a failure to remove the v0
catalog.
The picture is incomplete here until Phase III is implemented. See
above. I've clarified the comment on line 1099 to better indicate the
issues here.
XXX line 1134: didn't the get_catalog() call get a v1 catalog?
No. See above. I've added a comment here to make this clearer.
catalog.py:
- line 315: super() is generally considered dangerous; what's the reason
you're using it here? And shouldn't __init__() be called with "self"
as the first arg?
That depends on who you ask. It's still around in py3k, and is in fact
being enhanced further. However, since I now have two detractors and I
don't yet have a need for this, I'll simply remove super(). And no,
when using super() you don't pass self to the function; that's why self
is specified in the super() call [1].
The test suites wouldn't have passed otherwise :)
- line 389: might this be clearer:
return cmp(a[1], b[1]) or cmp(a[0], b[0])
you could even do it as a lambda in the sorted() call, but that might
not be worth it. I'd probably create a generator expression that would
yield pub/stem tuples, and an else clause (if not ordered) to do the
same, and then iterate over those pairs outside the if for both cases.
Hm, now that I look, I don't actually use 'ordered' anywhere, so I've
dumped all the code for this. But I'll use your suggestion later when I
need it. Thanks :)
- line 421: didn't we determine at some point that a try/except setup was
more expensive than testing for existence? You could also just use
.get(name, ()), and iterate over the empty list.
I had forgotten about that; changed.
- it seems that fmris() and entries() could probably be defined in terms
of each other; there's a lot of shared code here.
This was purely a performance guess on my part; that is, that fmris
simply calling entries but not yielding the entry data itself would be
slower than simply doing what it is doing now. However, I'm willing to
change it.
- line 574: using a generator might be a bit less memory-intensive than a
list.
If you mean returning a generator, it was intentional that I did not.
The point is to return the *unique* set of package names for all
publishers within the catalog.
- line 683,684: what are rename and obsolete in this context? Why do we
need anything other than add and remove?
This was a mostly incomplete thought leftover from the old catalog that
had special api() calls for obsolete/rename. However, in retrospect,
you're correct that we only really need add/remove.
- line 1273: I don't see that behavior.
I did while running the test suite on b117 :( I'll check again, but I
know I didn't imagine this.
- line 1281: isn't this likely to get us into exactly the same trouble
that got us into this except clause in the first place? Or is it just
that it won't be for EPERM, and everything else should get raised
anyway?
My assumption (and the previous one) was the latter.
- line 1444: no need for an atomic update here?
Do you mean copy the file somewhere else and then rename it into place?
It's definitely not the intent to rename the file into place
straight-away as that would remove it from the source directory (which
is definitely not intended).
- line 1593: is there any reason not to always have BASE included?
If it's data the caller doesn't need, then I'm just wasting cycles
adding it to the metadata returned and wasting memory. At least, that
was the thought...
- line 1838: this is only used in some tests now; perhaps it can be
retired? Same with extract_matching_fmris() now, too. I see you added
a "reverse" argument, but I don't see it used anywhere.
In my initial prototype, I had re-written image.py:__inventory() to use
get_matching_fmris() instead. However, it ended up being massively
slower for update all because of the simple matching approach it
currently uses (i.e. for each package installed on the system, attempt
to match the package against *all* known packages...ugh).
I'd like to eventually re-write __inventory to use get_matching_fmris()
as it seems useful to have that sort of functionality in the catalog.
I've fixed everything you don't see mentioned. I'll send out a webrev
tomorrow after I've had some time to test and re-review.
Cheers,
--
Shawn Walker
[1] http://docs.python.org/library/functions.html#super
_______________________________________________
pkg-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/pkg-discuss