As it's obviously taking me a while to work my way through this review,
Brock asked if he could see what I have, so here it is.  I've commented on
all the files which are already in the gate, and am making my way through
the new stuff.  I'm partway through search_storage.py right now.

Sorry for the delay ...


On Sat, Jul 05, 2008 at 08:21:38AM -0700, Brock Pytlik wrote:

> http://cr.opensolaris.org/~bpytlik/ips-search/

Is the synopsis of buid 2122 truncated?

Make sure that all python files start with "#!/usr/bin/python2.4".

I don't know where the convention of putting a space at the beginning and
end of every docstring began, but I'd like for it to end now.  No need to
fix up existing usage, but don't introduce any new ones.

I think we also probably settled on not putting spaces around keyword
arguments.  At least be consistent in your own patch.  :)

As discussed on Friday, I screwed the pooch on the distinction between _foo
and __foo.  Please use __foo for symbols that really ought to be private to
a class, and _foo for ones where they should be thought of as private, but
where some outsiders actually need to access them.  Though if you want to
drop the underscores entirely in some places, that's fine, too.  Also feel
free to use properties where that makes sense.

I think we need a text file in the doc directory explaining how all this
works.

client.py:

  - line 123, 132 (and others): put quotes around the command to run.

  - line 126: double space after 'pfexec'.

  - line 129: no need for a temporary variable; just return that string.

  - line 129: Do you need a period after "found"?  Note that this is going
    to be a pretty long line, even if "text" is short.

  - line 1422: usage(), perhaps?

  - line 1424: Perhaps this should go into Image, so that other clients
    (like the gui) end up having the same behavior?

  - line 1555: I'd say "above traceback" rather than "traceback produced".
    Nice idea; thanks.

depot.py:

  - line 154: "omitted"

attributes.py:

  - I'm a bit confused by the way generate_indices() works here.  All the
    dictionaries returned seem to have keys derived from the data, not
    describing what kind of thing they are (a la "path", "basename", or
    "hash" from the file action).  I would expect that a search hit on an
    fmri would come back as

        INDEX           ACTION   VALUE                  PACKAGE
        fmri            set      pkg:/[EMAIL PROTECTED]           pkg:/[EMAIL 
PROTECTED]

    not as

        INDEX           ACTION   VALUE                  PACKAGE
        pkg:/[EMAIL PROTECTED]    set      ["foo", "1.2", None]   pkg:/[EMAIL 
PROTECTED]

    which is what it looks like it'll come back as.  Perhaps I'm just not
    far enough along to see how this makes sense, but given that you didn't
    touch any of the other actions' generate_indices() methods, I guess I'm
    going to remain confused.

  - line 67: I know we don't agree about the parens, but tough.  Lose 'em.

  - line 68, 88: we've been using double-quotes around strings, primarily,
    except when the internal string has double-quotes itself.  There are a
    bunch of other places where you have this, particularly in new files.

  - line 69ff:

        return dict(
            (w, w)
            for w in self.attrs["value"].split()
        )

  - line 76: This functionality should be part of the fmri module.  And
    both this here and the PkgFmri (and Version) constructor should use the
    same parsing methods, whichever one is faster.  Perhaps you don't need
    to construct any objects, but that just means we need a single parser
    that the constructor can use as well as places that don't need the
    object.

  - line 80: space after the colon.  Break the line after the open bracket,
    and put the close bracket in the same column as the first character of
    the line on which the open bracket sits:

        {
            key: [
                listval1, listval2, listval3
            ]
        }

    If the list values don't fit in 80 characters, I'd put each one on a
    separate line.

  - line 86: given the next line, do you need this one?

  - line 97: space after colon

catalog.py:

  - line 107: space after comma

  - line 176: no parens

  - line 179: I would os.fdopen(tmp_num, mode="wb") instead.  Is "binary"
    mode really necessary here?

  - line 184: use double-quotes, and no outer parens.  Probably also want a
    short comment here saying that you're creating an empty file.

  - line 193: couldn't this go in the finally clause?

  - line 205, 600: os.rename().  Or portable.rename(), I guess.

  - line 293: no need for an extra blank line.

  - line 386, 396, 410: excessive indentation?

client_query_engine.py:

  - How confusing would it be if this were just client/query_engine.py?

  - line 41: dir_path?

  - line 52: use iteritems()

image.py:

  - line 455: "manifest".  possessive "'s" after "fmri", but I think that
    still doesn't fix the grammar.

  - line 456:

        return [
            (fmri, self.get_manifest_path(fmri))
            for fmri in self.gen_installed_pkgs()
        ]

    Better yet, make it a generator by replacing the brackets with parens,
    and make the caller listify it if they really need that.

  - line 563: why create this temporary object?  A single os.path.join call
    ought to do it.

  - line 564: excessive indentation

  - line 1162ff:

        Since the index directory will not reliably be updated when the
        image root is, this should be called prior to using the index
        directory.

    However, when do we change self.imgdir?  If this is actually necessary,
    shouldn't we just make imgdir a property, and have the setter modify
    index_dir, too (which should be more efficient than the other way
    around)?

  - line 1172: why the extra assignment?

  - line 1356: perhaps call this rebuild_search_indexes?

  - line 1357ff:

        Rebuilds the search databases from scratch.

imageplan.py:

  - line 388: "Check the index ..."

  - line 389: "inconsistent"

  - line 512: I'd put blank lines around this one.

  - line 515: I'd ditch this blank line

  - line 516: I'd put this (indented) after the finally

manifest.py:

  - line 273, 275: lose the parens.  If it helps, "not a in b" can be
    rewritten as "a not in b".

  - line 283: Comment needs an update.

repository.py:

  - line 185: Are you sure this doesn't require a protocol rev?

server/transaction.py:

  - line 188: This will make search results available for packages that
    aren't published yet.  Is that your intent?

query_engine.py:

  - line 46: blank line between class definitions

  - line 57: this should be a four-space indent.

  - line 68: this should be indented at the same level as line 56.

  - line 90: there doesn't seem to be a non-empty return from
    search_internal().  Or should the "return []"s set self._res instead?

  - line 108: if a list comprehension doesn't fit on one line, then say

        foo = [
            items
            for iter in loop
            if condition
        ]

  - line 113: remove blank line

  - line 123ff: please stop the lisp-y madness!

  - line 163: this resets the query results, I guess?  Comment please.

search_errors.py:

  - This needs the standard headers

  - line 9: Exception class names should end in "Exception", or maybe
    "Error".

  - line 13: I think you want "return" here.  Also, a space after "and".

  - line 14: four-space continuation indentation

  - The PartialIndexing exception tells you more explicitly what to do than
    the others.  You should probably do the same in the others.  I'd also
    fix up the grammar and punctuation.

  - lines 37, 45: extraneous blank lines.

  - I'm not sure it's good practice to put so much information in the
    string representation of an exception; I'd let the catcher generate
    that text.  However, if you're keen on being able to catch all the
    exceptions in a single handler and print out an appropriate message,
    then I think what I'd do is to have a class variable with a name common
    across the classes which holds the text, and catchers can print out
    that string.  For i18n purposes, that string should have "%s" in it so
    you can insert e.cause even in a translated message.

search_storage.py:

  - line 37: It seems like this function would be a lot simpler if you used
    a file locking protocol instead.  Also, why do you bail out if some
    index files are missing, but retry if the versions don't match?  Is
    there a substantive difference between the two conditions?

  - line 38: "ensures".  Also, "hitting the files"?

  - line 43: I'm not sure I see the point of this line, or of 79, given
    line 54.  Indeed, you could treat cur_version the same way as
    diff_versions.

  - line 50: this won't let you distinguish between None and 0 for timeout
    (where the former could mean no timeout, and the latter timeout
    immediately)

  - line 51: Suggest

        Consistency check timed out.  Either your indexes are corrupt, or
        you are having very bad luck with thread scheduling.  <What to do
        in either case.>

  - line 52: four-space continuation indentation

  - line 56: I'd put the comments describing the consistency checks here.
    "All index files must have the same version.  In addition, if one index
    file is missing, then all must be missing.  In either case, an
    InconsistentIndexError is raised."

  - line 60: I think the comment would make more sense if it said, "If we
    get here, then the current index file is present.  If we found one
    previously to be missing, we have found an inconsistent index."  Remove
    the test on line 90, change the test on line 92 to explicitly check
    against False, and unconditionally set empty to True after that if
    clause.  Then go back and change "empty" to "missing".  Or
    "missing_index".

  - line 74: "; otherwise, "; "check that the"

  - line 88: "If the index file is missing, ensure that previous files were
    missing as well.  If not, then we found an inconsistent index."

  - line 96: need indentation here.  If you need more room, I'd suggest
    importing * from search_errors.

  - line 107: "cur_version is not None" (or !=)

  - line 111: no old-style classes: derive from object.

  - line 131: excessive indent, want space after colon

  - line 150: No need to be so verbose in the warning.  Let the underscore
    do the talking, and perhaps just say that only child classes should
    call this method.  Alternatively, since all the child classes appear to
    be in this file, you could make this method private to the file; that
    would require getting rid of the one use of self, but that doesn't seem
    terribly hard to do.

  - line 157: Why aren't you using self._file_handle?

  - line 165: "heuristically"

  - line 168: why this assert?

  - line 170: lose the parens (change the tests, turn and into or).  You
    can also say stat_info.st_mtime, etc.  If you wanted, you could even
    say:

        if not (self._mtime, self._size) == \
            (stat_info.st_mtime, stat_info.st_size):

  - line 187: I don't understand "making the reads or writes isn't an
    option".

  - line 211: Could you add to the comment explaining the grammar of the
    file?  This could go in the write method instead.

  - line 220: No need for the first two sets of parens.

  - line 223: It might be useful to assign two names to the return, rather
    than using tmp[0] and tmp[1].  Same on lines 228 and 234:

        tok_type_id, action_id, keyval_id = map(int, tmp[0].split(","))

  - line 226 (&c): spaces around the minus sign, even in an array index.

  - line 243: I'm a bit curious why you didn't just use the normal Python
    syntax for dictionaries and lists.

  - line 263: This seems at odds with the fact that there's a function to
    write stuff out.  Perhaps a comment is necessary?

  - line 275: Is there any reason not to use os.rename()?

  - line 280-283: The first time through, we won't do anything except set
    old_suffix.  Each time after that, we remove the old file.  Why do we
    preserve the old file the first time we call shift_file(), but none of
    the rest?

Danek
_______________________________________________
pkg-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/pkg-discuss

Reply via email to