As it's obviously taking me a while to work my way through this review, Brock asked if he could see what I have, so here it is. I've commented on all the files which are already in the gate, and am making my way through the new stuff. I'm partway through search_storage.py right now.
Sorry for the delay ... On Sat, Jul 05, 2008 at 08:21:38AM -0700, Brock Pytlik wrote: > http://cr.opensolaris.org/~bpytlik/ips-search/ Is the synopsis of buid 2122 truncated? Make sure that all python files start with "#!/usr/bin/python2.4". I don't know where the convention of putting a space at the beginning and end of every docstring began, but I'd like for it to end now. No need to fix up existing usage, but don't introduce any new ones. I think we also probably settled on not putting spaces around keyword arguments. At least be consistent in your own patch. :) As discussed on Friday, I screwed the pooch on the distinction between _foo and __foo. Please use __foo for symbols that really ought to be private to a class, and _foo for ones where they should be thought of as private, but where some outsiders actually need to access them. Though if you want to drop the underscores entirely in some places, that's fine, too. Also feel free to use properties where that makes sense. I think we need a text file in the doc directory explaining how all this works. client.py: - line 123, 132 (and others): put quotes around the command to run. - line 126: double space after 'pfexec'. - line 129: no need for a temporary variable; just return that string. - line 129: Do you need a period after "found"? Note that this is going to be a pretty long line, even if "text" is short. - line 1422: usage(), perhaps? - line 1424: Perhaps this should go into Image, so that other clients (like the gui) end up having the same behavior? - line 1555: I'd say "above traceback" rather than "traceback produced". Nice idea; thanks. depot.py: - line 154: "omitted" attributes.py: - I'm a bit confused by the way generate_indices() works here. All the dictionaries returned seem to have keys derived from the data, not describing what kind of thing they are (a la "path", "basename", or "hash" from the file action). I would expect that a search hit on an fmri would come back as INDEX ACTION VALUE PACKAGE fmri set pkg:/[EMAIL PROTECTED] pkg:/[EMAIL PROTECTED] not as INDEX ACTION VALUE PACKAGE pkg:/[EMAIL PROTECTED] set ["foo", "1.2", None] pkg:/[EMAIL PROTECTED] which is what it looks like it'll come back as. Perhaps I'm just not far enough along to see how this makes sense, but given that you didn't touch any of the other actions' generate_indices() methods, I guess I'm going to remain confused. - line 67: I know we don't agree about the parens, but tough. Lose 'em. - line 68, 88: we've been using double-quotes around strings, primarily, except when the internal string has double-quotes itself. There are a bunch of other places where you have this, particularly in new files. - line 69ff: return dict( (w, w) for w in self.attrs["value"].split() ) - line 76: This functionality should be part of the fmri module. And both this here and the PkgFmri (and Version) constructor should use the same parsing methods, whichever one is faster. Perhaps you don't need to construct any objects, but that just means we need a single parser that the constructor can use as well as places that don't need the object. - line 80: space after the colon. Break the line after the open bracket, and put the close bracket in the same column as the first character of the line on which the open bracket sits: { key: [ listval1, listval2, listval3 ] } If the list values don't fit in 80 characters, I'd put each one on a separate line. - line 86: given the next line, do you need this one? - line 97: space after colon catalog.py: - line 107: space after comma - line 176: no parens - line 179: I would os.fdopen(tmp_num, mode="wb") instead. Is "binary" mode really necessary here? - line 184: use double-quotes, and no outer parens. Probably also want a short comment here saying that you're creating an empty file. - line 193: couldn't this go in the finally clause? - line 205, 600: os.rename(). Or portable.rename(), I guess. - line 293: no need for an extra blank line. - line 386, 396, 410: excessive indentation? client_query_engine.py: - How confusing would it be if this were just client/query_engine.py? - line 41: dir_path? - line 52: use iteritems() image.py: - line 455: "manifest". possessive "'s" after "fmri", but I think that still doesn't fix the grammar. - line 456: return [ (fmri, self.get_manifest_path(fmri)) for fmri in self.gen_installed_pkgs() ] Better yet, make it a generator by replacing the brackets with parens, and make the caller listify it if they really need that. - line 563: why create this temporary object? A single os.path.join call ought to do it. - line 564: excessive indentation - line 1162ff: Since the index directory will not reliably be updated when the image root is, this should be called prior to using the index directory. However, when do we change self.imgdir? If this is actually necessary, shouldn't we just make imgdir a property, and have the setter modify index_dir, too (which should be more efficient than the other way around)? - line 1172: why the extra assignment? - line 1356: perhaps call this rebuild_search_indexes? - line 1357ff: Rebuilds the search databases from scratch. imageplan.py: - line 388: "Check the index ..." - line 389: "inconsistent" - line 512: I'd put blank lines around this one. - line 515: I'd ditch this blank line - line 516: I'd put this (indented) after the finally manifest.py: - line 273, 275: lose the parens. If it helps, "not a in b" can be rewritten as "a not in b". - line 283: Comment needs an update. repository.py: - line 185: Are you sure this doesn't require a protocol rev? server/transaction.py: - line 188: This will make search results available for packages that aren't published yet. Is that your intent? query_engine.py: - line 46: blank line between class definitions - line 57: this should be a four-space indent. - line 68: this should be indented at the same level as line 56. - line 90: there doesn't seem to be a non-empty return from search_internal(). Or should the "return []"s set self._res instead? - line 108: if a list comprehension doesn't fit on one line, then say foo = [ items for iter in loop if condition ] - line 113: remove blank line - line 123ff: please stop the lisp-y madness! - line 163: this resets the query results, I guess? Comment please. search_errors.py: - This needs the standard headers - line 9: Exception class names should end in "Exception", or maybe "Error". - line 13: I think you want "return" here. Also, a space after "and". - line 14: four-space continuation indentation - The PartialIndexing exception tells you more explicitly what to do than the others. You should probably do the same in the others. I'd also fix up the grammar and punctuation. - lines 37, 45: extraneous blank lines. - I'm not sure it's good practice to put so much information in the string representation of an exception; I'd let the catcher generate that text. However, if you're keen on being able to catch all the exceptions in a single handler and print out an appropriate message, then I think what I'd do is to have a class variable with a name common across the classes which holds the text, and catchers can print out that string. For i18n purposes, that string should have "%s" in it so you can insert e.cause even in a translated message. search_storage.py: - line 37: It seems like this function would be a lot simpler if you used a file locking protocol instead. Also, why do you bail out if some index files are missing, but retry if the versions don't match? Is there a substantive difference between the two conditions? - line 38: "ensures". Also, "hitting the files"? - line 43: I'm not sure I see the point of this line, or of 79, given line 54. Indeed, you could treat cur_version the same way as diff_versions. - line 50: this won't let you distinguish between None and 0 for timeout (where the former could mean no timeout, and the latter timeout immediately) - line 51: Suggest Consistency check timed out. Either your indexes are corrupt, or you are having very bad luck with thread scheduling. <What to do in either case.> - line 52: four-space continuation indentation - line 56: I'd put the comments describing the consistency checks here. "All index files must have the same version. In addition, if one index file is missing, then all must be missing. In either case, an InconsistentIndexError is raised." - line 60: I think the comment would make more sense if it said, "If we get here, then the current index file is present. If we found one previously to be missing, we have found an inconsistent index." Remove the test on line 90, change the test on line 92 to explicitly check against False, and unconditionally set empty to True after that if clause. Then go back and change "empty" to "missing". Or "missing_index". - line 74: "; otherwise, "; "check that the" - line 88: "If the index file is missing, ensure that previous files were missing as well. If not, then we found an inconsistent index." - line 96: need indentation here. If you need more room, I'd suggest importing * from search_errors. - line 107: "cur_version is not None" (or !=) - line 111: no old-style classes: derive from object. - line 131: excessive indent, want space after colon - line 150: No need to be so verbose in the warning. Let the underscore do the talking, and perhaps just say that only child classes should call this method. Alternatively, since all the child classes appear to be in this file, you could make this method private to the file; that would require getting rid of the one use of self, but that doesn't seem terribly hard to do. - line 157: Why aren't you using self._file_handle? - line 165: "heuristically" - line 168: why this assert? - line 170: lose the parens (change the tests, turn and into or). You can also say stat_info.st_mtime, etc. If you wanted, you could even say: if not (self._mtime, self._size) == \ (stat_info.st_mtime, stat_info.st_size): - line 187: I don't understand "making the reads or writes isn't an option". - line 211: Could you add to the comment explaining the grammar of the file? This could go in the write method instead. - line 220: No need for the first two sets of parens. - line 223: It might be useful to assign two names to the return, rather than using tmp[0] and tmp[1]. Same on lines 228 and 234: tok_type_id, action_id, keyval_id = map(int, tmp[0].split(",")) - line 226 (&c): spaces around the minus sign, even in an array index. - line 243: I'm a bit curious why you didn't just use the normal Python syntax for dictionaries and lists. - line 263: This seems at odds with the fact that there's a function to write stuff out. Perhaps a comment is necessary? - line 275: Is there any reason not to use os.rename()? - line 280-283: The first time through, we won't do anything except set old_suffix. Each time after that, we remove the old file. Why do we preserve the old file the first time we call shift_file(), but none of the rest? Danek _______________________________________________ pkg-discuss mailing list [email protected] http://mail.opensolaris.org/mailman/listinfo/pkg-discuss
