Re: [caiman-discuss] Code review request: Transfer module

jean.mccormack Tue, 12 Oct 2010 11:00:58 -0700

 Keith,

Thanks for the in depth review. Lots of good comments and I learned yetmore about some of the python calls.

Responses are inline. Note: Talking with Drew after his review yieldedsome restructuring that I reference here.Basically the method of storing and retrieving attributes relative on aper transfer basis is being overhauled for the better

with a huge reduction in complexity and code size.

Ginnie will respond with the items I can't answer.

Jean

On 10/11/10 06:29 PM, Keith Mitchell wrote:

 On 09/30/10 10:18 AM, jean.mccormack wrote:
I'd like to request a code view for the Transfer Module checkpointfor the CUD project.
The webrev is here:
http://cr.opensolaris.org/~jeanm/transfer_module/
Please send comments by 10/12. I realize there are now 3 big codereview outstanding. If you need more time please contact
me and I can be flexible.
Hi Jean,
There's an impressive amount of code here, so pardon the delay inresponding. I've got a fair number of nits and straightforwardPythonizations below; the only functional issue I have with the codeis the error handling. In particular, I feel the error handling was abit too excessive in the use of try/except clauses - a lot ofexceptions were caught and then a new, more generic exception wasraised. The details are below, but I feel a lot of it could bestripped out with no harm to the resultant code - it's perfectly fineto let exceptions propagate up a few levels without catching them.
My rule of thumb is to only catch if (a) the catching code can resolvethe problem, or (b) the catching code can add additional context(either by logging additional data for later debugging, or by wrappingthe exception in a more specific exception that adds details/messagesabout the context in which the original exception was raised, with thegoal of either providing the end user a better idea of how to resolvethe issue, or providing a more discrete exception for a caller toparse out).
I haven't had a chance to look at the test code yet; while I stillwant to take a (less detailed) look, I wanted to send this out so thatit doesn't sit too long in my drafts folder.
Thanks,
Keith

General:
Since the files are going under solaris_install/transfer, I think itmight make more sense to remove the "transfer_" from the .py files.
import solaris_install.transfer.transfer_ips
seems redundant compared to
import solaris_install.transfer.ips

I'm fine with that,no attachment to file names here. If anyone has aviable complaint they should let me know.


(The class names themselves, e.g., TransferCPIO, are fine)

usr/src/lib/Makefile: Nit: Alphabetize

Will fix.

system-library-install.mf:
Should this actually be in system-install.mf? That's where the currenttransfer module is.

I think that all of the checkpoints should go the same place. DOC and MPare in system-install-library so that's where transfer should then go.

transfer_cpio.py:
64: This variable might be more flexible as a tuple of ("-p", "-d","-u", "-m")

Not really. The design spec calls for just passing args through withoutany processing. So holding it as a string is in fact easier.

99-107: It seems to me that this scenario might be better handled bysomething akin to the following:
except SourceNotExistError:
# _parse_input should raise this for cases such as "self.src andnot os.path.exists(self.src)"
    return self.DEFAULT_SIZE
Note that the 'else' clause is purposely left out - theTransferUnknError raised here can't provide a better reason for thefailure, so there's no reason not to simply let the original exceptionpropagate.

Agreed.

119: Use "partition" instead of "split" (partition won't cause anerror if there happens to be a line in the file not of the form"XXX=YYY" at some point in the future)

Will do.

115-124: I think this should be done prior to the 'for' loop

Agreed. In fact, Drew and Ginnie and I were talking about somerefactoring of the code and when looking at that I found this issue.Nice catch.

155: Nit: Would it make sense to use self._cancel_event forconsistency with other checkpoints that don't define a custom cancel()method?

Instead of self.abort? I'm fine with that.

159: I'm not sure what the reason is for this method - can youexplain? Will other checkpoints feasibly need something similar (i.e.,should there be something more global in the engine or logging module)?

If the user had not called the get_progress_estimate function thenreporting progress is stupid. So give_progress is set when they call itand progress logging only occurs in that case.


Will other checkpoints need something similar? Possibly.

184, 188: General: When logging from except clauses, uselogger.exception() rather than logger.error/critical, aslogger.exception automatically dumps traceback data for the currentexception. However, see next comment

183-189: I think it might be best to simply propagate the error(putting cleanup steps in a 'finally' clause and removing both'except' clauses). The engine will take care of logging the uncaughtexception. In particular, re-wrapping the error on line 189 causes theloss of useful data from the original exception, and I don't see anyvalue added specifically by the TransferUnknError here.

I don't have any issues doing this, but awhile ago we had talked aboutcheckpoints only returning known errors. i.e. some type of TransferError in this case.

So if you're going to catch all exceptions I'm fine with it.

209-210: As this is cleanup code, I think it should be wrapped intry/except so that an issue with an unlink doesn't cause an exceptionto get raised (e.g.; if an exception was raised, triggering cleanup,and an exception happened in the cleanup, the original - and moreimportant - exception would get lost)

sure.

235: Some comments explaining what's happening here would be invaluable.

That code is disappearing as a result of Drew's comments. Funny hecouldn't figure it out either.


248: I think this could just be "if num_spec != 1:"

gone like above.

253, 319, 388, 437: I'm generally averse to python functions thataccept either a string or a list as a parameter, as it results in codethat requires this sort of type-checking, rather than taking advantageof Python's dynamic typing a bit better.
If having separate calls for lists and strings doesn't make sensehere, a better alternative is to flip the comparison: check for "ifisinstance(file_list, basestring):", and let the else clause assume aniterable (list/tuple/etc.) - the idea being that in this fashion, ifdir_list is a tuple, or a "list-like" something or other, the functionstill works Pythonically.

Sure. I can flip the comparison.

266-273: Why not sort first in memory, then write to file? Then youdon't have to write to file twice.

Possibly. With some code refactoring that is being done this may fallout or be harder. Will consider.

291, elsewhere: I think Drew had a good suggestion here, but given thebehavior of os.path.join, this could simply be:
fname = os.path.join(self.src, file_list)
os.path.join, upon encountering any arg that is an absolute path,disregards all prior arguments, so there's no need to if/else this.

Really? You know I'll have to test this, super cool if it does. And ifso, will change. Thanks.

427: The validate_xxx_list functions all look very similar; I imaginethey could be readily combined, or use a common sub-function.

Yes. Done as part of Drew's review.

526: Seems more reasonable to just let the OSError bubble up.

Will consider. Need to look at your comment from above first.

544: Should log this and propagate the original error; or capture theoriginal error string, so that there can be some determination of whythe file couldn't be opened.

Sure.

564: If line 559 raised an OSError from os.lstat, this line will raisea NameError because nothing was set. Suggest removing the try/exceptblock (it should be nearly impossible for something to occur thereanyway, given that line 550 succeeded if we reach this point).

Correct. Need to look at this.


604: Line appears to be missing an argument.

Yup. Will fix.


649: log or stash the original exception

OK.


654: Propagate original exception

OK.

667: Would rather not see yet another "run/exec/cmd" function in oursource code. Given the current single use case for this function,simply embed the check_call in _transfer - would even be fine withoutthe try/except clause, I think.

I'd rather leave this as a method in case it needs some beefing up.

704: Use "self.logger.info("Transferring files to %s", self.dst)"
As a general rule, for logging statements, use formatting stringoperations and pass in the formatting arguments as separateparameters. This allows the logging module to defer the (relativelytime intensive) string formatting operation until it determines if thelogging call will actually occur or not (based on the log levels).
While it's only really important for code that is run frequently (in aloop processing a large amount of data, for example), it's a goodpractice to get into.

Will do.

754-760, 960-966: These are init'ed in AbstractCPIO - is there areason to reset them here?

They are actually going away. But there is/was a reason. I wanted aclean attribute every time parse_input is called.

transfer_err.py:
Philosophically speaking, i don't see a need for these exceptionclasses. Uses of "TransferValueError" should definitely be directlyreplaced with the basic Python ValueError, which callers wouldreasonably expect from a function if they pass in invalid parameters.

Which is how it was originally but was changed after the big talk aboutexceptions and ( to be honest) a comment you made at one point aboutValueError.

The rest of the exceptions defined here seem to indicate *where* theexception came from, rather than *what* the exception was - the latterbeing far more valuable to a caller or end-user. The engine alreadyknows what checkpoint is running - and the stacktrace data from anexception will further make that clear. In terms of resolving an issue- either finding a programming bug that raised an exception,presenting adequate information to the user to fix a permissionsissue, or programmatically recovering from an expected error case -it's easier to use an exception that matches what happened (IOError -bad permissions / file not found; TypeError - Function can't handleobjects of type XYZ; etc.).

This all goes back to a discussion at one point about having thecheckpoints return exceptions they have defined. If we have gotten awayfrom that, fine.

But that is probably a larger architectural decision.

transfer_info:
45: ret_value needs to get set whether or not conv_str is None (thatis, what happens if conv_str = "foobar"?)

will do.

47/49: Nit: just compare conv_str.lower() == "false"
(or conv_str.capitalize() == "False" - capitalize will ensure that*only* the first letter of each word is capitalized)

Will do.

58-59: The "SOFTWARE_" prepended is redundant - just use "LABEL" and"NAME_LABEL"
Ditto for the equivalent items in Destination and Dir classes.

That is again something that needs to be decided amongst all of thecheckpoints. MP did it this way. I followed their convention because

consistency is good.

119: Nit: As a one-liner: "return element.tag == Software.LABEL".Ditto for the other classes defined here.

Sure.


301: See above note on capitalize()

will do

482: Just to clarify, the expected action here is that ifself.file_list == self.DEF_FILE_LIST and if self.file_list *can't* beaccessed, then a self.dir_list action would override it? Or shouldthat be an error?


Override is the expected action.

552: Flip the if/else such that the comparison is "ifisinstance(file_name, basestring):" (see above)

will do

604-639: These lists can be stored in memory, right? Would it bebetter to do so? A temporary file seems prone to issues acrosspause/resume, particularly if the system is rebooted in between.

Not really. We are passing things through the DOC and I'm not sure wereally want to carry what could be 1000's of entries around. Doesn'tpause/resume

reread the manifest?

775-777: DTD validation will ensure that action is never anything but"install" or "uninstall", correct?

Yes.

1007: I know this does something in vi, but it seems odd to have it inour source base.

Yes. I've taken it out of some files. Will remove.

transfer_ips.py:
73: My understanding of gettext is that gettext.install should *only*be called by an application, never by a module - modules should usesomething like:
_ = gettext.translation
gettext.install() modifies the global namespace, so running it willaffect all imported modules - which means this line here willoverride, or be overridden by, the call to gettext.install() that willcome from DC.
Similarly, I don't think a checkpoint should set locale.LC_ALL globally.

But the IPS calls don't work without this code. ;-(


80-83: Please add comments for these values

sure.

86: Remove commented code

yes.

110: Nit: (Same as prior nit) - would it make sense to useself._cancel_event for consistency?

Same answer. No issues with that.

134: Can you explain what this math is doing here? It appears to bethat each pkg is assumed to be "roughly" the same size based on apre-generated average, and the number of packages given is correlatedagainst that?

Yes. Eventually we'd like to get package size information from IPS butit's not there yet. Then this would be far more accurate of a calculation.

139-140: Since get_size() appears to only be used here, couldget_size/distro_size be combined into a single @property calleddistro_size? Use an internal self._distro_size variable to cache theresults of the work currently done in get_size(); if the cache is set,return its value; if not, calculate it, set it, and then return thevalue.

Possibly. But get_size is part of the defined interface so it's not onlyused here.

153: Same prior question regarding whether there's a deficiency in thelogging module or engine that needs to be resolved.

Same answer.

178-184: See comments on lines 183-189 of transfer_cpio.py

Same answer.

201/202: Use "self.logger.debug("Destination: %s", self.dst)

yup. will change.

695, 711, 725, 761, 835: Similar comments

Will change.


215-219: Use set operations:
not_allowed = set(["prefix", "repo_uri", "origins", "mirrors"])
image_args = set(self.image_args)
overlap = a & b:
if overlap:
    # error

217-219: Malformed format string

Are you sure? I've actually seen this work I believe.


222-3: Quick nit:
self.prog_tracker = self.img_args.get("progtrack", self.DEF_PROG_TRACKER)
Should achieve what you're going for, I believe.

In which case I wouldn't have to set self.prog_tracker to the default atsome point.

228: A recent change in pkg(5) made it so that publishers withinstalled packages could never be removed - removing them wouldactually simply *disable* them. Will that cause problems with thisapproach, i.e., removing an existing publisher on line 310, thenadding it back in (with potentially different origin/mirrors) on line320/324?

I think we're ok. Remember that email thread?

Would it be better to iterate through all existing publishers;updating any that match what the specified preferred/alternates are,disabling any that don't match, and then adding any new ones necessary?


Seems unnecessarily complex to me.


269, 276, 296: Use logger.exception

OK.


281: This would be more "natural" as a for/else clause:

for pub in pub_list:
    if pub == self._publ:
        # do stuff
        break
else:
    # No "break" clause execute -> Did not find preferred publisher
    # set the new preferred publisher

Yeah.

394: pop() is a function, so the arg should be in parentheses, notbrackets. pop() also removes the item from the dictionary, which mayor may not be desired (e.g. if something fails, but it's recoverable,and this code executes again, "recursive_removal" will be gone - itmay be desirable to instead create a copy of args and use that forthis section of code)

Thanks for catching the brackets. Removal is the desired action in thiscase but the cases you expand upon are definitely ones to be thought about.

An empty dictionary will work just fine with **, so if args is alwaysa dictionary (or possibly None - in which case, add an "if args isNone: args = {}" chunk of code somewhere), then 391-401 can becombined into:

args may be None. So it comes down to is it nicer to have the if else orthe args = {}. 6 of one half dozen of another in my mind.



recursive = args.pop("recursive_removal", False)
self.api_inst.plan_uninstall(pkg_uninstall, recursive, **args)

438-439: Code can be removed

Yup. Obviously there used to be more code that was removed.

451: Will stacktrace if self._origin is None. (some_list[1:] willnever be None - if there are no items there, it will return an emptylist). line 451 could probably just be removed, and line 452 could bepart of the 'if' block from line 449.

I think you can't do that. If I move 452 to follow 449 then ifself._origin[1:] returns an empty list, I have set origins to an emptylist which isn't what I want. I don't want to set it at all in that case.

Actually, I know _origin is not None. It's at worst and empty list sinceit's initialized to that. I think the error is really at line 449 whichshould be

if self._origin[0] is not None:

456: What happens if self.action is neither "use_existing" nor "create"?

Bad things. This should be checked in _validate_input since those arethe only two legal values. Thanks.

504: Include the URI in the exception (or propagate theInvalidDepotResponseException if it has a suitable message

sure.

507: I don't know if this message will make sense to an end user - howwould they end up in this situation, and how might they resolve it?The API Errors from pkg.client.api_errors seem fairly verbose - Ithink it would be possible to simply propagate them, unless we haveadditional context to add to the message, or have a way to recoverfrom the exception.

It certainly made sense to Drew and Alok. They end up in this situationif the pkg version is different from our version. I believe their errorsweren't quite as good in this case. Or they didn't really help Drew andAlok when they saw them.

569: What is this clause catching? The message seems rather specificto a failure at line 563, and I imagine it could be constrained to asingle exception class.

Either 563 or 565 or 566. If any of these raise an exception thedestination hasn't been specified correctly.

575-578: If im_type.zone is a bool, simply set "self.is_zone =im_type.zone". If it's something else, use "self.is_zone =bool(im_type.zone)"

 My C background is showing here. Will change.

642: Nit: Initialize value_str to an empty string, and lines 646-648aren't needed.

This code is moving to DOC so you'll need to consult with Darren on this.


712, 726: A more specific class than "Exception" would be preferred here.

Yeah.

683: In looking at the complexity of the branching, I wonder if a lotof the publisher related code, here and elsewhere, would be simplifiedby having a single list of publishers, stored in "search" order. Thefirst item in the list would be what's known as the "preferred"publisher, and the rest are additional publishers, in order ofprecedence. This would seem to map more naturally to how thepublishers on a system are laid out these days (preferred is nothingmore than a way to say "search this publisher first").
kemit...@kemobile-work->~ 0 $ pkg publisher
PUBLISHER                             TYPE     STATUS   URI
opensolaris.org (non-sticky, preferred) origin onlinehttp://ipkg.sfbay/dev/contrib origin onlinehttp://pkg.opensolaris.org/contrib/extra origin onlinehttp://ipkg.sfbay/extra/kemit...@kemobile-work->~ 0 $ pfexec pkg set-publisher--search-before=opensolaris.org extra
kemit...@kemobile-work->~ 0 $ pkg publisher
PUBLISHER                             TYPE     STATUS   URI
extra (preferred) origin onlinehttp://ipkg.sfbay/extra/opensolaris.org (non-sticky) origin onlinehttp://ipkg.sfbay/dev/contrib origin onlinehttp://pkg.opensolaris.org/contrib/

Actually, it's not too different. We start out with publishers stored inorder, first being preferred. Unfortunately preferred and rest are setin IPS via very different mechanisms so it was easier to break them outfor internal storage.

745: Nit: The "+" and the "\" at the end aren't strictly necessary(The "\" is implied since there is an unclosed parentheses; the "+" isnot required for concatenation of inline string constants)(Saw this in a few places, truly not a big deal, just thought I'dpoint it out for future reference)
837-841: Nit: Enclose in an "ifself.logger.isEnabledFor(logging.DEBUG):" clause, for reasons similarto prior logging comments.

So you're implying that if all the code does is log something to encloseit? Reason being? Seems to me that all you save is the code not beingexecuted but at the expense of readability. And lots of otherlogger.debug statements are being executed.

869-871: Use "elif"?

Sure.

881: Initializing self.args to an empty dictionary would remove theneed for this type of logic.

Yes. Will look to make sure this doesn't break something else first though.


transfer_p5i.py:

77-82: Remove try/except clause, and let exceptions propagate?

Sure.


67, 87: You've popped the first publisher twice - was that intended?

Yes. In this case, the first publisher is the p5i file and the 2nd isthe preferred and I want both of those dealt with and only additionalones left.

transfer_prog.py:
70: Would os.statvfs provide the necessary data, rather than spawninga subprocess?See:http://stackoverflow.com/questions/787776/find-free-disk-space-in-python-on-os-x/787832#787832

will look into this.

77: Needs a "stderr=subprocess.PIPE" argument so that errors aren'tdumped to the console

yeah

80: Should check that the command succeeded, otherwise this may errorout with the IndexError seen on 82.
82-83: Propagate original error

OK to both.

92: Seems odd to override the value for self.endpct that was set online 60 in __init__
(Or should that 2nd equals be a minus sign perhaps?)

It should be a - along with line 103.

101: When would this occur?

The df code (on tree usr/src/cmd/fs.d/df.c allows for the case where -1could be there.

103: This looks odd as well - should the second equals sign be a minussign?

Yup. See above.

118: Could this be moved up to line 98 as the condition on the "while"statement?

The first time through you don't have pct yet so it would involve morework than I think is necessary.

120: This should be a tweakable attribute (i.e., in __init__, setself.sleep_for = 2)

Sure. Why not?


transfer_svr4.py:
87: Nit: Just use "for pkg_install in self._tr_pkg_install:"

Gone due to restructuring per Drew.


99: Could use more specific error handling and messaging.

Ginnie will look into this.

101: Move this comment to the docstring

Sure.

107: Have _parse_input raise a specific exception for this scenario,and catch that; see comment regarding lines 99-107 of transfer_cpio.py

Yes.

103-133, 162-190: This seems to overlap with the same functions inprior files; would a common meta-class be worthwhile, defining thesebasic things?

Possibly? So you're thinking a class like TransferClass whichTransferCPIO, TransferIPS...... are all children of? And they wouldinherit some of these methods from?

154-160: See comment regarding lines 183-189 of transfer_cpio.py

And same answer. ;-)


210/212: Nit: Use "for pkg_install in self._tr_pkg_install:"
(Ditto for line 249/251)


Gone due to some restructuring.


232-245: Use something like:
while pkg_proc.poll() is None:
    self.check_abort()
    # possibly sleep for a little bit
stdout, stderr = pkg_proc.communicate()
# log stdout, check pkg_proc.returncode != 0, etc.


Ginnie will need to answer this one.

245: This seems like it will add a *lot* of output at the info level,which is probably not useful. Perhaps log at the debug level, or onlylog if the pkgadd command failed?

Yes. Should be debug at the most.

248-276: Seems like this could be combined with the "install" block bydoing something like:


if self._tr_pkg_install:
    pkgs = self._tr_pkg_install
    base_cmd = self.PKGADD
elif self._tr_pkg_uninstall:
    pkgs = self._tr_pkg_uninstall
    base_cmd = self.PKGRM
for pkg in pkgs:
    # prep work
    cmd = [base_cmd] + arglist
    if self.datastream:
        cmd.append(datastream)
    pkg_proc = subprocess.Popen(cmd, ...)
    # continue as usual

Will look at this but this code will change quite a bit with therestructuring work.


322: Line not needed, given line 327

Correct.


335: No need to concatenate there

correct.

345, 355: Needs more specific exception(s)

In this case we could use ObjectNotFoundError.

414-425: Seems like this could be condensed similarly to lines232-245. Since this doesn't need to be interruptible, it could besquished even further into:
stdout, stderr = pkg_proc.communicate()
if "package datastream" in stdout:
    # datastream pkg

I'll let Ginnie respond here too.

428-443: Seems like it would be possible to remove this limitation, bycorrelating each pkg with its own set of arguments (datastream ornon-datastream, or uninstall)

I'll let Ginnie field this one too.

477-497: Might be worthwhile to simply change these to normalattributes, and then simply have the _parse_input function raise anexception if both are set?

Gone due to restructuring.

499-577: Seems like this duplicates a lot of the code in the priorclass - perhaps some reorganization is in order?


I'll have Ginnie look at this too.


Jean

_______________________________________________
caiman-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/caiman-discuss

Re: [caiman-discuss] Code review request: Transfer module

Reply via email to