Re: [caiman-discuss] Code review request: Transfer module

Keith Mitchell Fri, 15 Oct 2010 10:23:26 -0700

 Jean,

Responses below. I removed items where I had no further comments.


It sounds like the data restructuring will help a lot, which is fantastic.

- Keith

On 10/12/10 10:57 AM, jean.mccormack wrote:

 Keith,
Thanks for the in depth review. Lots of good comments and I learnedyet more about some of the python calls.
Responses are inline. Note: Talking with Drew after his review yieldedsome restructuring that I reference here.Basically the method of storing and retrieving attributes relative ona per transfer basis is being overhauled for the better
with a huge reduction in complexity and code size.

Ginnie will respond with the items I can't answer.

Jean

On 10/11/10 06:29 PM, Keith Mitchell wrote:
[...]
system-library-install.mf:
Should this actually be in system-install.mf? That's where thecurrent transfer module is.
I think that all of the checkpoints should go the same place. DOC andMP are in system-install-library so that's where transfer should then go.

Ok. I understand that perhaps those 2 packages will be re-evaluated inthe future, so I think that's fine then.

transfer_cpio.py:
64: This variable might be more flexible as a tuple of ("-p", "-d","-u", "-m")
Not really. The design spec calls for just passing args throughwithout any processing. So holding it as a string is in fact easier.

This works for the current set of arguments, but if we ever want tochange the default args to use an option that requires an argument(e.g., "-C bufsize"), it will be easier to do so if this variable is alist/tuple - since subprocess.Popen/call/check_call takes a list ofarguments, it'll be looking for something of the form:

["cpio", "-i", "-C", "16384", "-p", "-du", "-m"]

"-pdum" works currently because cpio can handle that as a singleargument. (Even if this were changed to be a single item list,["-pdum"], it leaves the door more flexible for future changes).

99-107: It seems to me that this scenario might be better handled bysomething akin to the following:
except SourceNotExistError:
# _parse_input should raise this for cases such as "self.src andnot os.path.exists(self.src)"
    return self.DEFAULT_SIZE
Note that the 'else' clause is purposely left out - theTransferUnknError raised here can't provide a better reason for thefailure, so there's no reason not to simply let the originalexception propagate.
Agreed.
119: Use "partition" instead of "split" (partition won't cause anerror if there happens to be a line in the file not of the form"XXX=YYY" at some point in the future)
Will do.
115-124: I think this should be done prior to the 'for' loop
Agreed. In fact, Drew and Ginnie and I were talking about somerefactoring of the code and when looking at that I found this issue.Nice catch.
155: Nit: Would it make sense to use self._cancel_event forconsistency with other checkpoints that don't define a customcancel() method?
Instead of self.abort? I'm fine with that.
159: I'm not sure what the reason is for this method - can youexplain? Will other checkpoints feasibly need something similar(i.e., should there be something more global in the engine or loggingmodule)?
If the user had not called the get_progress_estimate function thenreporting progress is stupid. So give_progress is set when they callit and progress logging only occurs in that case.
Will other checkpoints need something similar? Possibly.

My line of thought here is that, if get_progress_estimate hasn't beencalled, there probably aren't any ProgressHandlers assigned to thelogging module. In which case, calls to report_progress() are hopefullybeing filtered out appropriately.

In other words, the logging module is designed to perform filtering oflogging calls based on logging level, engine status, what handlersexist, etc. I don't think the checkpoints should have to worry about it- they should just call "report_progress", and if the logging moduledetermines that there's nowhere to send the message, then it just getsignored - just like if the logging level is set globally to"logging.WARN", logging messages at the debug level are simply ignoredby the logger.

184, 188: General: When logging from except clauses, uselogger.exception() rather than logger.error/critical, aslogger.exception automatically dumps traceback data for the currentexception. However, see next comment
183-189: I think it might be best to simply propagate the error(putting cleanup steps in a 'finally' clause and removing both'except' clauses). The engine will take care of logging the uncaughtexception. In particular, re-wrapping the error on line 189 causesthe loss of useful data from the original exception, and I don't seeany value added specifically by the TransferUnknError here.
I don't have any issues doing this, but awhile ago we had talked aboutcheckpoints only returning known errors. i.e. some type of TransferError in this case.
So if you're going to catch all exceptions I'm fine with it.

[...]
291, elsewhere: I think Drew had a good suggestion here, but giventhe behavior of os.path.join, this could simply be:
fname = os.path.join(self.src, file_list)
os.path.join, upon encountering any arg that is an absolute path,disregards all prior arguments, so there's no need to if/else this.
Really? You know I'll have to test this, super cool if it does. And ifso, will change. Thanks.

Yup, really. It's both a blessing and a pain, depending on if you'reaware of that behavior when you try to use the function...

[...]
667: Would rather not see yet another "run/exec/cmd" function in oursource code. Given the current single use case for this function,simply embed the check_call in _transfer - would even be fine withoutthe try/except clause, I think.
I'd rather leave this as a method in case it needs some beefing up.

If a special wrapper for subprocess.call/check_call/Popen is reallyneeded, it should probably be implemented in a way that's usable acrossthe entire code base. This is one of my major pet peeves in ourslim_source code - every module seems to have 1-3 special wrappers forexecuting a subprocess. Last time I counted, we had about 8 separatemethods for this, most of which do nearly the same thing.


https://defect.opensolaris.org/bz/show_bug.cgi?id=15957
http://opensolaris.org/jive/message.jspa?messageID=474361#474361

Personally, I don't think we need any wrapper around the subprocessfunctions; but even if we do, 1 or 2 at most seems like it would beenough. I don't think we need 8, 9, or more.

If embedding the subprocess.check_call() call is truly not desired here,would it at least be possible to use one of the existing wrappers,rather than rolling up a new one?

704: Use "self.logger.info("Transferring files to %s", self.dst)"
As a general rule, for logging statements, use formatting stringoperations and pass in the formatting arguments as separateparameters. This allows the logging module to defer the (relativelytime intensive) string formatting operation until it determines ifthe logging call will actually occur or not (based on the log levels).
While it's only really important for code that is run frequently (ina loop processing a large amount of data, for example), it's a goodpractice to get into.
Will do.
754-760, 960-966: These are init'ed in AbstractCPIO - is there areason to reset them here?
They are actually going away. But there is/was a reason. I wanted aclean attribute every time parse_input is called.
transfer_err.py:
Philosophically speaking, i don't see a need for these exceptionclasses. Uses of "TransferValueError" should definitely be directlyreplaced with the basic Python ValueError, which callers wouldreasonably expect from a function if they pass in invalid parameters.
Which is how it was originally but was changed after the big talkabout exceptions and ( to be honest) a comment you made at one pointabout ValueError.

Hrm. I find exceptions to be tricky... I've waffled in my philosophieson them at times, that much I know.

The rest of the exceptions defined here seem to indicate *where* theexception came from, rather than *what* the exception was - thelatter being far more valuable to a caller or end-user. The enginealready knows what checkpoint is running - and the stacktrace datafrom an exception will further make that clear. In terms of resolvingan issue - either finding a programming bug that raised an exception,presenting adequate information to the user to fix a permissionsissue, or programmatically recovering from an expected error case -it's easier to use an exception that matches what happened (IOError -bad permissions / file not found; TypeError - Function can't handleobjects of type XYZ; etc.).
This all goes back to a discussion at one point about having thecheckpoints return exceptions they have defined. If we have gottenaway from that, fine.
But that is probably a larger architectural decision.

I think I recall that discussion - it was in reference, initially, toManifestParser/Writer, as I recall. My memory may be deceiving me, but Ithink my stance on that was that I would have preferred to have theoriginal exceptions from lxml propagated, but (grudgingly) admitted thatfor MP/MW it was acceptable to wrap the exceptions, so long as theoriginal exception was stored and available for examination.

[...]
58-59: The "SOFTWARE_" prepended is redundant - just use "LABEL" and"NAME_LABEL"
Ditto for the equivalent items in Destination and Dir classes.
That is again something that needs to be decided amongst all of thecheckpoints. MP did it this way. I followed their convention because
consistency is good.

Ok. I'm sometimes inconsistent with whether or not I catch these things(or for that matter, whether or not I care about them). So forconsistency's sake, leave it in.

[...]
604-639: These lists can be stored in memory, right? Would it bebetter to do so? A temporary file seems prone to issues acrosspause/resume, particularly if the system is rebooted in between.
Not really. We are passing things through the DOC and I'm not sure wereally want to carry what could be 1000's of entries around. Doesn'tpause/resume
reread the manifest?

Out-of-process resume will only reread the manifest if the app choosesto do so. For DC, this is done, since the user may modify the manifestin between. For other apps, this may not be the case.

If there's concern about passing 1000s of entries around, it may benecessary to look into defining whatever special functions are used bythe pickle module to load/unload an object, so that when this object isserialized, the entries from the file are stored in the DOC snapshot aswell.

[...]
transfer_ips.py:
73: My understanding of gettext is that gettext.install should *only*be called by an application, never by a module - modules should usesomething like:
_ = gettext.translation
gettext.install() modifies the global namespace, so running it willaffect all imported modules - which means this line here willoverride, or be overridden by, the call to gettext.install() thatwill come from DC.
Similarly, I don't think a checkpoint should set locale.LC_ALL globally.
But the IPS calls don't work without this code. ;-(

We'll need to work through this to make sure that our modules and pkg'scan be localized properly in the same process-space. The pkg.client codeappears* to be violating some of the rules set out by the gettext modulefor use in libraries:

http://docs.python.org/library/gettext.html#localizing-your-module

* I say appears, because I'm really not sure. Localization is somethingthat eludes my grasp quite frequently, and it's all to easy to blamesomeone else's code for the problem, but I don't think it should benecessary for a consumer of an API to have to work around thelocalization code used within that module.

I *suspect* that use of gettext.bindtextdomain("pkg","/usr/share/locale") will make their code work. If adding that call atthe top level of the module causes the pkg.client code to run, then thebest approach is probably a context manager that sets the text domain to"pkg" before each pkg.client call, and resets it afterwards.

If this isn't easily resolvable, the best approach is probably to file 2bugs - 1 against pkg.client, and 1 against this code, and try to worktogether to figure out what needs to be done in order to make both sideslocalizable in a way that neither side interferes with the other.(Actually, filing those bugs may be needed anyway - even if thebindtextdomain method works, in a perfect world that wouldn't be necessary).

[...]
134: Can you explain what this math is doing here? It appears to bethat each pkg is assumed to be "roughly" the same size based on apre-generated average, and the number of packages given is correlatedagainst that?
Yes. Eventually we'd like to get package size information from IPS butit's not there yet. Then this would be far more accurate of acalculation.

Got it, that makes sense. Will there be a bug filed to update this codewhen the relevant pkg API is available?

139-140: Since get_size() appears to only be used here, couldget_size/distro_size be combined into a single @property calleddistro_size? Use an internal self._distro_size variable to cache theresults of the work currently done in get_size(); if the cache isset, return its value; if not, calculate it, set it, and then returnthe value.
Possibly. But get_size is part of the defined interface so it's notonly used here.


Ah, got it. Fine as is then.


215-219: Use set operations:
not_allowed = set(["prefix", "repo_uri", "origins", "mirrors"])
image_args = set(self.image_args)
overlap = a & b:
if overlap:
    # error

217-219: Malformed format string

Are you sure? I've actually seen this work I believe.

Assuming not_allowed = "repo_uri", the output here will end up beingliterally:


"%s to use should be repo_urispecified in the Source, not the args"

i.e., the string will be concatenated, not formatted. I think what youwant is:

raise ValueError("%s to use should be specified in the Source, not theargs" % not_allowed)

If that needs to line wrap, use implicit concatenation - without a "+",i.e.:

raise ValueError("%s to use should be "
                 "specified in the source, "
                 "not the args" % not_allowed)

222-3: Quick nit:
self.prog_tracker = self.img_args.get("progtrack",self.DEF_PROG_TRACKER)
Should achieve what you're going for, I believe.
In which case I wouldn't have to set self.prog_tracker to the defaultat some point.

Right, self.prog_tracker could simply be initialized to None in __init__(or left initialized to DEF_PROG_TRACKER - though it needs to beinitialized to something to appease Pylint and to aid things like Pydocthat use __init__ to determine what the variables of a class "should" be).

228: A recent change in pkg(5) made it so that publishers withinstalled packages could never be removed - removing them wouldactually simply *disable* them. Will that cause problems with thisapproach, i.e., removing an existing publisher on line 310, thenadding it back in (with potentially different origin/mirrors) on line320/324?
I think we're ok. Remember that email thread?

I do, now that you mention it, and yeah, it should work fine as is.Sometimes I spook to easily...

[...]
394: pop() is a function, so the arg should be in parentheses, notbrackets. pop() also removes the item from the dictionary, which mayor may not be desired (e.g. if something fails, but it's recoverable,and this code executes again, "recursive_removal" will be gone - itmay be desirable to instead create a copy of args and use that forthis section of code)
Thanks for catching the brackets. Removal is the desired action inthis case but the cases you expand upon are definitely ones to bethought about.
An empty dictionary will work just fine with **, so if args is alwaysa dictionary (or possibly None - in which case, add an "if args isNone: args = {}" chunk of code somewhere), then 391-401 can becombined into:
args may be None. So it comes down to is it nicer to have the if elseor the args = {}. 6 of one half dozen of another in my mind.
recursive = args.pop("recursive_removal", False)
self.api_inst.plan_uninstall(pkg_uninstall, recursive, **args)

438-439: Code can be removed
Yup. Obviously there used to be more code that was removed.


Yup. I've done the same many times...

451: Will stacktrace if self._origin is None. (some_list[1:] willnever be None - if there are no items there, it will return an emptylist). line 451 could probably just be removed, and line 452 could bepart of the 'if' block from line 449.
I think you can't do that. If I move 452 to follow 449 then ifself._origin[1:] returns an empty list, I have set origins to an emptylist which isn't what I want. I don't want to set it at all in that case.

In that case, the line should be "if self._origin[1:]:". Don't compareto "is None", because a list slice will never be None - it may be emptythough.

Actually, I know _origin is not None. It's at worst and empty listsince it's initialized to that. I think the error is really at line449 which should be
if self._origin[0] is not None:


[...]
507: I don't know if this message will make sense to an end user -how would they end up in this situation, and how might they resolveit? The API Errors from pkg.client.api_errors seem fairly verbose - Ithink it would be possible to simply propagate them, unless we haveadditional context to add to the message, or have a way to recoverfrom the exception.
It certainly made sense to Drew and Alok. They end up in thissituation if the pkg version is different from our version. I believetheir errors weren't quite as good in this case. Or they didn't reallyhelp Drew and Alok when they saw them.

Ok. From a developer's standpoint, "API version specified" makes sense.I'm just thinking that someone unfamiliar with IPS internals the way weare might see that message and have no idea what they did wrong or howto fix it. If this is an issue that will only ever be seen by developers(i.e., we'll catch API version changes and resolve them before thisreaches an end user), then no need to worry about it. And I think that'sthe case, so carry on, nothing to see here.

569: What is this clause catching? The message seems rather specificto a failure at line 563, and I imagine it could be constrained to asingle exception class.
Either 563 or 565 or 566. If any of these raise an exception thedestination hasn't been specified correctly.

Sorry, I wasn't clear on that. What exception types are we catchinghere? They should be listed explicitly, rather than having a bare'except' clause.

575-578: If im_type.zone is a bool, simply set "self.is_zone =im_type.zone". If it's something else, use "self.is_zone =bool(im_type.zone)"
 My C background is showing here. Will change.
642: Nit: Initialize value_str to an empty string, and lines 646-648aren't needed.
This code is moving to DOC so you'll need to consult with Darren on this.


Ok, forwarding it on to Darren in a separate email.

712, 726: A more specific class than "Exception" would be preferredhere.
Yeah.
683: In looking at the complexity of the branching, I wonder if a lotof the publisher related code, here and elsewhere, would besimplified by having a single list of publishers, stored in "search"order. The first item in the list would be what's known as the"preferred" publisher, and the rest are additional publishers, inorder of precedence. This would seem to map more naturally to how thepublishers on a system are laid out these days (preferred is nothingmore than a way to say "search this publisher first").
kemit...@kemobile-work->~ 0 $ pkg publisher
PUBLISHER                             TYPE     STATUS   URI
opensolaris.org (non-sticky, preferred) origin onlinehttp://ipkg.sfbay/dev/contrib origin onlinehttp://pkg.opensolaris.org/contrib/extra origin onlinehttp://ipkg.sfbay/extra/kemit...@kemobile-work->~ 0 $ pfexec pkg set-publisher--search-before=opensolaris.org extra
kemit...@kemobile-work->~ 0 $ pkg publisher
PUBLISHER                             TYPE     STATUS   URI
extra (preferred) origin onlinehttp://ipkg.sfbay/extra/opensolaris.org (non-sticky) origin onlinehttp://ipkg.sfbay/dev/contrib origin onlinehttp://pkg.opensolaris.org/contrib/
Actually, it's not too different. We start out with publishers storedin order, first being preferred. Unfortunately preferred and rest areset in IPS via very different mechanisms so it was easier to breakthem out for internal storage.

Fair enough. I wasn't sure how the pkg.client.api differed from the CLIin how this is handled, so I was mostly making a blind suggestion there.

745: Nit: The "+" and the "\" at the end aren't strictly necessary(The "\" is implied since there is an unclosed parentheses; the "+"is not required for concatenation of inline string constants)(Saw this in a few places, truly not a big deal, just thought I'dpoint it out for future reference)
837-841: Nit: Enclose in an "ifself.logger.isEnabledFor(logging.DEBUG):" clause, for reasons similarto prior logging comments.
So you're implying that if all the code does is log something toenclose it? Reason being? Seems to me that all you save is the codenot being executed but at the expense of readability. And lots ofother logger.debug statements are being executed.


Yeah, that was me being overzealous again. This scenario wasn't needed.

In a hypothetical scenario like the one below, you'd want to enclose thecall in such an "isEnabledFor" clause, to avoid doing unnecessaryprocessing, but this code isn't actually processing anything so there'sno need here.


if self.logger.isEnabledFor(logging.DEBUG):
    some_information = self.process_lots_of_information()

# takes a few seconds to run, so don't run it if we're not going toend up logging it anyway

    self.logger.debug("Some information:\n%s", some_information)

869-871: Use "elif"?
Sure.
881: Initializing self.args to an empty dictionary would remove theneed for this type of logic.
Yes. Will look to make sure this doesn't break something else firstthough.


Understood.

transfer_p5i.py:

77-82: Remove try/except clause, and let exceptions propagate?
Sure.
67, 87: You've popped the first publisher twice - was that intended?
Yes. In this case, the first publisher is the p5i file and the 2nd isthe preferred and I want both of those dealt with and only additionalones left.

Ok. When I reread the code and the comments again, that makes a bit moresense. It may or may not be worth adding a comment around 67, or in adocstring, about the structure of pub_list.

transfer_prog.py:[...]
101: When would this occur?
The df code (on tree usr/src/cmd/fs.d/df.c allows for the case where-1 could be there.


Would it be better to raise the error after line 81 then?

103: This looks odd as well - should the second equals sign be aminus sign?
Yup. See above.
118: Could this be moved up to line 98 as the condition on the"while" statement?
The first time through you don't have pct yet so it would involve morework than I think is necessary.

Ok.

[...]
103-133, 162-190: This seems to overlap with the same functions inprior files; would a common meta-class be worthwhile, defining thesebasic things?
Possibly? So you're thinking a class like TransferClass whichTransferCPIO, TransferIPS...... are all children of? And they wouldinherit some of these methods from?

Yes, something like that. Very lightweight, minimal number of methodsand a small set of common attributes.


[...]


_______________________________________________
caiman-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/caiman-discuss

Re: [caiman-discuss] Code review request: Transfer module

Reply via email to