On 08/ 9/10 11:51 AM, [email protected] wrote:
Hi Keith,
Thank you for the sanity-check. More below and to address some comments which came up in talking with you on the phone and in e-mail:
1) The doctest implies one must overload open() to use WANBootInfo(). To
   prevent any confused reading of the doctest, I've explictily commented
   that overloading open() is only for testing. Further, I've tried using
   open("/some_file_which_does_not_exist") however, doctest seems to
   execute in Python restricted mode so open()/file() are not available.

It's good to know that about doctest (though it makes sense, in a way, that it would be restricted). The comment will work.


3) For those who might belive a class for "querying" wanboot.conf might
   also allow writting, I've made an explicit comment that the class does
not allow writting. I tried setting __setattr__ to None but that caused
   other issues which I could not eaisliy resolve.

The comment is valuable; long term, defining __slots__[1] at the class level should prevent any attempts to write arbitrary attributes (but the comment is good enough for now).

[1]http://docs.python.org/reference/datamodel.html#slots


                            Thank you,
                            Clay

On Fri, 30 Jul 2010, Keith Mitchell wrote:

[snip...]
All my prior comments on shell scripts have been addressed. All comments below are from ai_sd.py, which seems to have gotten rather complicated suddenly.

120,129: Should this be... "method"? (Why is method listed at all? It's right above the line, and any pydoc parser should have a reference to the method name)

This was a carry-over I didn't pay it much attention to; but, indeed these are misspelled and redundant. I've removed them.

405, 418: platform.processor() should provide a more parseable result, particularly for sparc.

Ah yes, indeed. Thank you, I'll use processor().

469-509: Would probably be best to move this doctest to a separate PyUnit test. It's a bit verbose to read which significantly reduces its value as a test-that-also-documents-usage - ditto for the __getattr__ function's doctest. Given what's being tested - the opening of a file - there may not really be a need to test this. I trust that the implementation of "open" in Python works.

Ah yes, but the API usage to understand what err and f are and when is useful to know IMO. If anyone doesn't want to see it, even vim supports code folds to hid the doctest - let's keep code and their tests in one place for now.

I don't particularly like to rely on folding to hide the significant number of "noise" lines (especially since a fold will hide everything in the docstring), but I suppose that it will be fine. I think I'm too picky these days...


558-559: Between these lines, and the previous function, that's a lot of effort to: open a file, catch the error, tell the caller there was an error, then have the caller re-raise an error. Let Python's builtin exceptions do what they're made for, and just open the file directly (using a "with" block to automatically close it):

with open(a_file) as the_file:
   my_data = the_file.readlines()
   # process my_data, etc.
#(File closes automatically at end of 'with' block, much like a finally clause)

Yes, but this keeps the actual implementation of the service discovery code consistent be it SPARC or X86 (lines 400-423) which I think is more important. (I'd rather abstract out the differences and present a consistent API than have architecture specific checks for IOError, etc; here if this class has an error it will propagate it and the calling code can then report it as it wishes.

While I wasn't necessarily implying to do that all "in-line" (within the concept of a platform-generic function or class was my thought), it's not enough of a sticking point that I wish to continue pressing on it. I do feel that the nested exception is harder for a caller to parse, but not significantly so in this case given the usage.


563-564, 576: A __getattr__ function that reads a file twice is very expensive for attribute access. I appreciate the magic of Python used here, but given the limited usage of this class, it seems a bit expensive. A simple method (rather than a class), that reads the file and returns a dictionary of the key/value pairs found in the file would probably be far more efficient and valuable. If it makes it more natural to store that data in a class that has a "read from file" function, then that'd be ok too.

If a full fledged class is required, perhaps the data could be cached in __dict__, and only re-read the file if the file has been updated since the last read. (os.stat(filename).st_mtime)

Yes, this is not very efficient of me; I've simplified the function, however, I still believe cacheing is a bit overkill since at this time the function is only called once anyways.

That's fine. I do have a tendency to extrapolate and look for a "better solution" when one is not yet needed. But keep the concept in mind for future reference!


Additionally, regardless of implementation, use str.partition rather than split. It's a far 'safer' function for cases of the form "key=value", as it will always return a tuple of length 3. Also, is it truly technically invalid to have multiple entries of the same name in a wanboot.conf file? bootconfchk(1M) seems to accept it, so for consistency, if it's valid, we should as well.

Hrm, that's an interesting question. I hadn't realized that it would accept multiple values. For the places we use this, it can certainly cause us issues. (For example multiple root_server keys causes my fix to net-fs-root to try to download solaris.zlib, et al. multiple times too). As we generate the wanboot.conf file at this time; this is a potential situation but one I see as very unlikely. I've filed a bug 16751 - "Multiple value per key in wanboot.conf causes AI clients confusion" to track this.

Sounds good.


580: If this stays as a class, please wrap this into the class definition (setting file=None, and then setting wanboot_conf = WANBootInfo.default_wanboot_conf when file is None should do it). Also, use "file_" rather than overriding the builtin "file" (similarly, for readability above when creating a file, use "file_" - or a more descriptive name - rather than "f").

I've added a doc string mention that the file still acquires a value despite being left as None. Certainly since Drew too stumbled on the out-of-class indentation level init() I think you're right, this should stay with the class.

Thanks.


590/600: Perhaps just create a function?

This is to keep parity across SPARC and X86 for discoverying properties; so WANBootInfo and DHCPClientInfo are reasonably similar in API.

There's a very distinct difference between the two classes in that for one, you create an instance and access an attribute, and for the other, you're not allowed to create an instance. If this is meant to be a consistent API, then DHCPClientInfo.boot_server perhaps should be "root_server" to match WANBootInfo.root_server. The DHCPClientInfo should probably also be able to be instantiated and r/boot_server should be a property.

(Those are just nits though, or comments to file away to take note of as these classes evolve)


655-680: Expanding on Drew's comments on use of Popen - if it raises an OSError, that means a required program (dhcpinfo) is missing; if it's missing than either there's something corrupt on the image/system, or there's an IPS dependency error. Both are situations that the ai_sd can't do anything about; so the best course of action is to propagate up (at the highest level, e.g in the if __name__ == __main__ block, all exceptions should be logged and printed to the user with as much information as possible to assist in evaluating the root cause). Masking the OSError is a bad idea - let it show through, as that will be easier to diagnose if the situation ever arises, and has far richer data than the DHCPError will.

No doubt there are some good ideas here to aid in debugging. However, the code is already structured to use an AI_LOG device and report Popen OSErrors this way. But you have a very valid point I should pass the OSError's message through to aid in diagnosis (does dhcpinfo(1) not exist, have restrictive permissions, etc.?).

I still believe that this should be catchable as coming from the DHCPClientInfo class, as if the consumer isn't requiring the DHCP info, it may wish to simply note and continue; if it wishes to propagate it as long as the necessary info is there it can pass it along to the user (as ai_sd.py does). I think it's good hygiene for the class to report that it had an issue, not potentially cause a higher-level consumer to believe it had an issue (i.e. each layer should catch its own errors -- follow an encapsulating design).

I know I said I'd back off on this, but something about this usage in particular bothers me even more than the WANBootInfo case. I'll try to present my case a little more clearly (this is the last time for this code review - even if nothing changes in this code immediately, perhaps I can sway you away from this type of 'encapsulation' for the future).

By lumping all the exceptions that occur during this function call into a single bucket, you're simply begging for bugs to be masked. There's a reason the Python exception hierarchy is not a flat listing of "ListError", "DictError", "TupleError", etc. It's because callers don't care what the class is that originally raised the error - they care what the *cause* is. That's why the hierarchy has "TypeError," "ValueError," "OSError," and so forth.

Callers will handle errors (or not) based on the cause, not based on the source class. They may need to handle a missing binary separately from a failed call into that binary - even if that binary call is wrapped up in a class definition. A consumer of a class needs to know enough about that class to know the different exceptions it could raise and why, and be ready for those cases it can handle (and let those situations it can't handle propagate up).

If a caller truly doesn't care about the cause, Python has a way to handle multiple exceptions in a single clause (or one can handle all exceptions unilaterally with an "except StandardError" though that should really be used sparingly). But the class shouldn't be making the decision for the caller that the caller will handle all error cases the same way - and re-wrapping every exception into a single class is forcing the caller to handle them all the same way.

Finally, any value of lumping exceptions together by class goes away *completely* when you consider that, underneath it all, Python is weakly typed. Between subclasses, duck-typed objects, mixins, metaclasses, etc. The moment you step away from "I created this object" and into "I got an object as a parameter, and it should be a <class> or <class-like> object with a foobar function that raises XYZ for a given error condition", encapsulation of error conditions starts looking really ugly.


For ValueError, it should only arise in one of two cases: programming error, or passing in a dynamic (e.g. user-selected) value. In the former case (which is the case for the usage scenarios in this file), again, masking with a DHCPError would make it hard to debug; in the latter, it would be acceptable to catch, but only by the portion of the program "in control of" the dynamic value.

I appreciate your concern about ValueError being a programming error not a run time issues, and as ValueError can only be returned under 2 cases from subprocess.Popen() -- and we aren't going near them -- I've pulled that exception check out of both the new and old code.

Thank you.


Overall, the original implementation was much cleaner; really, it just needed to check the retcode of the Popen object and assert that stdout returned a usable value (and raise an exception as needed).

No, unfortunately, that was not possible. My original version was not modular enough -- we needed to be able to pull functionality out to test it.

You're correct. Half my brain (simple, direct code) is sometimes at odds with the other half (direct, modular code) - the modular side wins in this case.

- Keith


_______________________________________________
caiman-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/caiman-discuss


_______________________________________________
caiman-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/caiman-discuss

Reply via email to