Re: [caiman-discuss] Code review request: ManifestParser

Dermot McCluskey Tue, 05 Oct 2010 10:08:22 -0700

Keith,

Many thanks for reviewing so thoroughly.


An updated webrev addressing these issues is at:
http://cr.opensolaris.org/~dermot/webrev_mp_mw_02/

Responses below.



On 10/04/10 22:38, Keith Mitchell wrote:

 On 09/30/10 07:40 AM, Dermot McCluskey wrote:
Hi All,

I'd like a code review for the ManifestParser/ManifestWriter
checkpoints for the CUD project.  The changes are here:
   http://cr.opensolaris.org/~dermot/webrev_mp_mw_01/

Please sent comments by 10/12/2010.
Note: this webrev just includes the new files underusr/src/lib/install_manifest.There are also a very small number of changes to higher-levelMakefiles, etc
which have been omitted here for convenience.
Hi Dermot,
I can't quite tell from the webrev - will these be under thesolaris_install python package as a sub-package, e.g.,solaris_install/manifest, solaris_install/manifest/common,solaris_install/manifest/parser?

I've pulled the system-library-install manifest changes from the CUDgate into

my updated webrev.

They will all go in solaris_install/manifest.  The layout will be:
   vendor-packages/solaris_install/manifest/
   vendor-packages/solaris_install/manifest/__init__.py
   vendor-packages/solaris_install/manifest/parser.py
   vendor-packages/solaris_install/manifest/writer.py

import statements would look like:
   from solaris_install.manifest import ManifestError
   from solaris_install.manifest.parser import ManifestParser
   from solaris_install.manifest.writer import ManifestWriter
(although if you're using the InstallEngine, you won't need to
import much.)

Specific comments below.

- Keith

common/__init__.py:
42: I can't recall - will storing the Exception preserve the originaltraceback as well? I seem to remember thinking that the traceback mustbe explicitly grabbed (via, e.g., the traceback module) if desired forlater. It may be worth adding that instrumentation to this class (or,logging the traceback may be sufficient - see below on logging.exception)

I've added the traceback from the original exception to the attributesof ManifestError:

       self.orig_traceback = sys.exc_info()[2]

I haven't added it to the str() output, but I've tested that it can beaccessed from

user code with something like:
      import traceback
      print("Traceback: %s" % traceback.format_tb(error.orig_traceback))

Is that sufficient?

62: As this function is intended for use by modules other than thefile it's defined in, I think it's more appropriate to define withoutthe leading underscore.

I used the underscore to highlight that it is not intended to be apublic API,

but I don't mind removing it it you think that's better.
Fixed.

87: In general, it's better to use logger.exception in 'except'clauses - this will automatically log things like traceback data, etc.(Obviously, it depends on the scenario - sometimes it's not needed -but in general, it's better to have the traceback data than to end upwanting it later)


Fixed throughout.

parser.py:
166-7: Why not just "if validate_from_docinfo not in [None, True,False]:"? Although I'm generally of the opinion that there's no needto explicitly verify that a parameter is a bool or not. Just check for"is (not) None" at the appropriate point, then run "ifvalidate_from_docinfo" at the appropriate point. Type-checking inPython (particularly for bools) shouldn't be needed in most cases.


Fixed - removed all "in [True, False]" checks throughout.


178-186: Same comment regarding type-checking of bools.


Fixed.

231: While most of our code is probably not terribly performancesensitive when it comes to formatting string operations, it's bestpractice, when making logging statements, to use the form:
LOGGER.debug("ManifestParser.parse(doc=%s) called", doc)
(Note that doc is a separate arg, and not cast to a string first,rather than pre-formatted using %)This has the advantage of, if LOGGER is not enabled for the debuglevel, the string formatting doesn't occur. (Note that the call tostr(doc) is unnecessary regardless - '%s' will call str() on thepassed in object)


Fixed throughout.

235: I don't think this should be strictly needed - just pass in_validate_from_docinfo. "None" evaluates to "False" in a booleancontext. Only reason to cast this would be if the dtd_validation argto etree.XMLParser has differing behavior for None vs. False - and ifthat's the case, just pass bool(_validate_from_docinfo) to the_load_manifest() call, no need for the separate if/else block here.


I think at some point in the development this was needed, but agree it's not
needed now.  Removed.


270, 338, 345: Use logging.exception


Fixed.

273: Would it make sense to have parse() return a reference to the xmltree? This wouldn't change the execute behavior, and would allow forparse() to have some additional value when it doesn't get a DOCparameter. (Mostly useful for diagnostics, probably)

I'd rather not. MP/MW are really meant to be closely coupled with CUDand, in

particular, the DOC.  We talked previously to Ethan, for example, about how
other components that don't use the Engine/DOC/etc can very easily parse XML
documents on their own using lxml, without using MP.

If we want MP to be a general purpose XML handler, then we should probably

step back and re-look at the overall design to see what functionality isneeded.


295: See previous comment on string formatting for logging operations.


Fixed.

351: Along the same lines as the string formatting for loggingcomment, logging statements like this should be enclosed in 'if'blocks like so:
if LOGGER.isEnabledFor(logging.DEBUG):
    LOGGER.debug(some_function_with_potentially_long_execution_time())
Again, it probably isn't much of a strain here, but it's good practiceto try to isolate it.


Fixed here and similar call in writer.py.


writer.py:
29-32, 34-37: Nit: Alphabetize imports


fixed


137, 201, 211, 221, 227, 251, 291: Update logging statement


I don't believe 221 and 227 need to be changed because:
- they are calls to LOGGER.error, not LOGGER.debug and we should
 always want to write out errors
- the msg string is used twice, so it definitely needs to be evaluated,
 even if we are not logging it

All others fixed.


149: See previous comment about type-checking bools


fixed


264: This will error out if the open() call on 256 failed.


fixed:
       manifest_file = None
       try:
           manifest_file = open(self._manifest, mode='w')
           manifest_file.write(text)
       except IOError, error:
           <skip>
       finally:
           if manifest_file is not None:
               manifest_file.close()

Test code:
Note: I skipped over the test XML/DTD/XSLT files, unless they needexplicit reviewing, focusing on the actual test code. Also, much as Ilove test code, I find it hard to review (sometimes difficult tocorrelate test cases with tested code if you're not directly workingon either - this isn't a problem with your code, just with my eyes),so if I comment on a "missing" test case that exists, just let me know.
test/common.py:
78-97: lists in Python support negative indices - so this could becondensed to:
try:
    line = lines[lineno].strip()
except IndexError:
    return False
else:
    return (line == string)


D'oh!  Thanks.

test_mp_with_engine.py:
General: This module, in particular, doesn't need much (if anything atall). There's no strict logic in the execute() method other than thedoc verification, so unit tests don't add significant value beyondwhat's more readily testable specifically against the parse() method.

Agreed. But I think of this module as a bit of an integration test -testing that

MP works as expected in relation to the InstallEngine and DOC.  Also, I
tend to think of use via the InstallEngine as the "default" mode of MP and

the non-Engine APIs as "add-on" functionality. So, I've put most testsin the..._with_engine modules, unless it happened to be easier to check theoutcome

in the ...without_engine module.

51: Should be done in tearDown(), so that test modules that run*after* this one can sanely instantiate the Engine if needed.(Additionally, I think somewhere in CUD there is an engine resetfunction for use in test code - use that. I believe it also resets theloggerDict)


fixed.  Using get_new_engine_instance() and reset_engine() throughout now.

78, 100, 137, 153, 171: Use self.assertEqual(status,InstallEngine.EXEC_SUCCESS, message)


fixed throughout.

84: Use assertEqual (better default msg on failure)


fixed.

test_mp_without_engine.py:
Missing: Tests that run _load_manifest() directly (if some of theexisting tests hit that logic specifically, they should be written tocall _load_manifest directly, rather than via parse()). Only a fewtest cases are needed - specifically, maybe 1 success case withxinclude, 1 with dtd_validation=True, 1 with attribute_defaults=True;1 error case where an IOError -> ManifestError occurred, and 1 whereXMLSyntaxError -> ManifestError occurred.


_load_manifest is not a public API, it is only a convenience function.
So, if I can test the code without directly calling this function, should
I not stick to testing the APIs?

btw,  I should have mentioned that the coverage figures I'm getting from
running my tests are:

% ./slim_test lib/install_manifest/test/ --with-cover
...

Name Stmts ExecCover Missing

---------------------------------------------------------------------------------
...
solaris_install.manifest                               30     30   100%
solaris_install.manifest.parser                        78     78   100%

solaris_install.manifest.writer 89 8898% 198

...
solaris_install/manifest/parser                        78     78   100%

solaris_install/manifest/writer 89 8898% 198


(although I have a feeling these are slightly exaggerated.)

test_mw_with_engine.py: Similar comment to test_mp_with_engine. Someof these tests seem to duplicate those in test_mw_without_engine. Thetwo relevant cases unique to the execute() function itself are thedry_run flag, and the "doc is None" error condition. (This file couldbe readily combined with the dry_run test file)


I would have thought that in general, more tests are better, and there's no
need to to attempt to consolidate into fewer unit tests.

Is that not the case?

Other:
is _validate_manifest tested directly anywhere? It appears that someof the writer/parser tests might be covering symptoms of_validate_manifest that would be better served by testing the functiondirectly.

Yes - it is definitely tested, for both positive and negative results,but via calls

to parse() and/or execute_checkpoints(), not by calling it directly.


Thanks,
- Dermot


_______________________________________________
caiman-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/caiman-discuss

Re: [caiman-discuss] Code review request: ManifestParser

Reply via email to