Hi Dermot.

Thanks for taking the time to review.

On 03/22/11 11:11 AM, Dermot McCluskey wrote:
Jack,

Here are my review comment for group [B].



On 03/14/11 11:44 AM, Dermot McCluskey wrote:
(B) The Manifest Input Module, focus on overlay
   usr/src/lib/install_manifest_input/Makefile

All good - no comments.

   usr/src/lib/install_manifest_input/__init__.py

All good - no comments.

   usr/src/lib/install_manifest_input/process_dtd.py

First, a general question: I'm sure you've answered this before, so please
forgive me not recalling the answer:  why do you need to write your
own DTD parser?  While I don't know for sure, I'm surprised that lxml,
or some other common module does not provide some or all of this
type of functionality?
Well, I was surprised too at first... Actually there aren't many occasions when Joe User (a.k.a. me) would want to parse a DTD. An XML file... of course, but not a DTD.

The reason why I needed to do this is because add and overlay need to know where to place a new node with respect to its new siblings and whether more than one node with the same tag as the one being added is allowed. This info is only in the DTD.

I could have tried the node between each set of neighboring siblings until xmllint gave me a status I wanted, but there were a number of drawbacks to this approach, the two nastiest of which were that I had to parse xmllint error messages which were not documented and could change, and that it was very clunky.

I also looked thoroughly through libxml (C and python) for the functionality I needed and couldn't find it. After that, I emailed an external libxml alias and asked, and got back some responses similar to yours, and after about a month.

So I wrote the parser. I only parse what I need of the DTD, to keep it simple. Since order of attributes is not an issue, and there can be only one attribute with a given name, no attribute information is processed, only elements.

If you want more gory details, please ping me.
Specific issues:

  75     while (result != line):
...
 106         line = result

Why do you need a while loop here?  I cannot see any circumstances
where this could loop more than once?
This is why a second pair of eyes is helpful! Now that you point it out, line 106 (line = result) always executes before the test of while (result != line), so this loop never would execute a second time...

 110 # Assume splitchar is not ( or )

Why is this comment here?  This comment is repeated insode the docstring
on line 119
Removed from 110.

 234             self.type = "SEQ"
 247             self.type = "OR"
 256             self.type = "ELEM"

using strings as type identifiers here doesn't seem right. Would it not be better
to do something similar to the Search states, ie

189                TYPE_SEQ, TYPE_OR, TYPE_ELEM = range(3)
...
234                            self.type = DDObj.TYPE_SEQ
etc?
Yeah... Code is easier to read the way I did it, but it really isn't efficient... Changed.

 340                 if (nocomline.startswith("<!ELEMENT")):

Suggest making "<!ELEMENT" a global, like ENTITY_TITLE?
OK.

 397         try:
 398             parts = line.split(None, 2)
 399         except IndexError:
400 raise MimDTDExpnError(MimMsgs.ERR_BAD_ELT_LINE % {"mline":line})
 401
 402         name = parts[1]
 403         children = parts[2]

You are wrapping the wrong section of code in try/except here.
line.split() will not raise IndexError, but accessing
parts[1] and parts[2] might.
Yup.  Thanks, fixed.

424 Note: only an element's immediate children are of concern, since the 425 search function which uses the data structure needs only to return only
 426         immediate children of an element.

Nit: one too many "only"s in this comment!
Thanks.  Fixed.
444 items = items[0:-1] # in the SEQ or OR list that keeps those items.

 491             items = items[0:-1]

Suggest these would be clearer if written as:
items.pop()
Thanks for the suggestion. Actually, items is a string not a proper list. I have updated the method header to clarify this.

636 MimDTDInternalError - _search_element_info: invalid type "%s

Nit: stray double-quote?
Fixed.

 766             return (None)

Should this not return two values, as per lines 776, 778, 780?
Now returns (None, None) as second value is really N/A if first is None.


usr/src/lib/install_manifest_input/test/test_manifest_input_overlay.py

168 Popen.check_call([self.DIFF, self.FULL_XML, self.AIM_MANIFEST_FILE])

Is there an implied assertion here? If there is, perhaps it would be better to explicitly state (in a comment) what is being checked? eg non-zero return
code will raise an exception, etc.
OK.  I added
  # Raises an exception if diff command finds differences from original.




   usr/src/lib/install_manifest_input/test/test_process_dtd.py

All good - no comments.


- Dermot
Thanks for your thorough review.  I really appreciate it.

    Thanks,
    Jack

_______________________________________________
caiman-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/caiman-discuss

Reply via email to