Re: [caiman-discuss] Review: ManifestParser/ManifestWriter Design Spec

Dermot McCluskey Wed, 23 Jun 2010 05:40:03 -0700

Keith,

Thanks for reviewing.  Responses below.


On 06/23/10 00:26, Keith Mitchell wrote:

Hi Dermot,
First comment - the link given points to the wiki "preview" page whichrequires edit privileges. The correct link would be:http://hub.opensolaris.org/bin/view/Project+caiman/OpenSolarisInstaller+-+ManifestParser+and+ManifestWriter


Sorry for the error and thanks for correcting this.

My comments are below. Overall, the design looks good; I have a fewquestions about the behavior of the interfaces, as well as theexceptions exposed. (Thanks for calling out the differences from theprior ManifestParser - that was interesting and a valuable point ofcomparison)
3.4.1.1: The behavior and defaults for validate_from_docinfo leavessomething to be desired in the scenario of the XML not having aDOCINFO section with a listed DTD. Better default behavior could beachieved by setting the default to None, and then, "ifvalidate_from_docinfo is None" AND the supplied XML file has a givenDTD, perform the validation. If the caller explicitly wants validationfrom docinfo, they pass in True (in which case, absence of theDOCINFO/DTD data would be an error, and if they explicitly don't wantvalidation, then pass in False.


I fully expected that this bit would need to be re-worked, so I'm glad you
commented on this.  I was basically trying to map onto the functionality
provided by lxml and allow the document to be parsed and validated in
one pass, instead of having to parse it first and see if it has a DTD before
validating.

In any event, the time taken for ManifestParser is miniscule compared to

other checkpoints, so it's not really significant, but I like yoursuggestion,

where validate_from_docinfo = [None|True|False] which allows for a

sensible default and still supports one-pass parsing -and-validation, ifdesired.


I'll change to do as you suggest.

[btw, the error returned by lxml for DTD validation failure differs slightly
depending of whether single-pass or two-pass parsing and validation is done.

So, perhaps this is a reason not to support single-pass: to keep errorreporting

consistent?]

page 13 - execute/dry_run behavior: I think a dry_run forManifestParser or ManifestWriter would still add items to the DOC.Consider for example, a full dry-run of AI - the subsequentcheckpoints would need the data to perform their dry_run actions.


Makes sense.  I'll change it so dry_run will be ignored for ManifestParser.

page 13: execute/raises: For an application attempting to recover fromsuch an error, how would they differentiate between the differenterror cases and inform the user? If the manifest doesn't validate, howcan the app inform the user exactly where in the XML file the parsingfailed? I think ManifestParser needs a very solid, usable method forinforming the user of how to fix an error in their manifests, whichmeans the errors raised need to be readily understandable by theapplication and provide plenty of data, so I think this section shouldbe expanded. For example, right now, depending on the error one has inan AI or DC manifest, it's possible to end up with a generic, anduseless "could not parse manifest" error with no additional data -that doesn't really help an end-user figure out that they had a typoon line 78 and forgot to close the "<pkg_repo_addl_authority" tag.
In particular, subclasses of ManifestError that group the errors intodifferent cases would be valuable. And for some of those cases, theerror may benefit from providing more than simply string data. (e.g.,line number, etc.)

lxml returns a fairly useful text description for XML syntax errors,etc. My intention was that allManifestParser errors, from cannot-find-file to does-not-validate wouldreturn the same exception,

but the exception's message would give a detailed description, eg:

Unable to read Manifest file [dc.xml]: No such file
or

XML syntax error in Manifest file [dc.xml]: Opening and ending tagmismatch: software_spec line 59 and transfer, line 66, column 12

or

Validation against DTD [dc.dtd] failed: No declaration for attributemod_name of element checkpoint, line 43, column 27


and the application would just present this message to the user.

If more finely grained control is needed, then I can certainly sub-classManifestError as you suggest,although offering the line number in a separate value seems excessive -would we really use this

in our applications?

An indication of how the ManifestErrors wrap around or enhance theerrors raised by lxml would be valuable as well.


The exceptions being caught from lxml are:
   IOError
   etree.XMLSyntaxError
   etree.DTDParseError

In all cases, a ManifestError is raised, taking its message from theabove exception

(plus some additional text).

When DTD validation fails, lxml returns an error code and puts thereason in anerror log. ManifestParser raises ManifestError, using the text from theerror log

(plus some additional text)

I'll add a summary of the above to the document.

page 14: What factors, if any, would affect the value returned fromget_progress_estimate() for this checkpoint? For example, transfer/IPSdepends on the number and size of packages being installed - are thereany similar factors at play here, or is it expected to be static (I'mmostly just curious - I can't think of anything off-hand).


- size of manifest and DTD files, but this would not be significant
- network access is disabled for loading XML and DTD files,
 so that should not be a factor

I can't imagine get_progress_estimate() will return anything other than1, really.

Also, I should add a comment to the spec to state that ManifestParserand ManifestWriterwill not be providing progress updates during their execution.Execution time isexpected to be minimal (c. 0.12 seconds in my prototype), so updatingprogress

would not be very useful.

page 16: ManifestWriter/execute/dry_run behavior: For theManifestWriter, on the one hand it could be argued that a dry_runcould still benefit from presenting the user with an output manifest(perhaps just on stdout?). Depending on how a user or developerrunning an app specifies dry_run and an output file, it may be best tojust always dump to XML.

How about if I write it to the log, but do not create the output file,when dry_run = True?

page 16: execute/raises: Again, more detail would be valuable,although the ManifestWriter is probably better protected from end-usererrors/typos then the parser so this is perhaps less critical.


As per ManifestParser, I can sub-class the exception class if more precise
errors are required?


Thanks,
- Dermot

Thanks,
Keith

On 06/17/10 06:02 AM, Dermot McCluskey wrote:
Hi,

I hope there is still some enthusiasm left for reviewing design specs?
I'd like to request a review for my design for ManifestParser andManifestWriter.
The latest version of the document can be found here:
http://hub.opensolaris.org/bin/preview/Project+caiman/OpenSolarisInstaller+-+ManifestParser+and+ManifestWriter
As this component is slightly less complex than others that haverecently beenreviewed, I'd like to request that comments be sent by Friday June25th. If this
is not practical, I will extend this date.
Please let me know if you intend to review this so I can be sure Ihave enoughreviewers. I would particularly appreciate comments from some of thefollowing:
   Jack, Alok, Karen, Darren, Sarah.


Thanks,
- Dermot

_______________________________________________
caiman-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/caiman-discuss
_______________________________________________
caiman-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/caiman-discuss


_______________________________________________
caiman-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/caiman-discuss

Re: [caiman-discuss] Review: ManifestParser/ManifestWriter Design Spec

Reply via email to