Keith,
Thanks for reviewing. Responses below.
On 06/23/10 00:26, Keith Mitchell wrote:
Hi Dermot,
First comment - the link given points to the wiki "preview" page which
requires edit privileges. The correct link would be:
http://hub.opensolaris.org/bin/view/Project+caiman/OpenSolarisInstaller+-+ManifestParser+and+ManifestWriter
Sorry for the error and thanks for correcting this.
My comments are below. Overall, the design looks good; I have a few
questions about the behavior of the interfaces, as well as the
exceptions exposed. (Thanks for calling out the differences from the
prior ManifestParser - that was interesting and a valuable point of
comparison)
3.4.1.1: The behavior and defaults for validate_from_docinfo leaves
something to be desired in the scenario of the XML not having a
DOCINFO section with a listed DTD. Better default behavior could be
achieved by setting the default to None, and then, "if
validate_from_docinfo is None" AND the supplied XML file has a given
DTD, perform the validation. If the caller explicitly wants validation
from docinfo, they pass in True (in which case, absence of the
DOCINFO/DTD data would be an error, and if they explicitly don't want
validation, then pass in False.
I fully expected that this bit would need to be re-worked, so I'm glad you
commented on this. I was basically trying to map onto the functionality
provided by lxml and allow the document to be parsed and validated in
one pass, instead of having to parse it first and see if it has a DTD before
validating.
In any event, the time taken for ManifestParser is miniscule compared to
other checkpoints, so it's not really significant, but I like your
suggestion,
where validate_from_docinfo = [None|True|False] which allows for a
sensible default and still supports one-pass parsing -and-validation, if
desired.
I'll change to do as you suggest.
[btw, the error returned by lxml for DTD validation failure differs slightly
depending of whether single-pass or two-pass parsing and validation is done.
So, perhaps this is a reason not to support single-pass: to keep error
reporting
consistent?]
page 13 - execute/dry_run behavior: I think a dry_run for
ManifestParser or ManifestWriter would still add items to the DOC.
Consider for example, a full dry-run of AI - the subsequent
checkpoints would need the data to perform their dry_run actions.
Makes sense. I'll change it so dry_run will be ignored for ManifestParser.
page 13: execute/raises: For an application attempting to recover from
such an error, how would they differentiate between the different
error cases and inform the user? If the manifest doesn't validate, how
can the app inform the user exactly where in the XML file the parsing
failed? I think ManifestParser needs a very solid, usable method for
informing the user of how to fix an error in their manifests, which
means the errors raised need to be readily understandable by the
application and provide plenty of data, so I think this section should
be expanded. For example, right now, depending on the error one has in
an AI or DC manifest, it's possible to end up with a generic, and
useless "could not parse manifest" error with no additional data -
that doesn't really help an end-user figure out that they had a typo
on line 78 and forgot to close the "<pkg_repo_addl_authority" tag.
In particular, subclasses of ManifestError that group the errors into
different cases would be valuable. And for some of those cases, the
error may benefit from providing more than simply string data. (e.g.,
line number, etc.)
lxml returns a fairly useful text description for XML syntax errors,
etc. My intention was that all
ManifestParser errors, from cannot-find-file to does-not-validate would
return the same exception,
but the exception's message would give a detailed description, eg:
Unable to read Manifest file [dc.xml]: No such file
or
XML syntax error in Manifest file [dc.xml]: Opening and ending tag
mismatch: software_spec line 59 and transfer, line 66, column 12
or
Validation against DTD [dc.dtd] failed: No declaration for attribute
mod_name of element checkpoint, line 43, column 27
and the application would just present this message to the user.
If more finely grained control is needed, then I can certainly sub-class
ManifestError as you suggest,
although offering the line number in a separate value seems excessive -
would we really use this
in our applications?
An indication of how the ManifestErrors wrap around or enhance the
errors raised by lxml would be valuable as well.
The exceptions being caught from lxml are:
IOError
etree.XMLSyntaxError
etree.DTDParseError
In all cases, a ManifestError is raised, taking its message from the
above exception
(plus some additional text).
When DTD validation fails, lxml returns an error code and puts the
reason in an
error log. ManifestParser raises ManifestError, using the text from the
error log
(plus some additional text)
I'll add a summary of the above to the document.
page 14: What factors, if any, would affect the value returned from
get_progress_estimate() for this checkpoint? For example, transfer/IPS
depends on the number and size of packages being installed - are there
any similar factors at play here, or is it expected to be static (I'm
mostly just curious - I can't think of anything off-hand).
- size of manifest and DTD files, but this would not be significant
- network access is disabled for loading XML and DTD files,
so that should not be a factor
I can't imagine get_progress_estimate() will return anything other than
1, really.
Also, I should add a comment to the spec to state that ManifestParser
and ManifestWriter
will not be providing progress updates during their execution.
Execution time is
expected to be minimal (c. 0.12 seconds in my prototype), so updating
progress
would not be very useful.
page 16: ManifestWriter/execute/dry_run behavior: For the
ManifestWriter, on the one hand it could be argued that a dry_run
could still benefit from presenting the user with an output manifest
(perhaps just on stdout?). Depending on how a user or developer
running an app specifies dry_run and an output file, it may be best to
just always dump to XML.
How about if I write it to the log, but do not create the output file,
when dry_run = True?
page 16: execute/raises: Again, more detail would be valuable,
although the ManifestWriter is probably better protected from end-user
errors/typos then the parser so this is perhaps less critical.
As per ManifestParser, I can sub-class the exception class if more precise
errors are required?
Thanks,
- Dermot
Thanks,
Keith
On 06/17/10 06:02 AM, Dermot McCluskey wrote:
Hi,
I hope there is still some enthusiasm left for reviewing design specs?
I'd like to request a review for my design for ManifestParser and
ManifestWriter.
The latest version of the document can be found here:
http://hub.opensolaris.org/bin/preview/Project+caiman/OpenSolarisInstaller+-+ManifestParser+and+ManifestWriter
As this component is slightly less complex than others that have
recently been
reviewed, I'd like to request that comments be sent by Friday June
25th. If this
is not practical, I will extend this date.
Please let me know if you intend to review this so I can be sure I
have enough
reviewers. I would particularly appreciate comments from some of the
following:
Jack, Alok, Karen, Darren, Sarah.
Thanks,
- Dermot
_______________________________________________
caiman-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/caiman-discuss
_______________________________________________
caiman-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/caiman-discuss
_______________________________________________
caiman-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/caiman-discuss