[OPM] Strict parsing

Joakim Hove Tue, 09 Dec 2014 09:15:20 -0800

Hello;

Andreas currently has the following two PR's against OPM-parser:


https://github.com/OPM/opm-parser/pull/355
https://github.com/OPM/opm-parser/pull/347

One of them has title "Remove the strict parsing concept" and the other one
is about how to handle defaulted, and particularly defaulted items without
a properly set default value. The "srict parsing" behaviour of the parser
was quite mediocre, and I support that removal. However - both of these PR
touch upon the general balance which must be found between strictness and
correctness and ease of use and flexibility on the other hand. Andreas and
myself have discussed this at some length, but it is an important topic and
I would therefor like it if others joined with comments. I have not
personally hammered out a vision of exactly how this should be, so I must
admit that my review comments and decisions have not always been as well
thought out as they should.

The direction I would like to stear against for these questions can be
summarized as:


*The parser*: The parsers should serve as a glorified tokenizer; i.e. it
should correctly split the contents of the file into keywords, records and
items - and the items should be parsed as correct fundamental type (i.e.
string/int/double). This means that the parser should raise an exception in
the following situations:


   1. Failed to open include file.
   2. Can not parse item - i.e. "xxx" is not a numeric type.
   3. Getting confused with record/keyword termination.

These things should be OK:

   1. An unrecognized keyword (and thereby ignored) (current status: throw)
   2. <?> To short keywords <?/> [This currently throws - Andreas has a PR
   on that as well]
   3. Missing keywords.
   4. Keywords out of order or in wrong section.
   5. Keywords with invalid values (i.e. PERMX < 0)
   6. Invalid Keyword <-> Keyword interactions; i.e. MULTREGT based on
   undefined numerical array.

Observe that points 1 & 2 on the white-list are potentially in conflict
with point 3 on the black list, i.e. relaxing these two points will
increase the chance of confusing the parser, i.e. it can loose track and
end up with an exception.


*EclipseState:* For the EclipseState object I want to be strict. As I see
it there are two levels of strictness which can be enforced on the
EclipseState:

   1. We can require that all supplied input is consistent - otherwise we
   throw. I think that is quite close to the current state; altough we are
   relaxed when it comes to section ordering.
   2. We can require that all necessary input for a simulation is present -
   that is currently quite far from the situation. I.e. the EclipseState
   constructor will happily return a "valid" object even if the input deck
   does not provide any PERMX values - or a directive for how the
   initialization should be performed.

If we are to provide a guarantee of the type 2 above I think it should be
done with an abstract CheckEclipseState() class - then different simulators
can provide custom checkers. Furthermore I guess it would be valuable to
make more subcomponents in the EclipseState object - so that e.g.
properties can be fully validated even if you do not have any Schedule
data. This would be even more important if the EclipseState is made even
stricter.


*Logging of warnings & errors:* It is good to have a log - but the log
object/file should have any role in enforcing/verifying correctness in the
input.


Comments highly appreciated.


jaokim

_______________________________________________
Opm mailing list
[email protected]
http://www.opm-project.org/mailman/listinfo/opm

[OPM] Strict parsing

Reply via email to