G'day,

Ramaswamy wrote:
I am using bison for parsing the ASN.1 spec file. The defined syntax leads to ambiguous parsings. consider the code fragment below
myobject MYCLASS ::= { Integer { 1, 2 } }

It gets worse. Consider the following:


v1 T1 ::= {a 1, b 2}

The right-hand side is syntactically equivalent to a SEQUENCE value, and to a DefinedSyntax value.

In fact, a DefinedSyntax value may be syntactically equivalent to a SequenceValue/SetValue, or a SequenceOfValue/SetOfValue, or an ObjectIdentifierValue, or an argument list (there's probably a few more equivalences, but that's off the top of my head).

And any argument list is syntactically equivalent to a DefaultSyntax value: {T{1,2}} may be a valid sentence for the default syntax {&Type &value}, and for the default syntax {&Type}, and for the defined syntax {T &value}.

How can I modify the grammar to accept the input in either one of its for without the context info??

Short answer: you cannot. In some cases, an ASN.1 value cannot be determined without full knowledge of the associated ASN.1 type. For example, what is "{a 1}" without any further context? Is at an ObjectIdentifierValue? Is it a SequenceValue? Or is it a DefinedSyntaxValue? Without the type, it's just a list of lexical tokens, nothing more.


There are two methods of dealing with this ambiguity:

(a) When parsing, gather up any token sequence of the form "{...}", making sure that nested "{...}" sentences are handled. This will at least make sure that the input is *almost* well-formed. When the type information is known, the token sequence can be parsed using the appropriate production.

(b) If a token "{" appears when parsing a Value, then mark the current point in the parse. Try parsing each Value production alternative that may begin with a "{", rewinding back to the starting point afterwards. If only one production succeeded, then that's your candidate. If more than one productions succeeded, then bundle them up into an "AmbiguousValue" that can be resolved when the full type information is known.

Method (a) allows for easy parsing, but will not catch invalid sentences such as "{1 SEQUENCE , , }" until type information is known. If you want to check if a some ASN.1 is syntactically correct (and without performing any linking/semantic analysis), then this method is not adequate.

Method (b) will guarantee that the ASN.1 input is syntactically valid after the parse. However, there may be a lot of backtracking. Also, by parsiung the same sentence more than once, many syntax errors may be generated, and it is not generally determinable which errors should be discarded (if any).

The previous ASN.1 compiler I developed used method (a), but another one I'm messing about with now uses method (b). This is because I wanted to split the compiler into two parts: one part that parses the input and checks that it is syntactically valid, and one part that takes this valid input and performs the necessary semantic analysis. If you're not interested in syntactic validity, then method (a) is sufficient.

[I'm thinking about freely releasing the syntax parser soon, with the semantic analyser to follow later. I'm just in the middle of writing a grammar recognizer that will generate the necessary syntactic test cases. This would, I hope, be part of the asn1.org test suite that was once mooted]

Cheers,
Geoff






Reply via email to