The main issue is that, under the current implementation, in order to access 
information about a simpleType, we need to instantiate said type. If 
instantiating a type in the global context throws an error, then we cannot 
inspect it to determine if we needed the global instance in the first place. In 
theory, we should be able to move the necessary information to the 
simpleTypeDefFactory; but I believe we decided that this would be a substantial 
refactoring effort. Although, now that I think about it, the issue now is 
simpler than the issue before in that the only property we really need to 
compute globally is "hasRepType". This is a bit non-trivial, as it potentially 
involves inspecting the children for repTypes as well, but might be more doable.


I suspect that this still leaves behind a less than ideal situation, where you 
might need to add meaningless attributes to some simpleTypes which have 
repTypes defined. Those should be straightforward to fix as they come up (eg. 
replace Assert(foo) with Assert(foo && optRepType.isEmpty), but it will be 
whack-a-mole to fix these; and given how many corners DFDL has for them to hide 
in, our implementation will probably never be "correct",

________________________________
From: Beckerle, Mike <mbecke...@tresys.com>
Sent: Thursday, April 4, 2019 6:44:18 PM
To: dev@daffodil.apache.org
Subject: Re: Further design difficulties with TypeValueCalculators

The type stMinlength1 is clearly not suitable for many kinds of reuse.

Not all top level type defs will be suitable for use as a rep type because they 
are just incompletely specified. As you point out here, if used without 
supplying a different lengthKind it will SDE.

This is ok. It is just a schema design error.

For a type to be usable as a rep type it has to have enough properties to stand 
alone.

Or am I missing something?

Get Outlook for Android<https://aka.ms/ghei36>

________________________________
From: Sloane, Brandon <bslo...@tresys.com>
Sent: Thursday, April 4, 2019 5:42:02 PM
To: dev@daffodil.apache.org
Subject: Further design difficulties with TypeValueCalculators

I spoke with Mike about this issue the other day, but after more work on the 
implementation, it appears that the issue is more difficult that we had 
realized.


The enum support proposal allows simpleTypes to define mappings between 
representation values and logical values. Further, this mapping is accessible 
to DPath functions by specifying the QName of the simple type. This means that 
we must have a mapping of QName -> TypeCalculator, where TypeCalculator is a 
property of a GlobalSimpleType.


However, in our current implementation, all GlobalSimpleTypes are constructed 
by inlining them into their element, which means it is impossible to construct 
a global instance. In an attempt to work around this, I created psuedo-element 
in the compiler to serve as the element for the global instance to attach to.


In the common case, this works. However, consider the following schema (taken 
from section05/facets/Facets.tdml ):


3897     <dfdl:format ref="ex:GeneralFormat" />


4207     <xs:simpleType name="stMinLength_1">
4208       <xs:restriction base="xs:string">
4209         <xs:minLength value="1" />
4210       </xs:restriction>
4211     </xs:simpleType>

3963     <xs:element name="e2_2" dfdl:lengthKind="delimited">
3970       <xs:simpleType>
3971         <xs:restriction base="ex:stMinLength_1">
3972           <xs:minLength value="2" />
3973         </xs:restriction>
3974       </xs:simpleType>
3975     </xs:element>

Note that, in the global context, we have are using the format 
ex:GeneralFormat, which defines dfdl:lengthKind="implicit". However, it is a 
schemaDefinitionError to use stMinLength_1 with dfdl:lengthKind="implicit", 
because it does not define a maxLength.

In theory, this is okay, because it is only ever used by the element e2_2, 
which defines dfdl:lengthKind="delimited".

However, it means that attempting to construct a global instance of this type 
will fail unless the compiler knows to overwrite lengthKind; and any solution 
that attempts to solve this by actually overriting troublesome annotations 
seems fraught with peril.


Ultimately, it seems to me that the issue here is the same as our fundamental 
performance issue of excessive inlining, as constructing a simple type requires 
the simple type to be inlined in an element. If this is correct, then fixing it 
would require doing at least part of the performance fix (in particular, fixing 
it in the DSOM structure) before it is possible to implement the enum support 
proposal as written (or, find some much more hacky workaround).


Any thoughts?


Brandon T. Sloane

Associate, Services

bslo...@tresys.com | tresys.com

Reply via email to