[ 
https://issues.apache.org/jira/browse/DAFFODIL-2692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17581540#comment-17581540
 ] 

Mike Beckerle commented on DAFFODIL-2692:
-----------------------------------------

Recent email discussion suggests this idea should evolve to be not just value 
pattern, but the pattern should be expressed by way of the XSD facets on an 
element.

See: [https://lists.apache.org/thread/3t3j72k9z6dcpyb4pqsh5mh5lg53mgt9]

The name 'valuePattern' may not be well-chosen given this different direction.

At the crux of this all is the realization that the value patterns must be 
regex with longest-match for alternatives as the semantics. XSD pattern facet 
matching has this behavior, the dfdl:lengthKind 'pattern' regex engine does 
NOT. it has the left-to-right order behavior. 

(Also: Apache Xerces has an XSD validator which contains a regex matcher for 
pattern facets that implements this longest-match behavior, so there's at least 
one source of an open-source acceptable license code-base for this regex 
engine.)

> Add lengthKind 'valuePattern' which uses regex to match allowed data values
> ---------------------------------------------------------------------------
>
>                 Key: DAFFODIL-2692
>                 URL: https://issues.apache.org/jira/browse/DAFFODIL-2692
>             Project: Daffodil
>          Issue Type: New Feature
>          Components: Back End, Front End
>    Affects Versions: 3.3.0
>            Reporter: Mike Beckerle
>            Priority: Major
>
> Existing dfdl:lengthKind 'pattern' uses the pattern to determine the length. 
> No match means length 0.
> People want to use regular expression (or regex) matches differently from 
> this. They want to specify the allowed data patterns, with no match meaning 
> parse error. 
> This should be added as a dfdlx experimental feature to develop experience 
> with it. 
> A few design issues: we need to decide if this pattern includes nil values in 
> its syntax, or if those get added as allowed value patterns automatically. It 
> is simpler if we define this to require that the regex pattern specify all 
> possible data patterns that are accepted, whether they become nilled 
> elements, or elements with values. That, however, requires one to redundantly 
> express the dfdl:nilValue information.
> There may also be an interaction with properties like 
> dfdl:emptyValueDelimiterPolicy and the empty representation. I.e., does the 
> pattern have to allow for a zero-length successful match in order for the 
> data to be zero-length?
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to