Re: globs and rules and trees, oh my! (was: Re: XPath grammars (Was: Re: globs and trees in Perl6))

Jon Lang Fri, 03 Oct 2008 17:48:31 -0700

Timothy S. Nelson wrote:
>> <TimToady> note to treematching folks: it is envisaged that signatures in
>> a rule will match nodes in a tree
>>
>>        My question is, how is this expected to work?  Can someone give an
>> example?
>
>        I'm assuming that this relates to Jon Lang's comment about using
> rules to match non-strings.


Pretty much - although there are some patterns that one might want to
use that can't adequately be expressed in this way - at least, not
without relaxing some of the constraints on signature definition.
Some examples:

A signature always anchors its "positional parameters" pattern to the
first and last positional parameters (analogous to having implicit '^'
and '$' markup at the start and end of a textual pattern), and does
not provide any sort of "zero or more"/"one or more" qualifiers, other
than a single tail-end "slurpy list" option.  Its "zero or one"
qualifier is likewise constrained in that once you use an optional
positional, you're limited to optionals and slurpies from that point
on.  This makes it difficult to set up a pattern that matches, e.g.,
"any instance within the list of a string followed immediately by a
number".

The other issue that signatures-as-patterns doesn't handle very well
is that of capturing and returning matches.  I suppose that this could
be handled, to a limited extent, by breaking the signature up into
several signatures joined together by <,>, and then indicating which
"sub-signatures" are to be returned; but that doesn't work too well
once hierarchal arrangements are introduced.

Perhaps an approach more compatible with normal rules syntax might be
to introduce a series of xml-like tags:

<[> ... <]> lets you denote a nested list of patterns - analogous to
what [ ... ] does outside of rules.  Within its reach, '^' and '$'
refer to "just before the first element" and "just after the last
element", respectively.  Otherwise, this works just like the "list of
objects and/or strings" patterns currently described in S05.

<{> ... <}> does likewise with a nested hash of values, with standard
pair notation being used within in order to link key patterns to value
patterns.  Since hashes are not ordered, '^' and '$' would be
meaningless within this context.  Heck, order in general is
meaningless within this context.

<item> replaces <elem> as the object-based equivalent of '.' ('elem'
is too list-oriented of a term).  I'd recommend doing this even if you
don't take either of the suggestions above.

You might even do a <[[> ... <]]> pairing to denote a list that is
nested perhaps more than one layer down.  Or perhaps that could be
handled by using '<[>+' or the like.

> But how would it be if I wanted to search a tree for all nodes
> whose "readonly" attribute were true, and return an array of
> those nodes?

This can already be done, for the most part:

/ (<.does(ro)>) /

Mind you, this only searches a list; to make it search a tree, you'd
need a drill-down subrule such as I outline above:

/ <[>* (<.does(ro)>) <]>* /

-- 
Jonathan "Dataweaver" Lang

Re: globs and rules and trees, oh my! (was: Re: XPath grammars (Was: Re: globs and trees in Perl6))

Reply via email to