Re: globs and rules and trees, oh my! (was: Re: XPath grammars (Was: Re: globs and trees in Perl6))
Timothy S. Nelson wrote: >> note to treematching folks: it is envisaged that signatures in >> a rule will match nodes in a tree >> >>My question is, how is this expected to work? Can someone give an >> example? > >I'm assuming that this relates to Jon Lang's comment about using > rules to match non-strings. Pretty much - although there are some patterns that one might want to use that can't adequately be expressed in this way - at least, not without relaxing some of the constraints on signature definition. Some examples: A signature always anchors its "positional parameters" pattern to the first and last positional parameters (analogous to having implicit '^' and '$' markup at the start and end of a textual pattern), and does not provide any sort of "zero or more"/"one or more" qualifiers, other than a single tail-end "slurpy list" option. Its "zero or one" qualifier is likewise constrained in that once you use an optional positional, you're limited to optionals and slurpies from that point on. This makes it difficult to set up a pattern that matches, e.g., "any instance within the list of a string followed immediately by a number". The other issue that signatures-as-patterns doesn't handle very well is that of capturing and returning matches. I suppose that this could be handled, to a limited extent, by breaking the signature up into several signatures joined together by <,>, and then indicating which "sub-signatures" are to be returned; but that doesn't work too well once hierarchal arrangements are introduced. Perhaps an approach more compatible with normal rules syntax might be to introduce a series of xml-like tags: <[> ... <]> lets you denote a nested list of patterns - analogous to what [ ... ] does outside of rules. Within its reach, '^' and '$' refer to "just before the first element" and "just after the last element", respectively. Otherwise, this works just like the "list of objects and/or strings" patterns currently described in S05. <{> ... <}> does likewise with a nested hash of values, with standard pair notation being used within in order to link key patterns to value patterns. Since hashes are not ordered, '^' and '$' would be meaningless within this context. Heck, order in general is meaningless within this context. replaces as the object-based equivalent of '.' ('elem' is too list-oriented of a term). I'd recommend doing this even if you don't take either of the suggestions above. You might even do a <[[> ... <]]> pairing to denote a list that is nested perhaps more than one layer down. Or perhaps that could be handled by using '<[>+' or the like. > But how would it be if I wanted to search a tree for all nodes > whose "readonly" attribute were true, and return an array of > those nodes? This can already be done, for the most part: / (<.does(ro)>) / Mind you, this only searches a list; to make it search a tree, you'd need a drill-down subrule such as I outline above: / <[>* (<.does(ro)>) <]>* / -- Jonathan "Dataweaver" Lang
globs and rules and trees, oh my! (was: Re: XPath grammars (Was: Re: globs and trees in Perl6))
On Fri, 3 Oct 2008, Timothy S. Nelson wrote: On Fri, 3 Oct 2008, Timothy S. Nelson wrote: On Thu, 2 Oct 2008, Timothy S. Nelson wrote: Now that Perl6 is in the mix, though, I think that the best way to do it is to make roles that model eg. Nodes, Plexes (Documents), Elements, and the like, and then have operators on them do all the work (like my idea of using a slash for a combined feed and call code operator). I could be wrong, but it seems to me that we could get something that's somewhat like XPath this way, without having to worry about defining an XPath grammar or anything. I'm talking to myself here :). The guys on IRC convinced me that the way to go might be something like a grammar, but that does trees and tree transformations instead of a text input stream. See the IRC log for details :). Talking to myself again. I'm not as convinced as I was. I'll write up a long post about that if necessary, but want to get something else figured out first. First, a paste from the IRC log: note to treematching folks: it is envisaged that signatures in a rule will match nodes in a tree My question is, how is this expected to work? Can someone give an example? I'm assuming that this relates to Jon Lang's comment about using rules to match non-strings. I'm starting to see how *matching* would work now. But how would it be if I wanted to search a tree for all nodes whose "readonly" attribute were true, and return an array of those nodes? Hmm. Or something like the following XPath... /html/body//p/[EMAIL PROTECTED] = "#SampleName"] (my XPath isn't that great, but I'm assuming this will find all a tags in the html body that are the direct descendant of a paragraph, and have their @name attribute set to "#SampleName"). My guess is something like this: $htmlobject = HTML->new(); $htmlobject.children() ==> grep { /html/ } ==> map { .children() } ==> grep { /body/ } ==> map { .children() } ==> recursivegrep { /p/ } ==> map { .children() } ==> grep { /a/ and .name eq "#Samplename" } ==> $anchors I'm guessing that might do it (although the tree role and recursivegrep would also require some work). But the syntax is dreadful :). You can see why I was talking about having a feed operator that did a grep and got children as well as doing the feed. But no doubt I'm missing something here. And I'm still thinking like a Perl5 programmer :). :) - | Name: Tim Nelson | Because the Creator is,| | E-mail: [EMAIL PROTECTED]| I am | - BEGIN GEEK CODE BLOCK Version 3.12 GCS d+++ s+: a- C++$ U+++$ P+++$ L+++ E- W+ N+ w--- V- PE(+) Y+>++ PGP->+++ R(+) !tv b++ DI D G+ e++> h! y- -END GEEK CODE BLOCK-