Re: globs and rules and trees, oh my! (was: Re: XPath grammars (Was: Re: globs and trees in Perl6))

2008-10-03 Thread Jon Lang
Timothy S. Nelson wrote:
>>  note to treematching folks: it is envisaged that signatures in
>> a rule will match nodes in a tree
>>
>>My question is, how is this expected to work?  Can someone give an
>> example?
>
>I'm assuming that this relates to Jon Lang's comment about using
> rules to match non-strings.

Pretty much - although there are some patterns that one might want to
use that can't adequately be expressed in this way - at least, not
without relaxing some of the constraints on signature definition.
Some examples:

A signature always anchors its "positional parameters" pattern to the
first and last positional parameters (analogous to having implicit '^'
and '$' markup at the start and end of a textual pattern), and does
not provide any sort of "zero or more"/"one or more" qualifiers, other
than a single tail-end "slurpy list" option.  Its "zero or one"
qualifier is likewise constrained in that once you use an optional
positional, you're limited to optionals and slurpies from that point
on.  This makes it difficult to set up a pattern that matches, e.g.,
"any instance within the list of a string followed immediately by a
number".

The other issue that signatures-as-patterns doesn't handle very well
is that of capturing and returning matches.  I suppose that this could
be handled, to a limited extent, by breaking the signature up into
several signatures joined together by <,>, and then indicating which
"sub-signatures" are to be returned; but that doesn't work too well
once hierarchal arrangements are introduced.

Perhaps an approach more compatible with normal rules syntax might be
to introduce a series of xml-like tags:

<[> ... <]> lets you denote a nested list of patterns - analogous to
what [ ... ] does outside of rules.  Within its reach, '^' and '$'
refer to "just before the first element" and "just after the last
element", respectively.  Otherwise, this works just like the "list of
objects and/or strings" patterns currently described in S05.

<{> ... <}> does likewise with a nested hash of values, with standard
pair notation being used within in order to link key patterns to value
patterns.  Since hashes are not ordered, '^' and '$' would be
meaningless within this context.  Heck, order in general is
meaningless within this context.

 replaces  as the object-based equivalent of '.' ('elem'
is too list-oriented of a term).  I'd recommend doing this even if you
don't take either of the suggestions above.

You might even do a <[[> ... <]]> pairing to denote a list that is
nested perhaps more than one layer down.  Or perhaps that could be
handled by using '<[>+' or the like.

> But how would it be if I wanted to search a tree for all nodes
> whose "readonly" attribute were true, and return an array of
> those nodes?

This can already be done, for the most part:

/ (<.does(ro)>) /

Mind you, this only searches a list; to make it search a tree, you'd
need a drill-down subrule such as I outline above:

/ <[>* (<.does(ro)>) <]>* /

-- 
Jonathan "Dataweaver" Lang


globs and rules and trees, oh my! (was: Re: XPath grammars (Was: Re: globs and trees in Perl6))

2008-10-02 Thread Timothy S. Nelson

On Fri, 3 Oct 2008, Timothy S. Nelson wrote:


On Fri, 3 Oct 2008, Timothy S. Nelson wrote:


On Thu, 2 Oct 2008, Timothy S. Nelson wrote:

	Now that Perl6 is in the mix, though, I think that the best way to do 
it is to make roles that model eg. Nodes, Plexes (Documents), Elements, 
and the like, and then have operators on them do all the work (like my 
idea of using a slash for a combined feed and call code operator).  I 
could be wrong, but it seems to me that we could get something that's 
somewhat like XPath this way, without having to worry about defining an 
XPath grammar or anything.


	I'm talking to myself here :).  The guys on IRC convinced me that the 
way to go might be something like a grammar, but that does trees and tree 
transformations instead of a text input stream.  See the IRC log for 
details :).


	Talking to myself again.  I'm not as convinced as I was.  I'll write 
up a long post about that if necessary, but want to get something else 
figured out first.  First, a paste from the IRC log:


 note to treematching folks: it is envisaged that signatures in a
rule will match nodes in a tree

	My question is, how is this expected to work?  Can someone give an 
example?


	I'm assuming that this relates to Jon Lang's comment about using rules 
to match non-strings.


	I'm starting to see how *matching* would work now.  But how would it 
be if I wanted to search a tree for all nodes whose "readonly" attribute were 
true, and return an array of those nodes?  Hmm.  Or something like the 
following XPath...


/html/body//p/[EMAIL PROTECTED] = "#SampleName"]

	(my XPath isn't that great, but I'm assuming this will find all a tags 
in the html body that are the direct descendant of a paragraph, and have their 
@name attribute set to "#SampleName").  My guess is something like this:


$htmlobject = HTML->new();

$htmlobject.children()
==> grep { /html/ } ==> map { .children() }
==> grep { /body/ } ==> map { .children() }
==> recursivegrep { /p/ } ==> map { .children() }
==> grep { /a/ and .name eq "#Samplename" } ==> $anchors

	I'm guessing that might do it (although the tree role and 
recursivegrep would also require some work).  But the syntax is dreadful :). 
You can see why I was talking about having a feed operator that did a grep 
and got children as well as doing the feed.  But no doubt I'm missing 
something here.  And I'm still thinking like a Perl5 programmer :).


:)


-
| Name: Tim Nelson | Because the Creator is,|
| E-mail: [EMAIL PROTECTED]| I am   |
-

BEGIN GEEK CODE BLOCK
Version 3.12
GCS d+++ s+: a- C++$ U+++$ P+++$ L+++ E- W+ N+ w--- V- 
PE(+) Y+>++ PGP->+++ R(+) !tv b++ DI D G+ e++> h! y-

-END GEEK CODE BLOCK-