Re: [cg] tag order sensitivity in CG

Eckhard Bick Tue, 13 Jun 2017 22:17:17 -0700

Not sure I see the problem ...

I wasn't talking about changing sets on the fly at run time, only a*one-time* rewriting: Tag1·Tag2 tag strings being rewritten as //rreg.ex'es in order to preserve order. This would only be rewritten once,at compile time, not run time. And if we basically have a LINE tagalready, the re-written tag strings could be matched against this LINE.

The tag string rewriting is so simple that it could be done by a grammarpreprocessor rather than the actual compiler.

What would need to be adapted is only that this ordered-tag reg.ex'eswould be recognized as such and tested against LINE rather than ordinarytags or set in the cohort. Isn't that just a simple IF branch duringtarget/context matching? A bit like treating $$ unification setsdifferently?

Of course I'm talking algorithmically here, not claiming to predict thecomplexity of an implementation. But it looks feasible to me.


-- Eckhard


On 06/13/2017 08:46 PM, Tino Didriksen wrote:

Replied inline...
On 10 June 2017 at 07:40, Eckhard Bick <[email protected]<mailto:[email protected]>> wrote:
    1. We introduce a magic tag LINE, maintained by the compiler,
    constituted by the *whole* reading line (plus the word form at the
    start) as *one* tag, i.e. *without breaking on space*.
That part is easy and basically already done. The reading alreadystores an ordered list of tags - this is not the problem.
    2. If LIST or on-the-fly definitions use a tag parenthesis with
    space, e.g. (Tag1 Tag2), in a rule with the flag TAGORDER, this
    will be converted internally to /^(.* )?Tag1 Tag2( .*)?$/r.

        REMOVE TAGORDER (Tag3) IF (*1 (Tag1 Tag2)) ;
That's where it breaks. To change how some sets are compiled (or evenworse, recompiled for non-inline sets) based on a rule flag is a majorchange and kludge. There is currently zero interaction between thesetwo parts, and there shouldn't be. Sets don't know they are beingparsed in the context of a rule, and it would be messy to add markersto not deduplicate these new kinds of sets.
    In addition to, or instead of, TAGORDER at the rule level, we
    could also introduce the concept of a "nonbreaking space
    character", e.g. · (mini-bullet) or double underscore, to allow
    flexible use of tag order down at the level of individual
    contexts: (Tag1·Tag2) or (Tag1__Tag2).
In the final solution, I will need to introduce regex-like * . .+ .*(or whatever) as placeholders for zero, one, one-or-more, zero-or-moreany-tags to let writers express everything.
    Tino, is my intuition correct that it would not be so hard to turn
    this algorithmical idea into code? And what would it cost,
    speed-wise? Given that it would be relevant only for some rules, I
    guess, it can't be too bad.
Code-wise, not worth delaying the actual implementation for. It'dreach far into many corners, without actually getting us any closer tothe correct solution.
Speed-wise, it would be bad.

-- Tino Didriksen

--
You received this message because you are subscribed to the GoogleGroups "Constraint Grammar" group.To unsubscribe from this group and stop receiving emails from it, sendan email to [email protected]<mailto:[email protected]>.To post to this group, send email to[email protected]<mailto:[email protected]>.
Visit this group at https://groups.google.com/group/constraint-grammar.
For more options, visit https://groups.google.com/d/optout.



--
Eckhard Bick,
cand.med., dr.phil.
University of Southern Denmark
e-mail: [email protected]
web: http://beta.visl.sdu.dk

--
You received this message because you are subscribed to the Google Groups 
"Constraint Grammar" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/constraint-grammar.
For more options, visit https://groups.google.com/d/optout.

Re: [cg] tag order sensitivity in CG

Reply via email to