[REBOL] Re: Complex Series Parsing (Part 2)

sterling Fri, 09 Mar 2001 10:55:37 -0800

Well, before anybody goes further into the "here's something that
works for the last input you posted" followed by "but then there's
this input that doesn't work" path, lets go back to the definition of
input and output.

If you use load.markup and trat the REBOL words you have in your block 
as strings like Andrew suggests (which is a better way to deal with
them), then you have these input elements:
* <text>  -- open text tag
* "???"   -- some arbitrary string
* <???>   -- some other open tag
* </???>  -- some close tag
* </text> -- a close text tag

Your input looks like this:
probe input: load/markup {<tag0></tag0> <text> this and that
<tag1>those </tag1>and > these</text><tag2></tag2><text>There and
then</text>}
== [<tag0> </tag0> " " <text> " this and that^/" <tag1> "those "
</tag1> "and > these" </text> <tag2> </tag2> <text> "There and^/then"
</text>]

You can get rid of the whitespace-only strings if you want to that are 
created due to whitespace between the tags.
Now write the spec:
* any combination of input elements up to <text>
* open <text>
* any combination of "???", <???>, </???> where <text> whould be
inserted if front of each "???"
* </text>
* start the whole process over
Done.

That's all you've told us so far.  Each item above is essentially a
parse rule already.  Some can be joined together:
* [thru <text>]
* [any [
                </text> [thru <text>]
                | tag!
                | string! mark: (insert back mark <text>) string!
                ]
        ]

Now we just assemble:
 ; skip the immediate string after <text> so we don't add a second one
start-rule: [thru <text> [string! | none]]
parse imput [
        start-rule
        any [
                </text> start-rule ; start over
                | tag! ; eat any random tags
                | string! mark: (insert back mark <text>) string!
        ]
]

probe input

And presto!

Sterling

> This is on the right track.  But more complexity would arise... here is an
> advanced XML structure...
> 
> y: [<tag0></tag0> <text> this and that <tag1>those </tag1>and
> these</text><tag2></tag2><text>There and then</text>]
> output would be...
> out: [
> <tag0>
> </tag0>
> <text> this and that
> <tag1>
> <text> those
> </tag1>
> <text> and these
> </text>
> <tag2>
> </tag2>
> <text> There and then
> </text>
> ]
> 
> 
> There is method to the madness, I've got the "madness" part down pat, now if
> I could only come up with "the method".
> 
> Thanks
> Terry Brownell
> 
> ----- Original Message -----
> From: <[EMAIL PROTECTED]>
> To: <[EMAIL PROTECTED]>
> Sent: Thursday, March 08, 2001 4:17 PM
> Subject: [REBOL] Re: Complex Series Parsing (Part 2)
> 
> 
> >
> > I'm not sure I understand what you are really trying to do.  Usually
> > with parse, once you describe the format of what you want to parse and
> > the output wou desire, the parse rules just fall out onto the screen.
> > Correct me if I'm wrong:
> >
> > Input is a block with the following format:
> > A <text> tag followed by a series of words with any number of non
> > <text> or </text> tags interspersed and ends with a </text> tag.
> >
> > The desired output is the same block except that every place there is
> > a non <text> tag in the block a <text> tag should be placed after it
> > and before the next series of words.  The ending </text> tag should be
> > removed.
> >
> > For this you don't need parse at all.  Just march through the block
> > and insert the new <text> tag as needed:
> > y: [
> > <text> This is some text <tag> with a tag added </tag> and then some text
> </text>
> > ]
> >
> > forall y [
> > all [tag? y/1 y/1 <> <text> y/1 <> </text> insert next y <text>]
> > all [y/1 = </text> remove y y: back y]
> > ]
> >
> > probe y: head y
> >
> > Perhaps your rules are a bit more complicated in which caase you need
> > to define them and then see what's the best way to do it.  Parse may
> > be necessary but this simple case can be done quickly another way.
> >
> > Sterling
> >

-- 
To unsubscribe from this list, please send an email to
[EMAIL PROTECTED] with "unsubscribe" in the 
subject, without the quotes.
[REBOL] Re: Complex Series Parsing (Part 2)

Reply via email to