[REBOL] A novice question Re:

bhandley Sat, 19 Aug 2000 19:03:40 -0700
Hi Vik,

Did you retry your program in a fresh session of Rebol? It may have been
that during your writing/testing of your program you got to a point that
triggered the Rebol GC bug (which I understand is being looked at by RT).

Regarding the keywords are they tags or text? This might change the
approach. If say your keywords are part of the text, are immediately before
and after your job posting information, and are unique enough, then you
could just ignore the tags completely and parse based on your keywords.
Something like this maybe:

    parse-rules: [
        some [
            thru keyword-one-text
            copy text
            to keyword-two-text
            (print text)
    ]

Also, it may not be relevant, but note that the parse function as used in
script examples ignores spaces by default (use parse/all if you want parse
to process spaces).

On a different track, Rebol version 2.3 has the ability to load markup. Like
this,

    >> loaded-page: load/markup http://www.abc.net.au/news

loaded-page is now a block that contains values of type tag! and type
string!.

    foreach item loaded-page [ if not tag? item [ print item ] ]

or use parse in block mode rules

    abc-news-headlines: [
        thru <!-- start insert of main story copy -->
        some [ thru <b> copy text to </b> (print text)]
        <!--end insert of copy for top stories-->
        to end
    ]

    >> parse loaded-page abc-news-headlines
    Supply ship approaches rescue site as hopes fade
    Muslim extremists collapse hostage release talks
    Monsoon bus tragedy in central India
    US bushfires not letting up
    Gore pulls ahead in US presidential poll
    Man falls overboard in crocodile-infested waters
    Fighting couple force jumbo jet to land
    Sport news

This is good if you know exactly what the value of some items in the block
are, but not sood good if you need to do pattern matching. For example
finding the title text is easy because we know a tag <title> exists in the b
lock.

    >> copy/part find/tail loaded-page <title> 1
    == ["ABC Online News - Latest Bulletin"]

Brett.

----- Original Message -----
From: <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Sent: Sunday, August 20, 2000 8:27 AM
Subject: [REBOL] A novice question


> I was trying to modify the web parser code from the
> User's Guide. The original code is like this:
> tag-parser: make object! [
> tags: make block! 100
> text: make string! 8000
>  html-code: [
>  copy tag ["<" thru ">"] (append tags tag) |
>  copy txt to "<" (append text txt)
>  ]
>  parse-tags: func [site[url!]] [
>   clear tags clear text
>   parse read site [to "<" some html-code]
>   print text
>  ]
> ]
> My aim is to pick up listings from the web site to
> pick up jobs which begin with "keyword_one" and end
> with "keyword_two", but I would still like to get rid
> of the tags. So I tried this
> html-code:

> copy tag ["<" thru "keyword1"] (append tags tag) |
> copy txt to "keyword2" (append text txt)
> ]
> etc.. and then use tag-parser/parse-tags modified-url.
> But this now hangs.
> Any help welcomed by this novice.
> -Vik
>
> __________________________________________________
> Do You Yahoo!?
> Yahoo! Mail - Free email you can access from anywhere!
> http://mail.yahoo.com/
>
[REBOL] A novice question Re:

Reply via email to