Hi Vik,
Did you retry your program in a fresh session of Rebol? It may have been
that during your writing/testing of your program you got to a point that
triggered the Rebol GC bug (which I understand is being looked at by RT).
Regarding the keywords are they tags or text? This might change the
approach. If say your keywords are part of the text, are immediately before
and after your job posting information, and are unique enough, then you
could just ignore the tags completely and parse based on your keywords.
Something like this maybe:
parse-rules: [
some [
thru keyword-one-text
copy text
to keyword-two-text
(print text)
]
Also, it may not be relevant, but note that the parse function as used in
script examples ignores spaces by default (use parse/all if you want parse
to process spaces).
On a different track, Rebol version 2.3 has the ability to load markup. Like
this,
>> loaded-page: load/markup http://www.abc.net.au/news
loaded-page is now a block that contains values of type tag! and type
string!.
foreach item loaded-page [ if not tag? item [ print item ] ]
or use parse in block mode rules
abc-news-headlines: [
thru <!-- start insert of main story copy -->
some [ thru <b> copy text to </b> (print text)]
<!--end insert of copy for top stories-->
to end
]
>> parse loaded-page abc-news-headlines
Supply ship approaches rescue site as hopes fade
Muslim extremists collapse hostage release talks
Monsoon bus tragedy in central India
US bushfires not letting up
Gore pulls ahead in US presidential poll
Man falls overboard in crocodile-infested waters
Fighting couple force jumbo jet to land
Sport news
This is good if you know exactly what the value of some items in the block
are, but not sood good if you need to do pattern matching. For example
finding the title text is easy because we know a tag <title> exists in the b
lock.
>> copy/part find/tail loaded-page <title> 1
== ["ABC Online News - Latest Bulletin"]
Brett.
----- Original Message -----
From: <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Sent: Sunday, August 20, 2000 8:27 AM
Subject: [REBOL] A novice question
> I was trying to modify the web parser code from the
> User's Guide. The original code is like this:
> tag-parser: make object! [
> tags: make block! 100
> text: make string! 8000
> html-code: [
> copy tag ["<" thru ">"] (append tags tag) |
> copy txt to "<" (append text txt)
> ]
> parse-tags: func [site[url!]] [
> clear tags clear text
> parse read site [to "<" some html-code]
> print text
> ]
> ]
> My aim is to pick up listings from the web site to
> pick up jobs which begin with "keyword_one" and end
> with "keyword_two", but I would still like to get rid
> of the tags. So I tried this
> html-code:
> copy tag ["<" thru "keyword1"] (append tags tag) |
> copy txt to "keyword2" (append text txt)
> ]
> etc.. and then use tag-parser/parse-tags modified-url.
> But this now hangs.
> Any help welcomed by this novice.
> -Vik
>
> __________________________________________________
> Do You Yahoo!?
> Yahoo! Mail - Free email you can access from anywhere!
> http://mail.yahoo.com/
>