Could this script evolve in the following ways?
1. Ignore header info like author, date and comments
2. Ignore a difference in URLs, filenames, etc. when
they are the target of a particular verb. I.E. if I
read a URL, but the specific value differs, the 2
scripts are still doing the same thing. In fact the
coolest thing I can think of in this case would be
reporting that "these 2 scripts are functionally the
same, but the first operates on URL:
http://www.url1.com and the second on
http://www.url2.com..."
Don't mean to put the work on others, but I have
barely had the time to even dabble with Rebol, yet
(hey, 'dabble with Rebol' has a cool ring to it!). I
do, however, have a knack for grammars and other
meta/abstraction concepts. And since one can look at
the template of any defined function at run time, it
seems possible to determine which tokens have
significance in a context, and which have none or
less. Sort of like being able to determine that "the
names have been changed to protect the innocent", but
the story's the same. What dost thou think?
--- [EMAIL PROTECTED] wrote:
>
> Well, it sounded fun so here's what I've got. The
> output running it
> on the two files you talked about is at the bottom.
> The diff shows a
> list of blocks with tokens and a number which is how
> many times that
> token was in the file. You may see the same token
> listed in the diff
> for each file if the number of appearances is
> different.
>
> Well, enjoy!
>
> Sterling
>
> REBOL [
> Title: "Simple token diff"
> Purpose: {
> I don't know, really. It just tries to
> figure out how many REBOL tokens are different
> between two files. Somebody thought it would
> be neat. ;) Maybe they'll ake it complete and
> fix whatever lurking bugs there are in this code.}
> Author: "Sterling Newton"
> ]
>
> a: ask "File or URL #1? "
> b: ask "File or URL #2? "
>
> get-type: func [item [string!]] [
> switch/default true reduce [
> found? find item "://" [item: to-url item]
> found? find item "%" [item: to-file next item]
> ] [a: to-file a]
> item
> ]
>
> a: get-type a
> b: get-type b
>
> ; the unique tokens and totals blocks
> foreach item [a-tokens b-tokens a-totals b-totals] [
> set item copy []
> ]
>
> file1: load/next a
> file2: load/next b
>
> tokenize-block: func [
> blk [block!] tokens [block!] totals [block!]
> /local tmp idx]
> [
> while [not empty? blk] [
> either block? blk/1 [
> tokenize-block load/next form blk/1 tokens totals
> ] [
> either tmp: find tokens blk/1 [
> idx: index? tmp
> totals/:idx/2: totals/:idx/2 + 1
> ] [
> append tokens blk/1
> repend/only totals [blk/1 1]
> ]
> ]
> blk: load/next blk/2
> ]
> ]
>
> tokenize-block load/next file1 a-tokens a-totals
> tokenize-block load/next file2 b-tokens b-totals
>
> print ["The two files differ by:" length? difference
> a-tokens b-tokens "tokens."]
> print ["----- Tokens in" a "not in" b "-----"]
> foreach item intersect diff: difference a-totals
> b-totals a-totals [
> probe item
> ]
>
> print ["----- Tokens in" b "not in" a "-----"]
> foreach item intersect diff b-totals [
> probe item
> ]
> ========== results from the two web page emailing
> scripts ==========
>
> >> do %/home/moses/temp/diff.r
> File or URL #1?
> http://www.rebol.com/library/html/mailpage.html
> File or URL #2?
> http://www.rebol.com/library/html/websend.html
> The two files differ by: 14 tokens.
> ----- Tokens in
> http://www.rebol.com/library/html/mailpage.html not
> in http://www.rebol.com/library/html/websend.html
> -----
> [Email 2]
> [a 2]
> [Page 1]
> [mailpage.r 1]
> [10-Sep-1999 1]
> [page. 1]
> [(simple) 1]
> [http://www.rebol.com/releases.html</font> 1]
> ----- Tokens in
> http://www.rebol.com/library/html/websend.html not
> in http://www.rebol.com/library/html/mailpage.html
> -----
> [Page 2]
> [Emailer 1]
> [websend.r 1]
> [20-May-1999 1]
> [Fetch 1]
> [a 1]
> [and 1]
> [it 1]
> [as 1]
> [email. 1]
> [email 1]
> [http://www.rebol.com</font> 1]
>
__________________________________________________
Do You Yahoo!?
Yahoo! Mail - Free email you can access from anywhere!
http://mail.yahoo.com/