Carl:
> And, out of interest, I converted my index-sort to work with the file
> in memory, (since you kindly left out disk-access in the timing:),
> and got a result of 01:48.26. Joel's method still wins though. (:
> I've 32megs, but didn't have any memory problems.
I've incorporated your code into the set of benchmarks I seem to be
gathering.
You are faster than Joel _sometimes_ on my machine -- the results are too
close to call without a properly refereed benchmark environment.
Your code makes less assumptions that Joel's -- yours is sorting on the first
four characters of a line. That's should be in the format "b999" (b=space
9=digit) but your code will work if it isn't
Gregg and I make even less (or maybe different) assumptions -- we'll sort on
the first non-whitespace string regardless of length. Whether in practice
that is more future-proof is unknowable.
I've tweaked both mine and Scott's code by replacing "second parse/all" with
"first parse". 20% faster -- but we still won't win on raw speed though.
I ran the benchmark 5 times with:
loop 5 [do %sort-test.r]
It shows some interesting deterioration for long-running Rebol consoles:
41.41 -- Sunanda's parse in the Sort
75.51
78.43
78.43
81.73
8.19 -- Gregg's parse before the sort
8.41
8.41
8.52
8.67
2.25 -- Scott's index sort
2.25
2.30
2.30
2.42
2.31 -- Joel's bridge sort
2.26
2.31
2.30
2.25
The second and subsequent runs run the loop have all the interim data
structures in place, and possibly getting in our way.
But look at the way my sort deteriorates!! It is possible that my code gets
slugged by an internal garbage-collection, and if we ran it enough times any
of the benchmarks could be hit by that.
But I've seen something similar when repeatedly doing 'layout on very large
(over 1000) faces. Gradually, treacle gets poured into the fine workings of
the Rebol interpretor, and it slows down. Even the undocumented
recycle/torture doesn't get it back to speed.
All of our methods, except Joel's, show signs of slowing down under repeated
usage. Shows the importance of stress-testing any script that is expected to
run long and hard.
I've repeated below the benchmark code I've used here. Apologies here to
anyone for whom this email is too long and on no interest.
Sunanda.
rebol []
report-item: func [what [string!] /start /end /local times] [
times: []
if start [
clear times
append times now/time/precise
print ["rebol " system/version " -- Started" what times/1]
]
if end [
append times now/time/precise
print ["rebol " system/version " -- Ended" what times/2 " elapsed: "
times/2 - times/1 ]
]
]
data: read/lines %louis-data.txt
;; data: copy/part data 100 ;;de-comment for less data
;; loop 5 [append data data] ;;de-comment for more data
;; Sunanda -- Parse every sort compare
;; -----------------------------------
unset 'sorted-data
recycle
report-item/start "parse/stable"
sort/compare sorted-data: copy data func [a b /local a-key b-key ] [
a-key: first parse a " "
b-key: first parse b " "
either a-key < b-key [return -1]
[either a-key = b-key [return 0] [return +1]]
]
report-item/end "parse/stable"
write/lines %sorted-data-parse.txt sorted-data
;; Scott -- pre-parsed before sort
;; -------------------------------
unset 'sorted-data
recycle
report-item/start "pre-parsed/stable"
data-blk: copy []
foreach datum data [
append data-blk first parse datum " "
append data-blk datum
]
sort/skip data-blk 2
sorted-data: copy []
foreach [key value] data-blk [
append sorted-data value
]
report-item/end "pre-parsed/stable"
write %sorted-data-pre-parsed.txt ""
foreach value sorted-data [
write/lines/append %sorted-data-pre-parsed.txt value
]
;; Carl -- index sort
;; ------------------
unset 'sorted-data
recycle
report-item/start "index sort"
file-index: array 2 * length? data
ptr: 1
foreach item data [
poke file-index ptr copy/part item 4
poke file-index ptr + 1 item
ptr: ptr + 2
]
sorted-data: extract next sort/skip file-index 2 2
report-item/end "index sort"
write/lines %sorted-data-index.txt sorted-data
;; Joel -- bridge sort
;; -------------------
unset 'sorted-data
recycle
report-item/start "bridge sort"
buffer: []
foreach item data [
nr: to-integer copy/part next item 3
while [
nr > length? buffer
][
insert/only tail buffer copy []
]
append buffer/:nr item
]
sorted-data: copy []
foreach group buffer [
foreach line group [
append sorted-data line
]
]
report-item/end "bridge sort"
write %sorted-data-bridge.txt ""
foreach value sorted-data [
write/lines/append %sorted-data-bridge.txt value
]
unset 'sorted-data
recycle
;; Verify sort results
;; ===================
report-item/start "verifying results are the same"
parsed-file: read %sorted-data-parse.txt
pre-parsed-file: read %sorted-data-pre-parsed.txt
bridge-file: read %sorted-data-bridge.txt
indexed-file: read %sorted-data-index.txt
either all [parsed-file = pre-parsed-file
parsed-file = bridge-file
parsed-file = indexed-file
]
[print "got same results"]
[print "bad code in there somewhere"]
report-item/end "verifying results are the same"
print "done"
--
To unsubscribe from this list, please send an email to
[EMAIL PROTECTED] with "unsubscribe" in the
subject, without the quotes.