Hi Holger,

Thanks for your response. It is good that hash works for so many REBOL
datatypes. I note that with a few exceptions the hashable values are of type
any-string, any-word, or the so-called simple types in which the data is
included in the REBOL value.

I have not had a chance to test all of these possiblities. However I
question whether decimal keys in a block of consecutive key-value pairs are
hashable, or more precisely, the benefit of using a hash with decimal keys.
I ran the following test script:

REBOL []
clk: func [][now/time/precise]
;set up blocks and hashes with block values
;and either integer or decimal keys
;measure the time to find the last key in the block

b1: make block! 10000
b2: make block! 10000

repeat j 10000 [
 append b1 j append/only b1 reduce [j]
]
repeat j 10000 [
 append b2 to-decimal j append/only b2 reduce [j]
]

h1: to-hash b1
h2: to-hash b2

t1: clk loop 1000 [find b1 10000] t1: clk - t1
t2: clk loop 1000 [find b2 10000.0] t2: clk - t2

print ["Find integer key in block:" t1]
print ["Find decimal key in block:" t2]


t1: clk loop 1000 [find h1 10000] t1: clk - t1
t2: clk loop 1000 [find h2 10000.0] t2: clk - t2

print ["Find integer key in hash:" t1]
print ["Find decimal key in hash:" t2]
;end of code

Here are my results for the current versions of core and view (450 MHz PII):

system/version 2.5.0.3.1
Find integer key in block: 0:00:02.04
Find decimal key in block: 0:00:13.4
Find integer key in hash: 0:00
Find decimal key in hash: 0:00:14.66

system/version 1.2.1.3.1
Find integer key in block: 0:00:01.92
Find decimal key in block: 0:00:15.27
Find integer key in hash: 0:00
Find decimal key in hash: 0:00:16.59

It is interesting that the search for a decimal key takes 6 to 8 times
longer than for an integer key.

I noticed that if I change the code so that the decimal search keys are of
the form j / 3 instead of to-decimal j, and then search for 10000 / 3 the
times are much better, especially for the hash:

Find integer key in block: 0:00:02.03
Find decimal key in block: 0:00:09.23
Find integer key in hash: 0:00
Find decimal key in hash: 0:00:04.78

This suggests that comparing decimals which are equal to integers (1000 and
1000.0) requires some extra time. But this does not seem to be confirmed by
a direct test:

>> a: 10000
== 10000
>> b: 10000.0
== 10000
>> c: 10000 / 3
== 3333.33333333333
>> t: clk loop 5'000'000 [equal? a a] clk - t
== 0:00:06.26
>> t: clk loop 5'000'000 [equal? b b] clk - t
== 0:00:09.89
>> t: clk loop 5'000'000 [equal? c c] clk - t
== 0:00:10.11

>From the original example with decimal keys equal to integers, it appears
that using hash with decimal keys actually takes more time for FIND than
when using a block. But with integer keys the hash makes FIND so fast that
the loop iterations have to go up by a factor of 100 inorder to get a
measurable time.

I also see this:

>> t: clk loop 1000 [find h2 10000.0] clk - t
== 0:00:14.01
>> t: clk loop 1000 [find b2 10000.0] clk - t
== 0:00:14.83
>> t: clk loop 1000 [find h2 5000.0] clk - t
== 0:00:14.11
>> t: clk loop 1000 [find b2 5000.0] clk - t
== 0:00:07.36                  ;twice as fast as hash

It does seem that the hashes with decimal keys take a time independent of
the location of the search key in the block while FIND on a block searches
linearly from the beginning and thus results in half the time to find a key
half-way through the block. Constant time is certainly a hallmark of a hash,
but the hash time for FIND is so slow for decimal keys equal to integers
that searching for a random key will take on average almost twice as long as
when using a block. For the more representative j / 3 case, the times for
block and hash will be comparable for a random search key, while the worst
case will be a factor of 2 better for the hash.

I guess I don't fully understand REBOL hashes, particularly with respect to
decimal keys. Of the datatypes you mention, which will fail to display the
large speed increase we expect from a hash (and which we see for string and
integer keys) as opposed to a block ?

Any comments are welcome.

-Larry


----- Original Message -----
From: <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Sent: Tuesday, January 08, 2002 12:47 PM
Subject: [REBOL] Re: hash questions


> On Tue, Jan 08, 2002 at 12:26:24PM -0800, Larry Palmiter wrote:
> > Hi David,
> >
> > Joel has already given an excellent overview of using the hash!
datatype. I
> > just want to mention an additional issue with the hash! type.
> >
> > AFAIK, Only strings and integers are hashed to provide fast search. So
for a
> > block which contains consecutive  key-value pairs, converting the block
to a
> > hash will be a benefit only when the keys are either integers or
strings.
> > Does anyone know if other REBOL datatypes can be hashed?
>
> string, issue, binary, image, file, email, url, tag, word, set-word,
> get-word, lit-word, refinement, logic, integer, char, decimal, money,
> time, date, tuple, pair, object.
>
> A lot more than just integers and strings :).
>
> --
> Holger Kruse
> [EMAIL PROTECTED]


-- 
To unsubscribe from this list, please send an email to
[EMAIL PROTECTED] with "unsubscribe" in the 
subject, without the quotes.

Reply via email to