Hi Jeff,
       Here is a strategy inspired by the hashing function in Python and
Julia.  I hope it covers all bases in integers, float, rationals, complex
and arbitrary precision. Please go through it and provide your feedback.

Please note that the code is supposed to be pseudocode and it is not
tested. also hash64 is a function already implemented in Julia.

hash
(x::Union(Bool,Char,Int8,Uint8,Int16,Uint16,Int32,Uint32,Int64,Uint64))=
hash64(uint64(x))

hash(x::Union(Float16, Float32, Rational{T<:Integer})) = hash(float64(x))

function hash(x::Union(Int128, Uint128))
    x = uint128(x)
 x_one = uint64(x>>64)
x_two = uint64((x<<64)>>64))
 hash((x_one, x_two))

function hash(x::Float64)
 integral, fracpart = modf(x) #C function call
if fracpart == 0
 hash(integral)
else
 #custom hashing algo for float taking fracpart into account
end

function hash(x::Complex{T<:Real})
    if imag(x) ==0
        hash(real(x))
    else
        hash(real(x), real(y))
    end
end

function hash(x::BigInt)
    if typemin(Int128) <= x <= typemax(Int128)
    return hash(int128(x))
    else:
    return hash(string(x))
end
end

function hash(x::BigFloat)
    integral, fracpart = modf_equivalent_of_bigfloat(x)
    if fracpart == 0
    return hash(integral) #integral could be BigInt
    else:
    return hash(string(x))
end
end




On Tue, Jan 21, 2014 at 10:43 AM, Jeff Bezanson <[email protected]>wrote:

> The main reason is that there are many types of numbers, with more
> added all the time. And for purposes of hash tables, it is difficult
> to ensure that all numerically-equal numbers hash the same. So we had
> isequal(), which is used by dictionaries, distinguish numbers of
> different types. At this point, we would kind of like to change this
> back and make isequal more liberal (although it would still
> distinguish -0.0 and 0.0, and so not be strictly more liberal than
> ==). However, the hashing problem remains. Any ideas are welcome.
>
>
> On Mon, Jan 20, 2014 at 11:54 PM, Sharmila Gopirajan Sivakumar
> <[email protected]> wrote:
> > julia> x = int32(4)
> > 4
> >
> > julia> y = int64(4)
> > 4
> >
> > julia> x == y
> > true
> >
> > julia> x in [y]
> > false
> >
> > I'm not sure if this behaviour is expected. Logically,  if x==y
> evaluates to
> > true, x in [y] should also evaluate to true.  The difference arises
> because
> > the implementation for in uses 'isequal' function to check for equality
> > instead of ==. 'isequal' is documented as follows.
> >
> > isequal(x, y)
> >
> > True if and only if x and y have the same contents. Loosely speaking,
> this
> > means x and y would look the same when printed. This is the default
> > comparison function used by hash tables (Dict). New types with a notion
> of
> > equality should implement this function, except for numbers, which should
> > implement == instead. However, numeric types with special values might
> need
> > to implement isequal as well. For example, floating point NaN values are
> not
> > ==, but are all equivalent in the sense of isequal. Numbers of different
> > types are considered unequal. Mutable containers should generally
> implement
> > isequal by calling isequal recursively on all contents.
> >
> > Why is isequal more liberal than == in some cases (like floating point
> NaNs)
> > and more strict than == is others (Numbers of different types are
> considered
> > unequal.).  In the above case, I believe it should evaluate to True, but
> I
> > assume there would be a good reason for it being the way it is.  Will
> > someone please clarify why?
>

Reply via email to