[issue8188] Unified hash for numeric types.

2010-06-11 Thread Mark Dickinson

Mark Dickinson dicki...@gmail.com added the comment:

Committed the Decimal-to-Fraction comparisons in r81893.  All numeric types 
should now compare nicely with each other.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8188
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8188] Unified hash for numeric types.

2010-05-23 Thread Mark Dickinson

Mark Dickinson dicki...@gmail.com added the comment:

Committed the hash changes in r81486.  This commit just changes the method for 
computing hash values;  it doesn't include the changes to the decimal module 
that make Decimal instances comparable with Fraction instances.

--
resolution:  - accepted
stage: commit review - committed/rejected
status: open - closed

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8188
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8188] Unified hash for numeric types.

2010-05-22 Thread Mark Dickinson

Mark Dickinson dicki...@gmail.com added the comment:

Updated patch:

 - make hash(m/P) preserve sign, as discussed earlier
 - add details for computing the hash of a complex number
 - reorganize sys.hash_info
- drop sys.hash_info.bits (the exponent of the Mersenne prime);
  it's not needed in the Python code, and it can be deduced from
  the prime itself if necessary.  This also means that there's no
  public requirement that the prime be a Mersenne prime.
- drop sys.hash_info.ninf;  just use -sys.hash_info.inf instead

- add sys.hash_info.width:  the underlying width in bits for hashes
  of all descriptions;  in other words, it's just the number of bits
  in a C long in the current implementation
- add sys.hash_info.imag, the multiplier used for the imaginary
  part of a complex number

--
Added file: http://bugs.python.org/file17437/numeric_hash8.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8188
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8188] Unified hash for numeric types.

2010-04-21 Thread Mark Dickinson

Mark Dickinson dicki...@gmail.com added the comment:

 Regarding a hash function for complex types, I think documenting the 
 existing behavior for PyComplex is sufficient. The magic number 103  
 could be documented in hash_info as 'multiplier' and _PyHASH_MULTIPLIER. 

Seems reasonable;  I'm tempted to call the constant it hash_info.imaginary and 
_PyHASH_IMAGINARY, though.  :)  

There's also an implicit parameter in this algorithm, namely the size of a C 
long;  I think this should go into sys.hash_info, too.

complex_hash does need fixing in one respect:  it currently depends on signed 
overflow wrapping modulo 2**BIT_IN_LONG, but that's undefined behaviour;  it 
should use unsigned arithmetic instead.

 I think hash(m/P) should preserve sign. It just seems more symmetrical. :)

Agreed.  Along similar lines, I think I'm also going to get rid of _PyHASH_NINF 
and just use -PyHASH_INF instead.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8188
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8188] Unified hash for numeric types.

2010-04-20 Thread Case Van Horsen

Case Van Horsen cas...@gmail.com added the comment:

I've spent some time working through the new hash function by re-implementing 
it for gmpy. Very elegant design.

I like _PyHASH_MODULUS better, too.

Regarding a hash function for complex types, I think documenting the existing 
behavior for PyComplex is sufficient. The magic number 103 could be 
documented in hash_info as 'multiplier' and _PyHASH_MULTIPLIER. The same 
constant, but a different algorithm, is also used when hashing a tuple.

I think hash(m/P) should preserve sign. It just seems more symmetrical. :)

--
nosy: +casevh

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8188
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8188] Unified hash for numeric types.

2010-04-13 Thread Mark Dickinson

Mark Dickinson dicki...@gmail.com added the comment:

Many thanks for reviewing, Stefan, and for the patch.

Here's an updated patch:
 - specify 32-bit/64-bit C long rather than 32-bit/64-bit machine
 - apply hash-hash_ fix to Python hash recipe
 - use _PyHASH_MODULUS instead of _PyHASH_MASK throughout (this
   was bugging me).
 - reorganize the stdtypes doc entry slightly
 - update against current svn, and remove outdated test_float tests
   for the values of float('inf') and float('nan')

One unresolved issue:  it would probably make sense to specify (publicly) a 
hash algorithm for complex types, too, so that someone implementing e.g. 
Gaussian integers can make their hash function agree with that for the complex 
type where appropriate.

That hash algorithm would probably be as simple as:

  hr = hash(x.real)
  hi = hash(x.imag)
  return some suitably bit-mixing combination of hi and hr

where the algorithm for the combination needs to be specified explicitly, and 
any relevant parameters put into sys.hash_info.
(Unfortunately, -1 doesn't have square roots modulo the prime P used, else we 
could do the cute thing and make use of a square root of -1 modulo P.)

Another tiny detail:  I'm wondering whether hash(m/P) should care about the 
sign of m:  currently it doesn't, which means that the symmetry hash(-x) = 
-hash(x) *almost* always holds for a rational x, but not always.  An 
almost-always-true symmetry seems like a recipe for hard-to-find bugs.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8188
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8188] Unified hash for numeric types.

2010-04-13 Thread Mark Dickinson

Changes by Mark Dickinson dicki...@gmail.com:


--
assignee:  - mark.dickinson

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8188
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8188] Unified hash for numeric types.

2010-04-13 Thread Mark Dickinson

Mark Dickinson dicki...@gmail.com added the comment:

... and here's the actual patch...  Forget my own head next. :)

--
Added file: http://bugs.python.org/file16907/numeric_hash7.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8188
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8188] Unified hash for numeric types.

2010-04-13 Thread Mark Dickinson

Changes by Mark Dickinson dicki...@gmail.com:


--
priority:  - normal

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8188
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8188] Unified hash for numeric types.

2010-04-06 Thread Stefan Krah

Stefan Krah stefan-use...@bytereef.org added the comment:

I've finished reviewing the patch and I think it's quite ready to be
applied.

The documentation in stdtypes.rst says that P = 2**61-1 on 64-bit
machines. This should be changed to reflect the fact that actually
sizeof long is the deciding factor. I attach a patch that also fixes
the typo pointed out by Christophe Simonis.

--
Added file: http://bugs.python.org/file16782/doc_stdtypes.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8188
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8188] Unified hash for numeric types.

2010-04-04 Thread Mark Dickinson

Mark Dickinson dicki...@gmail.com added the comment:

New patch:

 - document and test sys.hash_info
 - document numeric hash definition (in Doc/library/stdtypes.rst;  I'm
   not sure whether this is the best place for it)
 - document Decimal change (Decimal instances are now comparable
   with instances of float, fraction.Fraction)
 - refresh patch to apply cleanly to current svn.

I think this is close to final form:  I intend to apply this patch (or 
something very much like it) soon;  any review would be appreciated.

--
nosy: +rhettinger
stage:  - commit review
Added file: http://bugs.python.org/file16756/numeric_hash6.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8188
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8188] Unified hash for numeric types.

2010-04-04 Thread Mark Dickinson

Mark Dickinson dicki...@gmail.com added the comment:

I've refreshed the Rietveld patch as well:

http://codereview.appspot.com/660042

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8188
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8188] Unified hash for numeric types.

2010-04-04 Thread Stefan Krah

Stefan Krah stefan-use...@bytereef.org added the comment:

Mark, very nice concept! - I'm just starting to review the patch, but I
think the unsigned longs in_Py_HashDouble() and long_hash() should be
uint64_t on a 64-bit OS.

For instance, now on Windows 64-bit:

 hash(2**61-1)
1073741823

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8188
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8188] Unified hash for numeric types.

2010-04-04 Thread Stefan Krah

Stefan Krah stefan-use...@bytereef.org added the comment:

Actually the current long_hash() is affected as well. On Windows 64-bit:

 hash(2**31)
-2147483648

 hash(2**32)
1

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8188
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8188] Unified hash for numeric types.

2010-04-04 Thread Mark Dickinson

Mark Dickinson dicki...@gmail.com added the comment:

Yes, hash values are C longs, regardless of platform.  I think that's probably 
too ingrained to consider changing it (we'd have to change hashes of all the 
non-numeric types, too).

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8188
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8188] Unified hash for numeric types.

2010-03-27 Thread Mark Dickinson

Mark Dickinson dicki...@gmail.com added the comment:

Here's a version of the patch that adds exact comparisons between the various 
numeric types.  The only slightly tricky comparison is the Fraction - Decimal 
one:  an obvious strategy is to convert the Decimal exactly to a Fraction and 
then use the fraction comparison, but this is inefficient for Decimal instances 
with large exponent.  So instead, we compare a Decimal `x` with a Fraction 
`n/d` by comparing `x*d` with `n` in the Decimal domain.

--
Added file: http://bugs.python.org/file16675/numeric_hash5.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8188
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8188] Unified hash for numeric types.

2010-03-24 Thread Stefan Krah

Changes by Stefan Krah stefan-use...@bytereef.org:


--
nosy: +skrah

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8188
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8188] Unified hash for numeric types.

2010-03-23 Thread Mark Dickinson

Mark Dickinson dicki...@gmail.com added the comment:

Another update, partly to address comments raised by Guido on Rietveld.  I'll 
upload these changes to Rietveld later today.

 - rename sys._hash_info to sys.hash_info and make it public rather than 
private (it still needs docs somewhere)

 - add some explanatory comments to long_hash; remove an outdated comment

 - fix missing error check (in previous patch) in slot_tp_hash.  slot_tp_hash 
also now always raises a TypeError if __hash__ returns a non-integer;  this is 
a change from current behaviour, which allows small floats to be returned by 
__hash__, but not large floats (where large means  2**31 or  2**63 in 
absolute value, depending on the system).  I'm assuming this was unintentional 
(the docs specify that __hash__ should return an integer).

 - simplify specification of hash function slightly:  for nonnegative x it 
simply computes the reduction of x;  previously it computed 1 + reduction of 
(x-1) for positive values.  This extra +-1 doesn't really add anything of 
value, and makes it slightly more complicated and error-prone to write your own 
hash function.

--
Added file: http://bugs.python.org/file16629/numeric_hash4.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8188
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8188] Unified hash for numeric types.

2010-03-21 Thread Mark Dickinson

Mark Dickinson dicki...@gmail.com added the comment:

Updated patch:

 - put hash parameters into pyport.h, to avoid repetition;  make them
   available to Python code via a private attribute sys._hash_info.

 - use a modulus of 2**61-1 on systems where SIZEOF_LONG = 8, and
   a modulus of 2**31 - 1 otherwise.

 - remove _invmod from fractions module.  It's faster (and easier) to
   use 3-argument pow to compute inverses modulo a prime.

 - add a few more tests.

--
Added file: http://bugs.python.org/file16610/numeric_hash3.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8188
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8188] Unified hash for numeric types.

2010-03-20 Thread Mark Dickinson

New submission from Mark Dickinson dicki...@gmail.com:

Here's a patch that makes hash(x) == hash(y) for any numeric types (int, float, 
complex, Decimal, Fraction, bool) when x and y are numerically equal.

This is a prerequisite for making all numeric types accurately comparable with 
each other.

--
files: numeric_hash.patch
keywords: patch
messages: 101395
nosy: mark.dickinson
severity: normal
status: open
title: Unified hash for numeric types.
type: feature request
versions: Python 3.2
Added file: http://bugs.python.org/file16604/numeric_hash.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8188
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8188] Unified hash for numeric types.

2010-03-20 Thread Mark Dickinson

Mark Dickinson dicki...@gmail.com added the comment:

Uploaded to Rietveld:

http://codereview.appspot.com/660042

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8188
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8188] Unified hash for numeric types.

2010-03-20 Thread Mark Dickinson

Mark Dickinson dicki...@gmail.com added the comment:

Updated patch, with a bit of cleanup and some comments describing the hashing 
strategy;  I'll update the Rietveld issue as well.

--
Added file: http://bugs.python.org/file16606/numeric_hash2.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8188
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8188] Unified hash for numeric types.

2010-03-20 Thread Mark Dickinson

Changes by Mark Dickinson dicki...@gmail.com:


Removed file: http://bugs.python.org/file16606/numeric_hash2.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8188
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8188] Unified hash for numeric types.

2010-03-20 Thread Mark Dickinson

Mark Dickinson dicki...@gmail.com added the comment:

Whoops;  that patch included some accidental Lib/test/test_decimal changes.  
Here's the correct patch.

--
Added file: http://bugs.python.org/file16607/numeric_hash2.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8188
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8188] Unified hash for numeric types.

2010-03-20 Thread Adam Olsen

Adam Olsen rha...@gmail.com added the comment:

Why aren't you using 64-bit hashes on 64-bit architectures?

--
nosy: +Rhamphoryncus

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8188
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8188] Unified hash for numeric types.

2010-03-20 Thread Mark Dickinson

Mark Dickinson dicki...@gmail.com added the comment:

 Why aren't you using 64-bit hashes on 64-bit architectures?

Mostly because I haven't got around to putting that in yet.  :)

Ideal would be to use _PyHASH_BITS=61 for 64-bit machines, throughout.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8188
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8188] Unified hash for numeric types.

2010-03-20 Thread Adam Olsen

Adam Olsen rha...@gmail.com added the comment:

I assume you mean 63. ;)

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8188
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8188] Unified hash for numeric types.

2010-03-20 Thread Mark Dickinson

Mark Dickinson dicki...@gmail.com added the comment:

No, I mean 61.  2**61 - 1 is prime;  2**63-1 is not.  (So 2 bits of the hash 
get wasted, but that's not a big deal, especially since they're the high-end 
bits and Python mostly cares about the lower-order bits.)

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8188
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8188] Unified hash for numeric types.

2010-03-20 Thread Mark Dickinson

Mark Dickinson dicki...@gmail.com added the comment:

Restore tests accidentally omitted from second patch.

--
Added file: http://bugs.python.org/file16608/numeric_hash2.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8188
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8188] Unified hash for numeric types.

2010-03-20 Thread Mark Dickinson

Changes by Mark Dickinson dicki...@gmail.com:


Removed file: http://bugs.python.org/file16607/numeric_hash2.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8188
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8188] Unified hash for numeric types.

2010-03-20 Thread Eric Smith

Changes by Eric Smith e...@trueblade.com:


--
nosy: +eric.smith

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8188
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com