[issue3439] math.frexp and obtaining the bit size of a large integer
Fredrik Johansson [EMAIL PROTECTED] added the comment: Some elaboration (that perhaps could be adapted into the documentation or at least source comments). There are two primary uses for numbits, both of which justify (0).numbits() == 0. The first is that for positive k, n = k.numbits() gives the minimum width of a register that can hold k, where a register can hold the 2**n integers 0, 1, ..., 2**n-1 (inclusive). This definition continues to make sense for k = 0, n = 0 (the empty register holds the 2**0 = 1 values 0). In Python terms, one could say that self.numbits() returns the smallest n such that abs(self) is in range(2**n). Perhaps this would make a clearer docstring? Second, k.numbits() (plus/minus 1, or perhaps multiplied by a constant factor) measures the number of steps required to solve a problem of size k using various divide-and-conquer algorithms. The problem of size k = 0 is trivial and therefore requires (0).numbits() == 0 steps. In particular, if L is a sorted list, then len(L).numbits() exactly gives the maximum number of comparisons required to find an insertion point in L using binary search. Finally, the convention (-k).numbits() == k.numbits() is useful in contexts where the number k itself is the input to a mathematical function. For example, in a function for multiplying two integers, one might want to choose a different algorithm depending on the sizes of the inputs, and this choice is likely to be independent of signs (if not, one probably needs to check signs anyway.) ___ Python tracker [EMAIL PROTECTED] http://bugs.python.org/issue3439 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue3439] math.frexp and obtaining the bit size of a large integer
STINNER Victor [EMAIL PROTECTED] added the comment: See also issue #3724 which proposes to support long integers for math.log2(). -- nosy: +haypo ___ Python tracker [EMAIL PROTECTED] http://bugs.python.org/issue3439 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue3439] math.frexp and obtaining the bit size of a large integer
Terry J. Reedy [EMAIL PROTECTED] added the comment: To add support to the proposal: there is currently yet another thread on c.l.p on how to calculate numbits efficiently. The OP needs it for prototyping cryptographic algorithms and found Python-level code slower than he wanted. -- nosy: +tjreedy ___ Python tracker [EMAIL PROTECTED] http://bugs.python.org/issue3439 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue3439] math.frexp and obtaining the bit size of a large integer
Fredrik Johansson [EMAIL PROTECTED] added the comment: Wow, newbie error. Thanks for spotting! In every case I can think of, I've wanted (0).numbits() to be 0. The explanation in the docstring can probably be improved. What other documentation is needed (where)? ___ Python tracker [EMAIL PROTECTED] http://bugs.python.org/issue3439 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue3439] math.frexp and obtaining the bit size of a large integer
Mark Dickinson [EMAIL PROTECTED] added the comment: In every case I can think of, I've wanted (0).numbits() to be 0. Me too, in most cases, though I've encountered the occasional case where raising ValueError would be more appropriate and would catch some bugs that might otherwise pass silently. So I agree that (0).numbits() should be 0, but I think there's enough potential ambiguity that there should be a sentence in the documentation making this explicit. Two of the most obvious (wrong) formulas for numbits are: (1) numbits(n) = ceiling(log_2(n)), and (2) numbits(n) = len(bin(n))-2, but neither of these formulas gives the right result for 0, or for negative numbers for that matter. The explanation in the docstring can probably be improved. What other documentation is needed (where)? The docstring looked okay to me. There should be more comprehensive ReST documentation in the Doc/ directory somewhere, probably in Doc/library/stdtypes.rst ___ Python tracker [EMAIL PROTECTED] http://bugs.python.org/issue3439 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue3439] math.frexp and obtaining the bit size of a large integer
Mark Dickinson [EMAIL PROTECTED] added the comment: With the patch, the following code causes a non-keyboard-interruptible interpreter hang. from sys import maxint (-maxint-1).numbits() [... interpreter hang ...] The culprit is, of course, the statement if (n 0) n = -n; in int_numbits: LONG_MIN is negated to itself (this may even be undefined behaviour according to the C standards). The patch also needs documentation, and that documentation should clearly spell out what happens for zero and for negative numbers. It's not at all clear that everyone will expect (0).numbits() to be 0, though I agree that this is probably the most useful definition in practice. One could make a case for (0).numbits() raising ValueError: for some algorithms, what one wants is an integer k such that 2**(k-1) = abs(n) 2**k; when n == 0 no such integer exists. Other than those two things, I think the patch looks fine. ___ Python tracker [EMAIL PROTECTED] http://bugs.python.org/issue3439 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue3439] math.frexp and obtaining the bit size of a large integer
Mark Dickinson [EMAIL PROTECTED] added the comment: One possible fix would be to compute the absolute value of n as an unsigned long. I *think* the following is portable and avoids any undefined behaviour coming from signed arithmetic overflow. unsigned long absn; if (n 0) absn = 1 + (unsigned long)(-1-n); else absn = (unsigned long)n; Might this work? Perhaps it would also be worth changing the tests in test_int from e.g. self.assertEqual((-a).numbits(), i+1) to self.assertEqual(int(-a).numbits(), i+1) This would have caught the -LONG_MAX error. ___ Python tracker [EMAIL PROTECTED] http://bugs.python.org/issue3439 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue3439] math.frexp and obtaining the bit size of a large integer
Mark Dickinson [EMAIL PROTECTED] added the comment: I'd also be interested in having _PyLong_NumBits exposed to Python in some way or another. It's something I've needed many times before, and it's used in the decimal module, for example. My favorite workaround uses hex instead of bin: 4*len('%x'%x) - correction_dictionary[first_hexdigit_of_x] but this is still O(log x) rather than O(1). -- nosy: +marketdickinson ___ Python tracker [EMAIL PROTECTED] http://bugs.python.org/issue3439 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue3439] math.frexp and obtaining the bit size of a large integer
New submission from Fredrik Johansson [EMAIL PROTECTED]: Python 3.0b2 (r30b2:65106, Jul 18 2008, 18:44:17) [MSC v.1500 32 bit (Intel)] on win32 Type help, copyright, credits or license for more information. import math math.frexp(10**100) (0.5714936956411375, 333) math.frexp(10**1000) Traceback (most recent call last): File stdin, line 1, in module OverflowError: Python int too large to convert to C double (Same behavior in Python 2.5.2, and presumably 2.6 although I haven't checked the latter.) I think it should be easy to make frexp work for large integers by calling PyLong_AsScaledDouble and adding the exponents. It would be logical to fix this since math.log(n) already works for large integers. My reason for requesting this change is that math.frexp is the fastest way I know of to accurately count the number of bits in a Python integer (it is more robust than math.log(n,2) and makes it easy to verify that the computed size is exact) and this is something I need to do a lot. Actually, it would be even more useful to have a standard function to directly obtain the bit size of an integer. If I'm not mistaken, PyLong_NumBits does precisely this, and would just have to be wrapped. Aside from my own needs (which don't reflect those of the Python community), there is at least one place in the standard library where this would be useful: decimal.py contains an inefficient implementation (_nbits) that could removed. -- components: Library (Lib) messages: 70216 nosy: fredrikj severity: normal status: open title: math.frexp and obtaining the bit size of a large integer type: feature request versions: Python 2.6, Python 3.0 ___ Python tracker [EMAIL PROTECTED] http://bugs.python.org/issue3439 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue3439] math.frexp and obtaining the bit size of a large integer
Martin v. Löwis [EMAIL PROTECTED] added the comment: Would you like to work on a patch? -- nosy: +loewis ___ Python tracker [EMAIL PROTECTED] http://bugs.python.org/issue3439 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue3439] math.frexp and obtaining the bit size of a large integer
Raymond Hettinger [EMAIL PROTECTED] added the comment: I prefer your idea to expose PyLong_Numbits(). IMO, frexp() is very much a floating point concept and should probably remain that way. -- nosy: +rhettinger ___ Python tracker [EMAIL PROTECTED] http://bugs.python.org/issue3439 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue3439] math.frexp and obtaining the bit size of a large integer
Raymond Hettinger [EMAIL PROTECTED] added the comment: Another reason to leave frexp() untouched is that it is tightly coupled to ldexp() as its inverse, for a lossless roundtrip: assert ldexp(*frexp(pi)) == pi This relationship is bound to get mucked-up or confused if frexp starts accepting large integers that are no exactly representable as floats (i.e. 2**100+1). ___ Python tracker [EMAIL PROTECTED] http://bugs.python.org/issue3439 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue3439] math.frexp and obtaining the bit size of a large integer
Fredrik Johansson [EMAIL PROTECTED] added the comment: Raymond, yes, I think that a separate numbits function would better, although exposing this functionality would not prevent also changing the behavior of frexp. As I said, math.log already knows about long integers, so handling long integers similarly in frexp would not be all that unnatural. (On the other hand, it is true that math.sqrt, math.pow, math.cos, etc could all theoretically be fixed to work with larger-than-double input, and that slippery slope is probably better avoided.) Good point about roundtripping, but the problem with that argument is that frexp already accepts integers that cannot be represented exactly, e.g.: ldexp(*frexp(10**100)) == 10**100 False Anyway, if there is support for exposing _PyLong_Numbits, should it be a method or a function? (And if a function, placed where? Should it accept floating-point input?) I'm attaching a patch (for the trunk) that adds a numbits method to the int and long types. My C-fu is limited, and I've never hacked on Python before, so the code is probably broken or otherwise bad in several ways (but in that case you can tell me about it and I will learn something :-). I did not bother to optimize the implementation for int, and the tests may be redundant/incomplete/placed wrongly. A slight problem is that _PyLong_NumBits is restricted to a size_t, so it raises an OverflowError on 32-bit platforms for some easily physically representable numbers: (13*10**9).numbits() Traceback (most recent call last): File stdin, line 1, in module OverflowError: long int too large to convert to int This would need to be fixed somehow. If numbits becomes a method, should it also be added to the Integral ABC? GMPY's mpz type, by the way, defines a method numdigits(self, base). This generalization would possibly be overkill, but it's worth considering. If it's too late to add this method/function for 2.6 and 3.0, please update the issue version field as appropriate. -- keywords: +patch Added file: http://bugs.python.org/file10972/numbits.diff ___ Python tracker [EMAIL PROTECTED] http://bugs.python.org/issue3439 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue3439] math.frexp and obtaining the bit size of a large integer
Raymond Hettinger [EMAIL PROTECTED] added the comment: numbers.Integral is already way too fat of an API. Am -1 on expanding it further. Recommend sticking with the simplest, least invasive, least pervasive version of your request, a numbits() method for ints. FWIW, in Py2.6 you can already write: def numbits(x): return len(bin(abs(x))) - 2 -- versions: +Python 2.7, Python 3.1 -Python 2.6, Python 3.0 ___ Python tracker [EMAIL PROTECTED] http://bugs.python.org/issue3439 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com