On 01/19/2012 07:31 PM, Maciej Fijalkowski wrote:
On Thu, Jan 19, 2012 at 6:46 PM, Dmitrey<[email protected]>  wrote:
Hi all,
could you provide clarification to numpypy new funcs accepting (not only for
me, but for any other possible volunteers)?
The doc I've been directed says only "You have to test exhaustively your
module", while I would like to know more explicit rules.
For example, "at least 3 tests per func" (however, I guess for funcs of
different complexity and variability number of tests also should expected to
be different).
Also, are there any strict rules for the testcases to be submitted, or I,
for example, can mere write

if __name__ == '__main__':
    assert array_equal(1, 1)
    assert array_equal([1, 2], [1, 2])
    assert array_equal(N.array([1, 2]), N.array([1, 2]))
    assert array_equal([1, 2], N.array([1, 2]))
    assert array_equal([1, 2], [1, 2, 3]) is False
    print('passed')
We have pretty exhaustive automated testing suites. Look for example
in pypy/module/micronumpy/test directory for the test file style.
They're run with py.test and we require at the very least full code
coverage (every line has to be executed, there are tools to check,
like coverage). Also passing "unusual" input, like sys.maxint  etc. is
usually recommended. With your example, you would check if it works
for say views and multidimensional arrays. Also "is False" is not
considered good style.

Or there is a certain rule for storing files with tests?

If I or someone else will submit a func with some tests like in the example
above, will you put the func and tests in the proper files by yourself? I'm
not lazy to go for it by myself, but I mere no merged enough into numpypy
dev process, including mercurial branches and numpypy files structure, and
can spend only quite limited time for diving into it in nearest future.
We generally require people to put their own tests as they go with the
code (in appropriate places) because you also should not break
anything. The usefullness of a patch that has to be sliced and diced
and put into places is very limited and for straightforward
mostly-copied code, like array_equal, plain useless, since it's almost
as much work to just do it.
Well, for this func (array_equal) my docstrings really were copied from cpython numpy (why wouln't do this to save some time, while license allows it?), but * why would'n go for this (), while other programmers are busy by other tasks? * engines of my and CPython numpy funcs complitely differs. At first, in PyPy the CPython code just doesn't work at all (because of the problem with ndarray.flat). At 2nd, I have implemented walkaround - just replaced some code lines by
    Size = a1.size
    f1, f2 = a1.flat, a2.flat
    # TODO: replace xrange by range in Python3
    for i in xrange(Size):
        if f1.next() != f2.next(): return False
    return True

Here are some results in CPython for the following bench:

from time import time
n = 100000
m = 100
a = N.zeros(n)
b = N.ones(n)
t = time()
for i in range(m):
    N.array_equal(a, b)
print('classic numpy array_equal time elapsed (on different arrays): %0.5f' % (time()-t))


t = time()
for i in range(m):
    array_equal(a, b)
print('Alternative array_equal time elapsed (on different arrays): %0.5f' % (time()-t))

b = N.zeros(n)

t = time()
for i in range(m):
    N.array_equal(a, b)
print('classic numpy array_equal time elapsed (on same arrays): %0.5f' % (time()-t))

t = time()
for i in range(m):
    array_equal(a, b)
print('Alternative array_equal time elapsed (on same arrays): %0.5f' % (time()-t))

CPython numpy results:
classic numpy array_equal time elapsed (on different arrays): 0.07728
Alternative array_equal time elapsed (on different arrays): 0.00056
classic numpy array_equal time elapsed (on same arrays): 0.11163
Alternative array_equal time elapsed (on same arrays): 9.09458

PyPy results (cannot test on "classic" version because it depends on some funcs that are unavailable yet):
Alternative array_equal time elapsed (on different arrays): 0.00133
Alternative array_equal time elapsed (on same arrays): 0.95038


So, as you see, even in CPython numpy my version is 138 times faster for different arrays (yet slower in 90 times for same arrays). However, in real world usually different arrays come to this func, and only sometimes similar arrays are encountered. Well, for my implementation for case of equal arrays time elapsed essentially depends on their size, but in either way I still think my implementation is better than CPython, - it's faster and doesn't require allocation of memory for the boolean array, that will go to the logical_and.

I updated my array_equal implementation with the changes mentioned above, some tests on multidimensional arrays you've asked and put it in http://pastebin.com/tg2aHE6x (now I'll update the bugs.pypy.org entry with the link).

-----------------------
Regards, D.
http://openopt.org/Dmitrey
_______________________________________________
pypy-dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/pypy-dev

Reply via email to