Re: [pypy-dev] certificate for accepting numpypy new funcs?

Dmitrey Thu, 19 Jan 2012 11:50:00 -0800

On 01/19/2012 07:31 PM, Maciej Fijalkowski wrote:

On Thu, Jan 19, 2012 at 6:46 PM, Dmitrey<[email protected]>  wrote:

Hi all,
could you provide clarification to numpypy new funcs accepting (not only for
me, but for any other possible volunteers)?
The doc I've been directed says only "You have to test exhaustively your
module", while I would like to know more explicit rules.
For example, "at least 3 tests per func" (however, I guess for funcs of
different complexity and variability number of tests also should expected to
be different).
Also, are there any strict rules for the testcases to be submitted, or I,
for example, can mere write


if __name__ == '__main__':
    assert array_equal(1, 1)
    assert array_equal([1, 2], [1, 2])
    assert array_equal(N.array([1, 2]), N.array([1, 2]))
    assert array_equal([1, 2], N.array([1, 2]))
    assert array_equal([1, 2], [1, 2, 3]) is False
    print('passed')

We have pretty exhaustive automated testing suites. Look for example
in pypy/module/micronumpy/test directory for the test file style.
They're run with py.test and we require at the very least full code
coverage (every line has to be executed, there are tools to check,
like coverage). Also passing "unusual" input, like sys.maxint  etc. is
usually recommended. With your example, you would check if it works
for say views and multidimensional arrays. Also "is False" is not
considered good style.

Or there is a certain rule for storing files with tests?

If I or someone else will submit a func with some tests like in the example
above, will you put the func and tests in the proper files by yourself? I'm
not lazy to go for it by myself, but I mere no merged enough into numpypy
dev process, including mercurial branches and numpypy files structure, and
can spend only quite limited time for diving into it in nearest future.

We generally require people to put their own tests as they go with the
code (in appropriate places) because you also should not break
anything. The usefullness of a patch that has to be sliced and diced
and put into places is very limited and for straightforward
mostly-copied code, like array_equal, plain useless, since it's almost
as much work to just do it.

Well, for this func (array_equal) my docstrings really were copied fromcpython numpy (why wouln't do this to save some time, while licenseallows it?), but* why would'n go for this (), while other programmers are busy by othertasks?* engines of my and CPython numpy funcs complitely differs. At first, inPyPy the CPython code just doesn't work at all (because of the problemwith ndarray.flat). At 2nd, I have implemented walkaround - justreplaced some code lines by

    Size = a1.size
    f1, f2 = a1.flat, a2.flat
    # TODO: replace xrange by range in Python3
    for i in xrange(Size):
        if f1.next() != f2.next(): return False
    return True

Here are some results in CPython for the following bench:

from time import time
n = 100000
m = 100
a = N.zeros(n)
b = N.ones(n)
t = time()
for i in range(m):
    N.array_equal(a, b)

print('classic numpy array_equal time elapsed (on different arrays):%0.5f' % (time()-t))



t = time()
for i in range(m):
    array_equal(a, b)

print('Alternative array_equal time elapsed (on different arrays):%0.5f' % (time()-t))


b = N.zeros(n)

t = time()
for i in range(m):
    N.array_equal(a, b)

print('classic numpy array_equal time elapsed (on same arrays): %0.5f' %(time()-t))


t = time()
for i in range(m):
    array_equal(a, b)

print('Alternative array_equal time elapsed (on same arrays): %0.5f' %(time()-t))


CPython numpy results:
classic numpy array_equal time elapsed (on different arrays): 0.07728
Alternative array_equal time elapsed (on different arrays): 0.00056
classic numpy array_equal time elapsed (on same arrays): 0.11163
Alternative array_equal time elapsed (on same arrays): 9.09458

PyPy results (cannot test on "classic" version because it depends onsome funcs that are unavailable yet):

Alternative array_equal time elapsed (on different arrays): 0.00133
Alternative array_equal time elapsed (on same arrays): 0.95038

So, as you see, even in CPython numpy my version is 138 times faster fordifferent arrays (yet slower in 90 times for same arrays). However, inreal world usually different arrays come to this func, and onlysometimes similar arrays are encountered.Well, for my implementation for case of equal arrays time elapsedessentially depends on their size, but in either way I still think myimplementation is better than CPython, - it's faster and doesn't requireallocation of memory for the boolean array, that will go to the logical_and.

I updated my array_equal implementation with the changes mentionedabove, some tests on multidimensional arrays you've asked and put it inhttp://pastebin.com/tg2aHE6x (now I'll update the bugs.pypy.org entrywith the link).


-----------------------
Regards, D.
http://openopt.org/Dmitrey
_______________________________________________
pypy-dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/pypy-dev

Re: [pypy-dev] certificate for accepting numpypy new funcs?

Reply via email to