Re: [ot-users] SpaceFillingC2 speed

roy Thu, 15 Jun 2017 01:16:29 -0700

Hello Denis,

Indeed now OT is faster using ot.Sample(sample).


Regarding numba, it has to be pure python and not numpy for it to work 
efficiently.

import numpy as np
import timeit
import openturns as ot
from numba import jit, njit


def discrepancy(sample):
    n_sample = len(sample)
    dim = sample.shape[1]

    abs_ = abs(sample - 0.5)
    disc1 = np.sum(np.prod(1 + 0.5 * abs_ - 0.5 * abs_ ** 2, axis=1))

    prod_arr = 1
    for i in range(dim):
        s0 = sample[:, i]
        prod_arr *= (1 +
                     0.5 * abs(s0[:, None] - 0.5) + 0.5 * abs(s0 - 0.5) -
                     0.5 * abs(s0[:, None] - s0))
    disc2 = prod_arr.sum()

    c2 = (13 / 12) ** dim - 2 / n_sample * disc1 + 1 / (n_sample ** 2) * disc2
    return np.sqrt(c2)

@jit
def discrepancy_numba(sample):
    n_sample = len(sample)
    dim = sample.shape[1]

    abs_ = abs(sample - 0.5)
    disc1 = np.sum(np.prod(1 + 0.5 * abs_ - 0.5 * abs_ ** 2, axis=1))

    prod_arr = 1
    for i in range(dim):
        s0 = sample[:, i]
        prod_arr *= (1 +
                     0.5 * abs(s0[:, None] - 0.5) + 0.5 * abs(s0 - 0.5) -
                     0.5 * abs(s0[:, None] - s0))
    disc2 = prod_arr.sum()

    c2 = (13 / 12) ** dim - 2 / n_sample * disc1 + 1 / (n_sample ** 2) * disc2
    return np.sqrt(c2)

@njit
def discrepancy_faster_numba(sample):
    disc1 = 0
    n_sample = len(sample)
    dim = sample.shape[1]

    for i in range(n_sample):
        prod = 1
        for item in sample[i]:
            sub = abs(item - 0.5)
            prod *= 1 + 0.5 * sub - 0.5 * sub ** 2
        disc1 += prod

    disc2 = 0
    for i in range(n_sample):
        for j in range(n_sample):
            prod = 1
            for k in range(dim):
                a = 0.5 * abs(sample[i,k] - 0.5)
                b = 0.5 * abs(sample[j,k] - 0.5)
                c = 0.5 * abs(sample[i,k] - sample[j,k])
                prod *= 1 + a + b - c
            disc2 += prod

    c2 = (13 / 12) ** dim - 2 / n_sample * disc1 + 1 / (n_sample ** 2) * disc2
    return np.sqrt(c2)


sample = np.random.random_sample((500, 2))
ot_sample = ot.Sample(sample)
print(discrepancy(sample))
print(discrepancy_numba(sample))
print(discrepancy_faster_numba(sample))
print(ot.SpaceFillingC2().evaluate(sample))

print('Function time: ', timeit.repeat('discrepancy(sample)', number=500, 
repeat=4, setup="from __main__ import discrepancy, sample"))
print('numba time: ', timeit.repeat('discrepancy_numba(sample)', number=500, 
repeat=4, setup="from __main__ import discrepancy_numba, sample"))
print('Fast numba time: ', timeit.repeat('discrepancy_faster_numba(sample)', 
number=500, repeat=4, setup="from __main__ import discrepancy_faster_numba, 
sample"))
print('OT time: ', timeit.repeat('ot.SpaceFillingC2().evaluate(ot_sample)', 
number=500, repeat=4, setup="from __main__ import ot_sample, ot"))



[34m[1mWRN - The configuration file has not been found, using default 
parameters.[0m      #### IF YOU HAPPEN TO KNOW HOW TO REMOVE THIS BY THE WAY
0.0181493670249
0.0181493670249
0.018149367024149737
0.018149367024149737
Function time:  [4.525451728957705, 4.541200206964277, 4.4143504980020225, 
4.56408092204947]
numba time:  [4.3976798499934375, 4.876463262015022, 5.385470865992829, 
5.138608552981168]
Fast numba time:  [0.6634743280010298, 0.6538278009975329, 0.7077985780197196, 
0.6579875709721819]
OT time:  [0.7988348260405473, 0.7220299079781398, 0.7797102630138397, 
0.7526425909600221]
[Finished in 53.8s]


So using numba is here again faster. Even if I use a large sample (1000) numba 
is slightly faster.


Sincerely,

Pamphile ROY
Chercheur doctorant en Quantification d’Incertitudes
CERFACS - Toulouse (31) - France
+33 (0) 5 61 19 31 57
+33 (0) 7 86 43 24 22



> Le 14 juin 2017 à 23:20, D. Barbier <[email protected]> a écrit :
> 
> Hello Pamphile,
> 
> The problem is that your sample case is small, so the conversion from
> a numpy array into an OT Sample has a significant cost.
> If you rerun your benchmark on
>  otsample = ot.Sample(sample)
> (or directly generate a random sample with OT), you will see that our
> version is much faster.
> 
> BTW I was intrigued by your results with numba, but could not achieve
> the same speedup, my gain is almost negligible.  Can you please show
> your test case with numba?  Did you use a GPU?
> 
> Regards,
> Denis
> 
> 2017-06-14 10:32 GMT+02:00 roy <[email protected]>:
>> Hi,
>> 
>> Thanks for the feedback, indeed that could explain the behaviours.
>> 
>> 
>> Pamphile ROY
>> Chercheur doctorant en Quantification d’Incertitudes
>> CERFACS - Toulouse (31) - France
>> +33 (0) 5 61 19 31 57
>> +33 (0) 7 86 43 24 22
>> 
>> 
>> 
>> Le 14 juin 2017 à 10:15, HADDAD Sofiane <[email protected]> a écrit :
>> 
>> Hi,
>> 
>> It also depends on sample size
>> 
>> With sample's size=1000, I get this :
>> 
>> 0.00975831343631
>> 0.009758313432154839
>> Function time:  [19.408187157008797, 21.296883990988135, 19.92589810100617]
>> OT time:  [4.125010760006262, 4.1429947539872956, 4.138353090995224]
>> 
>> For small samples, maybe we spend more time for the generation of small
>> objects than the evaluation itself
>> 
>> Regards,
>> Sofiane
>> 
>> 
>> Le Mercredi 14 juin 2017 0h22, D. Barbier <[email protected]> a écrit :
>> 
>> 
>> On 2017-06-13 12:01 GMT+02:00 roy wrote:
>>> Hi everyone,
>>> 
>>> I was playing with Centered discrepancy and wrote my function before I saw
>>> the class SpaceFillingC2.
>>> There is no issue except that I get 2x speedup with my python version.
>>> There
>>> might be room for improvement as I can even get a 10x on my version using
>>> numba.
>> [...]
>> 
>> Hello Pamphile,
>> 
>> I will have a look, thanks a lot for your feedback.
>> Regards,
>> 
>> Denis
>> 
>> _______________________________________________
>> OpenTURNS users mailing list
>> [email protected]
>> http://openturns.org/mailman/listinfo/users
>> 
>> 
>>

_______________________________________________
OpenTURNS users mailing list
[email protected]
http://openturns.org/mailman/listinfo/users

Re: [ot-users] SpaceFillingC2 speed

Reply via email to