I tried the following code:

    def test_speed():
        print('Computing X and X2...', end='', flush=True)
        X_np = np.random.uniform(0, 100, size=(10000, 1000)).astype(floatX)
        X2_np = np.random.uniform(0, 100, size=(10000, 1000)).astype(floatX)
        print('done!', flush=True)

        print('Moving X and X2 to the GPU...', end='', flush=True)
        X = theano.shared(X_np)
        X2 = theano.shared(X2_np)
        print('done!', flush=True)

        print('Building the graph...', end='', flush=True)
        Y = X
        for _ in range(100):
            # Y = Y * (Y <= X2)
            Y = Y * (Y - X2)
        Y.sum(axis=1)
        print('done!', flush=True)

        print('compiling...', end='', flush=True)
        f = theano.function([], Y)
        print('done!', flush=True)

        import time
        t = time.clock()
        f()
        print(time.clock() - t)

Note that there is a line with '<=' and another with '-' in the loop. 
They're exclusive. Here are the timings in seconds:

            CPU      GPU
    '-'     0.21    0.016
    <=      0.39    0.019

I'd say I don't need to worry about using comparisons.

On Monday, February 6, 2017 at 1:20:13 PM UTC+1, Kiuhnm Mnhuik wrote:
>
> I'm using Theano 0.9.0b1 with the new back-end.
> Should I use float32 for everything (even for bool masks) for maximum 
> speed on GPU (GTX 970)?
>

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"theano-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to