Without SIMD, I don't see much use for 32-bit integers on 64-bit 
architectures. However, if Julia adds support for autovectorization using 
SIMD instructions, twice as many 32-bit integers can be packed into the 
same SSE/AVX register and operated on at the same time.

On Tuesday, January 14, 2014 7:44:32 PM UTC-6, Stefan Karpinski wrote:
>
> It's primarily because you're using 64-bit integers in Julia and 32-bit 
> integers in all the other cases. If you change the C code to use `long` 
> instead of `int`, the timings are about the same. You're also just summing 
> result in C but accumulating it in Python and Julia and summing later, 
> which is less efficient – although it ends up not making much difference 
> here.
>
> It's difficult to use 32-bit integer arithmetic in Julia on a 64-bit 
> machine, but you usually don't want to. It's sometimes a performance hit 
> like it is here, but that's pretty rare, and 2 billion really just isn't 
> big enough for a lot of pretty mundane things you might use integers for 
> (Bill Gates can't represent their net worth with a 32-bit int!). 64-bit 
> integers on the other hand are big enough for any mundane counting task – 
> you don't hear people talking about quintillions of things, ever. Still, it 
> would be nice to be able to compile and/or run Julia in 32-bit mode on a 
> 64-bit machine and vice versa.
>
> It's clearly not appropriate for benchmarking, but this Julia one-liner 
> computes the same thing much more efficiently:
>
> julia> test2(n=20000) = sum(primes(ifloor(n*log(n*log(n))))[1:n])
> test2 (generic function with 2 methods)
>
> julia> @time test2()
> elapsed time: 0.004566614 seconds (363112 bytes allocated)
> 2137755325
>
>
> The primes 
> function<https://github.com/JuliaLang/julia/blob/master/base/primes.jl>uses a 
> BitArray and the Sieve of Atkin to find all primes up to the given 
> number. Since ifloor(n*log(n*log(n))) is an upper bound on the nth prime, 
> this always produces at least n prime values and returns their sum.
>
>
>
> On Tue, Jan 14, 2014 at 5:05 PM, Eric Davies <[email protected]<javascript:>
> > wrote:
>
>> Running test2() once before running @time test2() (to force compilation) 
>> results in a 13% performance improvement on my system.
>>
>>
>> On Tuesday, 14 January 2014 15:32:16 UTC-6, Przemyslaw Szufel wrote:
>>>
>>> Dear Julia users,
>>>
>>> I am considering using Julia for computational projects. 
>>> As a first to get a feeling of the new language a I tried to benchmark 
>>> Julia speed against other popular languages.
>>> I used an example code from the Cython tutorial: 
>>> http://docs.cython.org/src/tutorial/cython_tutorial.html [ the code for 
>>> finding n first prime numbers]. 
>>>
>>> Rewriting the code in different languages and measuring the times on my 
>>> Windows laptop gave me the following results:
>>>
>>> Language | Time in seconds (less=better)
>>>
>>> Python: 65.5
>>> Cython (with MinGW): 0.82
>>> Java : 0.64
>>> Java (with -server option) : 0.64
>>> C (with MinGW): 0.64
>>> Julia (0.2): 2.1
>>> Julia (0.3 nightly build): 2.1
>>>
>>> All the codes for my experiments are attached to this post (Cython i 
>>> Python are both being run starting from the prim.py file)
>>>
>>> The thing that worries me is that Julia takes much much longer than 
>>> Cython ,,,
>>> I am a beginner to Julia and would like to kindly ask what am I doing 
>>> wrong with my code. 
>>> I start Julia console and use the command  include ("prime.jl") to 
>>> execute it.
>>>
>>> This code looks very simple and I think the compiler should be able to 
>>> optimise it to at least the speed of Cython?
>>> Maybe I my code has been written in non-Julia style way and the compiler 
>>> has problems with it?
>>>
>>> I will be grateful for any answers or comments.
>>>
>>> Best regards,
>>> Przemyslaw Szufel
>>>
>>
>

Reply via email to