Re: [julia-users] Necessary to rebuild Julia after modifying Base?
There is one trick that you might try when you're modifying methods in Base. You can define a new method on the same function. Eg, you can paste the modified method in your terminal: function Base.sin(x::Float64) x == 0 error() sqrt(1-cos(x)^2) end Note that julia might have inlined the old definition, so in some cases you'll need to restart julia, or rebuild the system image in order for the new definition to be picked up.
Re: [julia-users] Matlab bench in Julia
Wow, I have now LU a little bit faster on the latest julia Fedora package than on my locally compiled julia: julia versioninfo() Julia Version 0.3.0 Platform Info: System: Linux (x86_64-redhat-linux) CPU: Intel(R) Core(TM) i7-4700MQ CPU @ 2.40GHz WORD_SIZE: 64 BLAS: libopenblas (DYNAMIC_ARCH NO_AFFINITY Haswell) LAPACK: libopenblasp.so.0 LIBM: libopenlibm LLVM: libLLVM-3.3 julia include(code/julia/bench.jl) LU decomposition, elapsed time: 0.07222901 seconds, was 0.123 seconds with my julia FFT , elapsed time: 0.248571629 seconds Thanks for making and improving the Fedora package
[julia-users] Re: How to convert a cell type to array type?
Thank you for help. I have checked the docs, no direct substitution for cell2mat :( On Sunday, September 21, 2014 1:01:22 PM UTC+8, Don MacMillen wrote: I haven't seen it, but that doesn't mean it's not there. Maybe time to hit readthedocs? On Saturday, September 20, 2014 9:48:02 PM UTC-7, Staro Pickle wrote: There is no automatic way, like the cell2mat command in matlab? On Sunday, September 21, 2014 12:42:44 PM UTC+8, Don MacMillen wrote: short answer is hvcat((2,2), A...) but make certain it is doing the concatenation in the order you really want, else call it out specifically as hvcat((2,2), A[2,1], A[1,1], A[2,2], A[1,2]) for example. On Saturday, September 20, 2014 8:34:32 PM UTC-7, Staro Pickle wrote: I define a matrix using cell, like: A = cell(2,2) b = ones(2,2) A[1,1] = b A[1,2] = b A[2,1] = b A[2,2] = b Then I want to make A a 4*4 2-D array: 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 How to do this?
[julia-users] Re: How to convert a cell type to array type?
perhaps this? *cell2mat(A)=hvcat(size(A),reshape(A,*(size(A)...))...)*
Re: [julia-users] De-serialization and thread-safety?
Hi Erik, First, one comment: tasks are not true (kernel) threads. Currently a julia process is single-threaded. Tasks are better considered as a form of cooperative multitasking. Yes, I've also found that I/O causes task switching. I don't personally know a great way around this. One option would presumably be to have some form of message queue; I am pretty sure that push!ing a new message on it---as long as you don't need to touch I/O to create the message---would not cause a switch. You can also use time() and other markers to indicate the status of control flow. I haven't been reading things carefully enough to know whether there's any history behind this, but if you haven't said so already...what does gdb (or equivalent) say about the segfault? --Tim On Saturday, September 20, 2014 08:24:59 PM Erik Schnetter wrote: I am trying to track down a segfault in a Julia application. Currently I am zooming in on deserialize, as avoiding calling it seems to reliably cure the problem, while calling it (even if not using the result) seems to reliably trigger the segfault. I am using many threads (tasks), and deserialize is called concurrently. Is this safe? I've been bitten in the past by this; e.g. I've accidentally added an info statement into a sequence of statements that needs to be atomic, and I/O apparently switches threads. Is there a list of known-to-be-safe or known-to-be-unsafe functions? Is deserialization thread-safe in this respect? I am in particular deserializing function calls and lambda expressions, and I see global variables (lambda_numbers, known_lambda_data). Are the respective data structures (WeakKeyDict and Dict) thread-safe? Is there a locking mechanism in Julia? This would temporarily only allow a single thread (task) to run, aborting with an error if this thread becomes unrunnable. In other words, calling yield when holding a lock would be a no-op. -erik
[julia-users] delete type definition
Hi, suppose I defined type Point a::Real b::Real end but later realized that I would rather have abstract(GeoObject) type Point : GeoObject a::Real b::Real end Of course Julia gives me an ERROR: invalid redefinition of constant Point Is working in a module and then reloading that the only workaround? Best, Tamas
Re: [julia-users] De-serialization and thread-safety?
I'm aware that Julia's threads are green threads. The issue of thread safety still remains; if one thread is suspended in a critical region, another can enter that region. Storing handles in global data structures and incrementing global variables are such actions, and I'm not 100% sure that the respective region in serialize.jl are yield-free, even without my info output. I was surprised to see that I/O causes task switches -- maybe something else (hashing? dictionaries? creating new lambdas in C?) also causes task switches? gdb points to memory allocation routines in libc, called from gc.c or array.c. I assume that something overwrites memory, destroying libc malloc's data structures, leading to a crash later. -erik On Sun, Sep 21, 2014 at 5:26 AM, Tim Holy tim.h...@gmail.com wrote: Hi Erik, First, one comment: tasks are not true (kernel) threads. Currently a julia process is single-threaded. Tasks are better considered as a form of cooperative multitasking. Yes, I've also found that I/O causes task switching. I don't personally know a great way around this. One option would presumably be to have some form of message queue; I am pretty sure that push!ing a new message on it---as long as you don't need to touch I/O to create the message---would not cause a switch. You can also use time() and other markers to indicate the status of control flow. I haven't been reading things carefully enough to know whether there's any history behind this, but if you haven't said so already...what does gdb (or equivalent) say about the segfault? --Tim On Saturday, September 20, 2014 08:24:59 PM Erik Schnetter wrote: I am trying to track down a segfault in a Julia application. Currently I am zooming in on deserialize, as avoiding calling it seems to reliably cure the problem, while calling it (even if not using the result) seems to reliably trigger the segfault. I am using many threads (tasks), and deserialize is called concurrently. Is this safe? I've been bitten in the past by this; e.g. I've accidentally added an info statement into a sequence of statements that needs to be atomic, and I/O apparently switches threads. Is there a list of known-to-be-safe or known-to-be-unsafe functions? Is deserialization thread-safe in this respect? I am in particular deserializing function calls and lambda expressions, and I see global variables (lambda_numbers, known_lambda_data). Are the respective data structures (WeakKeyDict and Dict) thread-safe? Is there a locking mechanism in Julia? This would temporarily only allow a single thread (task) to run, aborting with an error if this thread becomes unrunnable. In other words, calling yield when holding a lock would be a no-op. -erik -- Erik Schnetter schnet...@cct.lsu.edu http://www.perimeterinstitute.ca/personal/eschnetter/
Re: [julia-users] De-serialization and thread-safety?
If you have/find a clean example, certainly posting an issue will make sense. I can't comment on whether the task switch during I/O is inevitable. --Tim On Sunday, September 21, 2014 10:25:11 AM Erik Schnetter wrote: I'm aware that Julia's threads are green threads. The issue of thread safety still remains; if one thread is suspended in a critical region, another can enter that region. Storing handles in global data structures and incrementing global variables are such actions, and I'm not 100% sure that the respective region in serialize.jl are yield-free, even without my info output. I was surprised to see that I/O causes task switches -- maybe something else (hashing? dictionaries? creating new lambdas in C?) also causes task switches? gdb points to memory allocation routines in libc, called from gc.c or array.c. I assume that something overwrites memory, destroying libc malloc's data structures, leading to a crash later. -erik On Sun, Sep 21, 2014 at 5:26 AM, Tim Holy tim.h...@gmail.com wrote: Hi Erik, First, one comment: tasks are not true (kernel) threads. Currently a julia process is single-threaded. Tasks are better considered as a form of cooperative multitasking. Yes, I've also found that I/O causes task switching. I don't personally know a great way around this. One option would presumably be to have some form of message queue; I am pretty sure that push!ing a new message on it---as long as you don't need to touch I/O to create the message---would not cause a switch. You can also use time() and other markers to indicate the status of control flow. I haven't been reading things carefully enough to know whether there's any history behind this, but if you haven't said so already...what does gdb (or equivalent) say about the segfault? --Tim On Saturday, September 20, 2014 08:24:59 PM Erik Schnetter wrote: I am trying to track down a segfault in a Julia application. Currently I am zooming in on deserialize, as avoiding calling it seems to reliably cure the problem, while calling it (even if not using the result) seems to reliably trigger the segfault. I am using many threads (tasks), and deserialize is called concurrently. Is this safe? I've been bitten in the past by this; e.g. I've accidentally added an info statement into a sequence of statements that needs to be atomic, and I/O apparently switches threads. Is there a list of known-to-be-safe or known-to-be-unsafe functions? Is deserialization thread-safe in this respect? I am in particular deserializing function calls and lambda expressions, and I see global variables (lambda_numbers, known_lambda_data). Are the respective data structures (WeakKeyDict and Dict) thread-safe? Is there a locking mechanism in Julia? This would temporarily only allow a single thread (task) to run, aborting with an error if this thread becomes unrunnable. In other words, calling yield when holding a lock would be a no-op. -erik
Re: [julia-users] De-serialization and thread-safety?
Unfortunately I don't have a simple example that reproduces the problem. So far, I've managed to whittle it down to an application running in a single process without dependencies on external packages. -erik On Sun, Sep 21, 2014 at 1:04 PM, Tim Holy tim.h...@gmail.com wrote: If you have/find a clean example, certainly posting an issue will make sense. I can't comment on whether the task switch during I/O is inevitable. --Tim On Sunday, September 21, 2014 10:25:11 AM Erik Schnetter wrote: I'm aware that Julia's threads are green threads. The issue of thread safety still remains; if one thread is suspended in a critical region, another can enter that region. Storing handles in global data structures and incrementing global variables are such actions, and I'm not 100% sure that the respective region in serialize.jl are yield-free, even without my info output. I was surprised to see that I/O causes task switches -- maybe something else (hashing? dictionaries? creating new lambdas in C?) also causes task switches? gdb points to memory allocation routines in libc, called from gc.c or array.c. I assume that something overwrites memory, destroying libc malloc's data structures, leading to a crash later. -erik On Sun, Sep 21, 2014 at 5:26 AM, Tim Holy tim.h...@gmail.com wrote: Hi Erik, First, one comment: tasks are not true (kernel) threads. Currently a julia process is single-threaded. Tasks are better considered as a form of cooperative multitasking. Yes, I've also found that I/O causes task switching. I don't personally know a great way around this. One option would presumably be to have some form of message queue; I am pretty sure that push!ing a new message on it---as long as you don't need to touch I/O to create the message---would not cause a switch. You can also use time() and other markers to indicate the status of control flow. I haven't been reading things carefully enough to know whether there's any history behind this, but if you haven't said so already...what does gdb (or equivalent) say about the segfault? --Tim On Saturday, September 20, 2014 08:24:59 PM Erik Schnetter wrote: I am trying to track down a segfault in a Julia application. Currently I am zooming in on deserialize, as avoiding calling it seems to reliably cure the problem, while calling it (even if not using the result) seems to reliably trigger the segfault. I am using many threads (tasks), and deserialize is called concurrently. Is this safe? I've been bitten in the past by this; e.g. I've accidentally added an info statement into a sequence of statements that needs to be atomic, and I/O apparently switches threads. Is there a list of known-to-be-safe or known-to-be-unsafe functions? Is deserialization thread-safe in this respect? I am in particular deserializing function calls and lambda expressions, and I see global variables (lambda_numbers, known_lambda_data). Are the respective data structures (WeakKeyDict and Dict) thread-safe? Is there a locking mechanism in Julia? This would temporarily only allow a single thread (task) to run, aborting with an error if this thread becomes unrunnable. In other words, calling yield when holding a lock would be a no-op. -erik -- Erik Schnetter schnet...@cct.lsu.edu http://www.perimeterinstitute.ca/personal/eschnetter/
Re: [julia-users] De-serialization and thread-safety?
I saw a couple of posts back that you are using MPI? Any chance that MPI is issuing a callback on a different thread? This could be an issue with c-interop and can be sometimes solved by following the steps in the thread safety http://docs.julialang.org/en/release-0.3/manual/calling-c-and-fortran-code/#thread-safety section of the manual. On Sunday, September 21, 2014 1:44:23 PM UTC-4, Erik Schnetter wrote: Unfortunately I don't have a simple example that reproduces the problem. So far, I've managed to whittle it down to an application running in a single process without dependencies on external packages. -erik On Sun, Sep 21, 2014 at 1:04 PM, Tim Holy tim@gmail.com javascript: wrote: If you have/find a clean example, certainly posting an issue will make sense. I can't comment on whether the task switch during I/O is inevitable. --Tim On Sunday, September 21, 2014 10:25:11 AM Erik Schnetter wrote: I'm aware that Julia's threads are green threads. The issue of thread safety still remains; if one thread is suspended in a critical region, another can enter that region. Storing handles in global data structures and incrementing global variables are such actions, and I'm not 100% sure that the respective region in serialize.jl are yield-free, even without my info output. I was surprised to see that I/O causes task switches -- maybe something else (hashing? dictionaries? creating new lambdas in C?) also causes task switches? gdb points to memory allocation routines in libc, called from gc.c or array.c. I assume that something overwrites memory, destroying libc malloc's data structures, leading to a crash later. -erik On Sun, Sep 21, 2014 at 5:26 AM, Tim Holy tim@gmail.com javascript: wrote: Hi Erik, First, one comment: tasks are not true (kernel) threads. Currently a julia process is single-threaded. Tasks are better considered as a form of cooperative multitasking. Yes, I've also found that I/O causes task switching. I don't personally know a great way around this. One option would presumably be to have some form of message queue; I am pretty sure that push!ing a new message on it---as long as you don't need to touch I/O to create the message---would not cause a switch. You can also use time() and other markers to indicate the status of control flow. I haven't been reading things carefully enough to know whether there's any history behind this, but if you haven't said so already...what does gdb (or equivalent) say about the segfault? --Tim On Saturday, September 20, 2014 08:24:59 PM Erik Schnetter wrote: I am trying to track down a segfault in a Julia application. Currently I am zooming in on deserialize, as avoiding calling it seems to reliably cure the problem, while calling it (even if not using the result) seems to reliably trigger the segfault. I am using many threads (tasks), and deserialize is called concurrently. Is this safe? I've been bitten in the past by this; e.g. I've accidentally added an info statement into a sequence of statements that needs to be atomic, and I/O apparently switches threads. Is there a list of known-to-be-safe or known-to-be-unsafe functions? Is deserialization thread-safe in this respect? I am in particular deserializing function calls and lambda expressions, and I see global variables (lambda_numbers, known_lambda_data). Are the respective data structures (WeakKeyDict and Dict) thread-safe? Is there a locking mechanism in Julia? This would temporarily only allow a single thread (task) to run, aborting with an error if this thread becomes unrunnable. In other words, calling yield when holding a lock would be a no-op. -erik -- Erik Schnetter schn...@cct.lsu.edu javascript: http://www.perimeterinstitute.ca/personal/eschnetter/
[julia-users] Re: some Python / Julia comparisons
I got curious, and ended up implementing this myself: https://gist.github.com/jwmerrill/ff422bf00593e006c1a4 On my laptop, your Viterbi benchmark runs in 6.9s, and this new implementation runs in 0.5s, so it's something like 14x faster. If you wanted to push on performance any more, I'd recommend moving to a longer set of observations, and checking how execution time scales with 1. The number of observations 2. The size of the state space 3. The size of the observation space You can end up optimizing the wrong things if you only benchmark a toy example that executes in half a microsecond. If you do move to a longer set of observations, you might want to switch to storing and manipulating log probabilities instead of probabilities to avoid underflow. It looks like this algorithm only ever multiplies probabilities (never adds them), which is convenient because in log space, multiplication is translated to addition, but addition is translated to something that requires taking exponents and logs. A couple of additional comments: 1. Note the lack of type annotations on the viterbi function. They aren't necessary for performance, as long as the input is well typed. Sometimes they can be useful for documentation or error checking though. 2. I made the problem data const. It occurs in global scope, so this is necessary to make sure that it is precisely typed. 3. Most of the time is now spent allocating V, maxindices, and the output of path, and also inside first_index. I checked this by running @profile benchmark_example(10) Profile.print() Only the most recent column of V is ever used, so you could probably speed things up by only storing 1 column of V, and mutating it. first_index could probably be further optimized by dispatching on the length of the string or its first char or something like that, but I decided not to mess with that. On Saturday, September 20, 2014 11:43:11 AM UTC-7, Jason Merrill wrote: One thing that's really nice about Julia is that it's often straightforward to transliterate python code (or matlab code) in a fairly literal way and end up with working code that has similar, or sometimes better performance. But another nice thing about Julia is that it allows you to fairly smoothly move from python or matlab-like code to c-like code that has much better performance, in the places where this matters. There's usually some work to do to achieve this, though. I.e. you can't generally expect that inefficiently vectorized matlab-style code, or python-style code that uses tons of Dict lookups, will perform orders of magnitude better in Julia than it did in the original languages. To put it differently, Julia makes code run fast is too simplistic, but Julia lets me write expressive high level code, Julia lets me write performant low level code, and Julia lets me smoothly move back and forth between high level code and low level code are all true in my experience. Okay, so finally the reason for this sermon: if you want to, I think you could wring a lot more performance out of that Viterbi code. I haven't actually tried this yet, but I think you could replace a lot of your data structures with arrays and vectors of floats and ints, and then you'll probably see performance improvements like 2x or 10x or 100x instead of 1.26x. If you treat your states, [Healthy, Fever], and your observations [normal, cold, dizzy] like enums, i.e. think of those strings mapping onto integers, then your start probability can become a vector of 2 floats, your transition probability is a 2x2 float array, and your emission probability is a 2x3 float array. The big improvement will come from treating paths as an int array. Right now, it's a Dict{K,Any}(), and you allocate a new dict on each iteration, which is probably really hurting you. You could translate it to a Nx2 array, where N is the number of observations, and 2 is the number of states, and then allocate it all at the beginning. At each step n, you then fill in the nth row with integers that tell you which column to look in above you to keep following the path backwards. If you try any of this, I'd love to know how it goes. If you're after performance here, I suspect you'll be impressed by the results. Best, Jason On Saturday, September 20, 2014 9:41:02 AM UTC-7, Jason Trenouth wrote: Hi, I converted some Python programs to Julia recently. The (probably incorrect) ramblings are here: http://a-coda.tumblr.com/post/93907978846/julia-dream http://a-coda.tumblr.com/post/97973293291/a-recurring-dream tl;dr - similar in style, but Julia can be a lot faster __Jason
Re: [julia-users] De-serialization and thread-safety?
Yes, I thought the same. I thus removed the dependency on MPI; I'm now serializing and deserializing directly, without using MPI. My current code is at https://bitbucket.org/eschnett/funhpc.jl/branch/memdebug, and running julia Wave.jl triggers the problem reliably in a few seconds. The deserialization call is in Comm.jl in the function recv_item. -erik On Sun, Sep 21, 2014 at 2:00 PM, Jake Bolewski jakebolew...@gmail.com wrote: I saw a couple of posts back that you are using MPI? Any chance that MPI is issuing a callback on a different thread? This could be an issue with c-interop and can be sometimes solved by following the steps in the thread safety section of the manual. On Sunday, September 21, 2014 1:44:23 PM UTC-4, Erik Schnetter wrote: Unfortunately I don't have a simple example that reproduces the problem. So far, I've managed to whittle it down to an application running in a single process without dependencies on external packages. -erik On Sun, Sep 21, 2014 at 1:04 PM, Tim Holy tim@gmail.com wrote: If you have/find a clean example, certainly posting an issue will make sense. I can't comment on whether the task switch during I/O is inevitable. --Tim On Sunday, September 21, 2014 10:25:11 AM Erik Schnetter wrote: I'm aware that Julia's threads are green threads. The issue of thread safety still remains; if one thread is suspended in a critical region, another can enter that region. Storing handles in global data structures and incrementing global variables are such actions, and I'm not 100% sure that the respective region in serialize.jl are yield-free, even without my info output. I was surprised to see that I/O causes task switches -- maybe something else (hashing? dictionaries? creating new lambdas in C?) also causes task switches? gdb points to memory allocation routines in libc, called from gc.c or array.c. I assume that something overwrites memory, destroying libc malloc's data structures, leading to a crash later. -erik On Sun, Sep 21, 2014 at 5:26 AM, Tim Holy tim@gmail.com wrote: Hi Erik, First, one comment: tasks are not true (kernel) threads. Currently a julia process is single-threaded. Tasks are better considered as a form of cooperative multitasking. Yes, I've also found that I/O causes task switching. I don't personally know a great way around this. One option would presumably be to have some form of message queue; I am pretty sure that push!ing a new message on it---as long as you don't need to touch I/O to create the message---would not cause a switch. You can also use time() and other markers to indicate the status of control flow. I haven't been reading things carefully enough to know whether there's any history behind this, but if you haven't said so already...what does gdb (or equivalent) say about the segfault? --Tim On Saturday, September 20, 2014 08:24:59 PM Erik Schnetter wrote: I am trying to track down a segfault in a Julia application. Currently I am zooming in on deserialize, as avoiding calling it seems to reliably cure the problem, while calling it (even if not using the result) seems to reliably trigger the segfault. I am using many threads (tasks), and deserialize is called concurrently. Is this safe? I've been bitten in the past by this; e.g. I've accidentally added an info statement into a sequence of statements that needs to be atomic, and I/O apparently switches threads. Is there a list of known-to-be-safe or known-to-be-unsafe functions? Is deserialization thread-safe in this respect? I am in particular deserializing function calls and lambda expressions, and I see global variables (lambda_numbers, known_lambda_data). Are the respective data structures (WeakKeyDict and Dict) thread-safe? Is there a locking mechanism in Julia? This would temporarily only allow a single thread (task) to run, aborting with an error if this thread becomes unrunnable. In other words, calling yield when holding a lock would be a no-op. -erik -- Erik Schnetter schn...@cct.lsu.edu http://www.perimeterinstitute.ca/personal/eschnetter/ -- Erik Schnetter schnet...@cct.lsu.edu http://www.perimeterinstitute.ca/personal/eschnetter/
[julia-users] Re: some Python / Julia comparisons
On Sunday, 21 September 2014 19:23:20 UTC+1, Jason Merrill wrote: I got curious, and ended up implementing this myself: Hi Jason, Thanks for this and your previous comment. I might play about with this some more myself, e.g. back translate to Python to compare again. Note that the original Python code was just taken from the Wikipedia page on Viterbi and was probably only meant for educational purposes. I was trying to see how little I could change the code to speed things up while showing off some aspects of Julia. __Jason
[julia-users] Re: some Python / Julia comparisons
On Saturday, 20 September 2014 18:45:31 UTC+1, stone...@gmail.com wrote: Hi Jason, Could it be possible for you to create a Julia program to compare it with the famous Jake Vanderplas post ? http://jakevdp.github.io/blog/2013/06/15/numba-vs-cython-take-2/ Under which type of problem Julia fly much higher or easily than cython/pypy/numba ? (much = x3 in my mind) Hi, As a Julia newbie I'm not sure I'm the right person to ask to do such a bake-off, but there may be others who'll bite. __Jason
[julia-users] Re: some Python / Julia comparisons
On Sunday, September 21, 2014 12:21:37 PM UTC-7, Jason Trenouth wrote: On Sunday, 21 September 2014 19:23:20 UTC+1, Jason Merrill wrote: I got curious, and ended up implementing this myself: I was trying to see how little I could change the code to speed things up while showing off some aspects of Julia. Yeah, totally, and I think that's valuable. It's nice to know that you, yes you! can probably port your python code to Julia and get it working today. But I'm advocating that you're missing a lot of the benefit (and a lot of the fun!) if you don't then massage your code to take advantage of Julia's strengths. If I read your post without knowing anything about Julia, I think I might conclude oh, great, Julia is another language that's good for writing recursive factorial, but the performance improvements aren't so impressive for real algorithms like Viterbi. I wouldn't bother porting my code to a whole different language for a 26% performance improvement, unless maybe I was trying to get a few more fps in a game or something like that.
Re: [julia-users] delete type definition
Is working in a module and then reloading that the only workaround? There is also `workspace()` to clear everything. On Sun, Sep 21, 2014 at 9:37 AM, Tamas Papp tkp...@gmail.com wrote: Hi, suppose I defined type Point a::Real b::Real end but later realized that I would rather have abstract(GeoObject) type Point : GeoObject a::Real b::Real end Of course Julia gives me an ERROR: invalid redefinition of constant Point Is working in a module and then reloading that the only workaround? Best, Tamas
[julia-users] Re: Current state of 3rd party APIs in Julia?
Please? On Friday, September 19, 2014 11:37:21 AM UTC-6, Ed wrote: Hello, I was wondering if anybody knows what's the current state of the connectors between Julia and postgres, Amazon S3, and Google Storage and Google BigQuery? Do they just work, or are they buggy, or completely unusable/not built yet?
[julia-users] Re: some Python / Julia comparisons
On Saturday, September 20, 2014 10:45:31 AM UTC-7, stone...@gmail.com wrote: Hi Jason, Could it be possible for you to create a Julia program to compare it with the famous Jake Vanderplas post ? http://jakevdp.github.io/blog/2013/06/15/numba-vs-cython-take-2/ Under which type of problem Julia fly much higher or easily than cython/pypy/numba ? (much = x3 in my mind) You should just try it. `pairwise_python` from that post can be translated to Julia quite literally. I didn't succeed in installing numba in my first 5 minutes of trying, so I can't really report on comparative performance, but I can tell you that the Julia version's execution time is measured in ms, and the pure python version's execution time is measured in seconds. BTW, I think in Julia it might be better to represent the points as a 3x1000 array instead of a 1000x3 array, since you want the point coordinates to be stored next to each other in memory for this algorithm. See also pairwise from the Distances.jl package.
[julia-users] Re: some Python / Julia comparisons
Python w/ numba and Julia are comparable in speed for the tests I did. This is not surprising as numba utilizes LLVM as well. For the above example and on my Unix computer, the timings are 15 ms for Python+numba and 20 ms for Julia. If one gets the data as 1000x3 matrix, should one really first transpose the matrix and then apply a distance function on 3x1000? I don't think so. On Sunday, September 21, 2014 10:52:43 PM UTC+2, Jason Merrill wrote: On Saturday, September 20, 2014 10:45:31 AM UTC-7, stone...@gmail.com wrote: Hi Jason, Could it be possible for you to create a Julia program to compare it with the famous Jake Vanderplas post ? http://jakevdp.github.io/blog/2013/06/15/numba-vs-cython-take-2/ Under which type of problem Julia fly much higher or easily than cython/pypy/numba ? (much = x3 in my mind) You should just try it. `pairwise_python` from that post can be translated to Julia quite literally. I didn't succeed in installing numba in my first 5 minutes of trying, so I can't really report on comparative performance, but I can tell you that the Julia version's execution time is measured in ms, and the pure python version's execution time is measured in seconds. BTW, I think in Julia it might be better to represent the points as a 3x1000 array instead of a 1000x3 array, since you want the point coordinates to be stored next to each other in memory for this algorithm. See also pairwise from the Distances.jl package.
Re: [julia-users] Re: some Python / Julia comparisons
On Sunday, September 21, 2014 03:02:38 PM Hans W Borchers wrote: If one gets the data as 1000x3 matrix, should one really first transpose the matrix and then apply a distance function on 3x1000? I don't think so. It depends on the coming algorithm. 1000 points is not very much; the whole thing would fit into L1 cache, and so for 1000 points there's no incentive. But change that to 10^5 points, so the total computational cost is 10^10, and you could get an 5-10 fold speedup even with the cost of that transposition. When it comes to performance, cache is a huge consideration, and an O(N) reordering is totally worth it for an O(N^2) algorithm. Of course, if you're using the Euclidean distance, you can do it using BLAS matrix multiplication (with some loss in precision), and BLAS already worries about cache for you. So in that particular case, with that particular algorithm, it wouldn't be necessary. But basically any other metric (or if you are worried about that loss of precision and want to use the higher-precision naive algorithm), you'll want the points arranged along columns. --Tim
[julia-users] Re: Current state of 3rd party APIs in Julia?
Julia and Postgres works fine using the ODBC.jl package; I know there is also a Postgres-specific package in the works. I've used the AWS.jl package to download from S3, but haven't used it enough to say it just works. I'm not aware of Google package APIs. On Sunday, September 21, 2014 4:25:15 PM UTC-4, Ed wrote: Please? On Friday, September 19, 2014 11:37:21 AM UTC-6, Ed wrote: Hello, I was wondering if anybody knows what's the current state of the connectors between Julia and postgres, Amazon S3, and Google Storage and Google BigQuery? Do they just work, or are they buggy, or completely unusable/not built yet?