[julia-users] Re: obtaining pointer to function
Hi Steven, thank you very much for the clarification. It was indeed possible to pass the length if the vector and with pointer_to_array it works smoothly. Tomas On Tuesday, 2 August 2016 13:46:47 UTC+2, Steven G. Johnson wrote: > > > > On Tuesday, August 2, 2016 at 7:39:16 AM UTC-4, pev...@gmail.com wrote: >> >> Hi All, >> I am trying to bind a fortran library for optimization ( >> http://napsu.karmitsa.fi/lmbm/) to Julia. >> To do so, I would like to get a pointer to function written in Julia, >> which I try to do as >> >> function fOpt(x::Array{Float64,1},g::Array{Float64,1}) >> fill!(g,1.0); >> convert(Cdouble,sum(x))::Cdouble >> end >> const fOptPtr = cfunction(fOpt, Cdouble, (Ptr{Cdouble}, Ptr{Cdouble})) >> >> > An Array{T} is not the same thing as a Ptr{T}. > > You need to declare fOpt to take Ptr{Cdouble} arguments. You can use the > pointer_to_array function to convert pointers to Arrays. (However, you > need to know the length of the array; I'm surprised that the length is not > passed as an argument — are you supposed to use a global?) >
[julia-users] obtaining pointer to function
Hi All, I am trying to bind a fortran library for optimization (http://napsu.karmitsa.fi/lmbm/) to Julia. To do so, I would like to get a pointer to function written in Julia, which I try to do as function fOpt(x::Array{Float64,1},g::Array{Float64,1}) fill!(g,1.0); convert(Cdouble,sum(x))::Cdouble end const fOptPtr = cfunction(fOpt, Cdouble, (Ptr{Cdouble}, Ptr{Cdouble})) Though the call of cfunction return error: *ERROR: cfunction: no method exactly matched the required type signature (function not yet c-callable)* * in cfunction at c.jl:9* Can anyone suggest what I am doing wrong? I use Julia Version 0.4.5 (2016-03-18 00:58 UTC) on OS X. Thank you very much for help. Tomas
Re: [julia-users] Re: Calling all users of ParallelAccelerator.
Hi Todd, I have been looking at latte and it does not seem to be useful for me, since I need some special constructs and they are just not available. Nevertheless, I would like to ask you, if Latte uses parallelization? In my own implementation, I am struggling to exploit multi-core hw. Thank you very much. Best wishes, Tomas On Saturday, 23 July 2016 05:42:57 UTC+2, Todd Anderson wrote: > > You may also want to look at another IntelLabs project on GitHub called > Latte. It provides a DSL for deep neural networks in Julia. > > Todd > > -- > *From: *pev...@gmail.com > *To: *"julia-users"> *Sent: *Friday, July 22, 2016 8:07:19 PM > *Subject: *[julia-users] Re: Calling all users of ParallelAccelerator. > > Hi Todd, >> > I have tried several times to use ParallelAccelerator to speed up my toy > Neural Network library, but I never had any significant performance boost. > I like the idea of the project a lot, sadly I was never able to fully > utilise it. > > Best wishes, > Tomas > >
[julia-users] Re: Calling all users of ParallelAccelerator.
> > Hi Todd, > I have tried several times to use ParallelAccelerator to speed up my toy Neural Network library, but I never had any significant performance boost. I like the idea of the project a lot, sadly I was never able to fully utilise it. Best wishes, Tomas
[julia-users] Re: asynchronous reading from file
Hi James, thank for the reply. Though in your implementation the reading is not in a separate process / thread, as I expect that you are bound to IO operations. In my problem there is computationally intensive post-processing. Should I modify the iotask as iotask = @task begin info("reading from stdin") for i in 1:20 s @spawn loaddata() produce(s) end end Do I need to have the consumer of s wrapped as another task? Meaning my stochastic gradient descend loop will look like your worked and does the stochastic gradient descend needs to produce something? I would like to understand the details. Thanks for the answer. Tomas
[julia-users] asynchronous reading from file
Hi All, I would like to implement an asynchronous reading from file. I am doing stochastic gradient descend and while I am doing the optimisation, I would like to load the data on the background. Since reading of the data is followed by a quite complicated parsing, it is not just simple IO operation that can be done without CPU cycles. the skeleton of my current implementation looks like this rr = RemoteChannel() @async put!(rr, remotecall_fetch(loaddata,2) for ii in 1:maxiter #do some steps of the gradient descend #check if the data are ready and schedule next reading if isready(rr) append!(dss[1],take!(rr)); @async put!(rr, remotecall_fetch(loaddata,2) end end nevertheless the isready(rr) always returns false, which looks like that the data are never loaded. I start the julia as julia -p 2, therefore I expect there will be a processor. Can anyone explain me please, what am I doing wrong? Thank you very much. Tomas
Re: [julia-users] poor performance of threads
Dear Sam, the output of the benchmark is following 105.290122 seconds (31.43 k allocations: 1.442 MB, 0.00% gc time) 107.445101 seconds (1.37 M allocations: 251.368 MB, 0.12% gc time) Tomas
Re: [julia-users] poor performance of threads
Thank you very much Tim. I am using the profiler and your package ProfileView quite extensively and I know where is my Achille heel in the code, and it is cpu bound. That's why I am so puzzled with threads. I will try to use @code_warntype, never use it before. Best wishes, Tomas
[julia-users] Bug in daxpy! ???
Hello all, I was polishing my call and I have found the following definition of daxpy! I was not aware of function axpy!{Ti<:Integer,Tj<:Integer}(α, x::AbstractArray, rx::AbstractArray{Ti}, y::AbstractArray, ry::AbstractArray{Tj}) if length(x) != length(y) throw(DimensionMismatch("x has length $(length(x)), but y has length $(length(y))")) elseif minimum(rx) < 1 || maximum(rx) > length(x) throw(BoundsError(x, rx)) elseif minimum(ry) < 1 || maximum(ry) > length(y) throw(BoundsError(y, ry)) elseif length(rx) != length(ry) throw(ArgumentError("rx has length $(length(rx)), but ry has length $(length(ry))")) end for i = 1:length(rx) @inbounds y[ry[i]] += x[rx[i]]*α end y end Is the first check length(x) != length(y) really an intended behavior? The multiplication goes over indexes rx and ry, should not be the check length(rx) != length(ry) ? Thanks for the clarification. Tomas
Re: [julia-users] poor performance of threads
Thanks a lot for the suggestions. As I have mentioned, it was really a toy problem, but I am not getting a significant speedup on a bigger problem, where threads are nicely separated either and the problem is very CPU bound either. I would be very interested to know about tool that would point out to problems with cache and memory access. Tomas
[julia-users] poor performance of threads
Hi All, I would like to ask if someone has an experience with Threads as they are implemented at the moment in the master branch. After the successful compilation (put JULIA_THREADS=1 to Make.user) I have played with different levels of granularity, but usually the code was slower or more or less the same speed as single threaded version. I have even tried a totally stupid execution like this using Base.Threads; function one() x=randn(100); f=0; for i in x f+=i; end end function two() x=randn(100); f=zeros(nthreads()) @inbounds @threads for i in 1:length(x) f[threadid()]+=x[i]; end sum(f) end one() @time one() two() @time two() and the times on my 2013 Macbook air were 0.068617 seconds (2.00 M allocations: 38.157 MB, 9.72% gc time) 0.394164 seconds (5.72 M allocations: 99.015 MB, 5.00% gc time) Wov, that is quite poor. I would expect an overhead, but not big like this. Can anyone suggest, what is going wrong? I have been trying a profiler, but it does not help. It seems that it does not work with Threads at the moment. Or, is it because Threads are still not really supported. I would like to get speed-up showing in this video https://www.youtube.com/watch?v=GvLhseZ4D8M Any suggestions welcomed. Tomas
[julia-users] cannot compile Julia with threading support
Hi All, I wanted to try julia with threads, therefore I have cloned GIT repository, checkout version 0.4.3 git checkout release-0.4 I have put JULIA_THREADS=1, here is my Make.user CC=/opt/rh/devtoolset-2/root/usr/bin/gcc CXX=/opt/rh/devtoolset-2/root/usr/bin/g++ JULIA_THREADS=1 and compiled julia. Nevertheless I do not see the support threads, since I do not see module Base.Threads. Does anyone has experience with this? Can anyone recommend, what should I do or what I did wrong? Thank you very much for any help. Best wishes, Tomas
[julia-users] Tutorials and examples on threading
Hi All, I know that julia does not have at the moment native support for threads, but it exist in an experimental branch. Is there any tutorial on this, such that average user can try it? I would like to try it speed my code. I do separate processes at the moment and I think the overhead is enormous. Thanks a lot for the answers. Tomas
[julia-users] Type of composite type
Hello, I have a problem to which I have found a dirty solution and I am keen to know, if there is a principal one. I have a composite type defined as type Outer{T} A::T B::T end where A and and B are composite types Then I want to create constructor function Outer(k::Int) return(Outer(A{T}(k),B{T}(k)) end But I have not find a way to put there the type information. The only dirty hack I have come with to define the outer constructor as function Outer(k::Int;T::DataType=Float32) return(Outer(A{T}(k),B{T}(k)) end But I do not like this solution too much. It is little awkward. Thanks for suggesting a cleaner solution. Best wishes, Tomas
[julia-users] samples of different sizes in mocha
Hi, I have a question related to use of Mocha, particularly if I can tweak it to my problem. I want to use the library for multi-instance learning, which means that each sample is composed of multiple instances, but the number of instances in each sample differs from sample to sample. You can imagine this as a 2D image, where each image has different height, but the width of all images is the same. When the pooling operation is performed over the height, then we get sample of fixed size and one can follow the usual Neural Nets, where each sample has the same size. What I feel is that the usual data provider cannot be used here, as it assumes that all samples are of fixed size. Am I right? Is there a way, how this can be fixed? Thank you very much for help and I apologise if this is not the right place to ask my question, though I do not know about any other.
[julia-users] alignement in memory when allocating arrays
Hi all, I would like to ask if it is possible to enforce memory alignment for large arrays, such that that arrays are aligned to 64 bytes and can be efficiently with SIMD instructions. I assume to call library functions wirtten in c/c++. Thanks for the answer. Tomas