Andrei Zh I'm confused. Have you actually tried?
julia> io = IOBuffer() IOBuffer(data=UInt8[...], readable=true, writable=true, seekable=true, append=false, size=0, maxsize=Inf, ptr=1, mark=-1) julia> foo(x) = x + 1 foo (generic function with 1 method) julia> serialize(io, foo) julia> seekstart(io) IOBuffer(data=UInt8[...], readable=true, writable=true, seekable=true, append=false, size=9, maxsize=Inf, ptr=1, mark=-1) julia> baz = deserialize(io) foo (generic function with 1 method) julia> baz(1) 2 The serialization code won't recursively serialize all the of the functions dependencies so you will have to send/serialize the code that defines the environment (types, constants, Packages, etc). On Friday, August 14, 2015 at 6:23:23 AM UTC-4, Andrei Zh wrote: > > Yes, but once again, I'm not using Julia workers, but instead completely > independent Julia processes, running on different machines and ruled by > Spark, not by Julia's ClusterManager. I.e. workflow looks like this: > > 1. Julia process 1 starts JVM and connects to Spark master node. > 2. Julia process 1 sends serialized function to Spark master node. > 3. Spark master node notifies Spark worker nodes (say, there are N of > them) about upcoming computations. > 4. Each Spark worker node creates its own Julia process, independent from > Julia process 1. > 5. Each Spark worker node receives serialized function and passes it to > its local Julia process. > > So with N workers in Spark cluster, there's in total N+1 Julia processes, > and when function in question is created, Julia processes from 2 to N+1 > don't even exist yet. > > > On Friday, August 14, 2015 at 12:35:18 PM UTC+3, Tim Holy wrote: >> >> If you define the function with @everywhere, it will be defined on all >> existing >> workers. Likewise, `using MyPackage` loads the package on all workers. >> >> --Tim >> >> On Thursday, August 13, 2015 03:10:54 PM Andrei Zh wrote: >> > Ok, after going through serialization code, it's clear that default >> > implementation doesn't support serializing function code, but only its >> > name. For example, here's relevant section from >> > `deserialize(::SerializationState, ::Function)`: >> > mod = deserialize(s)::Module >> > name = deserialize(s)::Symbol >> > if !isdefined(mod,name) >> > return (args...)->error("function $name not defined on process >> > $(myid())") >> > end >> > >> > >> > >> > This doesn't fit my needs (essentially, semantics of Spark), and I >> guess >> > there's no existing solution for full function serialization. Thus I'm >> > going to write new solution for this. >> > >> > So far the best idea I have is to get function's AST and recursively >> > serialize it, catching calls to the other non-Base function and any >> bound >> > variables. But this looks quite complicated. Is there better / easier >> way >> > to get portable function's representation? >> > >> > On Monday, August 10, 2015 at 11:48:55 PM UTC+3, Andrei Zh wrote: >> > > Yes, I incorrectly assumed `serialize` / `deserialize` use JLD >> format. But >> > > anyway, even when I saved the function into "example.jls" or even >> plain >> > > byte array (using IOBuffer and `takebuf_array`), nothing changed. Am >> I >> > > missing something obvious? >> > > >> > > On Monday, August 10, 2015 at 11:40:03 PM UTC+3, Tim Holy wrote: >> > >> On Monday, August 10, 2015 01:13:15 PM Tony Kelman wrote: >> > >> > Should >> > >> > probably use some different extension for that, .jls or something, >> to >> > >> >> > >> avoid >> > >> >> > >> > confusion. >> > >> >> > >> Yes. That has been sufficiently confusing in the past, we even cover >> this >> > >> here: >> > >> >> > >> >> https://github.com/JuliaLang/JLD.jl#saving-and-loading-variables-in-julia >> > >> -data-format-jld >> > >> >> > >> --Tim >> > >> >> > >> > On Monday, August 10, 2015 at 12:45:35 PM UTC-7, Stefan Karpinski >> > >> >> > >> wrote: >> > >> > > JLD doesn't support serializing functions but Julia itself does. >> > >> > > >> > >> > > On Mon, Aug 10, 2015 at 3:43 PM, Andrei Zh <faithle...@gmail.com >> > >> > > >> > >> > > <javascript:>> wrote: >> > >> > >> I'm afraid it's not quite true, and I found simple way to show >> it. >> > >> >> > >> In the >> > >> >> > >> > >> next code snippet I define function `f` and serialize it to a >> file: >> > >> > >> >> > >> > >> julia> f(x) = x + 1 >> > >> > >> f (generic function with 1 method) >> > >> > >> >> > >> > >> julia> f(5) >> > >> > >> 6 >> > >> > >> >> > >> > >> julia> open("example.jld", "w") do io serialize(io, f) end >> > >> > >> >> > >> > >> >> > >> > >> Then I close Julia REPL and in a new session try to load and >> use >> > >> >> > >> this >> > >> >> > >> > >> function: >> > >> > >> >> > >> > >> julia> f2 = open("example.jld") do io deserialize(io) end >> > >> > >> (anonymous function) >> > >> > >> >> > >> > >> julia> f2(5) >> > >> > >> ERROR: function f not defined on process 1 >> > >> > >> >> > >> > >> in error at error.jl:21 >> > >> > >> in anonymous at serialize.jl:398 >> > >> > >> >> > >> > >> So deserialized function still refers to the old definition, >> which >> > >> >> > >> is not >> > >> >> > >> > >> available in this new session. >> > >> > >> >> > >> > >> Is there any better way to serialize a function and run it on >> an >> > >> > >> unrelated Julia process? >> > >> > >> >> > >> > >> On Monday, August 10, 2015 at 2:33:11 PM UTC+3, Jeff Waller >> wrote: >> > >> > >>>> My question is: does Julia's serialization produce completely >> > >> > >>>> self-containing code that can be run on workers? In other >> words, >> > >> >> > >> is it >> > >> >> > >> > >>>> possible to send serialized function over network to another >> host >> > >> >> > >> / >> > >> >> > >> > >>>> Julia >> > >> > >>>> process and applied there without any additional information >> from >> > >> >> > >> the >> > >> >> > >> > >>>> first >> > >> > >>>> process? >> > >> > >>>> >> > >> > >>>> I made some tests on a single machine, and when I defined >> function >> > >> > >>>> without `@everywhere`, worker failed with a message "function >> > >> >> > >> myfunc >> > >> >> > >> > >>>> not >> > >> > >>>> defined on process 1". With `@everywhere`, my code worked, >> but >> > >> >> > >> will it >> > >> >> > >> > >>>> work >> > >> > >>>> on multiple hosts with essentially independent Julia >> processes? >> > >> > >>> >> > >> > >>> According to Jey here >> > >> > >>> < >> > >> >> > >> >> https://groups.google.com/forum/#!searchin/julia-users/jey/julia-users/ >> > >> >> > >> > >>> bolLGcSCrs0/fGGVLgNhI2YJ>, Base.serialize does what we want; >> it's >> > >> > >>> contained in serialize.jl >> > >> > >>> < >> https://github.com/JuliaLang/julia/blob/master/base/serialize.jl> >> >>