Andrei Zh

I'm confused.  Have you actually tried?  

julia> io = IOBuffer()
IOBuffer(data=UInt8[...], readable=true, writable=true, seekable=true, 
append=false, size=0, maxsize=Inf, ptr=1, mark=-1)

julia> foo(x) =  x + 1
foo (generic function with 1 method)

julia> serialize(io, foo)

julia> seekstart(io)
IOBuffer(data=UInt8[...], readable=true, writable=true, seekable=true, 
append=false, size=9, maxsize=Inf, ptr=1, mark=-1)

julia> baz = deserialize(io)
foo (generic function with 1 method)

julia> baz(1)
2

The serialization code won't recursively serialize all the of the functions 
dependencies so you will have to send/serialize the code that defines the 
environment (types, constants, Packages, etc).

On Friday, August 14, 2015 at 6:23:23 AM UTC-4, Andrei Zh wrote:
>
> Yes, but once again, I'm not using Julia workers, but instead completely 
> independent Julia processes, running on different machines and ruled by 
> Spark, not by Julia's ClusterManager. I.e. workflow looks like this:
>
> 1. Julia process 1 starts JVM and connects to Spark master node. 
> 2. Julia process 1 sends serialized function to Spark master node. 
> 3. Spark master node notifies Spark worker nodes (say, there are N of 
> them) about upcoming computations. 
> 4. Each Spark worker node creates its own Julia process, independent from 
> Julia process 1. 
> 5. Each Spark worker node receives serialized function and passes it to 
> its local Julia process. 
>
> So with N workers in Spark cluster, there's in total N+1 Julia processes, 
> and when function in question is created, Julia processes from 2 to N+1 
> don't even exist yet.
>
>
> On Friday, August 14, 2015 at 12:35:18 PM UTC+3, Tim Holy wrote:
>>
>> If you define the function with @everywhere, it will be defined on all 
>> existing 
>> workers. Likewise, `using MyPackage` loads the package on all workers. 
>>
>> --Tim 
>>
>> On Thursday, August 13, 2015 03:10:54 PM Andrei Zh wrote: 
>> > Ok, after going through serialization code, it's clear that default 
>> > implementation doesn't support serializing function code, but only its 
>> > name. For example, here's relevant section from 
>> > `deserialize(::SerializationState, ::Function)`: 
>> > mod = deserialize(s)::Module 
>> > name = deserialize(s)::Symbol 
>> > if !isdefined(mod,name) 
>> >     return (args...)->error("function $name not defined on process 
>> > $(myid())") 
>> > end 
>> > 
>> > 
>> > 
>> > This doesn't fit my needs (essentially, semantics of Spark), and I 
>> guess 
>> > there's no existing solution for full function serialization. Thus I'm 
>> > going to write new solution for this. 
>> > 
>> > So far the best idea I have is to get function's AST and recursively 
>> > serialize it, catching calls to the other non-Base function and any 
>> bound 
>> > variables. But this looks quite complicated. Is there better / easier 
>> way 
>> > to get portable function's representation? 
>> > 
>> > On Monday, August 10, 2015 at 11:48:55 PM UTC+3, Andrei Zh wrote: 
>> > > Yes, I incorrectly assumed `serialize` / `deserialize` use JLD 
>> format. But 
>> > > anyway, even when I saved the function into "example.jls" or even 
>> plain 
>> > > byte array (using IOBuffer and `takebuf_array`), nothing changed. Am 
>> I 
>> > > missing something obvious? 
>> > > 
>> > > On Monday, August 10, 2015 at 11:40:03 PM UTC+3, Tim Holy wrote: 
>> > >> On Monday, August 10, 2015 01:13:15 PM Tony Kelman wrote: 
>> > >> > Should 
>> > >> > probably use some different extension for that, .jls or something, 
>> to 
>> > >> 
>> > >> avoid 
>> > >> 
>> > >> > confusion. 
>> > >> 
>> > >> Yes. That has been sufficiently confusing in the past, we even cover 
>> this 
>> > >> here: 
>> > >> 
>> > >> 
>> https://github.com/JuliaLang/JLD.jl#saving-and-loading-variables-in-julia 
>> > >> -data-format-jld 
>> > >> 
>> > >> --Tim 
>> > >> 
>> > >> > On Monday, August 10, 2015 at 12:45:35 PM UTC-7, Stefan Karpinski 
>> > >> 
>> > >> wrote: 
>> > >> > > JLD doesn't support serializing functions but Julia itself does. 
>> > >> > > 
>> > >> > > On Mon, Aug 10, 2015 at 3:43 PM, Andrei Zh <faithle...@gmail.com 
>> > >> > > 
>> > >> > > <javascript:>> wrote: 
>> > >> > >> I'm afraid it's not quite true, and I found simple way to show 
>> it. 
>> > >> 
>> > >> In the 
>> > >> 
>> > >> > >> next code snippet I define function `f` and serialize it to a 
>> file: 
>> > >> > >> 
>> > >> > >> julia> f(x) = x + 1 
>> > >> > >> f (generic function with 1 method) 
>> > >> > >> 
>> > >> > >> julia> f(5) 
>> > >> > >> 6 
>> > >> > >> 
>> > >> > >> julia> open("example.jld", "w") do io serialize(io, f) end 
>> > >> > >> 
>> > >> > >> 
>> > >> > >> Then I close Julia REPL and in a new session try to load and 
>> use 
>> > >> 
>> > >> this 
>> > >> 
>> > >> > >> function: 
>> > >> > >> 
>> > >> > >> julia> f2 = open("example.jld") do io deserialize(io) end 
>> > >> > >> (anonymous function) 
>> > >> > >> 
>> > >> > >> julia> f2(5) 
>> > >> > >> ERROR: function f not defined on process 1 
>> > >> > >> 
>> > >> > >>  in error at error.jl:21 
>> > >> > >>  in anonymous at serialize.jl:398 
>> > >> > >> 
>> > >> > >> So deserialized function still refers to the old definition, 
>> which 
>> > >> 
>> > >> is not 
>> > >> 
>> > >> > >> available in this new session. 
>> > >> > >> 
>> > >> > >> Is there any better way to serialize a function and run it on 
>> an 
>> > >> > >> unrelated Julia process? 
>> > >> > >> 
>> > >> > >> On Monday, August 10, 2015 at 2:33:11 PM UTC+3, Jeff Waller 
>> wrote: 
>> > >> > >>>> My question is: does Julia's serialization produce completely 
>> > >> > >>>> self-containing code that can be run on workers? In other 
>> words, 
>> > >> 
>> > >> is it 
>> > >> 
>> > >> > >>>> possible to send serialized function over network to another 
>> host 
>> > >> 
>> > >> / 
>> > >> 
>> > >> > >>>> Julia 
>> > >> > >>>> process and applied there without any additional information 
>> from 
>> > >> 
>> > >> the 
>> > >> 
>> > >> > >>>> first 
>> > >> > >>>> process? 
>> > >> > >>>> 
>> > >> > >>>> I made some tests on a single machine, and when I defined 
>> function 
>> > >> > >>>> without `@everywhere`, worker failed with a message "function 
>> > >> 
>> > >> myfunc 
>> > >> 
>> > >> > >>>> not 
>> > >> > >>>> defined on process 1". With `@everywhere`, my code worked, 
>> but 
>> > >> 
>> > >> will it 
>> > >> 
>> > >> > >>>> work 
>> > >> > >>>> on multiple hosts with essentially independent Julia 
>> processes? 
>> > >> > >>> 
>> > >> > >>> According to Jey here 
>> > >> > >>> < 
>> > >> 
>> > >> 
>> https://groups.google.com/forum/#!searchin/julia-users/jey/julia-users/ 
>> > >> 
>> > >> > >>> bolLGcSCrs0/fGGVLgNhI2YJ>, Base.serialize does what we want; 
>> it's 
>> > >> > >>> contained in serialize.jl 
>> > >> > >>> <
>> https://github.com/JuliaLang/julia/blob/master/base/serialize.jl> 
>>
>>

Reply via email to