Re: [julia-users] Re: Can Julia function be serialized and sent by network?

2015-09-22 Thread Andrei
So far the best way to overcome it is to install all needed Julia packages
on every machine. This is not very convenient, but at least it's not a
blocker and you need to install some Julia packages manually anyway (though
I'm thinking of creating API for massive installation of packages on all
Spark workers).

I'm not sure what you mean by "Julia-on-C++", though.

On Tue, Sep 22, 2015 at 5:33 AM, edward zhang  wrote:

>
> hmm, I'm already very interested in the project like
> Julia-on-(Spark/c++?),
> and the ser/des issue is a big obstacle.
> 在 2015年9月21日星期一 UTC+8下午9:30:05,Andrei Zh写道:
>>
>> Hi,
>>
>> not yet. I made some initial research regarding serialization of ASTs and
>> reconstructing functions from them, but it seems quite a tricky procedure
>> and I have very little time for this project now. I plan to come back to
>> this issue around the beginning of the next month.
>>
>> On Mon, Sep 21, 2015 at 11:25 AM, edward zhang 
>> wrote:
>>
>>> hi, dear,
>>>  have you already fixed this problem?
>>>
>>>
>>> 在 2015年8月14日星期五 UTC+8下午11:06:30,Andrei Zh写道:
>>>

 Hi Jake,

 your example works because you don't leave Julia session. `foo` is
 defined in this session, so the the pair of module name and function name
 is enough to get function object. If you save serialized function (or just
 retype it byte by byte) , it won't work. Here's an example:

 Session #1:

 julia> io = IOBuffer()
 IOBuffer(data=Uint8[...], readable=true, writable=true, seekable=true,
 append=false, size=0, maxsize=Inf, ptr=1, mark=-1)


 julia> foo(x) =  x + 1
 foo (generic function with 1 method)


 julia> serialize(io, foo)


 julia> takebuf_array(io)
 9-element Array{Uint8,1}:
  0x13
  0x02
  0x23
  0x2f
  0x02
  0x03
  0x66
  0x6f
  0x6f


 julia>



 Session #2:

 julia> data = Uint8[0x13, 0x02, 0x23, 0x2f, 0x02, 0x03, 0x66, 0x6f,
 0x6f]
 9-element Array{Uint8,1}:
  0x13
  0x02
  0x23
  0x2f
  0x02
  0x03
  0x66
  0x6f
  0x6f


 julia> io = IOBuffer(data)
 IOBuffer(data=Uint8[...], readable=true, writable=false, seekable=true,
 append=false, size=9, maxsize=Inf, ptr=1, mark=-1)


 julia> bar = deserialize(io)
 (anonymous function)


 julia> bar(1)
 ERROR: function foo not defined on process 1
  in error at error.jl:21
  in anonymous at serialize.jl:398


 julia>








 On Friday, August 14, 2015 at 5:49:55 PM UTC+3, Jake Bolewski wrote:
>
> Andrei Zh
>
> I'm confused.  Have you actually tried?
>
> julia> io = IOBuffer()
> IOBuffer(data=UInt8[...], readable=true, writable=true, seekable=true,
> append=false, size=0, maxsize=Inf, ptr=1, mark=-1)
>
> julia> foo(x) =  x + 1
> foo (generic function with 1 method)
>
> julia> serialize(io, foo)
>
> julia> seekstart(io)
> IOBuffer(data=UInt8[...], readable=true, writable=true, seekable=true,
> append=false, size=9, maxsize=Inf, ptr=1, mark=-1)
>
> julia> baz = deserialize(io)
> foo (generic function with 1 method)
>
> julia> baz(1)
> 2
>
> The serialization code won't recursively serialize all the of the
> functions dependencies so you will have to send/serialize the code that
> defines the environment (types, constants, Packages, etc).
>

>>


Re: [julia-users] Re: Can Julia function be serialized and sent by network?

2015-09-21 Thread edward zhang

hmm, I'm already very interested in the project like Julia-on-(Spark/c++?), 
and the ser/des issue is a big obstacle.
在 2015年9月21日星期一 UTC+8下午9:30:05,Andrei Zh写道:
>
> Hi, 
>
> not yet. I made some initial research regarding serialization of ASTs and 
> reconstructing functions from them, but it seems quite a tricky procedure 
> and I have very little time for this project now. I plan to come back to 
> this issue around the beginning of the next month. 
>
> On Mon, Sep 21, 2015 at 11:25 AM, edward zhang  > wrote:
>
>> hi, dear, 
>>  have you already fixed this problem?
>>
>>
>> 在 2015年8月14日星期五 UTC+8下午11:06:30,Andrei Zh写道:
>>
>>>
>>> Hi Jake, 
>>>
>>> your example works because you don't leave Julia session. `foo` is 
>>> defined in this session, so the the pair of module name and function name 
>>> is enough to get function object. If you save serialized function (or just 
>>> retype it byte by byte) , it won't work. Here's an example: 
>>>
>>> Session #1: 
>>>
>>> julia> io = IOBuffer()
>>> IOBuffer(data=Uint8[...], readable=true, writable=true, seekable=true, 
>>> append=false, size=0, maxsize=Inf, ptr=1, mark=-1)
>>>
>>>
>>> julia> foo(x) =  x + 1
>>> foo (generic function with 1 method)
>>>
>>>
>>> julia> serialize(io, foo)
>>>
>>>
>>> julia> takebuf_array(io)
>>> 9-element Array{Uint8,1}:
>>>  0x13
>>>  0x02
>>>  0x23
>>>  0x2f
>>>  0x02
>>>  0x03
>>>  0x66
>>>  0x6f
>>>  0x6f
>>>
>>>
>>> julia>
>>>
>>>
>>>
>>> Session #2: 
>>>
>>> julia> data = Uint8[0x13, 0x02, 0x23, 0x2f, 0x02, 0x03, 0x66, 0x6f, 0x6f
>>> ]
>>> 9-element Array{Uint8,1}:
>>>  0x13
>>>  0x02
>>>  0x23
>>>  0x2f
>>>  0x02
>>>  0x03
>>>  0x66
>>>  0x6f
>>>  0x6f
>>>
>>>
>>> julia> io = IOBuffer(data)
>>> IOBuffer(data=Uint8[...], readable=true, writable=false, seekable=true, 
>>> append=false, size=9, maxsize=Inf, ptr=1, mark=-1)
>>>
>>>
>>> julia> bar = deserialize(io)
>>> (anonymous function)
>>>
>>>
>>> julia> bar(1)
>>> ERROR: function foo not defined on process 1
>>>  in error at error.jl:21
>>>  in anonymous at serialize.jl:398
>>>
>>>
>>> julia>
>>>
>>>  
>>>
>>>
>>>
>>>
>>>
>>>
>>> On Friday, August 14, 2015 at 5:49:55 PM UTC+3, Jake Bolewski wrote:

 Andrei Zh

 I'm confused.  Have you actually tried?  

 julia> io = IOBuffer()
 IOBuffer(data=UInt8[...], readable=true, writable=true, seekable=true, 
 append=false, size=0, maxsize=Inf, ptr=1, mark=-1)

 julia> foo(x) =  x + 1
 foo (generic function with 1 method)

 julia> serialize(io, foo)

 julia> seekstart(io)
 IOBuffer(data=UInt8[...], readable=true, writable=true, seekable=true, 
 append=false, size=9, maxsize=Inf, ptr=1, mark=-1)

 julia> baz = deserialize(io)
 foo (generic function with 1 method)

 julia> baz(1)
 2

 The serialization code won't recursively serialize all the of the 
 functions dependencies so you will have to send/serialize the code that 
 defines the environment (types, constants, Packages, etc).

>>>
>

Re: [julia-users] Re: Can Julia function be serialized and sent by network?

2015-09-21 Thread Andrei
Hi,

not yet. I made some initial research regarding serialization of ASTs and
reconstructing functions from them, but it seems quite a tricky procedure
and I have very little time for this project now. I plan to come back to
this issue around the beginning of the next month.

On Mon, Sep 21, 2015 at 11:25 AM, edward zhang  wrote:

> hi, dear,
>  have you already fixed this problem?
>
>
> 在 2015年8月14日星期五 UTC+8下午11:06:30,Andrei Zh写道:
>
>>
>> Hi Jake,
>>
>> your example works because you don't leave Julia session. `foo` is
>> defined in this session, so the the pair of module name and function name
>> is enough to get function object. If you save serialized function (or just
>> retype it byte by byte) , it won't work. Here's an example:
>>
>> Session #1:
>>
>> julia> io = IOBuffer()
>> IOBuffer(data=Uint8[...], readable=true, writable=true, seekable=true,
>> append=false, size=0, maxsize=Inf, ptr=1, mark=-1)
>>
>>
>> julia> foo(x) =  x + 1
>> foo (generic function with 1 method)
>>
>>
>> julia> serialize(io, foo)
>>
>>
>> julia> takebuf_array(io)
>> 9-element Array{Uint8,1}:
>>  0x13
>>  0x02
>>  0x23
>>  0x2f
>>  0x02
>>  0x03
>>  0x66
>>  0x6f
>>  0x6f
>>
>>
>> julia>
>>
>>
>>
>> Session #2:
>>
>> julia> data = Uint8[0x13, 0x02, 0x23, 0x2f, 0x02, 0x03, 0x66, 0x6f, 0x6f]
>> 9-element Array{Uint8,1}:
>>  0x13
>>  0x02
>>  0x23
>>  0x2f
>>  0x02
>>  0x03
>>  0x66
>>  0x6f
>>  0x6f
>>
>>
>> julia> io = IOBuffer(data)
>> IOBuffer(data=Uint8[...], readable=true, writable=false, seekable=true,
>> append=false, size=9, maxsize=Inf, ptr=1, mark=-1)
>>
>>
>> julia> bar = deserialize(io)
>> (anonymous function)
>>
>>
>> julia> bar(1)
>> ERROR: function foo not defined on process 1
>>  in error at error.jl:21
>>  in anonymous at serialize.jl:398
>>
>>
>> julia>
>>
>>
>>
>>
>>
>>
>>
>>
>> On Friday, August 14, 2015 at 5:49:55 PM UTC+3, Jake Bolewski wrote:
>>>
>>> Andrei Zh
>>>
>>> I'm confused.  Have you actually tried?
>>>
>>> julia> io = IOBuffer()
>>> IOBuffer(data=UInt8[...], readable=true, writable=true, seekable=true,
>>> append=false, size=0, maxsize=Inf, ptr=1, mark=-1)
>>>
>>> julia> foo(x) =  x + 1
>>> foo (generic function with 1 method)
>>>
>>> julia> serialize(io, foo)
>>>
>>> julia> seekstart(io)
>>> IOBuffer(data=UInt8[...], readable=true, writable=true, seekable=true,
>>> append=false, size=9, maxsize=Inf, ptr=1, mark=-1)
>>>
>>> julia> baz = deserialize(io)
>>> foo (generic function with 1 method)
>>>
>>> julia> baz(1)
>>> 2
>>>
>>> The serialization code won't recursively serialize all the of the
>>> functions dependencies so you will have to send/serialize the code that
>>> defines the environment (types, constants, Packages, etc).
>>>
>>


Re: [julia-users] Re: Can Julia function be serialized and sent by network?

2015-09-21 Thread edward zhang
hi, dear, 
 have you already fixed this problem?


在 2015年8月14日星期五 UTC+8下午11:06:30,Andrei Zh写道:
>
>
> Hi Jake, 
>
> your example works because you don't leave Julia session. `foo` is defined 
> in this session, so the the pair of module name and function name is enough 
> to get function object. If you save serialized function (or just retype it 
> byte by byte) , it won't work. Here's an example: 
>
> Session #1: 
>
> julia> io = IOBuffer()
> IOBuffer(data=Uint8[...], readable=true, writable=true, seekable=true, 
> append=false, size=0, maxsize=Inf, ptr=1, mark=-1)
>
>
> julia> foo(x) =  x + 1
> foo (generic function with 1 method)
>
>
> julia> serialize(io, foo)
>
>
> julia> takebuf_array(io)
> 9-element Array{Uint8,1}:
>  0x13
>  0x02
>  0x23
>  0x2f
>  0x02
>  0x03
>  0x66
>  0x6f
>  0x6f
>
>
> julia>
>
>
>
> Session #2: 
>
> julia> data = Uint8[0x13, 0x02, 0x23, 0x2f, 0x02, 0x03, 0x66, 0x6f, 0x6f]
> 9-element Array{Uint8,1}:
>  0x13
>  0x02
>  0x23
>  0x2f
>  0x02
>  0x03
>  0x66
>  0x6f
>  0x6f
>
>
> julia> io = IOBuffer(data)
> IOBuffer(data=Uint8[...], readable=true, writable=false, seekable=true, 
> append=false, size=9, maxsize=Inf, ptr=1, mark=-1)
>
>
> julia> bar = deserialize(io)
> (anonymous function)
>
>
> julia> bar(1)
> ERROR: function foo not defined on process 1
>  in error at error.jl:21
>  in anonymous at serialize.jl:398
>
>
> julia>
>
>  
>
>
>
>
>
>
> On Friday, August 14, 2015 at 5:49:55 PM UTC+3, Jake Bolewski wrote:
>>
>> Andrei Zh
>>
>> I'm confused.  Have you actually tried?  
>>
>> julia> io = IOBuffer()
>> IOBuffer(data=UInt8[...], readable=true, writable=true, seekable=true, 
>> append=false, size=0, maxsize=Inf, ptr=1, mark=-1)
>>
>> julia> foo(x) =  x + 1
>> foo (generic function with 1 method)
>>
>> julia> serialize(io, foo)
>>
>> julia> seekstart(io)
>> IOBuffer(data=UInt8[...], readable=true, writable=true, seekable=true, 
>> append=false, size=9, maxsize=Inf, ptr=1, mark=-1)
>>
>> julia> baz = deserialize(io)
>> foo (generic function with 1 method)
>>
>> julia> baz(1)
>> 2
>>
>> The serialization code won't recursively serialize all the of the 
>> functions dependencies so you will have to send/serialize the code that 
>> defines the environment (types, constants, Packages, etc).
>>
>

Re: [julia-users] Re: Can Julia function be serialized and sent by network?

2015-08-14 Thread Jake Bolewski
Andrei Zh

I'm confused.  Have you actually tried?  

julia io = IOBuffer()
IOBuffer(data=UInt8[...], readable=true, writable=true, seekable=true, 
append=false, size=0, maxsize=Inf, ptr=1, mark=-1)

julia foo(x) =  x + 1
foo (generic function with 1 method)

julia serialize(io, foo)

julia seekstart(io)
IOBuffer(data=UInt8[...], readable=true, writable=true, seekable=true, 
append=false, size=9, maxsize=Inf, ptr=1, mark=-1)

julia baz = deserialize(io)
foo (generic function with 1 method)

julia baz(1)
2

The serialization code won't recursively serialize all the of the functions 
dependencies so you will have to send/serialize the code that defines the 
environment (types, constants, Packages, etc).

On Friday, August 14, 2015 at 6:23:23 AM UTC-4, Andrei Zh wrote:

 Yes, but once again, I'm not using Julia workers, but instead completely 
 independent Julia processes, running on different machines and ruled by 
 Spark, not by Julia's ClusterManager. I.e. workflow looks like this:

 1. Julia process 1 starts JVM and connects to Spark master node. 
 2. Julia process 1 sends serialized function to Spark master node. 
 3. Spark master node notifies Spark worker nodes (say, there are N of 
 them) about upcoming computations. 
 4. Each Spark worker node creates its own Julia process, independent from 
 Julia process 1. 
 5. Each Spark worker node receives serialized function and passes it to 
 its local Julia process. 

 So with N workers in Spark cluster, there's in total N+1 Julia processes, 
 and when function in question is created, Julia processes from 2 to N+1 
 don't even exist yet.


 On Friday, August 14, 2015 at 12:35:18 PM UTC+3, Tim Holy wrote:

 If you define the function with @everywhere, it will be defined on all 
 existing 
 workers. Likewise, `using MyPackage` loads the package on all workers. 

 --Tim 

 On Thursday, August 13, 2015 03:10:54 PM Andrei Zh wrote: 
  Ok, after going through serialization code, it's clear that default 
  implementation doesn't support serializing function code, but only its 
  name. For example, here's relevant section from 
  `deserialize(::SerializationState, ::Function)`: 
  mod = deserialize(s)::Module 
  name = deserialize(s)::Symbol 
  if !isdefined(mod,name) 
  return (args...)-error(function $name not defined on process 
  $(myid())) 
  end 
  
  
  
  This doesn't fit my needs (essentially, semantics of Spark), and I 
 guess 
  there's no existing solution for full function serialization. Thus I'm 
  going to write new solution for this. 
  
  So far the best idea I have is to get function's AST and recursively 
  serialize it, catching calls to the other non-Base function and any 
 bound 
  variables. But this looks quite complicated. Is there better / easier 
 way 
  to get portable function's representation? 
  
  On Monday, August 10, 2015 at 11:48:55 PM UTC+3, Andrei Zh wrote: 
   Yes, I incorrectly assumed `serialize` / `deserialize` use JLD 
 format. But 
   anyway, even when I saved the function into example.jls or even 
 plain 
   byte array (using IOBuffer and `takebuf_array`), nothing changed. Am 
 I 
   missing something obvious? 
   
   On Monday, August 10, 2015 at 11:40:03 PM UTC+3, Tim Holy wrote: 
   On Monday, August 10, 2015 01:13:15 PM Tony Kelman wrote: 
Should 
probably use some different extension for that, .jls or something, 
 to 
   
   avoid 
   
confusion. 
   
   Yes. That has been sufficiently confusing in the past, we even cover 
 this 
   here: 
   
   
 https://github.com/JuliaLang/JLD.jl#saving-and-loading-variables-in-julia 
   -data-format-jld 
   
   --Tim 
   
On Monday, August 10, 2015 at 12:45:35 PM UTC-7, Stefan Karpinski 
   
   wrote: 
 JLD doesn't support serializing functions but Julia itself does. 
 
 On Mon, Aug 10, 2015 at 3:43 PM, Andrei Zh faithle...@gmail.com 
 
 javascript: wrote: 
 I'm afraid it's not quite true, and I found simple way to show 
 it. 
   
   In the 
   
 next code snippet I define function `f` and serialize it to a 
 file: 
 
 julia f(x) = x + 1 
 f (generic function with 1 method) 
 
 julia f(5) 
 6 
 
 julia open(example.jld, w) do io serialize(io, f) end 
 
 
 Then I close Julia REPL and in a new session try to load and 
 use 
   
   this 
   
 function: 
 
 julia f2 = open(example.jld) do io deserialize(io) end 
 (anonymous function) 
 
 julia f2(5) 
 ERROR: function f not defined on process 1 
 
  in error at error.jl:21 
  in anonymous at serialize.jl:398 
 
 So deserialized function still refers to the old definition, 
 which 
   
   is not 
   
 available in this new session. 
 
 Is there any better way to serialize a function and run it on 
 an 
 unrelated Julia process? 
 
 On Monday, August 10, 2015 at 2:33:11 PM UTC+3, Jeff Waller 
 wrote: 
 My question is: does Julia's serialization produce completely 
 

Re: [julia-users] Re: Can Julia function be serialized and sent by network?

2015-08-14 Thread Andrei Zh

Hi Jake, 

your example works because you don't leave Julia session. `foo` is defined 
in this session, so the the pair of module name and function name is enough 
to get function object. If you save serialized function (or just retype it 
byte by byte) , it won't work. Here's an example: 

Session #1: 

julia io = IOBuffer()
IOBuffer(data=Uint8[...], readable=true, writable=true, seekable=true, 
append=false, size=0, maxsize=Inf, ptr=1, mark=-1)


julia foo(x) =  x + 1
foo (generic function with 1 method)


julia serialize(io, foo)


julia takebuf_array(io)
9-element Array{Uint8,1}:
 0x13
 0x02
 0x23
 0x2f
 0x02
 0x03
 0x66
 0x6f
 0x6f


julia



Session #2: 

julia data = Uint8[0x13, 0x02, 0x23, 0x2f, 0x02, 0x03, 0x66, 0x6f, 0x6f]
9-element Array{Uint8,1}:
 0x13
 0x02
 0x23
 0x2f
 0x02
 0x03
 0x66
 0x6f
 0x6f


julia io = IOBuffer(data)
IOBuffer(data=Uint8[...], readable=true, writable=false, seekable=true, 
append=false, size=9, maxsize=Inf, ptr=1, mark=-1)


julia bar = deserialize(io)
(anonymous function)


julia bar(1)
ERROR: function foo not defined on process 1
 in error at error.jl:21
 in anonymous at serialize.jl:398


julia

 






On Friday, August 14, 2015 at 5:49:55 PM UTC+3, Jake Bolewski wrote:

 Andrei Zh

 I'm confused.  Have you actually tried?  

 julia io = IOBuffer()
 IOBuffer(data=UInt8[...], readable=true, writable=true, seekable=true, 
 append=false, size=0, maxsize=Inf, ptr=1, mark=-1)

 julia foo(x) =  x + 1
 foo (generic function with 1 method)

 julia serialize(io, foo)

 julia seekstart(io)
 IOBuffer(data=UInt8[...], readable=true, writable=true, seekable=true, 
 append=false, size=9, maxsize=Inf, ptr=1, mark=-1)

 julia baz = deserialize(io)
 foo (generic function with 1 method)

 julia baz(1)
 2

 The serialization code won't recursively serialize all the of the 
 functions dependencies so you will have to send/serialize the code that 
 defines the environment (types, constants, Packages, etc).



Re: [julia-users] Re: Can Julia function be serialized and sent by network?

2015-08-14 Thread Tim Holy
If you define the function with @everywhere, it will be defined on all existing 
workers. Likewise, `using MyPackage` loads the package on all workers.

--Tim

On Thursday, August 13, 2015 03:10:54 PM Andrei Zh wrote:
 Ok, after going through serialization code, it's clear that default
 implementation doesn't support serializing function code, but only its
 name. For example, here's relevant section from
 `deserialize(::SerializationState, ::Function)`:
 mod = deserialize(s)::Module
 name = deserialize(s)::Symbol
 if !isdefined(mod,name)
 return (args...)-error(function $name not defined on process
 $(myid()))
 end
 
 
 
 This doesn't fit my needs (essentially, semantics of Spark), and I guess
 there's no existing solution for full function serialization. Thus I'm
 going to write new solution for this.
 
 So far the best idea I have is to get function's AST and recursively
 serialize it, catching calls to the other non-Base function and any bound
 variables. But this looks quite complicated. Is there better / easier way
 to get portable function's representation?
 
 On Monday, August 10, 2015 at 11:48:55 PM UTC+3, Andrei Zh wrote:
  Yes, I incorrectly assumed `serialize` / `deserialize` use JLD format. But
  anyway, even when I saved the function into example.jls or even plain
  byte array (using IOBuffer and `takebuf_array`), nothing changed. Am I
  missing something obvious?
  
  On Monday, August 10, 2015 at 11:40:03 PM UTC+3, Tim Holy wrote:
  On Monday, August 10, 2015 01:13:15 PM Tony Kelman wrote:
   Should
   probably use some different extension for that, .jls or something, to
  
  avoid
  
   confusion.
  
  Yes. That has been sufficiently confusing in the past, we even cover this
  here:
  
  https://github.com/JuliaLang/JLD.jl#saving-and-loading-variables-in-julia
  -data-format-jld
  
  --Tim
  
   On Monday, August 10, 2015 at 12:45:35 PM UTC-7, Stefan Karpinski
  
  wrote:
JLD doesn't support serializing functions but Julia itself does.

On Mon, Aug 10, 2015 at 3:43 PM, Andrei Zh faithle...@gmail.com

javascript: wrote:
I'm afraid it's not quite true, and I found simple way to show it.
  
  In the
  
next code snippet I define function `f` and serialize it to a file:

julia f(x) = x + 1
f (generic function with 1 method)

julia f(5)
6

julia open(example.jld, w) do io serialize(io, f) end


Then I close Julia REPL and in a new session try to load and use
  
  this
  
function:

julia f2 = open(example.jld) do io deserialize(io) end
(anonymous function)

julia f2(5)
ERROR: function f not defined on process 1

 in error at error.jl:21
 in anonymous at serialize.jl:398

So deserialized function still refers to the old definition, which
  
  is not
  
available in this new session.

Is there any better way to serialize a function and run it on an
unrelated Julia process?

On Monday, August 10, 2015 at 2:33:11 PM UTC+3, Jeff Waller wrote:
My question is: does Julia's serialization produce completely
self-containing code that can be run on workers? In other words,
  
  is it
  
possible to send serialized function over network to another host
  
  /
  
Julia
process and applied there without any additional information from
  
  the
  
first
process?

I made some tests on a single machine, and when I defined function
without `@everywhere`, worker failed with a message function
  
  myfunc
  
not
defined on process 1. With `@everywhere`, my code worked, but
  
  will it
  
work
on multiple hosts with essentially independent Julia processes?

According to Jey here

  
  https://groups.google.com/forum/#!searchin/julia-users/jey/julia-users/
  
bolLGcSCrs0/fGGVLgNhI2YJ, Base.serialize does what we want; it's
contained in serialize.jl
https://github.com/JuliaLang/julia/blob/master/base/serialize.jl



Re: [julia-users] Re: Can Julia function be serialized and sent by network?

2015-08-14 Thread Andrei Zh
Yes, but once again, I'm not using Julia workers, but instead completely 
independent Julia processes, running on different machines and ruled by 
Spark, not by Julia's ClusterManager. I.e. workflow looks like this:

1. Julia process 1 starts JVM and connects to Spark master node. 
2. Julia process 1 sends serialized function to Spark master node. 
3. Spark master node notifies Spark worker nodes (say, there are N of them) 
about upcoming computations. 
4. Each Spark worker node creates its own Julia process, independent from 
Julia process 1. 
5. Each Spark worker node receives serialized function and passes it to its 
local Julia process. 

So with N workers in Spark cluster, there's in total N+1 Julia processes, 
and when function in question is created, Julia processes from 2 to N+1 
don't even exist yet.


On Friday, August 14, 2015 at 12:35:18 PM UTC+3, Tim Holy wrote:

 If you define the function with @everywhere, it will be defined on all 
 existing 
 workers. Likewise, `using MyPackage` loads the package on all workers. 

 --Tim 

 On Thursday, August 13, 2015 03:10:54 PM Andrei Zh wrote: 
  Ok, after going through serialization code, it's clear that default 
  implementation doesn't support serializing function code, but only its 
  name. For example, here's relevant section from 
  `deserialize(::SerializationState, ::Function)`: 
  mod = deserialize(s)::Module 
  name = deserialize(s)::Symbol 
  if !isdefined(mod,name) 
  return (args...)-error(function $name not defined on process 
  $(myid())) 
  end 
  
  
  
  This doesn't fit my needs (essentially, semantics of Spark), and I guess 
  there's no existing solution for full function serialization. Thus I'm 
  going to write new solution for this. 
  
  So far the best idea I have is to get function's AST and recursively 
  serialize it, catching calls to the other non-Base function and any 
 bound 
  variables. But this looks quite complicated. Is there better / easier 
 way 
  to get portable function's representation? 
  
  On Monday, August 10, 2015 at 11:48:55 PM UTC+3, Andrei Zh wrote: 
   Yes, I incorrectly assumed `serialize` / `deserialize` use JLD format. 
 But 
   anyway, even when I saved the function into example.jls or even 
 plain 
   byte array (using IOBuffer and `takebuf_array`), nothing changed. Am I 
   missing something obvious? 
   
   On Monday, August 10, 2015 at 11:40:03 PM UTC+3, Tim Holy wrote: 
   On Monday, August 10, 2015 01:13:15 PM Tony Kelman wrote: 
Should 
probably use some different extension for that, .jls or something, 
 to 
   
   avoid 
   
confusion. 
   
   Yes. That has been sufficiently confusing in the past, we even cover 
 this 
   here: 
   
   
 https://github.com/JuliaLang/JLD.jl#saving-and-loading-variables-in-julia 
   -data-format-jld 
   
   --Tim 
   
On Monday, August 10, 2015 at 12:45:35 PM UTC-7, Stefan Karpinski 
   
   wrote: 
 JLD doesn't support serializing functions but Julia itself does. 
 
 On Mon, Aug 10, 2015 at 3:43 PM, Andrei Zh faithle...@gmail.com 
 
 javascript: wrote: 
 I'm afraid it's not quite true, and I found simple way to show 
 it. 
   
   In the 
   
 next code snippet I define function `f` and serialize it to a 
 file: 
 
 julia f(x) = x + 1 
 f (generic function with 1 method) 
 
 julia f(5) 
 6 
 
 julia open(example.jld, w) do io serialize(io, f) end 
 
 
 Then I close Julia REPL and in a new session try to load and use 
   
   this 
   
 function: 
 
 julia f2 = open(example.jld) do io deserialize(io) end 
 (anonymous function) 
 
 julia f2(5) 
 ERROR: function f not defined on process 1 
 
  in error at error.jl:21 
  in anonymous at serialize.jl:398 
 
 So deserialized function still refers to the old definition, 
 which 
   
   is not 
   
 available in this new session. 
 
 Is there any better way to serialize a function and run it on an 
 unrelated Julia process? 
 
 On Monday, August 10, 2015 at 2:33:11 PM UTC+3, Jeff Waller 
 wrote: 
 My question is: does Julia's serialization produce completely 
 self-containing code that can be run on workers? In other 
 words, 
   
   is it 
   
 possible to send serialized function over network to another 
 host 
   
   / 
   
 Julia 
 process and applied there without any additional information 
 from 
   
   the 
   
 first 
 process? 
 
 I made some tests on a single machine, and when I defined 
 function 
 without `@everywhere`, worker failed with a message function 
   
   myfunc 
   
 not 
 defined on process 1. With `@everywhere`, my code worked, but 
   
   will it 
   
 work 
 on multiple hosts with essentially independent Julia 
 processes? 
 
 According to Jey here 
  
   
   
 https://groups.google.com/forum/#!searchin/julia-users/jey/julia-users/ 
   
 

Re: [julia-users] Re: Can Julia function be serialized and sent by network?

2015-08-13 Thread Andrei Zh
Ok, after going through serialization code, it's clear that default 
implementation doesn't support serializing function code, but only its 
name. For example, here's relevant section from 
`deserialize(::SerializationState, ::Function)`:
mod = deserialize(s)::Module
name = deserialize(s)::Symbol
if !isdefined(mod,name)
return (args...)-error(function $name not defined on process 
$(myid()))
end 



This doesn't fit my needs (essentially, semantics of Spark), and I guess 
there's no existing solution for full function serialization. Thus I'm 
going to write new solution for this. 

So far the best idea I have is to get function's AST and recursively 
serialize it, catching calls to the other non-Base function and any bound 
variables. But this looks quite complicated. Is there better / easier way 
to get portable function's representation? 


On Monday, August 10, 2015 at 11:48:55 PM UTC+3, Andrei Zh wrote:

 Yes, I incorrectly assumed `serialize` / `deserialize` use JLD format. But 
 anyway, even when I saved the function into example.jls or even plain 
 byte array (using IOBuffer and `takebuf_array`), nothing changed. Am I 
 missing something obvious?  


 On Monday, August 10, 2015 at 11:40:03 PM UTC+3, Tim Holy wrote:

 On Monday, August 10, 2015 01:13:15 PM Tony Kelman wrote: 
  Should 
  probably use some different extension for that, .jls or something, to 
 avoid 
  confusion. 

 Yes. That has been sufficiently confusing in the past, we even cover this 
 here: 

 https://github.com/JuliaLang/JLD.jl#saving-and-loading-variables-in-julia-data-format-jld
  

 --Tim 

  
  On Monday, August 10, 2015 at 12:45:35 PM UTC-7, Stefan Karpinski 
 wrote: 
   JLD doesn't support serializing functions but Julia itself does. 
   
   On Mon, Aug 10, 2015 at 3:43 PM, Andrei Zh faithle...@gmail.com 
   
   javascript: wrote: 
   I'm afraid it's not quite true, and I found simple way to show it. 
 In the 
   next code snippet I define function `f` and serialize it to a file: 
   
   julia f(x) = x + 1 
   f (generic function with 1 method) 
   
   julia f(5) 
   6 
   
   julia open(example.jld, w) do io serialize(io, f) end 
   
   
   Then I close Julia REPL and in a new session try to load and use 
 this 
   function: 
   
   julia f2 = open(example.jld) do io deserialize(io) end 
   (anonymous function) 
   
   julia f2(5) 
   ERROR: function f not defined on process 1 
   
in error at error.jl:21 
in anonymous at serialize.jl:398 
   
   So deserialized function still refers to the old definition, which 
 is not 
   available in this new session. 
   
   Is there any better way to serialize a function and run it on an 
   unrelated Julia process? 
   
   On Monday, August 10, 2015 at 2:33:11 PM UTC+3, Jeff Waller wrote: 
   My question is: does Julia's serialization produce completely 
   self-containing code that can be run on workers? In other words, 
 is it 
   possible to send serialized function over network to another host 
 / 
   Julia 
   process and applied there without any additional information from 
 the 
   first 
   process? 
   
   I made some tests on a single machine, and when I defined function 
   without `@everywhere`, worker failed with a message function 
 myfunc 
   not 
   defined on process 1. With `@everywhere`, my code worked, but 
 will it 
   work 
   on multiple hosts with essentially independent Julia processes? 
   
   According to Jey here 
   
 https://groups.google.com/forum/#!searchin/julia-users/jey/julia-users/ 
   bolLGcSCrs0/fGGVLgNhI2YJ, Base.serialize does what we want; it's 
   contained in serialize.jl 
   https://github.com/JuliaLang/julia/blob/master/base/serialize.jl 



Re: [julia-users] Re: Can Julia function be serialized and sent by network?

2015-08-10 Thread Stefan Karpinski
JLD doesn't support serializing functions but Julia itself does.

On Mon, Aug 10, 2015 at 3:43 PM, Andrei Zh faithlessfri...@gmail.com
wrote:

 I'm afraid it's not quite true, and I found simple way to show it. In the
 next code snippet I define function `f` and serialize it to a file:

 julia f(x) = x + 1
 f (generic function with 1 method)

 julia f(5)
 6

 julia open(example.jld, w) do io serialize(io, f) end


 Then I close Julia REPL and in a new session try to load and use this
 function:

 julia f2 = open(example.jld) do io deserialize(io) end
 (anonymous function)

 julia f2(5)
 ERROR: function f not defined on process 1
  in error at error.jl:21
  in anonymous at serialize.jl:398


 So deserialized function still refers to the old definition, which is not
 available in this new session.

 Is there any better way to serialize a function and run it on an unrelated
 Julia process?


 On Monday, August 10, 2015 at 2:33:11 PM UTC+3, Jeff Waller wrote:



 My question is: does Julia's serialization produce completely
 self-containing code that can be run on workers? In other words, is it
 possible to send serialized function over network to another host / Julia
 process and applied there without any additional information from the first
 process?

 I made some tests on a single machine, and when I defined function
 without `@everywhere`, worker failed with a message function myfunc not
 defined on process 1. With `@everywhere`, my code worked, but will it work
 on multiple hosts with essentially independent Julia processes?


 According to Jey here
 https://groups.google.com/forum/#!searchin/julia-users/jey/julia-users/bolLGcSCrs0/fGGVLgNhI2YJ,
 Base.serialize does what we want; it's contained in serialize.jl
 https://github.com/JuliaLang/julia/blob/master/base/serialize.jl




Re: [julia-users] Re: Can Julia function be serialized and sent by network?

2015-08-10 Thread Tim Holy
On Monday, August 10, 2015 01:13:15 PM Tony Kelman wrote:
 Should
 probably use some different extension for that, .jls or something, to avoid
 confusion.

Yes. That has been sufficiently confusing in the past, we even cover this here: 
https://github.com/JuliaLang/JLD.jl#saving-and-loading-variables-in-julia-data-format-jld

--Tim

 
 On Monday, August 10, 2015 at 12:45:35 PM UTC-7, Stefan Karpinski wrote:
  JLD doesn't support serializing functions but Julia itself does.
  
  On Mon, Aug 10, 2015 at 3:43 PM, Andrei Zh faithle...@gmail.com
  
  javascript: wrote:
  I'm afraid it's not quite true, and I found simple way to show it. In the
  next code snippet I define function `f` and serialize it to a file:
  
  julia f(x) = x + 1
  f (generic function with 1 method)
  
  julia f(5)
  6
  
  julia open(example.jld, w) do io serialize(io, f) end
  
  
  Then I close Julia REPL and in a new session try to load and use this
  function:
  
  julia f2 = open(example.jld) do io deserialize(io) end
  (anonymous function)
  
  julia f2(5)
  ERROR: function f not defined on process 1
  
   in error at error.jl:21
   in anonymous at serialize.jl:398
  
  So deserialized function still refers to the old definition, which is not
  available in this new session.
  
  Is there any better way to serialize a function and run it on an
  unrelated Julia process?
  
  On Monday, August 10, 2015 at 2:33:11 PM UTC+3, Jeff Waller wrote:
  My question is: does Julia's serialization produce completely
  self-containing code that can be run on workers? In other words, is it
  possible to send serialized function over network to another host /
  Julia
  process and applied there without any additional information from the
  first
  process?
  
  I made some tests on a single machine, and when I defined function
  without `@everywhere`, worker failed with a message function myfunc
  not
  defined on process 1. With `@everywhere`, my code worked, but will it
  work
  on multiple hosts with essentially independent Julia processes?
  
  According to Jey here
  https://groups.google.com/forum/#!searchin/julia-users/jey/julia-users/
  bolLGcSCrs0/fGGVLgNhI2YJ, Base.serialize does what we want; it's
  contained in serialize.jl
  https://github.com/JuliaLang/julia/blob/master/base/serialize.jl



[julia-users] Re: Can Julia function be serialized and sent by network?

2015-08-10 Thread Andrei Zh
I'm afraid it's not quite true, and I found simple way to show it. In the 
next code snippet I define function `f` and serialize it to a file:

julia f(x) = x + 1
f (generic function with 1 method)

julia f(5)
6

julia open(example.jld, w) do io serialize(io, f) end


Then I close Julia REPL and in a new session try to load and use this 
function:

julia f2 = open(example.jld) do io deserialize(io) end
(anonymous function)

julia f2(5)
ERROR: function f not defined on process 1
 in error at error.jl:21
 in anonymous at serialize.jl:398


So deserialized function still refers to the old definition, which is not 
available in this new session. 

Is there any better way to serialize a function and run it on an unrelated 
Julia process? 

On Monday, August 10, 2015 at 2:33:11 PM UTC+3, Jeff Waller wrote:

  

 My question is: does Julia's serialization produce completely 
 self-containing code that can be run on workers? In other words, is it 
 possible to send serialized function over network to another host / Julia 
 process and applied there without any additional information from the first 
 process? 

 I made some tests on a single machine, and when I defined function 
 without `@everywhere`, worker failed with a message function myfunc not 
 defined on process 1. With `@everywhere`, my code worked, but will it work 
 on multiple hosts with essentially independent Julia processes? 

  
 According to Jey here 
 https://groups.google.com/forum/#!searchin/julia-users/jey/julia-users/bolLGcSCrs0/fGGVLgNhI2YJ,
  
 Base.serialize does what we want; it's contained in serialize.jl 
 https://github.com/JuliaLang/julia/blob/master/base/serialize.jl



Re: [julia-users] Re: Can Julia function be serialized and sent by network?

2015-08-10 Thread Tony Kelman
The above code wasn't using the HDF5-based JLD package/format, it was just 
using .jld as a file extension to store the results of serialize(). Should 
probably use some different extension for that, .jls or something, to avoid 
confusion.


On Monday, August 10, 2015 at 12:45:35 PM UTC-7, Stefan Karpinski wrote:

 JLD doesn't support serializing functions but Julia itself does.

 On Mon, Aug 10, 2015 at 3:43 PM, Andrei Zh faithle...@gmail.com 
 javascript: wrote:

 I'm afraid it's not quite true, and I found simple way to show it. In the 
 next code snippet I define function `f` and serialize it to a file:

 julia f(x) = x + 1
 f (generic function with 1 method)

 julia f(5)
 6

 julia open(example.jld, w) do io serialize(io, f) end


 Then I close Julia REPL and in a new session try to load and use this 
 function:

 julia f2 = open(example.jld) do io deserialize(io) end
 (anonymous function)

 julia f2(5)
 ERROR: function f not defined on process 1
  in error at error.jl:21
  in anonymous at serialize.jl:398


 So deserialized function still refers to the old definition, which is not 
 available in this new session. 

 Is there any better way to serialize a function and run it on an 
 unrelated Julia process? 


 On Monday, August 10, 2015 at 2:33:11 PM UTC+3, Jeff Waller wrote:

  

 My question is: does Julia's serialization produce completely 
 self-containing code that can be run on workers? In other words, is it 
 possible to send serialized function over network to another host / Julia 
 process and applied there without any additional information from the 
 first 
 process? 

 I made some tests on a single machine, and when I defined function 
 without `@everywhere`, worker failed with a message function myfunc not 
 defined on process 1. With `@everywhere`, my code worked, but will it 
 work 
 on multiple hosts with essentially independent Julia processes? 

  
 According to Jey here 
 https://groups.google.com/forum/#!searchin/julia-users/jey/julia-users/bolLGcSCrs0/fGGVLgNhI2YJ,
  
 Base.serialize does what we want; it's contained in serialize.jl 
 https://github.com/JuliaLang/julia/blob/master/base/serialize.jl




Re: [julia-users] Re: Can Julia function be serialized and sent by network?

2015-08-10 Thread Andrei Zh
Yes, I incorrectly assumed `serialize` / `deserialize` use JLD format. But 
anyway, even when I saved the function into example.jls or even plain 
byte array (using IOBuffer and `takebuf_array`), nothing changed. Am I 
missing something obvious?  


On Monday, August 10, 2015 at 11:40:03 PM UTC+3, Tim Holy wrote:

 On Monday, August 10, 2015 01:13:15 PM Tony Kelman wrote: 
  Should 
  probably use some different extension for that, .jls or something, to 
 avoid 
  confusion. 

 Yes. That has been sufficiently confusing in the past, we even cover this 
 here: 

 https://github.com/JuliaLang/JLD.jl#saving-and-loading-variables-in-julia-data-format-jld
  

 --Tim 

  
  On Monday, August 10, 2015 at 12:45:35 PM UTC-7, Stefan Karpinski wrote: 
   JLD doesn't support serializing functions but Julia itself does. 
   
   On Mon, Aug 10, 2015 at 3:43 PM, Andrei Zh faithle...@gmail.com 
   
   javascript: wrote: 
   I'm afraid it's not quite true, and I found simple way to show it. In 
 the 
   next code snippet I define function `f` and serialize it to a file: 
   
   julia f(x) = x + 1 
   f (generic function with 1 method) 
   
   julia f(5) 
   6 
   
   julia open(example.jld, w) do io serialize(io, f) end 
   
   
   Then I close Julia REPL and in a new session try to load and use this 
   function: 
   
   julia f2 = open(example.jld) do io deserialize(io) end 
   (anonymous function) 
   
   julia f2(5) 
   ERROR: function f not defined on process 1 
   
in error at error.jl:21 
in anonymous at serialize.jl:398 
   
   So deserialized function still refers to the old definition, which is 
 not 
   available in this new session. 
   
   Is there any better way to serialize a function and run it on an 
   unrelated Julia process? 
   
   On Monday, August 10, 2015 at 2:33:11 PM UTC+3, Jeff Waller wrote: 
   My question is: does Julia's serialization produce completely 
   self-containing code that can be run on workers? In other words, is 
 it 
   possible to send serialized function over network to another host / 
   Julia 
   process and applied there without any additional information from 
 the 
   first 
   process? 
   
   I made some tests on a single machine, and when I defined function 
   without `@everywhere`, worker failed with a message function 
 myfunc 
   not 
   defined on process 1. With `@everywhere`, my code worked, but will 
 it 
   work 
   on multiple hosts with essentially independent Julia processes? 
   
   According to Jey here 
   
 https://groups.google.com/forum/#!searchin/julia-users/jey/julia-users/ 
   bolLGcSCrs0/fGGVLgNhI2YJ, Base.serialize does what we want; it's 
   contained in serialize.jl 
   https://github.com/JuliaLang/julia/blob/master/base/serialize.jl 



[julia-users] Re: Can Julia function be serialized and sent by network?

2015-08-10 Thread Jeff Waller
 

 My question is: does Julia's serialization produce completely 
 self-containing code that can be run on workers? In other words, is it 
 possible to send serialized function over network to another host / Julia 
 process and applied there without any additional information from the first 
 process? 

 I made some tests on a single machine, and when I defined function without 
 `@everywhere`, worker failed with a message function myfunc not defined on 
 process 1. With `@everywhere`, my code worked, but will it work on 
 multiple hosts with essentially independent Julia processes? 

 
According to Jey here 
https://groups.google.com/forum/#!searchin/julia-users/jey/julia-users/bolLGcSCrs0/fGGVLgNhI2YJ,
 
Base.serialize does what we want; it's contained in serialize.jl 
https://github.com/JuliaLang/julia/blob/master/base/serialize.jl