On Sunday, October 19, 2014 5:25:34 PM UTC-4, Greg Plowman wrote:
>
> Hi,
>
> I have several general questions that came up in my first foray into Julia.
>
> Julia seems such a delight to work with, things seems to work magically
> and lots of details are not required or implicitly assumed.
> Whilst this is great for programming, it does mean I'm a little unsure
> about some things, especially about types, efficiency and what I get for
> free.
> In any case, here are some questions:
>
>
> I want to do some parallel simulations and combine/reduce the set of
> results into a single result.
>
> As a minimal starting point, I defined a composite type to hold sim
> results, a single no-argument constructor, and a + function for reducing.
>
>
>
> type Counters
>
> freqBase::Array{Int64,1}
>
> freqFeature::Array{Int64,1}
>
> freqWin::Array{Int64,1}
>
> freqPrize::Array{Int64,1}
>
> freqCombination::Array{Int64,2}
>
>
>
> # no-argument constructor
>
> function Counters()
>
> this = new()
>
> this.freqBase = zeros(Int64, 100000)
>
> this.freqFeature = zeros(Int64, 100000)
>
> this.freqWin = zeros(Int64, 100000)
>
> this.freqPrize = zeros(Int64, maxPrize)
>
> this.freqCombination = zeros(Int64, numSymbols, maxState)
>
> return this
>
> end
>
> end
>
>
>
> function +(c1::Counters, c2::Counters)
>
> c = Counters()
>
> c.freqBase = c1.freqBase + c2.freqBase
>
> c.freqFeature = c1.freqFeature + c2.freqFeature
>
> c.freqWin = c1.freqWin + c2.freqWin
>
> c.freqPrize = c1.freqPrize + c2.freqPrize
>
> c.freqCombination = c1.freqCombination + c2.freqCombination
>
> return c
>
> end
>
>
>
> To my surprise this was sufficient to provide the functionality I needed
> to return sim result from pmap and reduce to a single result:
>
>
>
> const numProcessors = 4
>
>
>
> if nprocs() < numProcessors
>
> addprocs(numProcessors - nprocs())
>
> end
>
>
>
> const numTrials = 100
>
> const numPlays = 1000000
>
>
>
> trialCounts = pmap(Simulation, fill(numPlays, numTrials))
>
> totalCounts = sum(trialCounts)
>
>
>
> PrintCountersSummary(trialCounts) # print summary for each trial
>
> PrintCounters(totalCounts) # print total combined results
>
>
>
> Surprisingly (to me) all this worked.
>
>
>
> Q1. Why does pmap return Vector{Any} rather than Vector{Counters}, when
> the return type from Simulation() is my user-defined type Counters?
>
>
>
> I inserted an extra line:
>
> trialCounts = convert(Vector{Counters}, trialCounts)
>
>
>
> I was surprised this even worked, because I didn’t define convert.
>
pmap returns a Vector{Any} because the remote data is inherently untyped
and it doesn't try to tighten the type of the array after retrieving the
items. convert{T,n,S}(::Type{Array{T,n}}, x::Array{S,n}) is defined here
<https://github.com/JuliaLang/julia/blob/0ceef8e7365b5abf5dda17498878db864a9601a1/base/array.jl#L220>.
(You can find this using which).
> Q2. Is there any advantage to using convert? Is it more efficient? E.g.
> PrintCounters could be defined to accept argument with Vector{Counters}
>
In some cases it will be more efficient since function lookup can be static
instead of dynamic, although if the cost of the function call is small
relative to the cost of running the function it doesn't matter.
> Q3. trialCounts is Vector{Any} but sum() works? Presumably sum uses run
> time type of actual elements of vector?
>
sum is basically just doing something like:
a = x[1]
for i = 2:length(x)
a += x[i]
end
which doesn't actually need type information to work, although type
information can make it faster.
> Q4. Presumably sum() uses my definition of operator +. I also noted that
> += works. Where are these defined? What else do I get for "free"?
>
x += y is rewritten to x = x + y in the frontend
> It occurred to me that my implementation of + could be improved by
> defining a copy constructor.
>
>
>
> function Counters(c::Counters)
>
> this = new()
>
> this.freqBase = copy(c.freqBase)
>
> this.freqFeature = copy(c.freqFeature)
>
> this.freqWin = copy(c.freqWin)
>
> this.freqPrize = copy(c.freqPrize)
>
> this.freqCombination = copy(c.freqCombination)
>
> return this
>
> end
>
>
>
> Then define + as:
>
>
>
> function +(c1, c2)
>
> c = Counters(c1)
>
> c1.freqBase += c2.freqBase
>
> ...
>
> return c
>
> end
>
>
>
> This would seem to eliminate first initialising with zeros.
>
> However, there was no improvement in practice. Maybe allocation is
> insignificant compared to addition.
>
copying still requires allocation. This just avoids initializing with
zeros, which is not usually very expensive.
> Then it occurred to me that summing by my definition creates a new object
> for addition.
> Perhaps a more efficient sum would be to define += as an updating
> function, so that a new object does not need to be created.
> I tried to define += but received an error.
> Instead I defined plusEquals() and this was almost 2x faster than sum(x)
> or s += x[i] loop or s = s + x[i] loop. (~50% gc time)
>
>
>
Q5 Why can’t I extend +=?
>
See above; it is a special construct that is rewritten to x = x + y in the
frontend.
> Q6 Wouldn’t this be faster for summing, and so sum() could be defined in
> terms of += rather than + (which creates new object for each element)
>
Yes, this would make summing mutable objects faster. There has been a lot
of discussion on this at https://github.com/JuliaLang/julia/issues/249. We
would also like to make garbage collection smarter in general so less
work/magic is necessary here to get good performance.
> I noticed that pmap uses nworkers which is (nprocs - 1) unless nproc==1.
>
>
>
> Q7 For case nprocs==2, wouldn’t it make sense to also use the local
> process as a worker, since the programmer’s intention was to use parallel
> processing (Otherwise for nprocs==2, there is no difference to using map)?
>
> *if* p *!=* myid() *||* np *==* 1
>
> *if* p *!=* myid() *||* np *<=* 2
>
Often the reason to start Julia with a single worker is to debug parallel
code, in which case it's useful to run parallel jobs with only a single
worker.
> Q8 Is there a way to programmatically determine the number of physical
> processors on current machine? Such a function would be useful to use with
> addprocs().
>
Not that I'm aware of.
Simon