Hi,
I have several general questions that came up in my first foray into Julia.
Julia seems such a delight to work with, things seems to work magically and
lots of details are not required or implicitly assumed.
Whilst this is great for programming, it does mean I'm a little unsure
about some things, especially about types, efficiency and what I get for
free.
In any case, here are some questions:
I want to do some parallel simulations and combine/reduce the set of
results into a single result.
As a minimal starting point, I defined a composite type to hold sim
results, a single no-argument constructor, and a + function for reducing.
type Counters
freqBase::Array{Int64,1}
freqFeature::Array{Int64,1}
freqWin::Array{Int64,1}
freqPrize::Array{Int64,1}
freqCombination::Array{Int64,2}
# no-argument constructor
function Counters()
this = new()
this.freqBase = zeros(Int64, 100000)
this.freqFeature = zeros(Int64, 100000)
this.freqWin = zeros(Int64, 100000)
this.freqPrize = zeros(Int64, maxPrize)
this.freqCombination = zeros(Int64, numSymbols, maxState)
return this
end
end
function +(c1::Counters, c2::Counters)
c = Counters()
c.freqBase = c1.freqBase + c2.freqBase
c.freqFeature = c1.freqFeature + c2.freqFeature
c.freqWin = c1.freqWin + c2.freqWin
c.freqPrize = c1.freqPrize + c2.freqPrize
c.freqCombination = c1.freqCombination + c2.freqCombination
return c
end
To my surprise this was sufficient to provide the functionality I needed to
return sim result from pmap and reduce to a single result:
const numProcessors = 4
if nprocs() < numProcessors
addprocs(numProcessors - nprocs())
end
const numTrials = 100
const numPlays = 1000000
trialCounts = pmap(Simulation, fill(numPlays, numTrials))
totalCounts = sum(trialCounts)
PrintCountersSummary(trialCounts) # print summary for each trial
PrintCounters(totalCounts) # print total combined results
Surprisingly (to me) all this worked.
Q1. Why does pmap return Vector{Any} rather than Vector{Counters}, when the
return type from Simulation() is my user-defined type Counters?
I inserted an extra line:
trialCounts = convert(Vector{Counters}, trialCounts)
I was surprised this even worked, because I didn’t define convert.
Q2. Is there any advantage to using convert? Is it more efficient? E.g.
PrintCounters could be defined to accept argument with Vector{Counters}
Q3. trialCounts is Vector{Any} but sum() works? Presumably sum uses run
time type of actual elements of vector?
Q4. Presumably sum() uses my definition of operator +. I also noted that +=
works. Where are these defined? What else do I get for "free"?
It occurred to me that my implementation of + could be improved by defining
a copy constructor.
function Counters(c::Counters)
this = new()
this.freqBase = copy(c.freqBase)
this.freqFeature = copy(c.freqFeature)
this.freqWin = copy(c.freqWin)
this.freqPrize = copy(c.freqPrize)
this.freqCombination = copy(c.freqCombination)
return this
end
Then define + as:
function +(c1, c2)
c = Counters(c1)
c1.freqBase += c2.freqBase
...
return c
end
This would seem to eliminate first initialising with zeros.
However, there was no improvement in practice. Maybe allocation is
insignificant compared to addition.
Then it occurred to me that summing by my definition creates a new object
for addition.
Perhaps a more efficient sum would be to define += as an updating function,
so that a new object does not need to be created.
I tried to define += but received an error.
Instead I defined plusEquals() and this was almost 2x faster than sum(x)
or s += x[i] loop or s = s + x[i] loop. (~50% gc time)
Q5 Why can’t I extend +=?
Q6 Wouldn’t this be faster for summing, and so sum() could be defined in
terms of += rather than + (which creates new object for each element)
I noticed that pmap uses nworkers which is (nprocs - 1) unless nproc==1.
Q7 For case nprocs==2, wouldn’t it make sense to also use the local process
as a worker, since the programmer’s intention was to use parallel
processing (Otherwise for nprocs==2, there is no difference to using map)?
*if* p *!=* myid() *||* np *==* 1
*if* p *!=* myid() *||* np *<=* 2
Q8 Is there a way to programmatically determine the number of physical
processors on current machine? Such a function would be useful to use with
addprocs().
Thanks
Greg