Hi,

I have several general questions that came up in my first foray into Julia.

Julia seems such a delight to work with, things seems to work magically and 
lots of details are not required or implicitly assumed. 
Whilst this is great for programming, it does mean I'm a little unsure 
about some things, especially about types, efficiency and what I get for 
free.   
In any case, here are some questions:


I want to do some parallel simulations and combine/reduce the set of 
results into a single result. 

As a minimal starting point, I defined a composite type to hold sim 
results, a single no-argument constructor, and a + function for reducing.

 

type Counters

    freqBase::Array{Int64,1}

    freqFeature::Array{Int64,1}

    freqWin::Array{Int64,1}

    freqPrize::Array{Int64,1}

    freqCombination::Array{Int64,2}

 

    # no-argument constructor

    function Counters()

        this = new()

        this.freqBase = zeros(Int64, 100000)

        this.freqFeature = zeros(Int64, 100000)

        this.freqWin = zeros(Int64, 100000)

        this.freqPrize = zeros(Int64, maxPrize)

        this.freqCombination = zeros(Int64, numSymbols, maxState)

        return this

    end

end

 

function +(c1::Counters, c2::Counters)

    c = Counters()

    c.freqBase        = c1.freqBase         + c2.freqBase

    c.freqFeature     = c1.freqFeature      + c2.freqFeature

    c.freqWin         = c1.freqWin          + c2.freqWin

    c.freqPrize       = c1.freqPrize        + c2.freqPrize

    c.freqCombination = c1.freqCombination  + c2.freqCombination

    return c

end

 

To my surprise this was sufficient to provide the functionality I needed to 
return sim result from pmap and reduce to a single result:

 

    const numProcessors = 4

 

    if nprocs() < numProcessors

        addprocs(numProcessors - nprocs())

    end

 

    const numTrials = 100

    const numPlays = 1000000

 

    trialCounts = pmap(Simulation, fill(numPlays, numTrials))

    totalCounts = sum(trialCounts)

    

    PrintCountersSummary(trialCounts) # print summary for each trial

    PrintCounters(totalCounts)        # print total combined results

 

Surprisingly (to me) all this worked.

 

Q1. Why does pmap return Vector{Any} rather than Vector{Counters}, when the 
return type from Simulation() is my user-defined type Counters? 

 

I inserted an extra line:

    trialCounts = convert(Vector{Counters}, trialCounts)

 

I was surprised this even worked, because I didn’t define convert.

Q2. Is there any advantage to using convert? Is it more efficient? E.g. 
PrintCounters could be defined to accept argument with Vector{Counters}

 

Q3. trialCounts is Vector{Any} but sum() works? Presumably sum uses run 
time type of actual elements of vector?

 

Q4. Presumably sum() uses my definition of operator +. I also noted that += 
works. Where are these defined? What else do I get for "free"?

 

It occurred to me that my implementation of + could be improved by defining 
a copy constructor.

 

    function Counters(c::Counters)

        this = new()

        this.freqBase = copy(c.freqBase)

        this.freqFeature = copy(c.freqFeature)

        this.freqWin = copy(c.freqWin)

        this.freqPrize = copy(c.freqPrize)

        this.freqCombination = copy(c.freqCombination)

        return this

    end

 

Then define + as:

 

    function +(c1, c2)

        c = Counters(c1)

   c1.freqBase += c2.freqBase

   ...

   return c

    end

 

This would seem to eliminate first initialising with zeros.

However, there was no improvement in practice. Maybe allocation is 
insignificant compared to addition.

 
Then it occurred to me that summing by my definition creates a new object 
for addition. 
Perhaps a more efficient sum would be to define += as an updating function, 
so that a new object does not need to be created.
I tried to define += but received an error. 
Instead I defined plusEquals() and this was almost  2x faster than sum(x) 
or s += x[i] loop or s = s + x[i] loop. (~50% gc time)

 

Q5 Why can’t I extend +=?

 

Q6 Wouldn’t this be faster for summing, and so sum() could be defined in 
terms of += rather than + (which creates new object for each element)

 

I noticed that pmap uses nworkers which is (nprocs - 1) unless nproc==1.

 

Q7 For case nprocs==2, wouldn’t it make sense to also use the local process 
as a worker, since the programmer’s intention was to use parallel 
processing (Otherwise for nprocs==2, there is no difference to using map)? 

      *if* p *!=* myid() *||* np *==* 1

      *if* p *!=* myid() *||* np *<=* 2

 

Q8 Is there a way to programmatically determine the number of physical 
processors on current machine? Such a function would be useful to use with 
addprocs().

 
Thanks
Greg 

Reply via email to