Re: Variable Block Distributions

John MacFrenz Wed, 25 Mar 2015 12:16:52 -0700

Hi,

I tried writing a simple test to compare block and my distribution but ran into 
several issues... One was that I can't seem to compile working binaries with 
--static flag enabled (gasnet initialization fails I think), I keep getting 
execution only on one locale despite NUMLOCALES file and some bugs in 
compiler/runtime I can report about more later...


However, right now I'm stuck on following issue: how can I define an array or a 
homogeneous tuple of one-dimensional arrays of varying sizes? All I could think 
of was the following:

record ArrayWrapper {
    param rank;
    type idxType;
    type eltType;
    var dom: domain(rank, idxType);
    var data: [dom] eltType;
}

param count = 3;
var arrays: count*ArrayWrapper(1, int, int);

arrays(2).dom = {1..24};
array(2).data = 11;
array(2).data(6) = 9;


However, this solution causes following error:  "cannot access remote data in 
local block". What does that mean?
This happens on locale 1 (when running with two locales). Here's a backtrace:

Breakpoint 1, gdbShouldBreakHere () at gdb.c:27
27      void gdbShouldBreakHere(void) {printf("%s", "");}
(gdb) bt
#0  gdbShouldBreakHere () at gdb.c:27
#1  0x00000000007b979f in chpl_exit_common (status=1, all=0) at chplexit.c:38
#2  0x00000000007b980d in chpl_exit_any (status=1) at chplexit.c:57
#3  0x00000000007b34b6 in chpl_error (
    message=0x8795f0 "cannot access remote data in local block", lineno=24, 
    filename=0x87a890 
"/home/john/projects/chapel/chapel/modules/dists/VariBlockPolicies.chpl") at 
error.c:85
#4  0x0000000000740750 in chpl_check_local (
    error=0x8795f0 "cannot access remote data in local block", 
    file=0x87a890 
"/home/john/projects/chapel/chapel/modules/dists/VariBlockPolicies.chpl", 
ln=24, node=0)
    at 
/home/john/projects/chapel/chapel/runtime//include/chpl-comm-compiler-macros.h:165
#5  chpl__initCopy16 (x_chpl=0x7ffff5e5a900) at VariBlockPolicies.chpl:24
#6  0x00000000007548cd in StaticCutPolicyIndexer_chpl (
    cutCache_chpl=0x7ffff5e5ac60) at VariBlockPolicies.chpl:163
#7  0x0000000000754016 in makeIndexer_chpl (this_chpl6=0x7ffff5e5c420)
    at VariBlockPolicies.chpl:149
#8  0x00000000007216f8 in VariBlock_chpl4 (other_chpl=0x7ffff5e5cae0, 
    privateData_chpl=0x7ffff5e5caf0, policy_chpl=0x7ffff5e5c420)
    at VariBlockDist.chpl:1055
#9  0x00000000007259c1 in dsiPrivatize_chpl5 (this_chpl6=0x7ffff5e5cae0, 
    privatizeData_chpl=0x7ffff5e5caf0) at VariBlockDist.chpl:1070
---Type <return> to continue, or q <return> to quit---
#10 0x000000000050f8de in _newPrivatizedClassHelp7 (
    parentValue=0x7ffff5e5cae0, originalValue=0x7ffff5e5cad0, n=0, hereID=0, 
    privatizeData=0x7ffff5e5caf0) at ChapelArray.chpl:52
#11 0x000000000051a89b in on_fn26 (originalValue=0x7ffff5e5cad0, 
    privatizeData=0x7ffff5e5caf0, newValue=0x7ffff5e5cae0, n=0, hereID=0)
    at ChapelArray.chpl:62
#12 0x000000000051b894 in wrapon_fn26 (c=0xae31a8) at ChapelArray.chpl:61
#13 0x00000000007c1bf0 in chpl_ftable_call (arg=<optimized out>, 
    fid=<optimized out>) at ../../../include/chpl-gen-includes.h:40
#14 fork_wrapper (f=0xae3190) at comm-gasnet.c:176
#15 0x00000000007bf7b3 in movedTaskWrapper (a=0xae2fc0) at tasks-fifo.c:799
#16 0x00000000007c0154 in thread_begin (ptask_void=0x7ffff00009f0)
    at tasks-fifo.c:1166
#17 0x00000000007c1786 in pthread_func (arg=0x7ffff00009f0)
    at threads-pthreads.c:397
#18 0x00007ffff7bc4182 in start_thread (arg=0x7ffff5e5d700)
    at pthread_create.c:312
#19 0x00007ffff70d100d in clone ()
    at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111


25.03.2015, 19:08, "Vassily Litvinov" <[email protected]>:
> John,
>
> Thank you for a description of what you are doing.
>
> To your questions - we use a subset of our test tree for desktop
> performance testing by running "start_test -performance -numtrials 5" on
> the entire test tree. You can see the results here:
>
>     http://chapel.sourceforge.net/perf/
>
> For multilocale performance benchmarks listed below, it's the same
> except we use a different set of options. (studies/nbody/md.xc-keys
> should read ..../md.chpl)
>
> The way we track performance is via performance graphs (see the link
> above). We track comm diagnostics by comparing against .good files for
> certain tests. We update .good files when we observe an improvement or
> when we agree to accept an increase in the counts. We have talked about
> plotting the counts like performance graphs, and we currently do not do so.
>
> Note that among our multilocale benchmarks only RA and STREAM are tuned
> for performance. Most others currently do not offer a good baseline
> performance to compare against.
>
> Vassily
>
> On 03/20/15 12:29, John MacFrenz wrote:
>>  Hi,
>>>>    I was wondering if someone could help me with writing and running
>>>  some tests for my domain map, both correctness and performance tests?
>>>
>>>  We have a framework that exercises common cases and corner cases of
>>>  using a domain map. See these readmes in the Chapel repository on github:
>>>
>>>      test/distributions/robust/README
>>>      test/distributions/robust/arithmetic/README
>>>
>>>  The second README says how to extend the framework to use a new
>>>  distribution - it is a straightforward modification of:
>>>
>>>      test/distributions/robust/arithmetic/driver.chpl
>>  Thanks, I had totally missed those README's.
>>>>    The code is on github and I'm willing to contribute it if it gets
>>>  developed enough to be included in Chapel.
>>>
>>>  We welcome your contribution! Please see the first paragraph on our
>>>  developer resources page, which summarizes what's required:
>>>
>>>      http://chapel.cray.com/developers.html
>>  Okay, I had a look at those links. I guess it's still way too early to make 
>> a pull request, but I'd appreciate if someone could have a look at the code 
>> at some point in near future.
>>>>    I guess that many correctness tests for BlockDist could be easily
>>>  modified for this dist. I'd also like to do performance testing versus
>>>  BlockDist.
>>>
>>>  Sure, any performance test that "uses" BlockDist is great for it. This
>>>  file talks about the support for performance testing that we have:
>>>
>>>      doc/developer/bestPractices/TestSystem.txt
>>>
>>>  The benchmarks whose performance we measure on multiple locales using
>>>  the -perflabel feature of start_test are:
>>>
>>>      npb/ep/mcahir/ep.chpl release/examples/benchmarks/hpcc/fft.chpl
>>>  release/examples/benchmarks/hpcc/hpl.chpl
>>>  release/examples/benchmarks/hpcc/ptrans.chpl
>>>  release/examples/benchmarks/hpcc/ra-atomics.chpl
>>>  release/examples/benchmarks/hpcc/stream-ep.chpl
>>>  release/examples/benchmarks/hpcc/stream.chpl
>>>  release/examples/benchmarks/hpcc/ra.chpl
>>>  release/examples/benchmarks/ssca2/SSCA2_main.chpl
>>>  release/examples/benchmarks/miniMD/miniMD.chpl
>>>  studies/hpcc/HPL/vass/hpl.hpcc2012.chpl
>>>  studies/lulesh/bradc/lulesh-dense.chpl studies/nbody/md.xc-keys
>>  Okay, so there is not a similar framework (as in 
>> test/distributions/robust/arithmetic you pointed out) for performance tests? 
>> Also how do you usually do comparisons and track history of of comm 
>> diagnostics?
>>>>    Also, some questions...
>>>>
>>>>      - Can I have the assignment of my domain map to be done by
>>>  reference? Or would it be best to just generate compiler error in
>>>  dsiAssign, since I can't figure how assign-by-value should be done...
>>>>      - What is the role of dsiClone? Could I just return "this"?
>>>  Think of dsiClone as a copy constructor for your Block class instances,
>>>  and dsiAssign as an assignment operator. Let me know if you need me to
>>>  say more here.
>>>
>>>  Implementing these two pieces of functionality is important for a
>>>  finished product of a domain map. Afaik however, none of our benchmarks
>>>  use them today. So it is fair to leave them unimplemented for now, and
>>>  focus on the other functionality and performance. Feel free to make them
>>>  compiler errors, for example.
>>  A bit more elaborate description of my problem:
>>
>>  My distribution (for now I call it VariBlock) takes a generic policy object 
>> as an argument in constructor. That policy object is essentially responsible 
>> for telling the VariBlock distribution how the data is to be partitioned 
>> among the locales. Importantly, I would also give a try to implementing a 
>> mechanism to change how the data is distributed during the execution of the 
>> program. In that process the instance of policy object wouldn't change (it 
>> is/should be a const). Currently I think that rebalancing should be 
>> initiated by the policy object (and possibly by user, by calling some method 
>> of the policy object)
>>
>>  The above implies that the policy object is very closely coupled to a 
>> domain map; Was inheriting of generic classes supported in Chapel, I 
>> probably would have dismissed the concept of policy objects in favour of 
>> creating new policies by extending some base class. So, in cloning or in 
>> assignment, the difficult question is what to do with the policy.
>>
>>  One solution would be to create a copy of the policy object in assignment 
>> and cloning. However, when it'd come to dynamic balancing of the 
>> distribution, some way would be needed for the user to acquire a reference 
>> to the policy object or the domain map (not the dmap wrapped).
>>
>>  Another solution would be that in assignment and cloning both domain maps 
>> would reference to the same policy object. However, since the behaviour of 
>> domain map is now controlled by the policy object, in practice those two 
>> resulting domain maps would be/become identical.
>>
>>  As a side note, the reason I decided to use policy objects is because that 
>> facilitates easier authoring of distributions in which each locale gets a 
>> single rectangular, dense (or strided) chunk of data . For example, policy 
>> resulting in same behaviour as BlockDist is 160 lines of code (and I use 
>> gratiously whitespaces...), whereas a distribution variable along one axis 
>> is 250 lines, and they are simpler to write than new domain maps.
>>>>      - Is there a requirement to targetLocDom to have same rank as the 
>>>> distribution (see BlockDist)? (I _recall_ Brad saying that at least not 
>>>> intentionlly, but I'm not sure...) I tried changing that, but couldn't get 
>>>> it compile because of some RAD cache functions.
>>>  targetLocDom is an implementation detail that is internal to the Block
>>>  distribution. So I believe there are no requirements about it as far as
>>>  the domain map framework is concerned.
>>>
>>>  Speaking about the internals of the Block distribution the way it is
>>>  designed, I can see, for example, targetLocDom being 1-d while the
>>>  corresponding domains/arrays being 2-d or more. This would mean that the
>>>  domains/arrays are distributed only along a single dimension. Our Block
>>>  distribution supports this scenario when targetLocDom's extent is 1
>>>  along all but one dimension; we did not write code specific to this case.
>>>
>>>  Is this what you question is about? If so, yes you can specialize for
>>>  this case, and there may be a way to have the RAD cache compile. I don't
>>>  know off hand what's needed there. It may be easier to support the
>>>  general case instead.
>>  Yes, this is what I'm talking about since I derived my new distribution 
>> from BlockDist. What I have done so far is what you described, to set all 
>> but one dimension have length of 1. However, since I would like authoring 
>> new policies to be as easy as possible, I'd like to have my distribution not 
>> to care about the rank of targetLocDom and let the author of new policy to 
>> decide that.

------------------------------------------------------------------------------
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the 
conversation now. http://goparallel.sourceforge.net/
_______________________________________________
Chapel-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/chapel-users

Re: Variable Block Distributions

Reply via email to