Hi John --

The comparison that makes the most sense to me are these two:

Standard block dist, with --fast --cache-remote --no-local :
 
(get = 0, get_nb = 4222, put = 0, put_nb = 0, test_nb = 0, wait_nb = 0,
try_nb = 0, fork = 0, fork_fast = 0, fork_nb = 1602) (get = 0, get_nb = 810,
put = 0, put_nb = 0, test_nb = 0, wait_nb = 0, try_nb = 0, fork = 1604,
fork_fast = 0, fork_nb = 0)


My custom dist with --no-local --fast

(get = 4232, get_nb = 0, put = 0, put_nb = 0, test_nb = 0, wait_nb = 0,
try_nb = 0, fork = 0, fork_fast = 0, fork_nb = 1602) (get = 9629, get_nb =
0, put = 800, put_nb = 0, test_nb = 0, wait_nb = 0, try_nb = 0, fork = 1604,
fork_fast = 0, fork_nb = 0)

Though to make it apples-to-apples, I think you should run both without --cache-remote (or both with?). Specifically, I suspect that the conversion of 4222 non-blocking gets (get_nb) in the first case to 4232 blocking gets (get) in the second is due to this flag. Otherwise the first locale's comm statistics look pretty similar between the two, so there don't seem to be any real surprises there.

The second locale's comm results are pretty weird, though: ~800 gets are becoming ~9000 gets and 800 puts. At a glance, this suggests to me that something that ought to be on locale #1 is actually on locale #0. The fact that the number of puts in the second case is equal to the number of gets in the first seems particularly suspcious. While it may be that --cache-remote is playing a role here, it may also simply be that something isn't stored where you'd expect (or getting optimized as you'd expect). But running both versions in a similar --cache-remote mode would remove that question mark.

I know the original authoring of the Block routine took a number of simple-but-ugly steps to localize data that the compiler wouldn't do automatically (most of which, one might expect it ultimately to do). I haven't reviewed those tricks in years to determine whether they are still necessary, but it could be that a seemingly innocuous software engineering refactoring would result in some meta-data being remote rather than local.

Unfortunately, there aren't any high-level tools to help determine this, so the tricks we usually take are to put the narrow the calipers on the communication count routines and study smaller and smaller sections of code; or to put things like "writeln(x.locale)" or "writeln(here)" or "assert (x.locale == here)" or "writeln(this.locale)" into the key routines (like dsiAccess or the 'these' iterators) to make sure that a given variable is stored whwere we'd expect, or that we're running where we'd expect, or that the two locations are the same.

-Brad
------------------------------------------------------------------------------
Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
from Actuate! Instantly Supercharge Your Business Reports and Dashboards
with Interactivity, Sharing, Native Excel Exports, App Integration & more
Get technology previously reserved for billion-dollar corporations, FREE
http://pubads.g.doubleclick.net/gampad/clk?id=190641631&iu=/4140/ostg.clktrk
_______________________________________________
Chapel-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/chapel-users

Reply via email to