One more note: > The strided code allocates and frees buffers for pack/unpack which can > trigger the bug. > > One additional work around is to disable the problematic pack/unpack > implementation by setting env var GASNET_VIS_AMPIPE=0
-Brad On Wed, 29 Jan 2014, Brad Chamberlain wrote: > > Hi Rafael and Akihiro -- > > I should've been a bit more patient in my response. After passing along the > "strided" bit, the response came back: > >> The implementation of strided operations is KNOWN to trigger the bug I >> referenced. So, have a look at the work-arounds in the bug report. > > which is here: > > https://upc-bugs.lbl.gov/bugzilla/show_bug.cgi?id=495 > > -Brad > > > On Wed, 29 Jan 2014, Brad Chamberlain wrote: > >> >> Hi Rafael and Akihiro -- >> >> It sounds as though there isn't a known issue w.r.t. large ibv conduit >> messages (I didn't catch the importance of 'strided', so sent that along in >> a second message, but don't expect it'll change the response). They >> wrote: >> >>> We don't have a known error with long messages, but do have one with >>> respect to free() which might be the problem. If we (ibv-conduit via >>> firehose) cache a dynamic memory registration (especially problematic w/ >>> SEGENT_EVERYTHING), then it is possible that memory is free()d and a later >>> malloc() gets the same virtual address. If that happens then ibv may end >>> up performing RDMA from the physical pages corresponding to the PREVIOUS >>> association for the virtual address (NOTE: the pages are ref-counted and >>> thus NOT truly free and NOT mapped into some other process). See >>> https://upc-bugs.lbl.gov/bugzilla/show_bug.cgi?id=495 for some details on >>> that bug. The work-arounds are to disable mmap-based malloc(), or disable >>> firehose. >> >> To me, this doesn't sound like the same thing, but I'm far enough away from >> the problem that you may recognize something that I'm not. >> >> Assuming it doesn't, how hard would it be to put together a small C+GASnet >> test that exhibits this issue? Would it be as simple as sending a large >> buffer in strided mode? In a loop? >> >> As long as I was bothering them, I also asked whether there was a way to >> sanity check that an executable was built with debugging on (since there >> are so many ways that we could get this wrong) and got the response: >> >>> As for checking for debugging support, the preprocessor token >>> GASNET_CONFIG_STRING will tell you a lot about the configuration you've >>> compiled with. We do some name-shifting to ensure you can't link with a >>> library of a different configuration that you've compiled with. >>> >>> Alternatively if you have the "ident" utility for finding RCS strings, >>> applying it to the executable file will extract lots of configuration >>> bits. The value of GASNET_CONFIG_STRING will follow "$GASNetConfig:" You >>> can fake that with: >>> $ perl -n -ln044 -e 'print if /GASNetConfig:/' -- a.out >> >> Thanks, >> -Brad >> >> > ------------------------------------------------------------------------------ WatchGuard Dimension instantly turns raw network data into actionable security intelligence. It gives you real-time visual feedback on key security issues and trends. Skip the complicated setup - simply import a virtual appliance and go from zero to informed in seconds. http://pubads.g.doubleclick.net/gampad/clk?id=123612991&iu=/4140/ostg.clktrk _______________________________________________ Chapel-developers mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/chapel-developers
