Re: [ofiwg] fault-tolerance

2015-09-08 Thread Hefty, Sean
> What's the state of fault-tolerance in OFI? Would it be prudent for > someone to write OFI code that aspired to survive process failures? Are > any implementations known to support this robustly right now? This would be provider specific. I'm not aware of anything that's coded to handle fail

[ofiwg] libfabric 1.1.1rc1 is now available

2015-09-10 Thread Hefty, Sean
I've created the 1.1.1rc1 package for libfabric. This contains bug fixes to the 1.1.0 release. The target release data for 1.1.1 is by the end of the month. The package is available from: http://downloads.openfabrics.org/downloads/ofi/ A corresponding fabtests 1.1.1rc1 package should be avai

[ofiwg] agenda items for meeting on 9/22

2015-09-21 Thread Hefty, Sean
I am collecting agenda items for tomorrow's meeting. If you have items to add, please forward them to me and/or the list. Agenda items - 1.1.1 release update - fabtest source code refactoring update - SC '15 BoF topics brainstorming - SC '15 tutorial update Thanks, Sean __

[ofiwg] libfabric 1.1.1rc2

2015-09-25 Thread Hefty, Sean
I've pushed out libfabric 1.1.1rc2 to: http://downloads.openfabrics.org/downloads/ofi/ http://downloads.openfabrics.org/downloads/ofi/libfabric-1.1.1rc2.tar.gz http://downloads.openfabrics.org/downloads/ofi/libfabric-1.1.1rc2.tar.bz2 The official 1.1.1 release is targeted for next week. All prov

Re: [ofiwg] question about FI_CLAIM/FI_DISCARD and CQ event generation

2015-09-30 Thread Hefty, Sean
> In implementing the FI_DISCARD feature for the gni provider, > we are trying to decide whether a call to fi_trecvmsg with flags > set to FI_CLAIM | FI_DISCARD (following a previously issued > fi_trecvmsg with FI_PEEK | FI_CLAIM) should generate a > CQE. > > It appears that the the sockets and ps

Re: [ofiwg] libfabric dpa provider

2015-10-02 Thread Hefty, Sean
> Does OFA allow for GPL only (or LGPL) license ? I’m no lawyer but the OFA > ByLaws indicate the following in order: > > 1. Unrestricted > > 2. Dual (GPL/BSD) > > 3. BSD only Libfabric is dual license. Fabtests has integrated a json parser that uses the MIT license, which I

[ofiwg] libfabric and fabtests release 1.1.1 are now available

2015-10-02 Thread Hefty, Sean
Libfabric 1.1.1 is a bug fix only release to 1.1.0. It is available for download from: http://downloads.openfabrics.org/downloads/ofi/ A list of updates is available here: https://github.com/ofiwg/libfabric/blob/v1.1.x/NEWS.md A matching fabtests 1.1.1 release was also created, mainly to prov

Re: [ofiwg] OFI WG, DS/DA - Seeking agenda topics

2015-10-05 Thread Hefty, Sean
For the ofiwg, I've invited Paolo to talk about his DPA provider over A3Cube. There are likely lessons that we can learn on how well libfabric adapted to HW that was not considered during its design. If there is time (doubtful), I would like to begin discussing whether and how libfabric should

Re: [ofiwg] DS/DA discussion

2015-10-06 Thread Hefty, Sean
> Of the possible devices, what do they need that OFI does not yet have? > Flags or operations to indicate that a memory should persisted (I think > Intel gave an example of a new instruction to move data into a > “persistence domain”)? Does it lack a “commit” or “sync” operation to make > the remo

[ofiwg] Travis CI testing of libfabric and fabtests

2015-10-06 Thread Hefty, Sean
There are several github projects available that provide continuous integration testing. Unless there is an objection, I plan on setting up Travis CI to test both libfabric and fabtests. The current intent is just to see what we can do with it. For example, verify the build, test that pull re

Re: [ofiwg] DS/DA discussion

2015-10-06 Thread Hefty, Sean
> The original interface assumes process-to-process communication. I am > simply wondering if that was too narrow for the storage aspect. Can the > current interface handle completely passive resources? There is no need to > “commit” memory in the process-to-process model, but the storage model > m

Re: [ofiwg] Travis CI testing of libfabric and fabtests

2015-10-08 Thread Hefty, Sean
> There are several github projects available that provide continuous > integration testing. Unless there is an objection, I plan on setting up > Travis CI to test both libfabric and fabtests. The current intent is just > to see what we can do with it. For example, verify the build, test that >

Re: [ofiwg] Travis CI testing of libfabric and fabtests

2015-10-09 Thread Hefty, Sean
> > I've also enabled branch protection to the upstream trees. Now that > Travis CI is working, I'd like to propose that we require status checks to > pass before pull requests can be merged. The only drawback that I see to > this is that there will be a small delay (~5-10 minutes) before simple

Re: [ofiwg] Travis CI testing of libfabric and fabtests

2015-10-09 Thread Hefty, Sean
> I've also enabled branch protection to the upstream trees. Now that > Travis CI is working, I'd like to propose that we require status checks to > pass before pull requests can be merged. The only drawback that I see to > this is that there will be a small delay (~5-10 minutes) before simple >

Re: [ofiwg] Travis CI testing of libfabric and fabtests

2015-10-12 Thread Hefty, Sean
> FWIW, we've left such details to the sites running the CI tests. But in this case, the site I'm referring to is github - not Cisco or Intel testing. The Travis CI testing is happening automatically for each pull request. I don't know what physical site is doing the actual testing, or even w

Re: [ofiwg] Travis CI testing of libfabric and fabtests

2015-10-12 Thread Hefty, Sean
> >> FWIW, we've left such details to the sites running the CI tests. > > > > But in this case, the site I'm referring to is github - not Cisco or > Intel testing. > > I'm not sure I understand the distinction...? I'm referring to a script that is *stored in the upstream github tree*. This is a

Re: [ofiwg] Agenda topics for tomorrow?

2015-10-20 Thread Hefty, Sean
> Libfabric – any agenda topics requiring discussion? I have the following items to bring up: - Travis testing with external libraries - Function tracing with parameters - Support for synchronous operations - Utility 'provider' update ___ ofiwg mailing

Re: [ofiwg] A question on FI_DELIVERY_COMPLETE

2015-10-26 Thread Hefty, Sean
> Here’s my understanding of how FI_DELIVERY_COMPLETE works on the > *responder* end: If you are doing an RMA operation, and the requester > uses CQ_REMOTE_DATA to signal the end of the transfer to the responder, > and the responder has FI_DELIVERY_COMPLETE set, then the responder won’t > get a co

Re: [ofiwg] A question on FI_DELIVERY_COMPLETE

2015-10-26 Thread Hefty, Sean
> FI_DELIVERY_COMPLETE is intended only to apply to the initiator of an > operation. > [PG] I suspected as much. > > The generation of a notification at the target is assumed to occur after > the operation has completed -- i.e. any transferred data is available. > This holds whether the completion

Re: [ofiwg] Native provider for OFI

2015-10-27 Thread Hefty, Sean
> Is there any plan for development of native provider for OFI ? Yes - as Jeff mentioned, usNIC is native. I'm aware of others, though I don't know their development schedules. The current layering of providers was the fastest way to provide an implementation that covered a broad range of har

[ofiwg] strided IOVs

2015-10-28 Thread Hefty, Sean
It's been a while, but we've had several discussions around the introduction of 'strided IOVs'. To recall, the goal of a strided IOV is to allow an application to reference every Nth element of an array using a compact data structure, versus the more tradition struct iovec. In order to define

Re: [ofiwg] strided IOVs

2015-10-28 Thread Hefty, Sean
> Unfortunately, I don't think one can rule out a use case for strided iov's > for tagged messages. > MPI in particular supports the notion of derived data types, including > srided types, which > can be used for send/receive of messages. Hardware providers that have > optimizations > for these ty

Re: [ofiwg] strided IOVs

2015-10-28 Thread Hefty, Sean
> I need more context to understand how writing X bytes every N bytes into a > one-time use receive buffer is useful. FWIW, someone sent me this link: http://people.csail.mit.edu/fred/ghost_cell.pdf as an example. ___ ofiwg mailing list ofiwg@lists.ope

Re: [ofiwg] strided IOVs

2015-10-29 Thread Hefty, Sean
> Exactly. However, we should be careful about what we are adding here. A > strided interface is a very specific special case that applies to regular > problems. There is a bready of MPI derived datatype capabilities that are > not covered by a simple stride. What are the objectives for coverin

[ofiwg] libfabric 1.2-rc1 schedule reminder

2015-10-29 Thread Hefty, Sean
This is just an FYI, mainly for the provider developers. As discussed in the ofiwg, we are planning on creating a 1.2rc1 release prior to SC 15. I will likely create this around Nov. 12th, with the 1.2 release hopefully coming mid-December. - Sean _

Re: [ofiwg] strided IOVs

2015-11-09 Thread Hefty, Sean
> Does any current hardware support strides natively? If not, maybe we have > a little time to work out that interface. Current providers could be made to support strided IOVs, though they use a combination of HW and SW. My target for this is the "2.0" release, which would include several othe

Re: [ofiwg] provider feature matrix

2015-12-03 Thread Hefty, Sean
> Can you add memory regions table, FI_MR_BASIC/SCALABLE support? I agree. Also, the mode section should probably be marked differently. As it is now, an 'X' in that section is actually goodness. Rather than reversing the markings, which may be confusing, maybe add new marks? R - required O

[ofiwg] libfabric 1.2.0rc2 is now available

2015-12-17 Thread Hefty, Sean
Libfabric release 1.2-rc2 is now available at: http://downloads.openfabrics.org/downloads/ofi ___ ofiwg mailing list ofiwg@lists.openfabrics.org http://lists.openfabrics.org/mailman/listinfo/ofiwg

[ofiwg] libfabric release 1.2.0

2016-01-07 Thread Hefty, Sean
Libfabric and fabtests release 1.2.0 are now available from the following location: http://downloads.openfabrics.org/downloads/ofi/ For a list of features and enhancements added since the previous 1.1.1 release see: https://github.com/ofiwg/libfabric/blob/master/NEWS.md - Sean ___

[ofiwg] ofiwg item: supporting other OS's

2016-01-08 Thread Hefty, Sean
As a discussion item for the next ofiwg, the topic has come up (again) about supporting other operating systems, specifically Windows and Solaris. At least two development teams have asked about support outside of Linux. The desire is to code only to libfabric, with it dealing with differences

[ofiwg] ofiwg item: completion optimizations

2016-01-08 Thread Hefty, Sean
This is a second set of items for the next ofiwg. These were mentioned briefly 1 or 2 meetings ago, but to recap: There is a provider driven need to relax support for completion flags (FI_SEND, FI_RECV, FI_READ, FI_WRITE, etc.). IIRC, I *think* this comes from a limitation on the transmit sid

Re: [ofiwg] ofiwg item: supporting other OS's

2016-01-08 Thread Hefty, Sean
> > As a discussion item for the next ofiwg, the topic has come up (again) > about supporting other operating systems, specifically Windows and > Solaris. > > I would imagine that supporting Solaris with the configury/build/sockets > providers wouldn't be too difficult...? > > I don't know much a

Re: [ofiwg] ofiwg item: supporting other OS's

2016-01-08 Thread Hefty, Sean
> Of course, Intel and others support a C99 compiler for Windows, but if > Windows support implies supporting the platform's default compiler IMO, supporting Windows would mean supplying binaries, not source code. So the problem of how to build it would be ours, not a user. I might be able to u

Re: [ofiwg] ofiwg item: supporting other OS's

2016-01-11 Thread Hefty, Sean
> 1. We can find someone who actually knows how to develop on Windows who > will work on libfabic with us, including long-term maintenance. FWIW, I have done significant development on Windows. But I would expect most this work to come from the people making the request. > 2. There's an extreme

Re: [ofiwg] fi_msg : Is it possible to receive partial data from the message?

2016-01-21 Thread Hefty, Sean
> Therefore I wanted to ask, if there is there any way to: > 1) Wait for CQ when the message arrives > 2) Peek the first few bytes (header) > 3) Then read the entire buffer Libfabric is message based, not stream based, and buffering is the responsibility of the app. The behavior that you're aski

Re: [ofiwg] fi_msg : Is it possible to receive partial data from the message?

2016-01-21 Thread Hefty, Sean
> My next approach will be to ask the user to provide me with a > 'maximum' size of the expected message, and allocate buffers of that > size during the reception... indeed it doesn't sound the best, but I > can avoid copying data around. > > If I understand correctly, in order to receive any kind

Re: [ofiwg] OFIWG agenda for 1/26/16

2016-01-26 Thread Hefty, Sean
> I don't think this has been decided yet. I expect that the DS/DA group > will want to meet as well. Do OFIWG folk see value in a f-2-f? I do for whoever may be there. F2F seems to provide the fastest way to resolve open issues. > The most likely candidate is Monday during the day. Two poss

[ofiwg] wait sets

2016-01-26 Thread Hefty, Sean
This is a continuation of the discussion from today's ofiwg and github issue 1645. An attempt to describe the desired application behavior is: 1. Wait for one or more events to occur 2. Get a list of queues that are ready for action 3. Process each queue until empty Assuming this be

Re: [ofiwg] wait sets

2016-01-27 Thread Hefty, Sean
> An attempt to describe the desired application behavior is: > > 1. Wait for one or more events to occur > 2. Get a list of queues that are ready for action > 3. Process each queue until empty I propose a couple of changes to the above sequence: 1. If it is okay to wait 1.1.

Re: [ofiwg] wait sets

2016-01-28 Thread Hefty, Sean
> >1. If it is okay to wait > >1.1. Wait for one or more events to occur > >2. Get list of queues ready for action > >3. Process each queue > > > > We then define step 1. Ba-da-bing, ba-da-boom, and we're done. > > Yes, this seems to help some. To answer Ben's question from his

Re: [ofiwg] wait sets

2016-02-01 Thread Hefty, Sean
> >>>1. If it is okay to wait > >> >1.1. Wait for one or more events to occur > >> >2. Get list of queues ready for action > >> >3. Process each queue > >> > > >> > We then define step 1. Ba-da-bing, ba-da-boom, and we're done. > >> > >> Yes, this seems to help some. > > > >To an

Re: [ofiwg] wait sets

2016-02-04 Thread Hefty, Sean
> Will this still require processing all queues attached to a wait object, No - it should not > or is there going to be an interface call added to somehow retrieve > information about which queues signaled (if possible)? If an app is using a wait set, using a poll set to get this list makes sens

Re: [ofiwg] wait sets

2016-02-05 Thread Hefty, Sean
> It isn’t clear to me how a poll set would be used with a wait set. Could > you clarify or give an example? Any CQ or counter that uses a wait set would be added to a poll set. For each CQ that is needed Allocate and point to wait set W Add to poll set P

[ofiwg] FreeBSD support for libfabric and fabtests

2016-02-05 Thread Hefty, Sean
Libfabric and fabtests have been ported to FreeBSD. Both compile and run with the sockets and udp providers on FreeBSD 10.2-i386 on a VirtualBox VM. If anyone has access to a real FreeBSD system, I would appreciate help in testing the build/install/execution of the code and reporting back with

Re: [ofiwg] OFIWG - Call for Agenda Topics for tomorrow (2/9/16) (EOM)

2016-02-09 Thread Hefty, Sean
I would like to discuss the utility library/provider framework that is being added to libfabric. ___ ofiwg mailing list ofiwg@lists.openfabrics.org http://lists.openfabrics.org/mailman/listinfo/ofiwg

[ofiwg] reminder of new libfabrics-users mail list

2016-02-09 Thread Hefty, Sean
Do you like libfabrics? Do you like to help people? Do you need more email? Then the libfabrics-users email list is for you! This is a reminder (at least I hope it's a reminder), that there is a user focused email list for libfabrics. You can sign up for the mailing list here: http://lists.

[ofiwg] input on intra-node implementation

2016-02-09 Thread Hefty, Sean
I want to provide an intra-node communication (i.e. loopback) utility to libfabric. The loopback utility could be part of a stand-alone provider, or incorporated into other providers. For this, I'm looking at selecting a single, easily maintained implementation. These are my choices so far:

Re: [ofiwg] [libfabric-users] sockets provider, number of outstanding reads

2016-02-10 Thread Hefty, Sean
Copying ofiwg > Tests are usually added to the fabtests repo - > https://github.com/ofiwg/fabtests.git. > > Although the reproducer is a little too complicated for fabtests (because > of the MPI dependency), it will be good to have a trimmed down version of > the test in fabtests. Maybe fabtests

Re: [ofiwg] input on intra-node implementation

2016-02-12 Thread Hefty, Sean
> XPMEM appears to be the most powerful option, but since CMA is standard > now, I think that's the better option to pursue. Thanks - this is exactly the information that I'm looking for. > I think respecting security concerns is essential to mainstream adoption. > And in HPC, security isn't just

Re: [ofiwg] Coverity coverage of new providers

2016-02-16 Thread Hefty, Sean
> It looks like we're not building the following providers for Coverity > testing coverage: > > - GNI > - PSM2 I believe this is doable. Copying Jianxin. > - MXM This is more difficult. I'm able to build against mxm on my system, but I had to manually extract out the library files from an in

Re: [ofiwg] prov/util/util_av.c: customization

2016-02-16 Thread Hefty, Sean
> What is the expected integration model -- can we add some hooks (i.e., > opns-like function pointers) into util_av.c?  Or is it expected that the > "real" provider will wrap the util_av functionality behind its own? I think we select whichever model makes the most sense and adapt accordingly.

Re: [ofiwg] OFI WG - Agenda topics for tomorrow?

2016-02-22 Thread Hefty, Sean
We can have a brief discussion on handling wait objects. See pull request 1765, issue 1645, and email thread: https://www.mail-archive.com/ofiwg@lists.openfabrics.org/msg00150.html I have slides for discussion regarding shared memory support. There has been some email discussion on this: ht

Re: [ofiwg] DS/DA Runtime Model Discussion

2016-02-23 Thread Hefty, Sean
> Do I understand correctly that, in a nutshell, the proposal is > that kfabrics becomes semantically richer than current > kernel verbs or any other kernel network interface, which > would allow to (efficiently, we hope) abstract from any underlying > fabrics semantics? Wouldn't that just bloat th

Re: [ofiwg] Access to scurbbed email attachments (digest email)?

2016-03-01 Thread Hefty, Sean
> When I tried to open this one I got a 403 error: > > > > http://lists.openfabrics.org/pipermail/ofiwg/attachments/20160301/dd81696f > /attachment.pptx > > > > Forbidden > > You don't have permission to access > /pipermail/ofiwg/attachments/20160301/dd81696f/attachment.pptx on this > server

[ofiwg] OFIWG F2F in Monterrey

2016-03-14 Thread Hefty, Sean
After looking at the open/known issues, there weren't any that I didn't think we could solve through either GitHub issues or our normal ofiwg meetings. Most require specific proposals beforehand to discuss and analyze. For a F2F meeting, this was the best idea that I could come up with for an

[ofiwg] fi_trywait and providers

2016-03-14 Thread Hefty, Sean
The 1.3 libfabric release will contain the fi_trywait call. For providers, there are a couple of options available for handling this. - The libfabric core can check that a provider supports the 1.3 API, and reject those that don't. This is the easy button. - The fi_trywait call can include a

Re: [ofiwg] OFIWG F2F in Monterrey

2016-03-14 Thread Hefty, Sean
Resending because I don't think this hit the list. > After looking at the open/known issues, there weren't any that I didn't > think we could solve through either GitHub issues or our normal ofiwg > meetings. Most require specific proposals beforehand to discuss and > analyze. For a F2F meeting,

Re: [ofiwg] OFIWG F2F in Monterrey

2016-03-14 Thread Hefty, Sean
Try #3... > Resending because I don't think this hit the list. > > > After looking at the open/known issues, there weren't any that I didn't > > think we could solve through either GitHub issues or our normal ofiwg > > meetings. Most require specific proposals beforehand to discuss and > > analy

Re: [ofiwg] fi_trywait and providers

2016-03-14 Thread Hefty, Sean
ht preference for options #1 or #2, but I don't feel strongly > about it. Other opinions? > > -Dave > > > On Mar 9, 2016, at 6:24 PM, Hefty, Sean wrote: > > > > The 1.3 libfabric release will contain the fi_trywait call. For > providers, there are a couple of

Re: [ofiwg] provider feature matrix

2016-03-15 Thread Hefty, Sean
> I think it would be worthwhile to keep the 1.2.1 data (e.g., make a new > wiki page with "1.2.1" in the filename somehow. I.e., we have the data, > so we might as well keep it -- especially in early days of libfabric, it > may well be useful (user asks, "I'm using libfabric vX.Y, but Z doesn't >

Re: [ofiwg] OFIWG F2F in Monterrey

2016-03-15 Thread Hefty, Sean
> Just to make sure,this is for the Monday of the week of the workshop in > Monterrey, right? Yes - this would be for Monday afternoon. ___ ofiwg mailing list ofiwg@lists.openfabrics.org http://lists.openfabrics.org/mailman/listinfo/ofiwg

Re: [ofiwg] v1.3.0 release?

2016-03-28 Thread Hefty, Sean
> Should I make a 1.3.0rc1 tarball? That would be great if you could. I was on vacation last week, and I just finished going through a backlog of emails before starting on rc1. - Sean ___ ofiwg mailing list ofiwg@lists.openfabrics.org http://lists.ope

Re: [ofiwg] libfabric + fabtests 1.3rc1 is available

2016-03-28 Thread Hefty, Sean
> - The libfabric.so number needs to be sanity checked (I updated it to > 3:0:2 Your sanity is verified. ___ ofiwg mailing list ofiwg@lists.openfabrics.org http://lists.openfabrics.org/mailman/listinfo/ofiwg

Re: [ofiwg] fi_alias to fi_ep_alias?

2016-04-12 Thread Hefty, Sean
> int fi_alias(struct fid_ep *ep, fid_t *alias_ep, uint64_t flags); To be clear, this is the man page definition. The source code defines this as: int fi_alias(struct fid *fid, struct fid **alias_fid, uint64_t flags) This in turn translates into fi_control(). > Instead, should we change the AP

Re: [ofiwg] Next release: 1.3.1 or 1.4.0?

2016-04-13 Thread Hefty, Sean
> Is there any thought towards having a v1.3.1 release? Not that I'm aware of. There will likely be more verbs RDM support/fixes added to the next release. > I.e., are there any commits that were held off to get v1.3.0 out the door? I held out one commit from fabtests, which seemed like too bi

Re: [ofiwg] Fwd: Broken: ofiwg/libfabric#1450 (master - 65cf71c)

2016-04-15 Thread Hefty, Sean
We see these sort of failures from Travis occasionally. My best guess is that there are timing issues starting up the server and the client. The test scripts use a simple delay between them. ___ ofiwg mailing list ofiwg@lists.openfabrics.org http://li

[ofiwg] OFIWG meeting agenda for 4/19

2016-04-18 Thread Hefty, Sean
This is a call-out for any agenda items for tomorrow's OFIWG. I have the following items to bring up for discussion, which may also be discussed via email. * Fabric/domain name resolution service. Request is to use friendly names to identify a fabric, such as "primary-ib-subnet" * Should

Re: [ofiwg] OFIWG meeting agenda for 4/19

2016-04-20 Thread Hefty, Sean
See below for notes from the meeting. > I have the following items to bring up for discussion, which may also be > discussed via email. > > * Fabric/domain name resolution service. >Request is to use friendly names to identify a fabric, such as >"primary-ib-subnet" The option to use a f

Re: [ofiwg] AF_UNIX,SOCK_DGRAM for shared memory signaling

2016-05-03 Thread Hefty, Sean
> I didn't bring it up on the call because I wanted to check on a couple of > semantics, but I think you could just use a single > "socket(AF_UNIX,SOCK_DGRAM,0)" socket per endpoint RX context. This should > scale -- it doesn't blow up as O(P^2) like RC QPs / SOCK_STREAM sockets / > named pipes wo

[ofiwg] detecting FABRIC_DIRECT mismatch

2016-05-13 Thread Hefty, Sean
We've hit into a situation where libfabric was compiled with FABRIC_DIRECT, but the application was not. What I'm looking for are ideas for how to detect this situation, so that libfabric can report an appropriate error/log message. I believe a run time check is preferable. So far, the ideas

Re: [ofiwg] detecting FABRIC_DIRECT mismatch

2016-05-13 Thread Hefty, Sean
> I don't know the details of how direct works, but if you must build a > special hardware specific version of libfabric.so, then those symbols > must not overlap with the normal full function library symbols. Libfabric exports a minimal set of functions. Those are unchanged between the direct a

Re: [ofiwg] detecting FABRIC_DIRECT mismatch

2016-05-13 Thread Hefty, Sean
> That is still an ABI change, you can't have > > fi_foobar(struct fi_foo *) > and > fi_foobar(struct fi_foo_verbs *) > > With the same symbol name - that is an incorrect way to use dynamic > library symbols. To be clear, what actually gets replaced is the app invoking object->functio

Re: [ofiwg] detecting FABRIC_DIRECT mismatch

2016-05-13 Thread Hefty, Sean
> > object->function_set->foobar(struct fi_foo *) > > Okay, but in this case, if I follow correctly, the function that > returns object->function_set is actually what has changed, ie it no > longer returns the foobar pointer, so it should be name mangled and > not present at all. I'm not sure

Re: [ofiwg] [Ofvwg] [ANNOUNCE] Open Fabrics Verbs Working Group (OFVWG) meeting tomorrow - 5/17/2016

2016-05-17 Thread Hefty, Sean
Copying ofiwg for input on how well webex has been working for others. > > The ofiwg uses webex, which has worked fairly well, provided I don't try > calling into the meeting using my lync phone. :) > > I've had inconsistent results with webex on Linux.. Is it still that > awful Java implementat

Re: [ofiwg] [libfabric-users] same process, multiple endpoints

2016-05-23 Thread Hefty, Sean
Re-posting to ofiwg mail list, as this is more of a developer question. > Hi all, > > I am from the alpha group [http://alpha.di.unito.it/] (CS dept. @ Univ. of > Torino, Italy). > We just started to integrate libfabric into FastFlow [link], on the track > of the great job by Paolo Inaudi with th

Re: [ofiwg] [libfabric-users] same process, multiple endpoints

2016-05-24 Thread Hefty, Sean
> Here I don't know how to obtain the two endpoints I need: I need to obtain > different fi_info > structures (by two different calls to fi_getinfo with same IP address and > different TCP ports) but I > want them to share the same domain. The domain itself comes from a > preliminary fi_getinfo

Re: [ofiwg] [libfabric-users] same process, multiple endpoints

2016-05-25 Thread Hefty, Sean
> I actually found a working solution. Not sure is the most canonical way, > but it works: > 1) call fi_getinfo (with some hints to select a provider) without any > address/flag > 2) open a domain > 3) call 2x fi_getinfo (with the same set of hints to get the same > provider as point 1) with specif

Re: [ofiwg] DS/DA Agenda for Tuesday, 6/21/16

2016-06-22 Thread Hefty, Sean
I think it would be better to discuss this proposal during the ofiwg-"classic" meetings, rather than ds/da. I don’t think we want to split libfabric API discussions into different meetings, as that just requires everyone to attend both. - Sean > -Original Message- > From: ofiwg [mail

Re: [ofiwg] EP_RDM question

2016-06-30 Thread Hefty, Sean
> EP_RDM is described thusly in fi_endpoint(3): > > - > FI_EP_RDM > Reliable datagram message. Provides a reliable, unconnected data > transfer service with flow control that maintains message boundaries. > - > > Consider this scenario: > > 1. sender sends message A on EP_RDM endpoint at

Re: [ofiwg] EP_RDM question

2016-06-30 Thread Hefty, Sean
> >> 2a. If receive buffers are eventually posted at the target, message > A > >> will be delivered successfully. > >> 2b. If the target endpoint is closed before receive buffers are > >> available at the target for message A, an error is triggered at the > >> sender indicating that message A was n

Re: [ofiwg] OFI WG - opinions sought on prioritizing agenda items for next meeting (9/20)

2016-09-12 Thread Hefty, Sean
> Our choices are: > > 1. Defer the code walk through of the Collective Offload proposal, > or > > 2. Defer completion of the discussion of the Enhancements to RDMA > for Non-volatile memory. > > Please express your opinions, soon. I prefer to extend my sabbatical until the end of the

Re: [ofiwg] Two-stage completion

2016-09-14 Thread Hefty, Sean
> In my application, I could benefit greatly if I could force generation > of two completion events per fi_tsendmsg – one generated on > FI_TRANSMIT_COMPLETE and the other on FI_DELIVERY_COMPLETE, so I can > block until the buffer is safe to mangle (post-transmit), and also > throw an event in my a

Re: [ofiwg] [libfabric-users] Two-stage completion

2016-09-14 Thread Hefty, Sean
> What if your provider copies to local buffer, then transmits from > there? Local completion happens at the rate of memcpy, which is usually > faster than RDMA. FI_TRANSMIT_COMPLETE is not a local completion. "A completion guarantees that the operation is no longer dependent on the fabric or lo

Re: [ofiwg] [libfabric-users] Two-stage completion

2016-09-15 Thread Hefty, Sean
> ...oh? I thought that FI_TRANSMIT_COMPLETE was the local completion and > FI_DELIVERY_COMPLETE was the remote completion. What does this mean, > then? > > FI_TRANSMIT_COMPLETE > Applies to fi_sendmsg. Indicates that a completion should not be > generated until the operation has been succes

Re: [ofiwg] [libfabric-users] Two-stage completion

2016-09-15 Thread Hefty, Sean
> You looking at the flags discussion for the send/receive operations. You -> you're > This is calling out that those flags only apply to the fi_sendmsg call. > Other send operations do no take flags, and they do not apply to the No -> not Apparently I need more sleep. _

Re: [ofiwg] [libfabric-users] Two-stage completion

2016-09-15 Thread Hefty, Sean
> If I start using the inject call to block until the buffer is safe, how > do I get the kind of completion I'd need for my timeout, if there's no > flags argument in the fi_tinject call? To reuse the buffer immediately but still get a completion, you should call fi_tsendmsg with the FI_INJECT fl

Re: [ofiwg] [libfabric-users] Two-stage completion

2016-09-15 Thread Hefty, Sean
> And then what completion mode should I select? FI_DELIVERY_COMPLETE? I would let the provider select this. FI_DELIVERY_COMPLETE can have significant overhead based on the provider implementation. ___ ofiwg mailing list ofiwg@lists.openfabrics.org htt

Re: [ofiwg] [libfabric-users] Two-stage completion

2016-09-16 Thread Hefty, Sean
> I'm not sure that using fi_tsendmsg with the FI_INJECT flag would meet > Jonathan's requirements - if I'm reading this email chain correctly. > I'll reorder his requirements and add comments: > > > 4. The application doesn't perform extra copy operations on the > message unless it's complete

[ofiwg] ofiwg agenda for 9/20

2016-09-19 Thread Hefty, Sean
We need to defer the discussions around NVM and offloaded collectives. We will still hold a meeting tomorrow to discuss the following items: 1.4 release - schedule and status (there are 28 opened issues marked with the 1.4 milestone) PR 2195/2196, issue 1975 Proposals to handle l

Re: [ofiwg] Question about endpoints and passive endpoints

2016-09-19 Thread Hefty, Sean
> From what I can tell from fi_endpoint.h and fi_cm.h, fid_pep and fid_ep > share the method groups fi_ops_ep and fi_ops_cm. It appears that in > those methods, if fid_t is used, any endpoint can be used and if > fid_pep is used, a passive endpoint must be used. If fid_t is specified, then there

Re: [ofiwg] Question about endpoints and passive endpoints

2016-09-19 Thread Hefty, Sean
> struct fid_pep *pep; > (struct fid_ep *) pep; /* bad casting */ > > However, something like this is okay: > > ... &pep.fid; /* get at the base fid object */ Well, it's okay if you ignore the compiler warning about using '.' rather than '->' _

Re: [ofiwg] Blue Gene /Q OFI Libfabric provider open source contribution to OFIWG git repo

2016-09-28 Thread Hefty, Sean
> Argonne has developed an OFI Libfabric implementation running on Blue > Gene /Q, with myself as the designated maintainer. It is currently in- > house proprietary software that we would like to open source and > contribute back to the OFIWG git repo. We intend to continue to > develop and suppo

[ofiwg] [ANNOUNCE] libfabric release v1.4.0rc1

2016-10-03 Thread Hefty, Sean
Libfabric release candidate v1.4.0rc1 is now available. https://github.com/ofiwg/libfabric/releases This is a pre-release for 1.4 testing. Thanks, - Sean ___ ofiwg mailing list ofiwg@lists.openfabrics.org http://lists.openfabrics.org/mailman/listinfo/o

Re: [ofiwg] Pushing bug fixes for 1.1.x and 1.2.x

2016-10-13 Thread Hefty, Sean
This sounds good to me. We can continue to create 'stable' branches as needed, or even support a specific release longer term if there's a demand for it. - Sean ___ ofiwg mailing list ofiwg@lists.openfabrics.org http://lists.openfabrics.org/mailman/lis

[ofiwg] release candidate 1.4.0rc2 is available

2016-10-18 Thread Hefty, Sean
Download files are available at: https://github.com/ofiwg/libfabric/releases - Sean ___ ofiwg mailing list ofiwg@lists.openfabrics.org http://lists.openfabrics.org/mailman/listinfo/ofiwg

[ofiwg] ofiwg face to face meetings

2016-10-19 Thread Hefty, Sean
At the last ofiwg, the possibility of 2 face to face meetings was discussed. The first proposal was to hold a developer focused hack-a-thon. The scheduling for such an event has not been defined; this is still exploratory. My goals for a hack-a-thon would be to address what I would refer to as

[ofiwg] fabtests 1.4.0rc1 now available

2016-10-25 Thread Hefty, Sean
In preparation for the pending libfabric 1.4.0 release, I've published a release candidate for fabtests. It is available from: https://github.com/ofiwg/fabtests/releases I anticipate the libfabric 1.4.0 release by the end of the week. Thanks, - Sean

Re: [ofiwg] sockets fabtests failure in testing for 1.4.0 release

2016-10-28 Thread Hefty, Sean
> I'm seeing a problem in a specific test: > > - Build and install libfabric 1.3.0 > - Build and install fabtests 1.3.0 against the installed libfabric > 1.3.0 > - Remove libfabric 1.3.0 > - Build and install libfabric 1.4.0 in its place > - Run fabtests > > Running the simple loopback fabtests/s

Re: [ofiwg] sockets fabtests failure in testing for 1.4.0 release

2016-10-28 Thread Hefty, Sean
bugs in fabtests. If that's the case, then the failures are okay. - Sean > -Original Message- > From: Jeff Squyres (jsquyres) [mailto:jsquy...@cisco.com] > Sent: Friday, October 28, 2016 12:22 PM > To: Hefty, Sean > Cc: OFIWG Mailing list > Subject: Re: sockets

Re: [ofiwg] sockets fabtests failure in testing for 1.4.0 release

2016-10-28 Thread Hefty, Sean
> Fabtests commit def17d223107244f1d9ce77a440085f717a4b69e occurred > between v1.3 and 1.4. > > This commits applied fixes to both of the tests that are failing. > > I haven't found a relevant change to libfabric yet, but my guess is > that a fix was applied to the sockets provider, which exposed

  1   2   3   4   5   6   >