Re: [OMPI devel] OpenIB/usNIC errors

2014-06-01 Thread Gilles Gouaillardet
Artem, thanks for the feedback. i commited the patch to the trunk (r31922) as i indicated in the commit log, this patch is likely suboptimal and has room for improvement. Jeff commented about the usnic related issue, so i will wait for a fix from the Cisco folks. Cheers, Gilles On Sun,

Re: [OMPI devel] OpenIB/usNIC errors

2014-06-01 Thread Jeff Squyres (jsquyres)
This should also be fixed when we stop firing up the usnic connectivity checker when there are no usNICs present. On Jun 1, 2014, at 9:12 AM, Artem Polyakov wrote: > > 2014-06-01 14:24 GMT+07:00 Gilles Gouaillardet > : > export

Re: [OMPI devel] OpenIB/usNIC errors

2014-06-01 Thread Jeff Squyres (jsquyres)
Ah -- I missed the attachment; I only looked at your email text. I'll have a look now... auto-failure: Ah, I found this late last week and sent a fix around internally for review. Should have something soon for trunk/v1.8. If you care: we accidentally still fire up the usnic connectivity

Re: [OMPI devel] OpenIB/usNIC errors

2014-06-01 Thread Artem Polyakov
2014-06-01 14:24 GMT+07:00 Gilles Gouaillardet < gilles.gouaillar...@gmail.com>: > export OMPI_MCA_btl_openib_use_eager_rdma=0 Gilles, I test your approach. Both: a) export OMPI_MCA_btl_openib_use_eager_rdma=0 b) applying your patch and run without "export OMPI_MCA_btl_openib_use_eager_rdma=0"

Re: [OMPI devel] OpenIB/usNIC errors

2014-06-01 Thread Artem Polyakov
Hello, Jeff. Please, check attached tar ("auto-failure" dir). There I've seen the following message: -- An internal error has occurred in the Open MPI usNIC BTL. This is highly unusual and shouldn't happen. It suggests

Re: [OMPI devel] OpenIB/usNIC errors

2014-06-01 Thread Jeff Squyres (jsquyres)
Just to be clear: it looks like you haven't seen any errors from the usnic BTL, right? (the Cisco VIC uses the usnic BTL only -- it does not use the openib BTL) On Jun 1, 2014, at 2:57 AM, Artem Polyakov wrote: > Hello, while testing new PMI implementation I faced a

Re: [OMPI devel] OpenIB/usNIC errors

2014-06-01 Thread Artem Polyakov
I think I can do that. воскресенье, 1 июня 2014 г. пользователь Gilles Gouaillardet написал: > Artem, > > this looks like the issue initially reported by Rolf > http://www.open-mpi.org/community/lists/devel/2014/05/14836.php > > in http://www.open-mpi.org/community/lists/devel/2014/05/14839.php

Re: [OMPI devel] OpenIB/usNIC errors

2014-06-01 Thread Gilles Gouaillardet
Artem, this looks like the issue initially reported by Rolf http://www.open-mpi.org/community/lists/devel/2014/05/14836.php in http://www.open-mpi.org/community/lists/devel/2014/05/14839.php i posted a patch and a workaround : export OMPI_MCA_btl_openib_use_eager_rdma=0 i do not recall i

Re: [OMPI devel] OpenIB/usNIC errors

2014-06-01 Thread Artem Polyakov
P.S. 1. Just to make sure I tried the same program with old ompi-1.6.5 that is installed on our cluster without any problem. 2. My testing program just sends data through the ring. 2014-06-01 13:57 GMT+07:00 Artem Polyakov : > Hello, while testing new PMI implementation I

[OMPI devel] OpenIB/usNIC errors

2014-06-01 Thread Artem Polyakov
Hello, while testing new PMI implementation I faced a problem with OpenIB and/or usNIC support. The cluster I use is build on Mellanox QDR. We don't use Cisco hardware, thus no Cisco Virtual Interface Card. To exclude possibility of new PMI code influence I used mpirun to launch the job. Slurm job