My feeling is that the current patch hide the symptoms without addressing
the real issue.

As a side note: The compiler incriminated in this thread, works perfectly
for 128 bits atomic operations in other projects where I use atomic LIFO &
FIFO (but not the one from OMPI as I already raised my concerns about this).

  George.

PS: Why are there totally non-related comments about FIFO in the
opal_lifo.h (starting line 61)?

On Wed, Feb 4, 2015 at 11:30 PM, Gilles Gouaillardet <
gilles.gouaillar...@iferc.org> wrote:

>  Paul and all,
>
> i just pushed
> https://github.com/open-mpi/ompi/commit/b42e3441294e9fe787fe8e9ad7403d5b8e465163
>
> when a buggy compiler is detected, configure now forces
> OPAL_HAVE_CMPXCHG16B=0
> this is enough to make opal_lifo test and make check happy again.
>
> Cheers,
>
> Gilles
>
>
> On 2015/02/04 17:26, Gilles Gouaillardet wrote:
>
> Paul,
>
> my previous email was misleading.
>
> what i really meant is the opal_fifo test works fine with icc 2013u5
> (the release before 2013sp1) and
> icc 2013sp1u2 and later
>
> so even if the reproducer fails with icc older that 2013sp1u2, that
> might not impact ompi
> since for other reasons, the bug is not hit
>
> for example, with icc 2013u5, OPAL_HAVE_CMPXCHG16B=0 so ompi stays away
> from the compiler bug.
>
> Cheers,
>
> Gilles
>
> On 2015/02/04 17:15, Paul Hargrove wrote:
>
>  Giles,
>
> Who says only 2 version are effected?
>
> I have access to 9 revisions of icc.
> Using your reduced case I find 7 that fail and only 2 (the latest two) that
> pass.
> Discounting icc-12 (which can't compile the test) that makes 6 versions
> effected by the bug (not 2).
>
> -Paul
>
> $ for x in 12.1.5.339 13.0.0.079 13.0.1.117 13.1.2.183 13.1.3.192
> 14.0.0.080 14.0.1.106 14.0.2.144 15.0.1.133; do module swap intel intel/$x
> ; echo @ Testing Intel compiler version $x; icc conftest.c && ./a.out &&
> echo PASS ; done
> @ Testing Intel compiler version 12.1.5.339
> conftest.c(10): error: identifier "__int128_t" is undefined
>       __int128_t value;
>       ^
>
> compilation aborted for conftest.c (code 2)
> @ Testing Intel compiler version 13.0.0.079
> a.out: conftest.c:36: main: Assertion `a.value == b.value' failed.
> Aborted
> @ Testing Intel compiler version 13.0.1.117
> a.out: conftest.c:36: main: Assertion `a.value == b.value' failed.
> Aborted
> @ Testing Intel compiler version 13.1.2.183
> a.out: conftest.c:36: main: Assertion `a.value == b.value' failed.
> Aborted
> @ Testing Intel compiler version 13.1.3.192
> a.out: conftest.c:36: main: Assertion `a.value == b.value' failed.
> Aborted
> @ Testing Intel compiler version 14.0.0.080
> a.out: conftest.c:36: main: Assertion `a.value == b.value' failed.
> Aborted
> @ Testing Intel compiler version 14.0.1.106
> a.out: conftest.c:36: main: Assertion `a.value == b.value' failed.
> Aborted
> @ Testing Intel compiler version 14.0.2.144
> PASS
> @ Testing Intel compiler version 15.0.1.133
> PASS
>
> On Tue, Feb 3, 2015 at 11:45 PM, Gilles Gouaillardet 
> <gilles.gouaillar...@iferc.org> wrote:
>
>
>   Nathan,
>
> imho, this is a compiler bug and only two versions are affected :
> - intel icc 14.0.0.080 (aka 2013sp1)
> - intel icc 14.0.1.106 (aka 2013sp1u1)
> /* note the bug only occurs with -O1 and higher optimization levels */
>
> here is attached a simple reproducer
>
> a simple workaround is to configure with ac_cv_type___int128=0
>
> Cheers,
>
> Gilles
>
> On 2015/02/04 4:17, Nathan Hjelm wrote:
>
> Thats the second report involving icc 14. I will dig into this later
> this week.
>
> -Nathan
>
> On Mon, Feb 02, 2015 at 11:03:41PM -0800, Paul Hargrove wrote:
>
>     I have seen opal_fifo hang on 2 distinct systems
>     + Linux/ppc32 with xlc-11.1
>     + Linux/x86-64 with icc-14.0.1.106
>    I have no explanation to offer for either hang.
>    No "weird" configure options were passed to either.
>    -Paul
>    --
>    Paul H. Hargrove                          phhargr...@lbl.gov
>    Computer Languages & Systems Software (CLaSS) Group
>    Computer Science Department               Tel: +1-510-495-2352
>    Lawrence Berkeley National Laboratory     Fax: +1-510-486-6900
>
>   _______________________________________________
> devel mailing listde...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post: 
> http://www.open-mpi.org/community/lists/devel/2015/02/16911.php
>
>
>
> _______________________________________________
> devel mailing listde...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post: 
> http://www.open-mpi.org/community/lists/devel/2015/02/16920.php
>
>
>
> _______________________________________________
> devel mailing listde...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this 
> post:http://www.open-mpi.org/community/lists/devel/2015/02/16921.php
>
>
>
>
> _______________________________________________
> devel mailing listde...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post: 
> http://www.open-mpi.org/community/lists/devel/2015/02/16922.php
>
>
>
> _______________________________________________
> devel mailing listde...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
> Link to this post: 
> http://www.open-mpi.org/community/lists/devel/2015/02/16923.php
>
>
>
> _______________________________________________
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2015/02/16926.php
>

Reply via email to