Re: Benchmarking MXNet with different compilers and different OpenMP implementations (results)

2019-04-12 Thread Pedro Larroy
Are there any updates on this?

This is still affecting multiprocessing, some tests hang:

rces. For information on submitting this issue, please see
https://bugs.llvm.org/.
[INFO] Setting test np/mx/python random seeds, use
MXNET_TEST_SEED=2124604270 to reproduce.
Assertion failure at kmp_runtime.cpp(6479): __kmp_thread_pool == __null.
OMP: Error #13: Assertion failure at kmp_runtime.cpp(6479).
OMP: Hint: Please submit a bug report with this message, compile and
run commands used, and machine configuration info including native
compiler and operating system versions. Faster response will be
obtained by including all program sources. For information on
submitting this issue, please see https://bugs.llvm.org/.
Assertion failure at kmp_runtime.cpp(6479): __kmp_thread_pool == __null.
OMP: Error #13: Assertion failure at kmp_runtime.cpp(6479).
OMP: Hint: Please submit a bug report with this message, compile and
run commands used, and machine configuration info including native
compiler and operating system versions. Faster response will be
obtained by including all program sources. For information on
submitting this issue, please see https://bugs.llvm.org/.
^CException ignored in: >
Traceback (most recent call last):
  File "/home/piotr/mxnet_other/python/mxnet/gluon/data/dataloader.py",
line 595, in __del__
self._worker_pool.terminate()
  File "/usr/lib/python3.6/multiprocessing/pool.py", line 567, in terminate
self._terminate()
  File "/usr/lib/python3.6/multiprocessing/util.py", line 186, in __call__
res = self._callback(*self._args, **self._kwargs)
  File "/usr/lib/python3.6/multiprocessing/pool.py", line 597, in
_terminate_pool
cls._help_stuff_finish(inqueue, task_handler, len(pool))
  File "/usr/lib/python3.6/multiprocessing/pool.py", line 582, in
_help_stuff_finish
inqueue._rlock.acquire()
KeyboardInterrupt

Pedro.

On Thu, Feb 14, 2019 at 6:30 AM Tsukrov, Stanislav
 wrote:
>
> Thanks Aaron for the feedback.
>
> > As for your next steps, would you propose that cmake be brought up to 
> > parity?
> Yes. sse2 in cmake vs sse3 in make is a minor example without high impact. 
> There are others.
>
> > It seems strange that it causes slowness and if so, it shouldn't be 
> > recommended for now.
> There are some issues in the cmake-files code, that should be fixed. Some of 
> them are workarounded for the benchmark.
>
> Best Regards
>
> Stas
>
> On 14.02.19, 14:09, "Anton Chernov"  wrote:
>
> Thank you, Aaron, for your interest on the topic.
>
> My main previous proposal still stands: remove bundled OpenMP submodule 
> and
> use OpenMP provided by the environment [1]. This might lead to performance
> degradation in some cases where an old OpenMP library is used or thread
> affinity wasn't set properly. But that would be a problem of the
> environment, not MXNet.
>
> I described some alternative solutions in [1] as part of this [2] thread.
> Tricking the linker with symlinks in both cases should allow to avoid
> multiple OpenMP implementations linked simultaneously to MXNet. Windows
> questions would be still open.
>
> Best
> Anton
>
> [1] https://github.com/apache/incubator-mxnet/pull/12160
> [2]
> 
> https://lists.apache.org/thread.html/007d8db15a1782e1b20896a4050b62710d4ff0908c67b94af7cb0f8b@%3Cdev.mxnet.apache.org%3E
> [3]
> 
> https://lists.apache.org/thread.html/4827f0f742b6e7e070da350ea81226d059401527f3072ce8b33c1fdf@%3Cdev.mxnet.apache.org%3E
>
>
> вт, 12 февр. 2019 г. в 16:39, Aaron Markham :
>
> > This is really great research. I've often wondered what the difference
> > really is, and why it has to be so complicated. It seems the answer is
> > there isn't much difference and it shouldn't be as complex.
> > As for your next steps, would you propose that cmake be brought up to
> > parity? It seems strange that it causes slowness and if so, it 
> shouldn't be
> > recommended for now.
> > Also, testing for windows compliers might be quite important as install
> > stats suggest a significant portion of windows users. Wouldn't this 
> nudge
> > the decision of what to use as a rule going forward?
> > I ran into this submodule openmp issue on windows myself. How does that 
> get
> > fixed? Do we have to repackage all of the submodules to make sure they 
> use
> > the recommended implementation or they use what the system expects?
> >
> > Cheers,
> > Aaron
> >
> > On Tue, Feb 12, 2019, 04:37 Anton Chernov  wrote:
> >
> > > Dear MXNet community,
> > >
> > > Due to multiple problems related to OpenMP and stale proposed change 
> [1]
> > we
> > > have been working on gathering performance data on the impact of using
> > > different OpenMP implementations with MXNet (great thanks to Stanislav
> > > Tsukrov for the hard work). The results can be found here [2].
> > >
> > > As a short summary of the investigation: 

Re: Benchmarking MXNet with different compilers and different OpenMP implementations (results)

2019-02-14 Thread Tsukrov, Stanislav
Thanks Aaron for the feedback.

> As for your next steps, would you propose that cmake be brought up to parity? 
Yes. sse2 in cmake vs sse3 in make is a minor example without high impact. 
There are others.

> It seems strange that it causes slowness and if so, it shouldn't be 
> recommended for now.
There are some issues in the cmake-files code, that should be fixed. Some of 
them are workarounded for the benchmark.

Best Regards

Stas

On 14.02.19, 14:09, "Anton Chernov"  wrote:

Thank you, Aaron, for your interest on the topic.

My main previous proposal still stands: remove bundled OpenMP submodule and
use OpenMP provided by the environment [1]. This might lead to performance
degradation in some cases where an old OpenMP library is used or thread
affinity wasn't set properly. But that would be a problem of the
environment, not MXNet.

I described some alternative solutions in [1] as part of this [2] thread.
Tricking the linker with symlinks in both cases should allow to avoid
multiple OpenMP implementations linked simultaneously to MXNet. Windows
questions would be still open.

Best
Anton

[1] https://github.com/apache/incubator-mxnet/pull/12160
[2]

https://lists.apache.org/thread.html/007d8db15a1782e1b20896a4050b62710d4ff0908c67b94af7cb0f8b@%3Cdev.mxnet.apache.org%3E
[3]

https://lists.apache.org/thread.html/4827f0f742b6e7e070da350ea81226d059401527f3072ce8b33c1fdf@%3Cdev.mxnet.apache.org%3E


вт, 12 февр. 2019 г. в 16:39, Aaron Markham :

> This is really great research. I've often wondered what the difference
> really is, and why it has to be so complicated. It seems the answer is
> there isn't much difference and it shouldn't be as complex.
> As for your next steps, would you propose that cmake be brought up to
> parity? It seems strange that it causes slowness and if so, it shouldn't 
be
> recommended for now.
> Also, testing for windows compliers might be quite important as install
> stats suggest a significant portion of windows users. Wouldn't this nudge
> the decision of what to use as a rule going forward?
> I ran into this submodule openmp issue on windows myself. How does that 
get
> fixed? Do we have to repackage all of the submodules to make sure they use
> the recommended implementation or they use what the system expects?
>
> Cheers,
> Aaron
>
> On Tue, Feb 12, 2019, 04:37 Anton Chernov  wrote:
>
> > Dear MXNet community,
> >
> > Due to multiple problems related to OpenMP and stale proposed change [1]
> we
> > have been working on gathering performance data on the impact of using
> > different OpenMP implementations with MXNet (great thanks to Stanislav
> > Tsukrov for the hard work). The results can be found here [2].
> >
> > As a short summary of the investigation: The difference between 
different
> > compilers is insignificant. Native OpenMP implementations (more or less
> > recent) perform equally (<5% difference). See more details in the
> document.
> >
> > Please review the document and share your thoughts on the topic.
> >
> > Thanks!
> >
> > Best
> > Anton
> >
> > [1]
> >
> >
> 
https://lists.apache.org/thread.html/4827f0f742b6e7e070da350ea81226d059401527f3072ce8b33c1fdf@
> > 
> > [2] https://cwiki.apache.org/confluence/x/2wclBg
> >
>





Re: Benchmarking MXNet with different compilers and different OpenMP implementations (results)

2019-02-14 Thread Anton Chernov
Thank you, Aaron, for your interest on the topic.

My main previous proposal still stands: remove bundled OpenMP submodule and
use OpenMP provided by the environment [1]. This might lead to performance
degradation in some cases where an old OpenMP library is used or thread
affinity wasn't set properly. But that would be a problem of the
environment, not MXNet.

I described some alternative solutions in [1] as part of this [2] thread.
Tricking the linker with symlinks in both cases should allow to avoid
multiple OpenMP implementations linked simultaneously to MXNet. Windows
questions would be still open.

Best
Anton

[1] https://github.com/apache/incubator-mxnet/pull/12160
[2]
https://lists.apache.org/thread.html/007d8db15a1782e1b20896a4050b62710d4ff0908c67b94af7cb0f8b@%3Cdev.mxnet.apache.org%3E
[3]
https://lists.apache.org/thread.html/4827f0f742b6e7e070da350ea81226d059401527f3072ce8b33c1fdf@%3Cdev.mxnet.apache.org%3E


вт, 12 февр. 2019 г. в 16:39, Aaron Markham :

> This is really great research. I've often wondered what the difference
> really is, and why it has to be so complicated. It seems the answer is
> there isn't much difference and it shouldn't be as complex.
> As for your next steps, would you propose that cmake be brought up to
> parity? It seems strange that it causes slowness and if so, it shouldn't be
> recommended for now.
> Also, testing for windows compliers might be quite important as install
> stats suggest a significant portion of windows users. Wouldn't this nudge
> the decision of what to use as a rule going forward?
> I ran into this submodule openmp issue on windows myself. How does that get
> fixed? Do we have to repackage all of the submodules to make sure they use
> the recommended implementation or they use what the system expects?
>
> Cheers,
> Aaron
>
> On Tue, Feb 12, 2019, 04:37 Anton Chernov  wrote:
>
> > Dear MXNet community,
> >
> > Due to multiple problems related to OpenMP and stale proposed change [1]
> we
> > have been working on gathering performance data on the impact of using
> > different OpenMP implementations with MXNet (great thanks to Stanislav
> > Tsukrov for the hard work). The results can be found here [2].
> >
> > As a short summary of the investigation: The difference between different
> > compilers is insignificant. Native OpenMP implementations (more or less
> > recent) perform equally (<5% difference). See more details in the
> document.
> >
> > Please review the document and share your thoughts on the topic.
> >
> > Thanks!
> >
> > Best
> > Anton
> >
> > [1]
> >
> >
> https://lists.apache.org/thread.html/4827f0f742b6e7e070da350ea81226d059401527f3072ce8b33c1fdf@
> > 
> > [2] https://cwiki.apache.org/confluence/x/2wclBg
> >
>


Re: Benchmarking MXNet with different compilers and different OpenMP implementations (results)

2019-02-12 Thread Aaron Markham
This is really great research. I've often wondered what the difference
really is, and why it has to be so complicated. It seems the answer is
there isn't much difference and it shouldn't be as complex.
As for your next steps, would you propose that cmake be brought up to
parity? It seems strange that it causes slowness and if so, it shouldn't be
recommended for now.
Also, testing for windows compliers might be quite important as install
stats suggest a significant portion of windows users. Wouldn't this nudge
the decision of what to use as a rule going forward?
I ran into this submodule openmp issue on windows myself. How does that get
fixed? Do we have to repackage all of the submodules to make sure they use
the recommended implementation or they use what the system expects?

Cheers,
Aaron

On Tue, Feb 12, 2019, 04:37 Anton Chernov  wrote:

> Dear MXNet community,
>
> Due to multiple problems related to OpenMP and stale proposed change [1] we
> have been working on gathering performance data on the impact of using
> different OpenMP implementations with MXNet (great thanks to Stanislav
> Tsukrov for the hard work). The results can be found here [2].
>
> As a short summary of the investigation: The difference between different
> compilers is insignificant. Native OpenMP implementations (more or less
> recent) perform equally (<5% difference). See more details in the document.
>
> Please review the document and share your thoughts on the topic.
>
> Thanks!
>
> Best
> Anton
>
> [1]
>
> https://lists.apache.org/thread.html/4827f0f742b6e7e070da350ea81226d059401527f3072ce8b33c1fdf@
> 
> [2] https://cwiki.apache.org/confluence/x/2wclBg
>


Benchmarking MXNet with different compilers and different OpenMP implementations (results)

2019-02-12 Thread Anton Chernov
Dear MXNet community,

Due to multiple problems related to OpenMP and stale proposed change [1] we
have been working on gathering performance data on the impact of using
different OpenMP implementations with MXNet (great thanks to Stanislav
Tsukrov for the hard work). The results can be found here [2].

As a short summary of the investigation: The difference between different
compilers is insignificant. Native OpenMP implementations (more or less
recent) perform equally (<5% difference). See more details in the document.

Please review the document and share your thoughts on the topic.

Thanks!

Best
Anton

[1]
https://lists.apache.org/thread.html/4827f0f742b6e7e070da350ea81226d059401527f3072ce8b33c1fdf@

[2] https://cwiki.apache.org/confluence/x/2wclBg