Fwd: [RESTARTING][VOTE] Release Apache MXNet (incubating) version 1.4.0.rc2

2019-02-12 Thread Justin Mclean
forgot to CC dev

> Begin forwarded message:
> 
> From: Justin Mclean 
> Subject: Re: [RESTARTING][VOTE] Release Apache MXNet (incubating) version 
> 1.4.0.rc2
> Date: 13 February 2019 at 6:43:48 pm AEDT
> To: Michael Wall 
> 
> Hi,
> 
>> Option 1:
>> Do nothing.  I don't know how a RESTARTED vote works.
> 
> I don’t believe there is such a concept.
> 
>> Option 2: 
>> Start another vote thread on general@incubator.a.o pointing to the original 
>> vote thread on dev@mxnet.a.o and the canceled vote thread. 
> 
> It may end up with the same outcome.
> 
>> Option 3:
>> 1 - Fix the header issues.  
> 
>> 3 - Start a vote thread on general@incubator.a.o pointing to the new vote 
>> thread from step 2.  Will likely need to be open 72 hours.
> 
> Just be aware it can take longer, sometime much longer, to get the 3 +1 IPMC 
> votes.
> 
>> Tough position to be in with Horovod being released.  
> 
> Which show the risk of tying in your release cycle with a non Apache product. 
> IMO you need to be independent of 3rd party releases and not tied to their 
> milestones. If they wanted to include a particular unreleased version of ASF 
> software, you should started the release a long time ahead of time just in 
> case problems were encountered issues.This probably wouldn't be an issue if 
> you made more frequent releases, it’s easier to check compliance with 
> frequent releases so the 3rd party could just take the last good release and 
> go with that.
> 
> Thanks,
> Justin



Re: [RESTARTING][VOTE] Release Apache MXNet (incubating) version 1.4.0.rc2

2019-02-12 Thread Sheng Zha
Thanks for the detailed explanation and the help on educating the community, 
Michael.

People on the general list are spending time to help us get the licensing 
right. If possible, I think we should be thankful by treating their feedbacks 
more seriously, making the efforts to quickly fix the problem, and getting our 
release out when ready. Fixes for the issues found during the release are 
already going in as we speak [1][2][3].

One thing that the community can benefit from is the clarity on what file types 
we should remove from the rat-excludes file that we have [4], so that we make 
the project compliant with the release policy once for all.

-sz

[1] https://github.com/apache/incubator-mxnet/pull/14138
[2] https://github.com/apache/incubator-mxnet/pull/14141
[3] https://github.com/apache/incubator-mxnet/pull/14043
[4] 
https://github.com/apache/incubator-mxnet/blob/master/tests/nightly/apache_rat_license_check/rat-excludes

On 2019/02/13 01:14:07, Michael Wall  wrote: 
> Hi Qing,
> 
> I see 3 options
> 
> Option 1:
> Do nothing.  I don't know how a RESTARTED vote works.  Steffen counted the
> binding votes from the before it was restarted.  Unsure if that actually
> works.  There has been one +1 votes since the restart, but it is
> non-binding as best I can tell even though it labeled as binding.  To be a
> binding vote for the general@incubator.a.o VOTE you must be on the
> Incubator PMC or IPMC.  Users on the MXNet Podling PMC or PPMC have a
> binding vote only on the dev@mxnet VOTE thread.   See
> https://incubator.apache.org/policy/incubation.html#releases.  In addition,
> those binding +1 votes may need to be changes based on
> http://www.apache.org/legal/release-policy.html#release-approval which reads
> 
> "Before casting +1 binding votes, individuals are REQUIRED to download all
> signed source code packages onto their own hardware, verify that they meet
> all requirements of ASF policy on releases as described below, validate all
> cryptographic signatures, compile as provided, and test the result on their
> own platform."
> 
> Luciano's -1 was because the release does not meet the licensing policy at
> http://www.apache.org/legal/release-policy.html#license-headers
> 
> For this reason, I can not give a +1 on the general@incubator.a.o VOTE
> thread.  Sorry, that is why I have not voted.
> 
> Option 2:
> Start another vote thread on general@incubator.a.o pointing to the original
> vote thread on dev@mxnet.a.o and the canceled vote thread.  Likely that
> need to be open for 72 hours unless the IPMC agrees otherwise.  I list this
> because I don't know if a RESTART recounting votes from a prior thread is
> valid.  But this option has the same risk of not being approved for the
> reasons listed above.
> 
> Option 3:
> 1 - Fix the header issues.  I dug a little more, and the excludes file at
> https://github.com/apache/incubator-mxnet/blob/v1.4.x/tests/nightly/apache_rat_license_check/rat-excludes
> is
> overly broad and excludes files from the check that should have license
> headers, again per
> http://www.apache.org/legal/release-policy.html#license-headers
> 2 - Start a vote thread on dev@mxnet.a.o.  Doesn't have to be open 72 hours
> according to Justin's note if the PPMC agrees.  Expect this would need to
> be documented on the mailing list, but could be part of the vote I think.
> 3 - Start a vote thread on general@incubator.a.o pointing to the new vote
> thread from step 2.  Will likely need to be open 72 hours.
> 
> Clearly option 1 would be faster, but the risk is the vote not passing.
> Option 2 may not be needed if the restart in option 1 is valid.  Option 3
> is the most correct I think according to what I read in ASF policy.  But
> rushing a vote does have risks, such as less testing on the code being
> released.
> 
> To make this more confusing, the VOTE thread is showing up on both
> dev@mxnet.a.o and general@incubator.a.o.  There is an additional +1 vote on
> the dev@mxnet.a.o list that doesn't show up on the general@incubator, but
> this too is non binding best I can tell.
> 
> Tough position to be in with Horovod being released.  Nothing in ASF policy
> makes allowances for such an event that I could find.  Perhaps we should
> ask for more clarification on general@incubator.a.o to get more thoughts
> from the IPMC.
> 
> Mike
> 
> On Tue, Feb 12, 2019 at 5:53 PM Qing Lan  wrote:
> 
> > Hi Michael,
> >
> > Could you please guide how to proceed with this? Given that we have a
> > possibility of announcing MXNet support in Horovod with their next release
> > and this would help MXNet increase our visibility.
> >
> > Thanks,
> > Qing
> >
> > On 2/12/19, 2:16 PM, "Michael Wall"  wrote:
> >
> > Team,
> >
> > Here is my read on the situation.  The vote has been canceled.
> > Justin's
> > point was that a -1 doesn't mean you must cancel a vote for the
> > reasons he
> > outlined.  But here the vote needs to be restarted and the issue
> > Luciano
> > found needs 

RE: [VOTE] Release Apache MXNet (incubating) version 1.4.0.rc2

2019-02-12 Thread Zhao, Patric
Update, the issue is fixed and the new patch release is out MKL-DNN 0.17.4.

Tao filed a PR to update the MKLDNN version in the release 1.4.X
https://github.com/apache/incubator-mxnet/pull/14141

Thanks all of your helps :)

--Patric


> -Original Message-
> From: Zhao, Patric [mailto:patric.z...@intel.com]
> Sent: Tuesday, February 5, 2019 11:53 AM
> To: dev@mxnet.incubator.apache.org
> Cc: Lv, Tao A ; Ye, Jason Y 
> Subject: RE: [VOTE] Release Apache MXNet (incubating) version 1.4.0.rc2
> 
> Hi Sheng,
> 
> Thanks to raise this important issues. Sorry for the lack of validation since 
> we
> don't have mac machine with earlier OS version in house.
> 
> I will contact with MKL-DNN team for the supports of earlier versions of OSX
> but I'm a little afraid the fix needs some extra-time.
> 
> Alternatively, several workarounds in my thoughts (I know it's not the perfect
> solution):
> 
> * using LLVM which can work crossing HW/OS generation
> https://github.com/apache/incubator-
> mxnet/blob/master/MKLDNN_README.md#2
> 
> * provide the binary build for different HW/OS like cuda, mxnet-cu90/92
> 
> * disable MKLDNN supports for earlier versions of HW/OS in MAC, only
> mxnet build.
> 
> I will update the status when get the feedback and schedule from MKL-DNN
> team.
> 
> Feel free to let us know if anything we can help.
> 
> Thanks,
> 
> --Patric
> 
> 
> > -Original Message-
> > From: Sheng Zha [mailto:szha@gmail.com]
> > Sent: Tuesday, February 5, 2019 10:33 AM
> > To: dev@mxnet.incubator.apache.org
> > Subject: Re: [VOTE] Release Apache MXNet (incubating) version
> > 1.4.0.rc2
> >
> > Also, recent MKLDNN upgrade prevents us from offering binary
> > distribution for earlier versions of OSX, as it now requires OSX
> > 10.13. This means we would need to drop the binary distribution
> > support for OSX 10.11 and 10.12 if we are to keep mkldnn as a
> > dependency for mxnet-mkl. I'm inquiring whether Intel could extend the
> > compatibility to earlier OSX [1], but even if this is solved upstream it 
> > would
> require an update on the mkldnn submodule.
> >
> > -sz
> >
> > [1] https://github.com/intel/mkl-dnn/issues/405
> >
> > On Mon, Feb 4, 2019 at 3:47 PM Anirudh Subramanian
> > 
> > wrote:
> >
> > > -0
> > >
> > > Thanks Steffen for your release efforts !
> > >
> > > Build from source works with make but fails with cmake for me.
> > >
> > >  cd build && cmake VERBOSE=1 -DUSE_CUDA=ON -DUSE_CUDNN=ON
> > > -DUSE_OPENMP=ON -DCMAKE_BUILD_TYPE=Debug -
> > DUSE_DIST_KVSTORE=0
> > > -DUSE_OPENCV=1 -GNinja .. && ninja -v
> > >
> > > FAILED: : && /usr/bin/c++   -Wall -Wno-unknown-pragmas -fPIC -g -O0 -
> > msse2
> > > -std=c++11 -fopenmp -g  -pthread
> > >
> > > 3rdparty/dmlc-core/test/unittest/CMakeFiles/dmlc_unit_tests.dir/unit
> > > te
> > > st_lockfree.cc.o
> > >
> > > 3rdparty/dmlc-core/test/unittest/CMakeFiles/dmlc_unit_tests.dir/unit
> > > te
> > > st_param.cc.o
> > >
> > > 3rdparty/dmlc-core/test/unittest/CMakeFiles/dmlc_unit_tests.dir/unit
> > > te
> > > st_parser.cc.o
> > >
> > > 3rdparty/dmlc-core/test/unittest/CMakeFiles/dmlc_unit_tests.dir/unit
> > > te
> > > st_array_view.cc.o
> > >
> > > 3rdparty/dmlc-core/test/unittest/CMakeFiles/dmlc_unit_tests.dir/unit
> > > te
> > > st_any.cc.o
> > >
> > > 3rdparty/dmlc-core/test/unittest/CMakeFiles/dmlc_unit_tests.dir/unit
> > > te
> > > st_config.cc.o
> > >
> > > 3rdparty/dmlc-core/test/unittest/CMakeFiles/dmlc_unit_tests.dir/unit
> > > te
> > > st_threaditer.cc.o
> > >
> > > 3rdparty/dmlc-core/test/unittest/CMakeFiles/dmlc_unit_tests.dir/unit
> > > te
> > > st_serializer.cc.o
> > >
> > > 3rdparty/dmlc-core/test/unittest/CMakeFiles/dmlc_unit_tests.dir/unit
> > > te
> > > st_threaditer_exc_handling.cc.o
> > >
> > > 3rdparty/dmlc-core/test/unittest/CMakeFiles/dmlc_unit_tests.dir/unit
> > > te
> > > st_inputsplit.cc.o
> > >
> > > 3rdparty/dmlc-core/test/unittest/CMakeFiles/dmlc_unit_tests.dir/unit
> > > te
> > > st_logging.cc.o
> > >
> > > 3rdparty/dmlc-core/test/unittest/CMakeFiles/dmlc_unit_tests.dir/unit
> > > te
> > > st_json.cc.o
> > >
> > > 3rdparty/dmlc-core/test/unittest/CMakeFiles/dmlc_unit_tests.dir/unit
> > > te
> > > st_optional.cc.o
> > >
> > > 3rdparty/dmlc-core/test/unittest/CMakeFiles/dmlc_unit_tests.dir/unit
> > > te
> > > st_main.cc.o
> > >
> > > 3rdparty/dmlc-core/test/unittest/CMakeFiles/dmlc_unit_tests.dir/unit
> > > te
> > > st_env.cc.o
> > >
> > > 3rdparty/dmlc-core/test/unittest/CMakeFiles/dmlc_unit_tests.dir/unit
> > > te
> > > st_thread_group.cc.o -o
> > > 3rdparty/dmlc-core/test/unittest/dmlc_unit_tests  -rdynamic
> > > lib/libgtestd.a 3rdparty/dmlc-core/libdmlc.a -lpthread && :
> > >
> > > 3rdparty/dmlc-
> > core/test/unittest/CMakeFiles/dmlc_unit_tests.dir/unittest_logging.cc.o:
> > > In function `Logging_basics_Test::TestBody()':
> > >
> > > /home/ubuntu/experimentals/1.4_release/build/../3rdparty/dmlc-
> > core/test/unittest/unittest_logging.cc:19:
> > > undefined reference to `testing::internal::DeathTest::Create(char

Re: [RESTARTING][VOTE] Release Apache MXNet (incubating) version 1.4.0.rc2

2019-02-12 Thread Aaron Markham
I think I misunderstood the 3rd party reference to imply Uber instead
of the 3rd party folder. I feel the same regardless, and defer to the
experts on what do do about the 3rd party folder.

As for the other license issues, we don't have to add license info to
readme or informational files. It is specifically called out as an
exception [1]:

"Other files may make sense to have no license header. Three examples are:
Short informational text files; for example README, INSTALL files. The
expectation is that these files make it obvious which product they
relate to.
Test data for which the addition of a source header would cause the
tests to fail.
'Snippet' files that are combined as form a larger file where the
larger file would have duplicate licensing headers."

I certainly wouldn't add headers to the markdown files as this would
create havoc in the website rendering until that is configured to
handle it. Besides, we're covered on these file as we have an Apache
copyright footer on the website. Also from the Apache page on headers
[1]:

"...Our web sites do not have an associated NOTICE file. Instead we
may soon be making the terms of such content explicit through a "Terms
of Use" or "Legal Information" link in the footer of web pages. At
this point, no action is required for Apache web sites."

I can think of a few examples where markdown files are not rendered on
the website, but as they're informational text files they're "obvious
which product they relate to" and therefore I think they can be
excluded.

I looked at the rat-exclude, and if pom.xml files (for example) are
supposed to have licenses, then we should probably add that and
tighten up the excludes for .*xml. But if we can do that in the next
release, that would be great. (I'm not sure how to gauge the
importance of these license headers vis-a-vis project usability.) Not
to muddy the waters, by why is the R package excluded entirely?

[1] http://www.apache.org/legal/src-headers.html#headers

Cheers,
Aaron

On Tue, Feb 12, 2019 at 5:23 PM Michael Wall  wrote:
>
> Hi Qing,
>
> I see 3 options
>
> Option 1:
> Do nothing.  I don't know how a RESTARTED vote works.  Steffen counted the
> binding votes from the before it was restarted.  Unsure if that actually
> works.  There has been one +1 votes since the restart, but it is
> non-binding as best I can tell even though it labeled as binding.  To be a
> binding vote for the general@incubator.a.o VOTE you must be on the
> Incubator PMC or IPMC.  Users on the MXNet Podling PMC or PPMC have a
> binding vote only on the dev@mxnet VOTE thread.   See
> https://incubator.apache.org/policy/incubation.html#releases.  In addition,
> those binding +1 votes may need to be changes based on
> http://www.apache.org/legal/release-policy.html#release-approval which reads
>
> "Before casting +1 binding votes, individuals are REQUIRED to download all
> signed source code packages onto their own hardware, verify that they meet
> all requirements of ASF policy on releases as described below, validate all
> cryptographic signatures, compile as provided, and test the result on their
> own platform."
>
> Luciano's -1 was because the release does not meet the licensing policy at
> http://www.apache.org/legal/release-policy.html#license-headers
>
> For this reason, I can not give a +1 on the general@incubator.a.o VOTE
> thread.  Sorry, that is why I have not voted.
>
> Option 2:
> Start another vote thread on general@incubator.a.o pointing to the original
> vote thread on dev@mxnet.a.o and the canceled vote thread.  Likely that
> need to be open for 72 hours unless the IPMC agrees otherwise.  I list this
> because I don't know if a RESTART recounting votes from a prior thread is
> valid.  But this option has the same risk of not being approved for the
> reasons listed above.
>
> Option 3:
> 1 - Fix the header issues.  I dug a little more, and the excludes file at
> https://github.com/apache/incubator-mxnet/blob/v1.4.x/tests/nightly/apache_rat_license_check/rat-excludes
> is
> overly broad and excludes files from the check that should have license
> headers, again per
> http://www.apache.org/legal/release-policy.html#license-headers
> 2 - Start a vote thread on dev@mxnet.a.o.  Doesn't have to be open 72 hours
> according to Justin's note if the PPMC agrees.  Expect this would need to
> be documented on the mailing list, but could be part of the vote I think.
> 3 - Start a vote thread on general@incubator.a.o pointing to the new vote
> thread from step 2.  Will likely need to be open 72 hours.
>
> Clearly option 1 would be faster, but the risk is the vote not passing.
> Option 2 may not be needed if the restart in option 1 is valid.  Option 3
> is the most correct I think according to what I read in ASF policy.  But
> rushing a vote does have risks, such as less testing on the code being
> released.
>
> To make this more confusing, the VOTE thread is showing up on both
> dev@mxnet.a.o and general@incubator.a.o.  There is an additi

Re: [RESTARTING][VOTE] Release Apache MXNet (incubating) version 1.4.0.rc2

2019-02-12 Thread Michael Wall
Hi Qing,

I see 3 options

Option 1:
Do nothing.  I don't know how a RESTARTED vote works.  Steffen counted the
binding votes from the before it was restarted.  Unsure if that actually
works.  There has been one +1 votes since the restart, but it is
non-binding as best I can tell even though it labeled as binding.  To be a
binding vote for the general@incubator.a.o VOTE you must be on the
Incubator PMC or IPMC.  Users on the MXNet Podling PMC or PPMC have a
binding vote only on the dev@mxnet VOTE thread.   See
https://incubator.apache.org/policy/incubation.html#releases.  In addition,
those binding +1 votes may need to be changes based on
http://www.apache.org/legal/release-policy.html#release-approval which reads

"Before casting +1 binding votes, individuals are REQUIRED to download all
signed source code packages onto their own hardware, verify that they meet
all requirements of ASF policy on releases as described below, validate all
cryptographic signatures, compile as provided, and test the result on their
own platform."

Luciano's -1 was because the release does not meet the licensing policy at
http://www.apache.org/legal/release-policy.html#license-headers

For this reason, I can not give a +1 on the general@incubator.a.o VOTE
thread.  Sorry, that is why I have not voted.

Option 2:
Start another vote thread on general@incubator.a.o pointing to the original
vote thread on dev@mxnet.a.o and the canceled vote thread.  Likely that
need to be open for 72 hours unless the IPMC agrees otherwise.  I list this
because I don't know if a RESTART recounting votes from a prior thread is
valid.  But this option has the same risk of not being approved for the
reasons listed above.

Option 3:
1 - Fix the header issues.  I dug a little more, and the excludes file at
https://github.com/apache/incubator-mxnet/blob/v1.4.x/tests/nightly/apache_rat_license_check/rat-excludes
is
overly broad and excludes files from the check that should have license
headers, again per
http://www.apache.org/legal/release-policy.html#license-headers
2 - Start a vote thread on dev@mxnet.a.o.  Doesn't have to be open 72 hours
according to Justin's note if the PPMC agrees.  Expect this would need to
be documented on the mailing list, but could be part of the vote I think.
3 - Start a vote thread on general@incubator.a.o pointing to the new vote
thread from step 2.  Will likely need to be open 72 hours.

Clearly option 1 would be faster, but the risk is the vote not passing.
Option 2 may not be needed if the restart in option 1 is valid.  Option 3
is the most correct I think according to what I read in ASF policy.  But
rushing a vote does have risks, such as less testing on the code being
released.

To make this more confusing, the VOTE thread is showing up on both
dev@mxnet.a.o and general@incubator.a.o.  There is an additional +1 vote on
the dev@mxnet.a.o list that doesn't show up on the general@incubator, but
this too is non binding best I can tell.

Tough position to be in with Horovod being released.  Nothing in ASF policy
makes allowances for such an event that I could find.  Perhaps we should
ask for more clarification on general@incubator.a.o to get more thoughts
from the IPMC.

Mike

On Tue, Feb 12, 2019 at 5:53 PM Qing Lan  wrote:

> Hi Michael,
>
> Could you please guide how to proceed with this? Given that we have a
> possibility of announcing MXNet support in Horovod with their next release
> and this would help MXNet increase our visibility.
>
> Thanks,
> Qing
>
> On 2/12/19, 2:16 PM, "Michael Wall"  wrote:
>
> Team,
>
> Here is my read on the situation.  The vote has been canceled.
> Justin's
> point was that a -1 doesn't mean you must cancel a vote for the
> reasons he
> outlined.  But here the vote needs to be restarted and the issue
> Luciano
> found needs to be addressed.
>
> That issue is that there are files in MXNet source tree that do not
> have
> the required licenses headers,
> http://www.apache.org/legal/release-policy.html#license-headers.  For
> example, the top level README.md is missing the header
>
> https://raw.githubusercontent.com/apache/incubator-mxnet/master/README.md.
> Excluding 3rd party files from the RAT check is fine, but not files
> originating from the MXNet repo.
>
> It would be good to know exactly how Luciano ran the RAT check, cc'd.
> Here
> is a link to the thread
>
> https://lists.apache.org/thread.html/51e9ab05edae2089c74a253000a92d5aa5c6406f54e5bd0a0b3c3879@%3Cgeneral.incubator.apache.org%3E
> .
>
> Justin's other point, aIso cc'd, was that the vote with the podling
> doesn't
> have to take 72 hours before going to the incubator list.
>
> I realize this is not what everyone is pushing for, so interested in
> other's thoughts.  Especially other mentors.
>
> Mike
>
> On Tue, Feb 12, 2019 at 4:47 PM Aaron Markham <
> aaron.s.mark...@gmail.com>
> wrote:
>
> > +1
> > I disagree about 3rd party consideratio

[DISCUSS] upgrade openblas

2019-02-12 Thread Qing Lan
Dear Community,

Recently I observed a performance regression on a MXNet after switching Atlas 
to openblas 0.3.3. OpenBLAS community address this issue in their 0.3.5 
release. So I plan to upgrade openblas in our publish system to gain the 
performance improvement, here to discuss if anyone had concerns with this.

Currently this upgrade seemed to bring a negative impact to Julia: 
https://github.com/xianyi/OpenBLAS/issues/1955

Reference:
https://github.com/xianyi/OpenBLAS/issues/1897

Thanks,
Qing


Re: [RESTARTING][VOTE] Release Apache MXNet (incubating) version 1.4.0.rc2

2019-02-12 Thread Qing Lan
Hi Michael, 

Could you please guide how to proceed with this? Given that we have a 
possibility of announcing MXNet support in Horovod with their next release and 
this would help MXNet increase our visibility.

Thanks,
Qing

On 2/12/19, 2:16 PM, "Michael Wall"  wrote:

Team,

Here is my read on the situation.  The vote has been canceled.  Justin's
point was that a -1 doesn't mean you must cancel a vote for the reasons he
outlined.  But here the vote needs to be restarted and the issue Luciano
found needs to be addressed.

That issue is that there are files in MXNet source tree that do not have
the required licenses headers,
http://www.apache.org/legal/release-policy.html#license-headers.  For
example, the top level README.md is missing the header
https://raw.githubusercontent.com/apache/incubator-mxnet/master/README.md.
Excluding 3rd party files from the RAT check is fine, but not files
originating from the MXNet repo.

It would be good to know exactly how Luciano ran the RAT check, cc'd.  Here
is a link to the thread

https://lists.apache.org/thread.html/51e9ab05edae2089c74a253000a92d5aa5c6406f54e5bd0a0b3c3879@%3Cgeneral.incubator.apache.org%3E
.

Justin's other point, aIso cc'd, was that the vote with the podling doesn't
have to take 72 hours before going to the incubator list.

I realize this is not what everyone is pushing for, so interested in
other's thoughts.  Especially other mentors.

Mike

On Tue, Feb 12, 2019 at 4:47 PM Aaron Markham 
wrote:

> +1
> I disagree about 3rd party considerations. This is an ecosystem after all.
> The distributed training story is quite nice with Horovod. Given my
> interaction with tensorflow with  Horovod and dynamic training with MXNet
> and the kvstore, this new route is, IMO, easier to setup and manage.
> I see the benefit for getting it out there sooner than later, and market
> timings are important to the project and adoption. If Uber's announcement
> drives traffic to MXNet, but then people can't set it up with a stable
> release package, there's a lost opportunity for growing the community. Why
> miss the opportunity for a RAT license?
>
> On Tue, Feb 12, 2019, 13:14 Dave Fisher  wrote:
>
> > Hi -
> >
> > Third party vendor considerations do not matter. Are you voting +1 with
> > your Apache hat on or your Amazon hat?
> >
> > Regards,
> > Dave
> >
> > > On Feb 11, 2019, at 10:16 PM, Lin Yuan  wrote:
> > >
> > > +1 binding
> > > Horovod is going to release it's 0.16.0 in the coming week with MXNet
> > > integration. We need to release 1.4.0 which includes all the
> dependencies
> > > for Horovod integration.
> > >
> > > Best,
> > >
> > > Lin
> > >
> > > On Mon, Feb 11, 2019 at 9:30 PM Steffen Rochel <
> steffenroc...@gmail.com>
> > > wrote:
> > >
> > >> Dear community -
> > >> based on Justin's and community feedback I'm suggesting to restart 
the
> > >> vote.
> > >> Current status:
> > >> binding votes:
> > >> +1: 2 votes (Henri, Jason)
> > >> -1:  1 vote (Luciano)
> > >>
> > >> non-binding:
> > >> +1: 1 vote (Kellen)
> > >>
> > >> The community is investigating feedback from Luciano that the
> exclusion
> > >> file is to broad and potentially missing files which can and must 
have
> > >> apache license headers not to be checked.
> > >>
> > >> Regards,
> > >> Steffen
> > >>
> > >>
> > >>
> > >>
> > >> On Mon, Feb 11, 2019 at 10:08 AM Hagay Lupesko 
> > wrote:
> > >>
> > >>> Based on Justin's feedback, can we resume the vote instead of
> > cancelling
> > >>> it?
> > >>>
> > >>> On Mon, Feb 11, 2019 at 12:02 AM Justin Mclean <
> > jus...@classsoftware.com
> > >>>
> > >>> wrote:
> > >>>
> >  Hi,
> > 
> >  In future don’t be so hasty to cancel a release vote, people mind
> can
> > >> be
> >  changed and a -1 is not a veto on a release.
> > 
> >  Thanks,
> >  Justin
> > 
> > 
> > 
> -
> >  To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
> >  For additional commands, e-mail: general-h...@incubator.apache.org
> > 
> > 
> > >>>
> > >>
> >
> >
>




Re: [RESTARTING][VOTE] Release Apache MXNet (incubating) version 1.4.0.rc2

2019-02-12 Thread Michael Wall
Team,

Here is my read on the situation.  The vote has been canceled.  Justin's
point was that a -1 doesn't mean you must cancel a vote for the reasons he
outlined.  But here the vote needs to be restarted and the issue Luciano
found needs to be addressed.

That issue is that there are files in MXNet source tree that do not have
the required licenses headers,
http://www.apache.org/legal/release-policy.html#license-headers.  For
example, the top level README.md is missing the header
https://raw.githubusercontent.com/apache/incubator-mxnet/master/README.md.
Excluding 3rd party files from the RAT check is fine, but not files
originating from the MXNet repo.

It would be good to know exactly how Luciano ran the RAT check, cc'd.  Here
is a link to the thread
https://lists.apache.org/thread.html/51e9ab05edae2089c74a253000a92d5aa5c6406f54e5bd0a0b3c3879@%3Cgeneral.incubator.apache.org%3E
.

Justin's other point, aIso cc'd, was that the vote with the podling doesn't
have to take 72 hours before going to the incubator list.

I realize this is not what everyone is pushing for, so interested in
other's thoughts.  Especially other mentors.

Mike

On Tue, Feb 12, 2019 at 4:47 PM Aaron Markham 
wrote:

> +1
> I disagree about 3rd party considerations. This is an ecosystem after all.
> The distributed training story is quite nice with Horovod. Given my
> interaction with tensorflow with  Horovod and dynamic training with MXNet
> and the kvstore, this new route is, IMO, easier to setup and manage.
> I see the benefit for getting it out there sooner than later, and market
> timings are important to the project and adoption. If Uber's announcement
> drives traffic to MXNet, but then people can't set it up with a stable
> release package, there's a lost opportunity for growing the community. Why
> miss the opportunity for a RAT license?
>
> On Tue, Feb 12, 2019, 13:14 Dave Fisher  wrote:
>
> > Hi -
> >
> > Third party vendor considerations do not matter. Are you voting +1 with
> > your Apache hat on or your Amazon hat?
> >
> > Regards,
> > Dave
> >
> > > On Feb 11, 2019, at 10:16 PM, Lin Yuan  wrote:
> > >
> > > +1 binding
> > > Horovod is going to release it's 0.16.0 in the coming week with MXNet
> > > integration. We need to release 1.4.0 which includes all the
> dependencies
> > > for Horovod integration.
> > >
> > > Best,
> > >
> > > Lin
> > >
> > > On Mon, Feb 11, 2019 at 9:30 PM Steffen Rochel <
> steffenroc...@gmail.com>
> > > wrote:
> > >
> > >> Dear community -
> > >> based on Justin's and community feedback I'm suggesting to restart the
> > >> vote.
> > >> Current status:
> > >> binding votes:
> > >> +1: 2 votes (Henri, Jason)
> > >> -1:  1 vote (Luciano)
> > >>
> > >> non-binding:
> > >> +1: 1 vote (Kellen)
> > >>
> > >> The community is investigating feedback from Luciano that the
> exclusion
> > >> file is to broad and potentially missing files which can and must have
> > >> apache license headers not to be checked.
> > >>
> > >> Regards,
> > >> Steffen
> > >>
> > >>
> > >>
> > >>
> > >> On Mon, Feb 11, 2019 at 10:08 AM Hagay Lupesko 
> > wrote:
> > >>
> > >>> Based on Justin's feedback, can we resume the vote instead of
> > cancelling
> > >>> it?
> > >>>
> > >>> On Mon, Feb 11, 2019 at 12:02 AM Justin Mclean <
> > jus...@classsoftware.com
> > >>>
> > >>> wrote:
> > >>>
> >  Hi,
> > 
> >  In future don’t be so hasty to cancel a release vote, people mind
> can
> > >> be
> >  changed and a -1 is not a veto on a release.
> > 
> >  Thanks,
> >  Justin
> > 
> > 
> > 
> -
> >  To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
> >  For additional commands, e-mail: general-h...@incubator.apache.org
> > 
> > 
> > >>>
> > >>
> >
> >
>


Re: [RESTARTING][VOTE] Release Apache MXNet (incubating) version 1.4.0.rc2

2019-02-12 Thread Aaron Markham
+1
I disagree about 3rd party considerations. This is an ecosystem after all.
The distributed training story is quite nice with Horovod. Given my
interaction with tensorflow with  Horovod and dynamic training with MXNet
and the kvstore, this new route is, IMO, easier to setup and manage.
I see the benefit for getting it out there sooner than later, and market
timings are important to the project and adoption. If Uber's announcement
drives traffic to MXNet, but then people can't set it up with a stable
release package, there's a lost opportunity for growing the community. Why
miss the opportunity for a RAT license?

On Tue, Feb 12, 2019, 13:14 Dave Fisher  wrote:

> Hi -
>
> Third party vendor considerations do not matter. Are you voting +1 with
> your Apache hat on or your Amazon hat?
>
> Regards,
> Dave
>
> > On Feb 11, 2019, at 10:16 PM, Lin Yuan  wrote:
> >
> > +1 binding
> > Horovod is going to release it's 0.16.0 in the coming week with MXNet
> > integration. We need to release 1.4.0 which includes all the dependencies
> > for Horovod integration.
> >
> > Best,
> >
> > Lin
> >
> > On Mon, Feb 11, 2019 at 9:30 PM Steffen Rochel 
> > wrote:
> >
> >> Dear community -
> >> based on Justin's and community feedback I'm suggesting to restart the
> >> vote.
> >> Current status:
> >> binding votes:
> >> +1: 2 votes (Henri, Jason)
> >> -1:  1 vote (Luciano)
> >>
> >> non-binding:
> >> +1: 1 vote (Kellen)
> >>
> >> The community is investigating feedback from Luciano that the exclusion
> >> file is to broad and potentially missing files which can and must have
> >> apache license headers not to be checked.
> >>
> >> Regards,
> >> Steffen
> >>
> >>
> >>
> >>
> >> On Mon, Feb 11, 2019 at 10:08 AM Hagay Lupesko 
> wrote:
> >>
> >>> Based on Justin's feedback, can we resume the vote instead of
> cancelling
> >>> it?
> >>>
> >>> On Mon, Feb 11, 2019 at 12:02 AM Justin Mclean <
> jus...@classsoftware.com
> >>>
> >>> wrote:
> >>>
>  Hi,
> 
>  In future don’t be so hasty to cancel a release vote, people mind can
> >> be
>  changed and a -1 is not a veto on a release.
> 
>  Thanks,
>  Justin
> 
> 
>  -
>  To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
>  For additional commands, e-mail: general-h...@incubator.apache.org
> 
> 
> >>>
> >>
>
>


Re: [RESTARTING][VOTE] Release Apache MXNet (incubating) version 1.4.0.rc2

2019-02-12 Thread Dave Fisher
Hi -

Third party vendor considerations do not matter. Are you voting +1 with your 
Apache hat on or your Amazon hat?

Regards,
Dave

> On Feb 11, 2019, at 10:16 PM, Lin Yuan  wrote:
> 
> +1 binding
> Horovod is going to release it's 0.16.0 in the coming week with MXNet
> integration. We need to release 1.4.0 which includes all the dependencies
> for Horovod integration.
> 
> Best,
> 
> Lin
> 
> On Mon, Feb 11, 2019 at 9:30 PM Steffen Rochel 
> wrote:
> 
>> Dear community -
>> based on Justin's and community feedback I'm suggesting to restart the
>> vote.
>> Current status:
>> binding votes:
>> +1: 2 votes (Henri, Jason)
>> -1:  1 vote (Luciano)
>> 
>> non-binding:
>> +1: 1 vote (Kellen)
>> 
>> The community is investigating feedback from Luciano that the exclusion
>> file is to broad and potentially missing files which can and must have
>> apache license headers not to be checked.
>> 
>> Regards,
>> Steffen
>> 
>> 
>> 
>> 
>> On Mon, Feb 11, 2019 at 10:08 AM Hagay Lupesko  wrote:
>> 
>>> Based on Justin's feedback, can we resume the vote instead of cancelling
>>> it?
>>> 
>>> On Mon, Feb 11, 2019 at 12:02 AM Justin Mclean >> 
>>> wrote:
>>> 
 Hi,
 
 In future don’t be so hasty to cancel a release vote, people mind can
>> be
 changed and a -1 is not a veto on a release.
 
 Thanks,
 Justin
 
 
 -
 To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
 For additional commands, e-mail: general-h...@incubator.apache.org
 
 
>>> 
>> 



Re: First class support for MxNet?

2019-02-12 Thread Zach Boldyga
Really interesting stuff, Iblis. Thanks for sharing! I'm excited to stick
around and absorb :D

Zach Boldyga
Scalabull  |  Founder
1 (866) 846-8771 x 101


On Mon, Feb 11, 2019 at 6:25 AM Carin Meier  wrote:

> +100 on Iblis's thoughts:
>
> "We know tools and frameworks keep changing.
> People learn the lesson from making and attempting.
> It's just the path of the human technology evolution.
> The point is the ideas/experiences
> which this community is going to surprise you at."
>
> - Carin
>
>
> On Mon, Feb 11, 2019 at 9:08 AM iblis  wrote:
>
> > well, I'm not going to talk about technical stuffs.
> > You can find some design concepts on doc or wiki.
> > (
> >
> https://mxnet.incubator.apache.org/versions/master/architecture/index.html
> > )
> >
> > For me, working on MXNet is a rare chance to verify my ideas of
> > a machine learning framework.
> > During implementing MXNet Julia package, I can explicitly compare the
> > experience of MXNet with Flux's
> > ...and than start to complaining about them. :p
> > I think a way to moving forward is comparison.
> > So that's why I said I want to increase the diversity of DL tools in
> Julia.
> >
> > I like the spirit of portability in MXNet community.
> > We welcomed all of language packages and open-minded.
> > Although some of languages might be considered not popular in ML/DL,
> > this community still keep polishing them day in day out.
> > Yeah, someone has to try it, compare and gain experience from this
> > process regardless of how the language has been evaluated in ML.
> > The experience is valuable.
> > (e.g. I think lack of function overloading is a disadvantage
> >   of Python; the file-based namespace does help for maintainability
> >   in Python.
> >   After I did some works in Julia, I can clearly point out pros and
> cons.)
> >
> >  From a long-term view... maybe twenty years after,
> > none of the languages we are using now will be popular.
> > But I believe the meta-rules which extracted from experiences are still
> > applied.
> >
> > So.. why not have a Rust lib? maybe Rust's macro can do something crazy,
> > maybe.
> > e.g. Julia package shows a more elegant way to stack a network than
> Python,
> > thanks to metaprogramming.
> >
> >mlp = @mx.chain mx.Variable(:data) =>
> >  mx.FullyConnected(name=:fc1, num_hidden=128) =>
> >  mx.Activation(name=:relu1, act_type=:relu)   =>
> >  mx.FullyConnected(name=:fc2, num_hidden=64)  =>
> >  mx.Activation(name=:relu2, act_type=:relu)   =>
> >  mx.FullyConnected(name=:fc3, num_hidden=10)  =>
> >  mx.SoftmaxOutput(name=:softmax)
> >
> >
> > > Wondering where that leaves MxNet...
> >
> > Actually, I don't case about this issue.
> > We know tools and frameworks keep changing.
> > People learn the lesson from making and attempting.
> > It's just the path of the human technology evolution.
> > The point is the ideas/experiences
> > which this community is going to surprise you at.
> >
> >
> > Iblis Lin
> > 林峻頤
> >
> > On 2/11/19 12:04 PM, Zach Boldyga wrote:
> > > Those are compelling points! There's also another more recent follow-up
> > > from the Julia team:
> > https://julialang.org/blog/2018/12/ml-language-compiler
> > > .
> > >
> > > It seems that Julia will likely have it's place in ML regardless of how
> > > other tools progress; the latest offerings from Julia/Flux are really
> > > compelling.
> > >
> > > Wondering where that leaves MxNet...
> > >
> > > Zach Boldyga
> > > Scalabull  |  Founder
> > > 1 (866) 846-8771 x 101
> > >
> >
>


Re: libjpegturbo

2019-02-12 Thread Sheng Zha
MXNet pip statically links with libturbojpeg that's built from source, not from 
debian package. The script for linux and mac can be found here: 
https://github.com/apache/incubator-mxnet/blob/master/tools/dependencies/libturbojpeg.sh#L22

-sz

On 2019/02/12 07:46:30, Per da Silva  wrote: 
> Hello everyone,
> 
> I was wondering if there was any particular reason why we are building and
> testing mxnet with USE_LIBJPEG_TURBO=0. I noticed that we are shipping it
> with USE_LIBJPEG_TURBO=1 (eg. make/pip/pip_linux_cpu.mk).
> 
> I ran into issues trying to compile mxnet with the libjpegturbo flag on
> Ubuntu 16.04 (I was wondering if this was the reason). This came from an
> issue with libturbojpeg-dev package. There is a fix described on [1]. I've
> applied it in a PR, which I'm currently testing [2].
> 
> Cheers,
> 
> Per
> 
> [1] https://github.com/HaxeFoundation/hashlink/issues/147
> [2] https://github.com/apache/incubator-mxnet/pull/14127
> 


Re: [Announce] Runtime feature detection

2019-02-12 Thread Lin Yuan
Thanks, Pedro for contributing this long awaiting feature. I can
immediately use it for Horovod project now.

Bravo!

Lin

On Tue, Feb 12, 2019 at 2:42 AM Pedro Larroy 
wrote:

> An update on this topic, Sheng just merged the refinements to the
> feature detection so it's now a single API call. (
> https://github.com/apache/incubator-mxnet/pull/13964 ). Thank you
> Sheng for the reviews.
>
> Please use this functionality to check for capabilities of MXNet at
> runtime such as Cuda, OpenCV etc. This can simplify tests and
> automation in several places in the code.
>
> Lin Iblis is already preparing Julia support:
> https://github.com/apache/incubator-mxnet/pull/13992
>
> This is a PR that adds documentation on the feature and explains how
> to use it from Python:
> https://github.com/apache/incubator-mxnet/pull/14130
>
> Thanks.
>
> On Fri, Jan 25, 2019 at 7:08 PM Sheng Zha  wrote:
> >
> > Hi Pedro,
> >
> > Happy to help, though I was waiting for PR comments to be addressed.
> Currently the PR is close to complete, with some open comments to be
> resolved.
> >
> > -sz
> >
> > > On Jan 25, 2019, at 9:27 AM, Pedro Larroy <
> pedro.larroy.li...@gmail.com> wrote:
> > >
> > > That's Great! There's a PR that we should merge first which
> > > internalizes the enum inside the library as per Sheng's suggestion.
> > >
> > > https://github.com/apache/incubator-mxnet/pull/13964
> > >
> > > @Sheng could we merge the PR? so we can build on top of this feature?
> > > It's badly needed for tests suites etc.
> > > Thanks a lot!
> > >
> > > Pedro.
> > >
> > >
> > >> On Fri, Jan 25, 2019 at 2:22 PM Iblis Lin 
> wrote:
> > >>
> > >> Hi,
> > >>
> > >> I added the Julia binding for it.
> > >> PR is here:
> > >> https://github.com/apache/incubator-mxnet/pull/13992
> > >>
> > >> Iblis Lin
> > >> 林峻頤
> > >>
> > >>> On 1/23/19 12:39 AM, Pedro Larroy wrote:
> > >>> Hi
> > >>>
> > >>> I'm pleased to announce that runtime feature detection has been
> merged
> > >>> in master, thanks to Aaron for the merge and the many reviewers who
> > >>> gave feedback on the PR.  (
> > >>> https://github.com/apache/incubator-mxnet/pull/13549 )
> > >>>
> > >>> As the functionality matures and is exposed through other bindings,
> > >>> please feel free to try and use it to build on it, for example for
> > >>> easier test suite selection depending on what's compiled in the
> > >>> engine.
> > >>>
> > >>> Usage examples:
> > >>>
> > >>> $ ipython
> > >>> In [4]: import mxnet.mxfeatures
> > >>>
> > >>> In [5]: mxnet.mxfeatures.features_enabled()
> > >>> Out[5]:
> > >>> [,
> > >>> ,
> > >>> ,
> > >>> ,
> > >>> ,
> > >>> ,
> > >>> ,
> > >>> ,
> > >>> ,
> > >>> ,
> > >>> ]
> > >>>
> > >>> In [6]: mxnet.mxfeatures.features_enabled_str()
> > >>> Out[6]: 'CPU_SSE, CPU_SSE2, CPU_SSE3, CPU_SSE4_1, CPU_SSE4_2,
> CPU_AVX,
> > >>> F16C, BLAS_OPEN, LAPACK, SIGNAL_HANDLER, DEBUG'
> > >>>
> > >>> see also: help(mxnet.mxfeatures)
> > >>>
> > >>> Regards.
> > >>>
>


Berlin recurring user group

2019-02-12 Thread Per da Silva
Hello Dev,



This is a friendly reminder that Berlin office hours will be held today at

6pm-7pm (CEST) / 9am-10am (PST). More info here:

https://cwiki.apache.org/confluence/display/MXNET/Apache+MXNet+%28Incubating%29+User+Groups+recurring+meetings


https://chime.aws/1611907695


Thanks!

Per


Re: Benchmarking MXNet with different compilers and different OpenMP implementations (results)

2019-02-12 Thread Aaron Markham
This is really great research. I've often wondered what the difference
really is, and why it has to be so complicated. It seems the answer is
there isn't much difference and it shouldn't be as complex.
As for your next steps, would you propose that cmake be brought up to
parity? It seems strange that it causes slowness and if so, it shouldn't be
recommended for now.
Also, testing for windows compliers might be quite important as install
stats suggest a significant portion of windows users. Wouldn't this nudge
the decision of what to use as a rule going forward?
I ran into this submodule openmp issue on windows myself. How does that get
fixed? Do we have to repackage all of the submodules to make sure they use
the recommended implementation or they use what the system expects?

Cheers,
Aaron

On Tue, Feb 12, 2019, 04:37 Anton Chernov  wrote:

> Dear MXNet community,
>
> Due to multiple problems related to OpenMP and stale proposed change [1] we
> have been working on gathering performance data on the impact of using
> different OpenMP implementations with MXNet (great thanks to Stanislav
> Tsukrov for the hard work). The results can be found here [2].
>
> As a short summary of the investigation: The difference between different
> compilers is insignificant. Native OpenMP implementations (more or less
> recent) perform equally (<5% difference). See more details in the document.
>
> Please review the document and share your thoughts on the topic.
>
> Thanks!
>
> Best
> Anton
>
> [1]
>
> https://lists.apache.org/thread.html/4827f0f742b6e7e070da350ea81226d059401527f3072ce8b33c1fdf@
> 
> [2] https://cwiki.apache.org/confluence/x/2wclBg
>


Re: [Discussion] Remove bundled llvm OpenMP

2019-02-12 Thread Anton Chernov
I would like to propose a possible alternative solution for consideration.

If keeping llvm OpenMP as a submodule is inevitable one could make
following adjustments:

Since compilers try to find their own OpenMP library implicitly, MXNet
needs to ensure that only the bundled version is found. Therefore during
the build and also during deployment this library has to provide symlinks
for each possible compiler that would link to the built artifact ie.

libiomp.so -> libgomp.so -> libomp.so

The MKLML iomp would need to be hidden and removed as well.

On Windows it would be a different story, but as can be seen [1] bundled
OpenMP was not included in the Windows build anyway.

Alternatively: always use iomp (with same symlinking trick though) provided
by MKLML distribution [2]. This potentially could work on Windows as well.

Best
Anton

[1]
https://github.com/apache/incubator-mxnet/blob/8a63bdecf2d9f12d34fe5874957ae4c867eb5f5b/CMakeLists.txt#L408-L410
[2] https://github.com/intel/mkl-dnn/releases

вт, 12 февр. 2019 г. в 11:22, Anton Chernov :

> Recent benchmarking results have been published here [1]. Experiments
> compare different OpenMP implementations as well as binaries compiled with
> different compilers including GCC, Clang and ICC.
>
> During experimentation another issues with mixing up libraries was
> identified and described here [2].
>
> Best
> Anton
>
> [1] https://cwiki.apache.org/confluence/x/2wclBg
> [2]
> https://github.com/apache/incubator-mxnet/issues/14087#issuecomment-461734041
>
>
> вс, 9 дек. 2018 г. в 16:28, Anton Chernov :
>
>> Hi Chris,
>>
>> Following up on the issue, are all things resolved in the discussion?
>>
>> If yes, I kindly ask you to reopen this PR and remove ‘requesting
>> changes’ status:
>> https://github.com/apache/incubator-mxnet/pull/12160
>>
>> Thank you.
>>
>>
>> Best
>> Anton
>>
>>
>> вт, 27 нояб. 2018 г. в 17:15, Anton Chernov :
>>
>>> Another thing to take into consideration:
>>>
>>> All python artefacts that are created (PyPi) are built with make and are
>>> not using the bundled OpenMP library.
>>>
>>> One step for the switch to CMake to happen is the approval and merging
>>> of the mentioned PR:
>>>
>>> https://github.com/apache/incubator-mxnet/pull/12160
>>>
>>> If there are no other objections I kindly ask Chris Olivier to remove
>>> his 'requesting changes' veto on it to unblock the CMake overhaul work.
>>>
>>> Thank you.
>>>
>>> Best
>>> Anton
>>>
>>> чт, 22 нояб. 2018 г. в 17:11, Anton Chernov :
>>>

 Thank you for you answer, Chris.

 > The whole “mixing omp libraries” is something that occurs in
 production
 every day and certainly in everything that uses mkl.

 I'm afraid this statement is wrong. Intel MKL-DNN strictly ensures that
 this mixture is not happening:

 "Intel MKL-DNN uses OpenMP* for parallelism and requires an OpenMP
 runtime library to work. As different OpenMP runtimes may not be binary
 compatible it's important to ensure that only one OpenMP runtime is used
 throughout the application. Having more than one OpenMP runtime initialized
 may lead to undefined behavior resulting in incorrect results or crashes."
 [1]

 That is why 2 different MKLML libraries are provided:

 lib/libmklml_gnu.so  | Intel MKL small library for GNU* OpenMP runtime
 lib/libmklml_intel.so | Intel MKL small library for Intel(R) OpenMP
 runtime

 > is the suggestion that libiomp be removed from mkl?

 That is certainly not my suggestion.

 > have you spoken with intel? have you consulted Intel at all?

 Yes, I have asked for comments on the issue.

 > “hard to debug random crash”. you’re seeing an assertion which is
 probably ...

 I'm seeing the result of undefined behaviour. And I want to put
 emphasis on the following statement:

 I disregards of whether there is a particular reason for the assert -
 it is a result of behaviour that should not happen. There are valid ways
 how to use llvm OpenMP in MXNet and the current way is not one of them.

 > The lack of root-causing the problem and knee-jerk solution here
 makes me
 uncomfortable.

 I hope that my efforts highlighting the problems reach you to mitigate
 your uncomfort.

 > if you want to see performance differences there’s an environment
 variable
 you can set in the mxnet omp tuning code that will print overhead and
 execution times for the current omp library.

 I don't want to see performance differences in the current OpenMP
 library. I want to remove the current OpenMP library and use the one
 provided by the compiler.



 Best
 Anton

 [1] https://github.com/intel/mkl-dnn/blame/master/README.md#L261-L265

 чт, 22 нояб. 2018 г. в 16:50, Chris Olivier :

> Do you not work on CI mostly? My apologies for thinking that was some
> sort

Benchmarking MXNet with different compilers and different OpenMP implementations (results)

2019-02-12 Thread Anton Chernov
Dear MXNet community,

Due to multiple problems related to OpenMP and stale proposed change [1] we
have been working on gathering performance data on the impact of using
different OpenMP implementations with MXNet (great thanks to Stanislav
Tsukrov for the hard work). The results can be found here [2].

As a short summary of the investigation: The difference between different
compilers is insignificant. Native OpenMP implementations (more or less
recent) perform equally (<5% difference). See more details in the document.

Please review the document and share your thoughts on the topic.

Thanks!

Best
Anton

[1]
https://lists.apache.org/thread.html/4827f0f742b6e7e070da350ea81226d059401527f3072ce8b33c1fdf@

[2] https://cwiki.apache.org/confluence/x/2wclBg


Re: [Announce] Runtime feature detection

2019-02-12 Thread Pedro Larroy
An update on this topic, Sheng just merged the refinements to the
feature detection so it's now a single API call. (
https://github.com/apache/incubator-mxnet/pull/13964 ). Thank you
Sheng for the reviews.

Please use this functionality to check for capabilities of MXNet at
runtime such as Cuda, OpenCV etc. This can simplify tests and
automation in several places in the code.

Lin Iblis is already preparing Julia support:
https://github.com/apache/incubator-mxnet/pull/13992

This is a PR that adds documentation on the feature and explains how
to use it from Python:
https://github.com/apache/incubator-mxnet/pull/14130

Thanks.

On Fri, Jan 25, 2019 at 7:08 PM Sheng Zha  wrote:
>
> Hi Pedro,
>
> Happy to help, though I was waiting for PR comments to be addressed. 
> Currently the PR is close to complete, with some open comments to be resolved.
>
> -sz
>
> > On Jan 25, 2019, at 9:27 AM, Pedro Larroy  
> > wrote:
> >
> > That's Great! There's a PR that we should merge first which
> > internalizes the enum inside the library as per Sheng's suggestion.
> >
> > https://github.com/apache/incubator-mxnet/pull/13964
> >
> > @Sheng could we merge the PR? so we can build on top of this feature?
> > It's badly needed for tests suites etc.
> > Thanks a lot!
> >
> > Pedro.
> >
> >
> >> On Fri, Jan 25, 2019 at 2:22 PM Iblis Lin  wrote:
> >>
> >> Hi,
> >>
> >> I added the Julia binding for it.
> >> PR is here:
> >> https://github.com/apache/incubator-mxnet/pull/13992
> >>
> >> Iblis Lin
> >> 林峻頤
> >>
> >>> On 1/23/19 12:39 AM, Pedro Larroy wrote:
> >>> Hi
> >>>
> >>> I'm pleased to announce that runtime feature detection has been merged
> >>> in master, thanks to Aaron for the merge and the many reviewers who
> >>> gave feedback on the PR.  (
> >>> https://github.com/apache/incubator-mxnet/pull/13549 )
> >>>
> >>> As the functionality matures and is exposed through other bindings,
> >>> please feel free to try and use it to build on it, for example for
> >>> easier test suite selection depending on what's compiled in the
> >>> engine.
> >>>
> >>> Usage examples:
> >>>
> >>> $ ipython
> >>> In [4]: import mxnet.mxfeatures
> >>>
> >>> In [5]: mxnet.mxfeatures.features_enabled()
> >>> Out[5]:
> >>> [,
> >>> ,
> >>> ,
> >>> ,
> >>> ,
> >>> ,
> >>> ,
> >>> ,
> >>> ,
> >>> ,
> >>> ]
> >>>
> >>> In [6]: mxnet.mxfeatures.features_enabled_str()
> >>> Out[6]: 'CPU_SSE, CPU_SSE2, CPU_SSE3, CPU_SSE4_1, CPU_SSE4_2, CPU_AVX,
> >>> F16C, BLAS_OPEN, LAPACK, SIGNAL_HANDLER, DEBUG'
> >>>
> >>> see also: help(mxnet.mxfeatures)
> >>>
> >>> Regards.
> >>>


Re: [Discussion] Remove bundled llvm OpenMP

2019-02-12 Thread Anton Chernov
Recent benchmarking results have been published here [1]. Experiments
compare different OpenMP implementations as well as binaries compiled with
different compilers including GCC, Clang and ICC.

During experimentation another issues with mixing up libraries was
identified and described here [2].

Best
Anton

[1] https://cwiki.apache.org/confluence/x/2wclBg
[2]
https://github.com/apache/incubator-mxnet/issues/14087#issuecomment-461734041


вс, 9 дек. 2018 г. в 16:28, Anton Chernov :

> Hi Chris,
>
> Following up on the issue, are all things resolved in the discussion?
>
> If yes, I kindly ask you to reopen this PR and remove ‘requesting changes’
> status:
> https://github.com/apache/incubator-mxnet/pull/12160
>
> Thank you.
>
>
> Best
> Anton
>
>
> вт, 27 нояб. 2018 г. в 17:15, Anton Chernov :
>
>> Another thing to take into consideration:
>>
>> All python artefacts that are created (PyPi) are built with make and are
>> not using the bundled OpenMP library.
>>
>> One step for the switch to CMake to happen is the approval and merging of
>> the mentioned PR:
>>
>> https://github.com/apache/incubator-mxnet/pull/12160
>>
>> If there are no other objections I kindly ask Chris Olivier to remove his
>> 'requesting changes' veto on it to unblock the CMake overhaul work.
>>
>> Thank you.
>>
>> Best
>> Anton
>>
>> чт, 22 нояб. 2018 г. в 17:11, Anton Chernov :
>>
>>>
>>> Thank you for you answer, Chris.
>>>
>>> > The whole “mixing omp libraries” is something that occurs in production
>>> every day and certainly in everything that uses mkl.
>>>
>>> I'm afraid this statement is wrong. Intel MKL-DNN strictly ensures that
>>> this mixture is not happening:
>>>
>>> "Intel MKL-DNN uses OpenMP* for parallelism and requires an OpenMP
>>> runtime library to work. As different OpenMP runtimes may not be binary
>>> compatible it's important to ensure that only one OpenMP runtime is used
>>> throughout the application. Having more than one OpenMP runtime initialized
>>> may lead to undefined behavior resulting in incorrect results or crashes."
>>> [1]
>>>
>>> That is why 2 different MKLML libraries are provided:
>>>
>>> lib/libmklml_gnu.so  | Intel MKL small library for GNU* OpenMP runtime
>>> lib/libmklml_intel.so | Intel MKL small library for Intel(R) OpenMP
>>> runtime
>>>
>>> > is the suggestion that libiomp be removed from mkl?
>>>
>>> That is certainly not my suggestion.
>>>
>>> > have you spoken with intel? have you consulted Intel at all?
>>>
>>> Yes, I have asked for comments on the issue.
>>>
>>> > “hard to debug random crash”. you’re seeing an assertion which is
>>> probably ...
>>>
>>> I'm seeing the result of undefined behaviour. And I want to put emphasis
>>> on the following statement:
>>>
>>> I disregards of whether there is a particular reason for the assert - it
>>> is a result of behaviour that should not happen. There are valid ways how
>>> to use llvm OpenMP in MXNet and the current way is not one of them.
>>>
>>> > The lack of root-causing the problem and knee-jerk solution here makes
>>> me
>>> uncomfortable.
>>>
>>> I hope that my efforts highlighting the problems reach you to mitigate
>>> your uncomfort.
>>>
>>> > if you want to see performance differences there’s an environment
>>> variable
>>> you can set in the mxnet omp tuning code that will print overhead and
>>> execution times for the current omp library.
>>>
>>> I don't want to see performance differences in the current OpenMP
>>> library. I want to remove the current OpenMP library and use the one
>>> provided by the compiler.
>>>
>>>
>>>
>>> Best
>>> Anton
>>>
>>> [1] https://github.com/intel/mkl-dnn/blame/master/README.md#L261-L265
>>>
>>> чт, 22 нояб. 2018 г. в 16:50, Chris Olivier :
>>>
 Do you not work on CI mostly? My apologies for thinking that was some
 sort
 of team effort between you and a few others that were passionate about
 CI
 keeping the CI system running smoothly.

 You have source code, you have the line the assertion is on. If you
 can’t
 describe what’s going wrong that causes the assertion, then I don’t
 really
 have anything more to add to this conversation beyond what’s below:

 The whole “mixing omp libraries” is something that occurs in production
 every day and certainly in everything that uses mkl.  It may
 occasionally
 cause problems for some edge cases when there is super-complex linking
 strategies and dynamic loading.  But this is not one of those edge
 cases.
 Mostly blaming this is a red herring for other thread-related problems
 and
 people switch omp library and the timing of their code changes and they
 stop seeing the problem. I’ve spent my entire career doing heavily
 multiphreaded c++ development and i’ve seen that a million times.  is
 the
 suggestion that libiomp be removed from mkl? have you spoken with intel?
 have you consulted Intel at all?

 and what you are seeing isn’t some “hard to debu

Re: RE: Third-party package tests for MXNet nightly builds

2019-02-12 Thread Felix Hieber
Thank you all for your responses so far.
Sheng, I understand your concerns about tight coupling, but would this
separate CI pipeline really create such a problematic coupling, given that
I was proposing this pipeline to be entirely non-blocking to MXNet?
Monitoring the upstream library (MXNet) would still be up to the
maintainers of downstream projects, as well as the creation of
corresponding issues (and/or fixes). Nevertheless, there would be benefit
from a *timely* (and automated) notification that some recent change in
MXNet broke your tests. This is mostly motivated because the release
process in MXNet is slow, and once a regression is part of a release it
takes quite a while to be removed afterwards.

In my opinion the main question is about resources and ownership. If MXNet
does not want to spend the resources to run tests for downstream projects
on a regular basis, then I agree with you that making the MXNet CI solution
available to other users (maintainers of downstream projects) would indeed
be very useful.

Thanks,
Felix

On Mon, Feb 11, 2019 at 8:58 PM Sheng Zha  wrote:

> Thanks for the proposal, Felix. On one hand, I agree that richer workload
> from the ecosystem helps find issues in MXNet early. On the other hand, I'm
> concerned about tightly coupling the development of projects.
>
> Monitoring the upstream library and addressing problems for upgrading
> dependency should be the concern of the downstream projects. These projects
> own the effort of having proper testing for any changes needed, including
> version upgrade. Having these projects in MXNet CI means the responsibiliy
> of maintaining these projects partly transfers to the MXNet's contributors,
> which doesn't seem right. It blurs the line of who's responsible for
> debugging, isolating the problem, making minimum reproducible sample code,
> and posting the fix.
>
> That said, I think there's much opportunity for reusing the current code
> for MXNet CI. Projects in MXNet's ecosystem would likely benefit from
> MXNet's CI solution so that each individual community project can identify
> issues early. (And from offline chats with Chance and his team members, I
> think this is what's already on their minds.)
>
> -sz
>
> On 2019/02/11 16:46:06, "Zhao, Patric"  wrote:
> > Agree to track the 3rd party packages which make MXNet more prosperous :)
> >
> > Before building the CI, I suggest to create the related labels, like
> sockeye, gluonCV, gluonNLP, etc, in the GitHub and give the high priority
> for these issues/PR.
> > So the issue/PR can be fixed quickly and  these important applications
> would not be blocked again.
> >
> > We can help for the performance/backend/operator related issues as well
> :)
> >
> > Thanks,
> >
> > --Patric
> >
> >
> >
> > > -Original Message-
> > > From: Chance Bair [mailto:chanceb...@gmail.com]
> > > Sent: Monday, February 11, 2019 11:28 PM
> > > To: dev@mxnet.incubator.apache.org
> > > Cc: d...@mxnet.apache.org
> > > Subject: Re: Third-party package tests for MXNet nightly builds
> > >
> > > Hi Felix,
> > >
> > > Thank you for the request!  The CI team is currently working on
> improving
> > > our benchmarking platform and will evaluate this request carefully.
> > >
> > > Chance Bair
> > >
> > >
> > >
> > > On Mon, Feb 11, 2019 at 3:59 PM Carin Meier 
> > > wrote:
> > >
> > > > Can't speak for the CI team, but in general I think that it is good
> idea.
> > > >
> > > > On a separate note, I've been playing around with Sockeye recently
> and
> > > > it's great! Awesome work and glad to see MXNet used for such cutting
> > > > edge use cases.
> > > > I'd love to see closer collaboration with the Sockeye team and MXNet
> > > > for innovation, cross pollination, and evangelization of what MXNet
> can
> > > do .
> > > >
> > > > Best,
> > > > Carin
> > > >
> > > > On Mon, Feb 11, 2019 at 6:01 AM Felix Hieber  >
> > > > wrote:
> > > >
> > > > > Hello dev@,
> > > > >
> > > > >
> > > > >
> > > > > I would like to ask around whether there is interest in the
> > > > > community to test nightly builds of MXNet with third-party packages
> > > > > that depend on
> > > > MXNet
> > > > > and act as early adopters. The goal is to catch regressions in
> MXNet
> > > > early,
> > > > > allowing time for bug fixes before a new release is cut.
> > > > >
> > > > >
> > > > >
> > > > > For example, Sockeye  is a
> > > > > customer
> > > > of
> > > > > new MXNet releases and aims to upgrade to latest MXNet as soon as
> > > > possible.
> > > > > Typically, we update our dependency on MXNet once a new release
> > > > > becomes available (through pip). However, there have been cases
> > > > > where new
> > > > releases
> > > > > of MXNet introduced regressions undetected by MXNet tests (hence
> > > > > passing the release process): the latest example is this issue
> > > > > , which
> may
> > > > > have been introduced already back in