possible bug in gpu_topology.h ComputeDepth ?

2018-11-01 Thread Pedro Larroy
Hi

I'm investigating this issue:
https://github.com/apache/incubator-mxnet/issues/12994

To me this code seems suspicious, as it doesn't do what is stated in the
comment.

https://github.com/apache/incubator-mxnet/blob/master/src/kvstore/gpu_topology.h#L577

I don't think the depth of the binary tree is calculated correctly, for
example a tree of three nodes should have two leves, but a tree of four
nodes should have three. A tree of 0 should have 0.

Any ideas if this is indeed buggy? or there's something hidden I'm missing?

Test code to check:

#include 
#include 
#include 
#include 
#include 
#include 
using namespace std;
inline int ComputeDepth(int n) {
  for (int depth = 0; depth < 16; ++depth) {
int num = 2 << depth;
if (n <= num)
  return depth+1;
  }
  return 0;
}
int main(int argc, char *argv[])
{
for (size_t i=0; i<64; ++i)
cout << "ComputeDepth(" << i << ") = " << ComputeDepth(i) << endl;
}


ComputeDepth(0) = 1
ComputeDepth(1) = 1
ComputeDepth(2) = 1
ComputeDepth(3) = 2
ComputeDepth(4) = 2
ComputeDepth(5) = 3
ComputeDepth(6) = 3
ComputeDepth(7) = 3
ComputeDepth(8) = 3
ComputeDepth(9) = 4
ComputeDepth(10) = 4
ComputeDepth(11) = 4
ComputeDepth(12) = 4
ComputeDepth(13) = 4
ComputeDepth(14) = 4
ComputeDepth(15) = 4
ComputeDepth(16) = 4
ComputeDepth(17) = 5
ComputeDepth(18) = 5


Re: MXNet - Label Bot functionality

2018-11-01 Thread Anirudh Acharya
Hi Harsh,

Thanks for working on this. This will be very helpful for people who triage
issues and reviews PRs regularly.

Few concerns from this design document -
https://cwiki.apache.org/confluence/display/MXNET/Machine+Learning+Based+GitHub+Bot
and
the conversation in the comment section

   1. As the scope of the label bot increases, the need for a safety checks
   on who the label bot listens to becomes important. Currently the bot just
   adds labels. You have proposed that the bot also be allowed to remove and
   update labels. And I think allowing the bot to close issues( with a command
   like @mxnet-label-bot close) is in discussion in the comments section. This
   opens up a serious security flaw - anyone who wishes to abuse this system,
   can randomly start closing issues or removing labels. You need to come up
   with a solution so that the bot does not listen to random strangers on the
   internet.
   2. Also you seem to have linked a document that goes to an internal
   amazon website, please remove that.
   [image: Screen Shot 2018-11-01 at 2.31.13 PM.png]


Thanks
Anirudh


On Thu, Oct 18, 2018 at 1:51 PM Harsh Patel 
wrote:

> Hey,
> After having my PR vetted and reviewed by some contributors, I would like
> to move forward into the stage of putting this into production. I am asking
> for MXNet committers to take a look at my PR regarding the Label Bot.
> https://github.com/MXNetEdge/mxnet-infrastructure/pull/15. This will also
> require access for a webhook - let's set this into motion. Thanks.
>
> Best,
> -Harsh
>
> On Mon, Oct 15, 2018 at 4:05 PM Piyush Ghai  wrote:
>
> > Hi Harsh,
> >
> > Good job! This is super cool! Especially bringing down the response time
> > to under 20 seconds.
> >
> > Thanks,
> > Piyush
> >
> >
> > > On Oct 15, 2018, at 3:49 PM, Qing Lan  wrote:
> > >
> > > Hi Harsh,
> > >
> > > This new label bot design looks great! I would like to encourage people
> > to review it and move forward to benefit the MXNet community.
> > > Since this new design needs webhook support from Apache, let's go
> > through the following steps to get this done:
> > >
> > > 1. Demo and contributors review stage: all contributors are encouraged
> > to review the PR here:
> > > https://github.com/MXNetEdge/mxnet-infrastructure/pull/15 and leave
> > your thoughts so Harsh can apply them in his design.
> > > 2. Committers review stage: Once all contributors think the design is
> > good to go, let's get committers involved to get a review.
> > > 3. Committers send request to Apache Infra to get the webhook setup.
> > > 4. Harsh finally deploy the model and all of us can use it in
> > incubator-mxnet repo!
> > >
> > > Some fun fact I would like to share:
> > > 1. This new bot can recommend labels and reply to people who file it!
> > > 2. It response time from 5mins -> less than 20 seconds
> > >
> > > Thanks,
> > > Qing
> > >
> > > On 10/15/18, 11:11 AM, "Harsh Patel" 
> > wrote:
> > >
> > >Hey,
> > >I have a demo available that users and developers can play around
> > with --
> > >this is in regards to the post I had made regarding the updated
> label
> > bot
> > >functionality. This is available on my fork (
> > >https://github.com/harshp8l/incubator-mxnet) if the developers
> would
> > be
> > >able to provide feedback that would be great.
> > >The updated usage of this label bot:
> > >To add labels: @mxnet-label-bot, add ['label1', 'label2']
> > >To remove labels: @mxnet-label-bot, remove ['label1', 'label2']
> > >To update labels: @mxnet-label-bot, update ['label3', 'label4']
> > >(warning: with update - this will remove all other labels for a
> > specific
> > >issue and update only with the labels the user specifies). Thanks.
> > >
> > >My PR for reference:
> > >https://github.com/MXNetEdge/mxnet-infrastructure/pull/15
> > >
> > >My Design:
> > >
> >
> https://cwiki.apache.org/confluence/display/MXNET/Machine+Learning+Based+GitHub+Bot
> > >
> > >Best,
> > >-Harsh
> > >
> > >On Mon, Oct 15, 2018 at 12:54 AM Hagay Lupesko 
> > wrote:
> > >
> > >> +1
> > >> Thanks for the contribution!
> > >>
> > >> On Fri, Oct 12, 2018 at 1:41 AM kellen sunderland <
> > >> kellen.sunderl...@gmail.com> wrote:
> > >>
> > >>> Awesome work!  Many thanks.
> > >>>
> > >>> On Fri, Oct 12, 2018, 1:19 AM Harsh Patel <
> harshpatel081...@gmail.com>
> > >>> wrote:
> > >>>
> >  Hey,
> >  I am looking to contribute to MXNet. I have a working implementation
> > >>> based
> >  on my proposed design structure according to this wiki page (
> > 
> > 
> > >>>
> > >>
> >
> https://cwiki.apache.org/confluence/display/MXNET/Machine+Learning+Based+GitHub+Bot
> >  )
> >  - under 7.
> >  I have provided users with functionality allowing for adding,
> > updating,
> > >>> and
> >  deleting labels for our issues. The response time with the bot to
> > >> provide
> >  the aforementioned functionality has been reduced as well. 

Re: [Discuss] Feature detection at runtime / test skipping depending on features

2018-11-01 Thread Anton Chernov
Great idea. I could see following checks being added:

* Is OpenMP enabled?
* Is CUDA enabled? (though already available as 0 for GPU count)
* Is NCCL enabled?
* CuDNN?
* What BLAS / LAPACK math library is used?
* F16 support enabled?
* KVStore enabled?
* TensorRT?

It would help to structure the tests better.

Best
Anton


чт, 1 нояб. 2018 г. в 15:30, Pedro Larroy :

> Hi
>
> There are some tests that fail when some features are not compiled in, such
> as Opencv.
>
> In some cases we skip the test according to some precondition such as:
>
> @unittest.skipIf(not graphviz_exists(),
>
>
> I would propose that we have a Python module that exports a set of methods
> to check what features are compiled in to skip tests which need this
> feature.
>
>
>
> test_gluon_data.test_recordimage_dataset ... [INFO] Setting test
> np/mx/python random seeds, use MXNET_TEST_SEED=1883419283 to reproduce.
> ERROR
> test_gluon_data.test_recordimage_dataset_with_data_loader_multiworker ...
> Process Process-1:
> Traceback (most recent call last):
>   File "/usr/lib/python3.5/multiprocessing/process.py", line 249, in
> _bootstrap
> self.run()
>   File "/usr/lib/python3.5/multiprocessing/process.py", line 93, in run
> self._target(*self._args, **self._kwargs)
>   File
> "/usr/local/lib/python3.5/dist-packages/mxnet/gluon/data/dataloader.py",
> line 189, in worker_loop
> batch = batchify_fn([dataset[i] for i in samples])
>   File
> "/usr/local/lib/python3.5/dist-packages/mxnet/gluon/data/dataloader.py",
> line 189, in 
> batch = batchify_fn([dataset[i] for i in samples])
>   File
>
> "/usr/local/lib/python3.5/dist-packages/mxnet/gluon/data/vision/datasets.py",
> line 261, in __getitem__
> return image.imdecode(img, self._flag), header.label
>   File "/usr/local/lib/python3.5/dist-packages/mxnet/image/image.py", line
> 147, in imdecode
> return _internal._cvimdecode(buf, *args, **kwargs)
>   File "", line 36, in _cvimdecode
>   File "/usr/local/lib/python3.5/dist-packages/mxnet/_ctypes/ndarray.py",
> line 92, in _imperative_invoke
> ctypes.byref(out_stypes)))
>   File "/usr/local/lib/python3.5/dist-packages/mxnet/base.py", line 252, in
> check_call
> raise MXNetError(py_str(_LIB.MXGetLastError()))
> mxnet.base.MXNetError: [19:21:42] /work/mxnet/src/io/image_io.cc:211: Build
> with USE_OPENCV=1 for image io.
>
>
> Pedro
>


[RESULT][LAZY VOTE] Next MXNet release

2018-11-01 Thread Steffen Rochel
There have been no objections, so lazy vote passed.
Anton volunteered to manage the 1.3.1 release and Naveen will support him
as co-manager to handle the release tasks requiring committer powers.
Please support Anton for a smooth 1.3.1 release process.

I'm still looking for volunteers to manage / co-manage the 1.4.0 release.

Regards,
Steffen

On Sun, Oct 28, 2018 at 7:33 PM Steffen Rochel 
wrote:

> I calling a lazy vote to release MXNet
> 1.3.1 (patch release) and 1.4.0 (minor relase).
>
> Release content: release proposal page
> 
>
> Target milestones:
> *1.3.1*
>
>- Code Freeze: 10/31
>- Release published: 11/13
>
> *1.4.0:*
>
>- Code Freeze: 11/13
>- Release published: 12/13 (if possible announce during NIPS)
>
>
> The vote will be open until Wednesday October 31, 2018 8.00pm PDT.
>
> Regards,
> Steffen
>
> On Fri, Oct 26, 2018 at 7:56 AM Steffen Rochel 
> wrote:
>
>> During the Hangout on Wednesday multiple release proposals have been
>> discussed. I summarized discussion here
>> 
>>  and
>> updated the release proposal page
>> 
>> .
>> Please review, provide feedback and propose changes.
>> I plan to start a lazy vote on Sunday regarding the release proposal.
>>
>> Calling for volunteers to manage the 1.3.1 and 1.4.0 release.
>>
>> Regards,
>> Steffen
>>
>> On Tue, Oct 9, 2018 at 7:20 AM kellen sunderland <
>> kellen.sunderl...@gmail.com> wrote:
>>
>>> Hey Steffen,
>>>
>>> Recommend these be merged into patch release:
>>>
>>> https://github.com/apache/incubator-mxnet/pull/12631
>>> https://github.com/apache/incubator-mxnet/pull/12603
>>> https://github.com/apache/incubator-mxnet/pull/12499
>>>
>>> -Kellen
>>>
>>> On Tue, Oct 2, 2018 at 7:17 AM Zhao, Patric 
>>> wrote:
>>>
>>> > Thanks to let us know this discussion.
>>> > Because we don't have enough bandwidth to track the different sources,
>>> > like discussion forum.
>>> >
>>> > I think the best way is to open issue in the github so that we can
>>> > answer/solve the issue in time :)
>>> >
>>> > Thanks,
>>> >
>>> > --Patric
>>> >
>>> > > -Original Message-
>>> > > From: Afrooze, Sina [mailto:sina@gmail.com]
>>> > > Sent: Tuesday, October 2, 2018 1:14 AM
>>> > > To: dev@mxnet.incubator.apache.org
>>> > > Cc: Ye, Jason Y ; Zai, Alexander
>>> > > ; Zheng, Da 
>>> > > Subject: Re: [Discuss] Next MXNet release
>>> > >
>>> > > This post suggests there is a regression from 1.1.0 to 1.2.1 related
>>> to
>>> > > MKLDNN integration:
>>> https://discuss.mxnet.io/t/mxnet-1-2-1-module-get-
>>> > > outputs/1882
>>> > >
>>> > > The error is related to MKLDNN layout not being converted back to
>>> MXNet
>>> > > layout in some operator: " !IsMKLDNNData() We can’t generate TBlob
>>> for
>>> > > MKLDNN data. Please use Reorder2Default() to generate a new NDArray
>>> > > first"
>>> > >
>>> > > Sina
>>> > >
>>> > >
>>> > >
>>> > >
>>> > > On 9/30/18, 6:55 PM, "Steffen Rochel" 
>>> wrote:
>>> > >
>>> > > Thanks Patrick.
>>> > > Updated roadmap and next release content.
>>> > >
>>> > > Patrick - suggest to send a reminder to review the design doc and
>>> > collect
>>> > > feedback.
>>> > > Are there still known issues or gaps before we declare MKL-DNN
>>> > > integration
>>> > > as GA?
>>> > >
>>> > > Regards,
>>> > > Steffen
>>> > >
>>> > > On Sat, Sep 29, 2018 at 1:31 AM Zhao, Patric <
>>> patric.z...@intel.com>
>>> > > wrote:
>>> > >
>>> > > > Thanks, Steffen.
>>> > > >
>>> > > > Regarding the next release note, two items from our side:
>>> > > >
>>> > > > 1. (-remove) MKL-DNN integration is done. I think we can remove
>>> > this
>>> > > item.
>>> > > > 2. (+add) MKL-DNN based graph optimization and quantization by
>>> > > subgraph
>>> > > > Design doc:
>>> > > >
>>> > >
>>> https://cwiki.apache.org/confluence/display/MXNET/MXNet+Graph+Optimiz
>>> > > ation+and+Quantization+based+on+subgraph+and+MKL-DNN
>>> > > > Lead Contributor: Patric Zhao,
>>> > https://github.com/pengzhao-intel/
>>> > > >
>>> > > > Regarding the Roadmap
>>> > > > (+add) Q1 2019: MKL-DNN RNN API supports
>>> > > >
>>> > > > BR,
>>> > > >
>>> > > > Thanks,
>>> > > >
>>> > > > --Patric
>>> > > >
>>> > > >
>>> > > > > -Original Message-
>>> > > > > From: kellen sunderland [mailto:kellen.sunderl...@gmail.com]
>>> > > > > Sent: Saturday, September 29, 2018 11:31 AM
>>> > > > > To: dev@mxnet.incubator.apache.org
>>> > > > > Subject: Re: [Discuss] Next MXNet release
>>> > > > >
>>> > > > > Sorry I meant to say next 'Regarding the *minor* release'.
>>> > > > >
>>> > > > > On Sat, Sep 29, 2018 at 5:27 AM kellen sunderland <
>>> > > > 

Re: [VOTE] - Adopt "Become a Committer and PPMC Member" Document

2018-11-01 Thread Pedro Larroy
+1 non-binding. Thanks for driving this, looking forward to see the
positive impact.

On Mon, Oct 29, 2018 at 11:47 PM Carin Meier  wrote:

> This vote is to adopt the document
>
> https://cwiki.apache.org/confluence/display/MXNET/Become+an+Apache+MXNet+%28incubating%29+Committer+and+PPMC+Member+Proposal
> to replace the current document
> https://cwiki.apache.org/confluence/display/MXNET/Becoming+a+Committer
>
> The dev discussion thread is here
>
> https://lists.apache.org/thread.html/e61ffa26af374de7a99c475d406e462a00b26cfc1155e232198dd53e@%3Cdev.mxnet.apache.org%3E
>
> The vote will be a procedural issue vote as defined
> https://www.apache.org/foundation/voting.html
>
> Votes on procedural issues follow the common format of majority rule unless
> otherwise stated. That is, if there are more favourable votes than
> unfavourable ones, the issue is considered to have passed -- regardless of
> the number of votes in each category. (If the number of votes seems too
> small to be representative of a community consensus, the issue is typically
> not pursued. However, see the description of lazy consensus
>  for a
> modifying factor.)
>
> The vote will run until Friday Nov 2nd at 6:00 am EST
>
> Thanks,
> Carin
>


[Discuss] Feature detection at runtime / test skipping depending on features

2018-11-01 Thread Pedro Larroy
Hi

There are some tests that fail when some features are not compiled in, such
as Opencv.

In some cases we skip the test according to some precondition such as:

@unittest.skipIf(not graphviz_exists(),


I would propose that we have a Python module that exports a set of methods
to check what features are compiled in to skip tests which need this
feature.



test_gluon_data.test_recordimage_dataset ... [INFO] Setting test
np/mx/python random seeds, use MXNET_TEST_SEED=1883419283 to reproduce.
ERROR
test_gluon_data.test_recordimage_dataset_with_data_loader_multiworker ...
Process Process-1:
Traceback (most recent call last):
  File "/usr/lib/python3.5/multiprocessing/process.py", line 249, in
_bootstrap
self.run()
  File "/usr/lib/python3.5/multiprocessing/process.py", line 93, in run
self._target(*self._args, **self._kwargs)
  File
"/usr/local/lib/python3.5/dist-packages/mxnet/gluon/data/dataloader.py",
line 189, in worker_loop
batch = batchify_fn([dataset[i] for i in samples])
  File
"/usr/local/lib/python3.5/dist-packages/mxnet/gluon/data/dataloader.py",
line 189, in 
batch = batchify_fn([dataset[i] for i in samples])
  File
"/usr/local/lib/python3.5/dist-packages/mxnet/gluon/data/vision/datasets.py",
line 261, in __getitem__
return image.imdecode(img, self._flag), header.label
  File "/usr/local/lib/python3.5/dist-packages/mxnet/image/image.py", line
147, in imdecode
return _internal._cvimdecode(buf, *args, **kwargs)
  File "", line 36, in _cvimdecode
  File "/usr/local/lib/python3.5/dist-packages/mxnet/_ctypes/ndarray.py",
line 92, in _imperative_invoke
ctypes.byref(out_stypes)))
  File "/usr/local/lib/python3.5/dist-packages/mxnet/base.py", line 252, in
check_call
raise MXNetError(py_str(_LIB.MXGetLastError()))
mxnet.base.MXNetError: [19:21:42] /work/mxnet/src/io/image_io.cc:211: Build
with USE_OPENCV=1 for image io.


Pedro


Re: [VOTE] - Adopt "Become a Committer and PPMC Member" Document

2018-11-01 Thread Chris Olivier
+1

On Thu, Nov 1, 2018 at 6:08 AM Carin Meier  wrote:

> Reminder - vote ends tomorrow morning at 6:00 am EST
>
> On Mon, Oct 29, 2018 at 6:46 PM Carin Meier  wrote:
>
> > This vote is to adopt the document
> >
> https://cwiki.apache.org/confluence/display/MXNET/Become+an+Apache+MXNet+%28incubating%29+Committer+and+PPMC+Member+Proposal
> > to replace the current document
> > https://cwiki.apache.org/confluence/display/MXNET/Becoming+a+Committer
> >
> > The dev discussion thread is here
> >
> https://lists.apache.org/thread.html/e61ffa26af374de7a99c475d406e462a00b26cfc1155e232198dd53e@%3Cdev.mxnet.apache.org%3E
> >
> > The vote will be a procedural issue vote as defined
> > https://www.apache.org/foundation/voting.html
> >
> > Votes on procedural issues follow the common format of majority rule
> > unless otherwise stated. That is, if there are more favourable votes than
> > unfavourable ones, the issue is considered to have passed -- regardless
> of
> > the number of votes in each category. (If the number of votes seems too
> > small to be representative of a community consensus, the issue is
> typically
> > not pursued. However, see the description of lazy consensus
> >  for a
> > modifying factor.)
> >
> > The vote will run until Friday Nov 2nd at 6:00 am EST
> >
> > Thanks,
> > Carin
> >
> >
>


Re: [VOTE] - Adopt "Become a Committer and PPMC Member" Document

2018-11-01 Thread Naveen Swamy
+1
Thanks everyone for your input and participation. Thanks to Carin for driving 
this.

> On Nov 1, 2018, at 6:07 AM, Carin Meier  wrote:
> 
> Reminder - vote ends tomorrow morning at 6:00 am EST
> 
>> On Mon, Oct 29, 2018 at 6:46 PM Carin Meier  wrote:
>> 
>> This vote is to adopt the document
>> https://cwiki.apache.org/confluence/display/MXNET/Become+an+Apache+MXNet+%28incubating%29+Committer+and+PPMC+Member+Proposal
>> to replace the current document
>> https://cwiki.apache.org/confluence/display/MXNET/Becoming+a+Committer
>> 
>> The dev discussion thread is here
>> https://lists.apache.org/thread.html/e61ffa26af374de7a99c475d406e462a00b26cfc1155e232198dd53e@%3Cdev.mxnet.apache.org%3E
>> 
>> The vote will be a procedural issue vote as defined
>> https://www.apache.org/foundation/voting.html
>> 
>> Votes on procedural issues follow the common format of majority rule
>> unless otherwise stated. That is, if there are more favourable votes than
>> unfavourable ones, the issue is considered to have passed -- regardless of
>> the number of votes in each category. (If the number of votes seems too
>> small to be representative of a community consensus, the issue is typically
>> not pursued. However, see the description of lazy consensus
>>  for a
>> modifying factor.)
>> 
>> The vote will run until Friday Nov 2nd at 6:00 am EST
>> 
>> Thanks,
>> Carin
>> 
>> 


Re: [VOTE] - Adopt "Become a Committer and PPMC Member" Document

2018-11-01 Thread Carin Meier
Reminder - vote ends tomorrow morning at 6:00 am EST

On Mon, Oct 29, 2018 at 6:46 PM Carin Meier  wrote:

> This vote is to adopt the document
> https://cwiki.apache.org/confluence/display/MXNET/Become+an+Apache+MXNet+%28incubating%29+Committer+and+PPMC+Member+Proposal
> to replace the current document
> https://cwiki.apache.org/confluence/display/MXNET/Becoming+a+Committer
>
> The dev discussion thread is here
> https://lists.apache.org/thread.html/e61ffa26af374de7a99c475d406e462a00b26cfc1155e232198dd53e@%3Cdev.mxnet.apache.org%3E
>
> The vote will be a procedural issue vote as defined
> https://www.apache.org/foundation/voting.html
>
> Votes on procedural issues follow the common format of majority rule
> unless otherwise stated. That is, if there are more favourable votes than
> unfavourable ones, the issue is considered to have passed -- regardless of
> the number of votes in each category. (If the number of votes seems too
> small to be representative of a community consensus, the issue is typically
> not pursued. However, see the description of lazy consensus
>  for a
> modifying factor.)
>
> The vote will run until Friday Nov 2nd at 6:00 am EST
>
> Thanks,
> Carin
>
>