Re: [Annoucement] New Committer -- Da Zheng

2018-12-17 Thread Hagay Lupesko
Congrats Da!

On Mon, Dec 17, 2018 at 5:44 PM Marco de Abreu 
wrote:

> Welcome Da, great to have you on board!
>
> Am Di., 18. Dez. 2018, 02:35 hat Lv, Tao A 
> geschrieben:
>
> > Congrats Da! Thank you for the effort on bringing MKL-DNN to MXNet. It's
> > really the footstone for the latter work and improvements.
> >
> > -Original Message-
> > From: Tianqi Chen [mailto:tqc...@apache.org]
> > Sent: Tuesday, December 18, 2018 1:02 AM
> > To: dev@mxnet.incubator.apache.org
> > Subject: [Annoucement] New Committer -- Da Zheng
> >
> > Dear Community:
> >
> > Please join me to welcome Da Zheng as a new committer of the MXNet.
> >
> > Da is the main author of MKL-DNN integration and recently he champions
> the
> > control flow support. He is one of the few "explorer style" contributors
> of
> > the community, who we desperately need in this fast change environment of
> > the deep learning system landscape.
> >
> > PRs https://github.com/apache/incubator-mxnet/commits?author=zheng-da
> > reviews  *
> >
> https://github.com/apache/incubator-mxnet/pulls?utf8=%E2%9C%93=is%3Apr+reviewed-by%3Azheng-da+
> > <
> >
> https://github.com/apache/incubator-mxnet/pulls?utf8=%E2%9C%93=is%3Apr+reviewed-by%3Azheng-da+
> > >*
> > dev@  https://lists.apache.org/list.html?d...@mxnet.apache.org:lte=3y:da-
> > zheng
> >
> > Tianqi
> >
>


Re: [Annoucement] New Committer -- Da Zheng

2018-12-17 Thread Marco de Abreu
Welcome Da, great to have you on board!

Am Di., 18. Dez. 2018, 02:35 hat Lv, Tao A  geschrieben:

> Congrats Da! Thank you for the effort on bringing MKL-DNN to MXNet. It's
> really the footstone for the latter work and improvements.
>
> -Original Message-
> From: Tianqi Chen [mailto:tqc...@apache.org]
> Sent: Tuesday, December 18, 2018 1:02 AM
> To: dev@mxnet.incubator.apache.org
> Subject: [Annoucement] New Committer -- Da Zheng
>
> Dear Community:
>
> Please join me to welcome Da Zheng as a new committer of the MXNet.
>
> Da is the main author of MKL-DNN integration and recently he champions the
> control flow support. He is one of the few "explorer style" contributors of
> the community, who we desperately need in this fast change environment of
> the deep learning system landscape.
>
> PRs https://github.com/apache/incubator-mxnet/commits?author=zheng-da
> reviews  *
> https://github.com/apache/incubator-mxnet/pulls?utf8=%E2%9C%93=is%3Apr+reviewed-by%3Azheng-da+
> <
> https://github.com/apache/incubator-mxnet/pulls?utf8=%E2%9C%93=is%3Apr+reviewed-by%3Azheng-da+
> >*
> dev@  https://lists.apache.org/list.html?d...@mxnet.apache.org:lte=3y:da-
> zheng
>
> Tianqi
>


RE: [Annoucement] New Committer -- Da Zheng

2018-12-17 Thread Lv, Tao A
Congrats Da! Thank you for the effort on bringing MKL-DNN to MXNet. It's really 
the footstone for the latter work and improvements.

-Original Message-
From: Tianqi Chen [mailto:tqc...@apache.org] 
Sent: Tuesday, December 18, 2018 1:02 AM
To: dev@mxnet.incubator.apache.org
Subject: [Annoucement] New Committer -- Da Zheng

Dear Community:

Please join me to welcome Da Zheng as a new committer of the MXNet.

Da is the main author of MKL-DNN integration and recently he champions the 
control flow support. He is one of the few "explorer style" contributors of the 
community, who we desperately need in this fast change environment of the deep 
learning system landscape.

PRs https://github.com/apache/incubator-mxnet/commits?author=zheng-da
reviews  
*https://github.com/apache/incubator-mxnet/pulls?utf8=%E2%9C%93=is%3Apr+reviewed-by%3Azheng-da+
*
dev@  https://lists.apache.org/list.html?d...@mxnet.apache.org:lte=3y:da-
zheng

Tianqi


Re: [DISCUSS] About the usage of CUDA/CUDNN

2018-12-17 Thread Marco de Abreu
Hi,

sorry for derailing the discussion another time (I'm slow to respond,
sorry) but I just wanted to give a little statement about the security
aspect of Jenkins.

Kellen is right that Jenkins is by its nature quite insecure due to the
high frequency of discovered security vulnerabilities. These attacks go
either against slaves due to arbitrary code execution (which we have
mitigated with the restricted slaves) or against the master with its big
attack surface due to high number of used plugins in combination with
infrequence updates from our side.

This decision to use the CI system for CD purposes was done as a result of
a lacking alternative - which lead me to quickly coming up with the
restricted slave idea. This was, from the beginning, intended as a
temporary solution and I still stand by that. That's why Chance Bair and I
are currently working on refactoring our CI deployment pipeline. This will
give us three things:
1. Frequent security updates
2. Fully automated redeployment
3. Individual deployments

#3 is the point which is of interest for this discussion. In future, it
will be possible to deploy different environments - this could be Alpha,
Beta, Gamma, Prod or - from another perspective - CI and CD. Each
deployment could then also have its own security setup. What I've been
thinking of (and please note that this is just some loud thinking from my
side and no agreed-on or reviewed proposal) is that we could then have a CD
instance of Jenkins that is not accessible from the internet or with a
Nginx proxy that allows authentication before hitting the actual Jenkins
Master. Before we can get to that point though, we first have to finish
that CI deployment pipeline and then see where we stand. We are then going
to follow up with a proposal where the community can express their ideas
about the different approaches.

But one thing is clear: We won't be running a mixed CI and CD forever. But
Jenkins - if designed and managed properly - should not be an attack vector
and thus not discarded as option for further considerations.

Again, for sorry for derailing the topic afterwards.

Best regards,
Marco

On Tue, Dec 18, 2018 at 1:07 AM Hen  wrote:

> Please raise on legal-discuss or in a legal jira.
>
> On Mon, Dec 17, 2018 at 3:59 PM Naveen Swamy  wrote:
>
> > Attempting to answer Qing's question
> > --
> > If you can digest the legal terms:
> > https://docs.nvidia.com/cuda/eula/index.html#distribution-requirements.
> > It sounds its OK("
> >
> >1. Your application must have material additional functionality,
> >beyond the included portions of the SDK.")
> >
> >  but I not don't understand the legal lingo
> >
> > @Hen  : Could you provide input to this?
> >
> > Thanks, Naveen
> >
> > On Mon, Dec 17, 2018 at 3:29 PM Davydenko, Denis <
> > dzianis.davydze...@gmail.com> wrote:
> >
> >> Kellen, please see conversation [1] on previously published proposal re:
> >> maven publishing pipeline. I think your concerns are valid and we should
> >> look into security aspect of running our CI on a broader scope, not
> bound
> >> to just artifact publishing.
> >>
> >> I believe right now Qing's question is whether it is OK from legal
> >> perspective to download CUDA by literally running wget during one of the
> >> jobs in publishing pipeline. The fact it is not available by just simple
> >> URL download raises concern: whether it is a protective measure from
> >> downloads by unauthenticated users or just inconvenience that has not
> been
> >> addressed by nVidia yet.
> >>
> >> [1]:
> >>
> https://lists.apache.org/thread.html/464712f0136fb51916ca9f1b702b99847e108dbdbd0b6a2b73fc91f1@%3Cdev.mxnet.apache.org%3E
> >>
> >>
> >> On 12/17/18, 2:48 PM, "kellen sunderland"  >
> >> wrote:
> >>
> >> Restricted nodes may provide enough security for some use cases, but
> >> in my
> >> opinion they don't provide enough for artifact publishing. An
> example
> >> would
> >> be if there were a exploit available that worked against a Jenkins
> >> master.
> >> In this case I think an attacker code still pivot to a secure node
> >> (correct
> >> me if I'm wrong).
> >>
> >> To your second point, it shouldn't be too hard for us to maintain
> all
> >> the
> >> deps for our packages in Dockerfiles which are checked into source
> and
> >> built on a regular basis.  To publish these artifacts I'd recommend
> >> doing
> >> this from a separate, secure environment.  The flow I'd recommend
> >> would be
> >> something like: (1) Developers commit PRs with verification that the
> >> artifacts build properly on a continual basis from the CI. (2) In a
> >> separate, secure environment we do the same artifact build
> generation
> >> again, but this time we publish to various repos as a convenience to
> >> our
> >> MXNet users.
> >>
> >> On Mon, Dec 17, 2018 at 2:34 PM Qing Lan 
> wrote:
> >>
> >> > Hi Kellen,
> >> >
> >> > Firstly the restricted node is completely 

RE: [Annoucement] New Committer -- Da Zheng

2018-12-17 Thread Zhao, Patric
Congratulation, Da! 

Really thanks for your great supports and looking forward the more cooperation 
together :)

> -Original Message-
> From: Tianqi Chen [mailto:tqc...@apache.org]
> Sent: Tuesday, December 18, 2018 1:02 AM
> To: dev@mxnet.incubator.apache.org
> Subject: [Annoucement] New Committer -- Da Zheng
> 
> Dear Community:
> 
> Please join me to welcome Da Zheng as a new committer of the MXNet.
> 
> Da is the main author of MKL-DNN integration and recently he champions
> the control flow support. He is one of the few "explorer style" contributors
> of the community, who we desperately need in this fast change environment
> of the deep learning system landscape.
> 
> PRs https://github.com/apache/incubator-mxnet/commits?author=zheng-da
> reviews  *https://github.com/apache/incubator-
> mxnet/pulls?utf8=%E2%9C%93=is%3Apr+reviewed-by%3Azheng-da+
>  mxnet/pulls?utf8=%E2%9C%93=is%3Apr+reviewed-by%3Azheng-da+>*
> dev@  https://lists.apache.org/list.html?d...@mxnet.apache.org:lte=3y:da-
> zheng
> 
> Tianqi


Re: [DISCUSS] About the usage of CUDA/CUDNN

2018-12-17 Thread Naveen Swamy
Attempting to answer Qing's question
--
If you can digest the legal terms:
https://docs.nvidia.com/cuda/eula/index.html#distribution-requirements.
It sounds its OK("

   1. Your application must have material additional functionality, beyond
   the included portions of the SDK.")

 but I not don't understand the legal lingo

@Hen  : Could you provide input to this?

Thanks, Naveen

On Mon, Dec 17, 2018 at 3:29 PM Davydenko, Denis <
dzianis.davydze...@gmail.com> wrote:

> Kellen, please see conversation [1] on previously published proposal re:
> maven publishing pipeline. I think your concerns are valid and we should
> look into security aspect of running our CI on a broader scope, not bound
> to just artifact publishing.
>
> I believe right now Qing's question is whether it is OK from legal
> perspective to download CUDA by literally running wget during one of the
> jobs in publishing pipeline. The fact it is not available by just simple
> URL download raises concern: whether it is a protective measure from
> downloads by unauthenticated users or just inconvenience that has not been
> addressed by nVidia yet.
>
> [1]:
> https://lists.apache.org/thread.html/464712f0136fb51916ca9f1b702b99847e108dbdbd0b6a2b73fc91f1@%3Cdev.mxnet.apache.org%3E
>
>
> On 12/17/18, 2:48 PM, "kellen sunderland" 
> wrote:
>
> Restricted nodes may provide enough security for some use cases, but
> in my
> opinion they don't provide enough for artifact publishing. An example
> would
> be if there were a exploit available that worked against a Jenkins
> master.
> In this case I think an attacker code still pivot to a secure node
> (correct
> me if I'm wrong).
>
> To your second point, it shouldn't be too hard for us to maintain all
> the
> deps for our packages in Dockerfiles which are checked into source and
> built on a regular basis.  To publish these artifacts I'd recommend
> doing
> this from a separate, secure environment.  The flow I'd recommend
> would be
> something like: (1) Developers commit PRs with verification that the
> artifacts build properly on a continual basis from the CI. (2) In a
> separate, secure environment we do the same artifact build generation
> again, but this time we publish to various repos as a convenience to
> our
> MXNet users.
>
> On Mon, Dec 17, 2018 at 2:34 PM Qing Lan  wrote:
>
> > Hi Kellen,
> >
> > Firstly the restricted node is completely isolated to the
> PR-checking CI
> > system (physically) which is explained in here:
> >
> https://cwiki.apache.org/confluence/display/MXNET/Restricted+jobs+and+nodes
> > .
> > What you are mentioning: the Public CIs are all having troubles if
> they
> > are public accessible. I am not sure how secure the restricted node
> is.
> > However, the only way I can think of from your end is to downloading
> all
> > deps in a single machine and run everything there (disconnected from
> > internet). It would bring us the best security we have.
> >
> > Thanks,
> > Qing
> >
> > On 12/17/18, 2:06 PM, "kellen sunderland" <
> kellen.sunderl...@gmail.com>
> > wrote:
> >
> > I'm not in favour of publishing artifacts from any Jenkins based
> > systems.
> > There are many ways to bundle artifacts and publish them from an
> > automated
> > system.  Why we would use a CI system like Jenkins for this task?
> > Jenkins
> > frequently has security vulnerabilities and is designed to run
> > arbitrary
> > code from the internet.  It is a real possibility that an
> attacker
> > could
> > pivot from any Jenkins based CI system to infect artifacts which
> would
> > then
> > potentially be pushed to repositories our users would consume.
> I would
> > consider any system using Jenkins as insecure-by-design, and
> encourage
> > us
> > to air-gapped any artifact generation (websites, jars, PyPi
> packages)
> > completely from a system like that.
> >
> > An alternative I could see is a simple Dockerfile (no Jenkins)
> that
> > builds
> > all artifacts end-to-end and can be run in an automated account
> well
> > outside our CI account.
> >
> > On Mon, Dec 17, 2018 at 1:53 PM Qing Lan 
> wrote:
> >
> > > Dear community,
> > >
> > > Currently me and Zach are working on the Automated-publish
> pipeline
> > on
> > > Jenkins which is a pipeline used to publish Maven packages and
> pip
> > packages
> > > nightly build. We are trying to use NVIDIA deb which could
> help us
> > to build
> > > different CUDA/CUDNN versions in the publish system. Sheng has
> > provided a
> > > script here:
> https://github.com/apache/incubator-mxnet/pull/13646.
> > This
> > > provide a very concrete and automatic solution from
> downloading to
> >   

Re: [DISCUSS] About the usage of CUDA/CUDNN

2018-12-17 Thread Davydenko, Denis
Kellen, please see conversation [1] on previously published proposal re: maven 
publishing pipeline. I think your concerns are valid and we should look into 
security aspect of running our CI on a broader scope, not bound to just 
artifact publishing.

I believe right now Qing's question is whether it is OK from legal perspective 
to download CUDA by literally running wget during one of the jobs in publishing 
pipeline. The fact it is not available by just simple URL download raises 
concern: whether it is a protective measure from downloads by unauthenticated 
users or just inconvenience that has not been addressed by nVidia yet.

[1]: 
https://lists.apache.org/thread.html/464712f0136fb51916ca9f1b702b99847e108dbdbd0b6a2b73fc91f1@%3Cdev.mxnet.apache.org%3E


On 12/17/18, 2:48 PM, "kellen sunderland"  wrote:

Restricted nodes may provide enough security for some use cases, but in my
opinion they don't provide enough for artifact publishing. An example would
be if there were a exploit available that worked against a Jenkins master.
In this case I think an attacker code still pivot to a secure node (correct
me if I'm wrong).

To your second point, it shouldn't be too hard for us to maintain all the
deps for our packages in Dockerfiles which are checked into source and
built on a regular basis.  To publish these artifacts I'd recommend doing
this from a separate, secure environment.  The flow I'd recommend would be
something like: (1) Developers commit PRs with verification that the
artifacts build properly on a continual basis from the CI. (2) In a
separate, secure environment we do the same artifact build generation
again, but this time we publish to various repos as a convenience to our
MXNet users.

On Mon, Dec 17, 2018 at 2:34 PM Qing Lan  wrote:

> Hi Kellen,
>
> Firstly the restricted node is completely isolated to the PR-checking CI
> system (physically) which is explained in here:
> 
https://cwiki.apache.org/confluence/display/MXNET/Restricted+jobs+and+nodes
> .
> What you are mentioning: the Public CIs are all having troubles if they
> are public accessible. I am not sure how secure the restricted node is.
> However, the only way I can think of from your end is to downloading all
> deps in a single machine and run everything there (disconnected from
> internet). It would bring us the best security we have.
>
> Thanks,
> Qing
>
> On 12/17/18, 2:06 PM, "kellen sunderland" 
> wrote:
>
> I'm not in favour of publishing artifacts from any Jenkins based
> systems.
> There are many ways to bundle artifacts and publish them from an
> automated
> system.  Why we would use a CI system like Jenkins for this task?
> Jenkins
> frequently has security vulnerabilities and is designed to run
> arbitrary
> code from the internet.  It is a real possibility that an attacker
> could
> pivot from any Jenkins based CI system to infect artifacts which would
> then
> potentially be pushed to repositories our users would consume.  I 
would
> consider any system using Jenkins as insecure-by-design, and encourage
> us
> to air-gapped any artifact generation (websites, jars, PyPi packages)
> completely from a system like that.
>
> An alternative I could see is a simple Dockerfile (no Jenkins) that
> builds
> all artifacts end-to-end and can be run in an automated account well
> outside our CI account.
>
> On Mon, Dec 17, 2018 at 1:53 PM Qing Lan  wrote:
>
> > Dear community,
> >
> > Currently me and Zach are working on the Automated-publish pipeline
> on
> > Jenkins which is a pipeline used to publish Maven packages and pip
> packages
> > nightly build. We are trying to use NVIDIA deb which could help us
> to build
> > different CUDA/CUDNN versions in the publish system. Sheng has
> provided a
> > script here: https://github.com/apache/incubator-mxnet/pull/13646.
> This
> > provide a very concrete and automatic solution from downloading to
> > installing on the system. The only scenario we are facing is: It
> seemed
> > NVIDIA has a restriction on distributing CUDA. We are not sure if it
> is
> > legally-safe for us to use this in public.
> >
> > We would be grateful if somebody has a better context on it and help
> us
> > out!
> >
> > Thanks,
> > Qing
> >
>
>
>





Re: [DISCUSS] About the usage of CUDA/CUDNN

2018-12-17 Thread kellen sunderland
Restricted nodes may provide enough security for some use cases, but in my
opinion they don't provide enough for artifact publishing. An example would
be if there were a exploit available that worked against a Jenkins master.
In this case I think an attacker code still pivot to a secure node (correct
me if I'm wrong).

To your second point, it shouldn't be too hard for us to maintain all the
deps for our packages in Dockerfiles which are checked into source and
built on a regular basis.  To publish these artifacts I'd recommend doing
this from a separate, secure environment.  The flow I'd recommend would be
something like: (1) Developers commit PRs with verification that the
artifacts build properly on a continual basis from the CI. (2) In a
separate, secure environment we do the same artifact build generation
again, but this time we publish to various repos as a convenience to our
MXNet users.

On Mon, Dec 17, 2018 at 2:34 PM Qing Lan  wrote:

> Hi Kellen,
>
> Firstly the restricted node is completely isolated to the PR-checking CI
> system (physically) which is explained in here:
> https://cwiki.apache.org/confluence/display/MXNET/Restricted+jobs+and+nodes
> .
> What you are mentioning: the Public CIs are all having troubles if they
> are public accessible. I am not sure how secure the restricted node is.
> However, the only way I can think of from your end is to downloading all
> deps in a single machine and run everything there (disconnected from
> internet). It would bring us the best security we have.
>
> Thanks,
> Qing
>
> On 12/17/18, 2:06 PM, "kellen sunderland" 
> wrote:
>
> I'm not in favour of publishing artifacts from any Jenkins based
> systems.
> There are many ways to bundle artifacts and publish them from an
> automated
> system.  Why we would use a CI system like Jenkins for this task?
> Jenkins
> frequently has security vulnerabilities and is designed to run
> arbitrary
> code from the internet.  It is a real possibility that an attacker
> could
> pivot from any Jenkins based CI system to infect artifacts which would
> then
> potentially be pushed to repositories our users would consume.  I would
> consider any system using Jenkins as insecure-by-design, and encourage
> us
> to air-gapped any artifact generation (websites, jars, PyPi packages)
> completely from a system like that.
>
> An alternative I could see is a simple Dockerfile (no Jenkins) that
> builds
> all artifacts end-to-end and can be run in an automated account well
> outside our CI account.
>
> On Mon, Dec 17, 2018 at 1:53 PM Qing Lan  wrote:
>
> > Dear community,
> >
> > Currently me and Zach are working on the Automated-publish pipeline
> on
> > Jenkins which is a pipeline used to publish Maven packages and pip
> packages
> > nightly build. We are trying to use NVIDIA deb which could help us
> to build
> > different CUDA/CUDNN versions in the publish system. Sheng has
> provided a
> > script here: https://github.com/apache/incubator-mxnet/pull/13646.
> This
> > provide a very concrete and automatic solution from downloading to
> > installing on the system. The only scenario we are facing is: It
> seemed
> > NVIDIA has a restriction on distributing CUDA. We are not sure if it
> is
> > legally-safe for us to use this in public.
> >
> > We would be grateful if somebody has a better context on it and help
> us
> > out!
> >
> > Thanks,
> > Qing
> >
>
>
>


Re: [DISCUSS] About the usage of CUDA/CUDNN

2018-12-17 Thread Qing Lan
Hi Kellen,

Firstly the restricted node is completely isolated to the PR-checking CI system 
(physically) which is explained in here: 
https://cwiki.apache.org/confluence/display/MXNET/Restricted+jobs+and+nodes.
What you are mentioning: the Public CIs are all having troubles if they are 
public accessible. I am not sure how secure the restricted node is. However, 
the only way I can think of from your end is to downloading all deps in a 
single machine and run everything there (disconnected from internet). It would 
bring us the best security we have. 

Thanks,
Qing

On 12/17/18, 2:06 PM, "kellen sunderland"  wrote:

I'm not in favour of publishing artifacts from any Jenkins based systems.
There are many ways to bundle artifacts and publish them from an automated
system.  Why we would use a CI system like Jenkins for this task?  Jenkins
frequently has security vulnerabilities and is designed to run arbitrary
code from the internet.  It is a real possibility that an attacker could
pivot from any Jenkins based CI system to infect artifacts which would then
potentially be pushed to repositories our users would consume.  I would
consider any system using Jenkins as insecure-by-design, and encourage us
to air-gapped any artifact generation (websites, jars, PyPi packages)
completely from a system like that.

An alternative I could see is a simple Dockerfile (no Jenkins) that builds
all artifacts end-to-end and can be run in an automated account well
outside our CI account.

On Mon, Dec 17, 2018 at 1:53 PM Qing Lan  wrote:

> Dear community,
>
> Currently me and Zach are working on the Automated-publish pipeline on
> Jenkins which is a pipeline used to publish Maven packages and pip 
packages
> nightly build. We are trying to use NVIDIA deb which could help us to 
build
> different CUDA/CUDNN versions in the publish system. Sheng has provided a
> script here: https://github.com/apache/incubator-mxnet/pull/13646. This
> provide a very concrete and automatic solution from downloading to
> installing on the system. The only scenario we are facing is: It seemed
> NVIDIA has a restriction on distributing CUDA. We are not sure if it is
> legally-safe for us to use this in public.
>
> We would be grateful if somebody has a better context on it and help us
> out!
>
> Thanks,
> Qing
>




Re: [DISCUSS] About the usage of CUDA/CUDNN

2018-12-17 Thread kellen sunderland
I'm not in favour of publishing artifacts from any Jenkins based systems.
There are many ways to bundle artifacts and publish them from an automated
system.  Why we would use a CI system like Jenkins for this task?  Jenkins
frequently has security vulnerabilities and is designed to run arbitrary
code from the internet.  It is a real possibility that an attacker could
pivot from any Jenkins based CI system to infect artifacts which would then
potentially be pushed to repositories our users would consume.  I would
consider any system using Jenkins as insecure-by-design, and encourage us
to air-gapped any artifact generation (websites, jars, PyPi packages)
completely from a system like that.

An alternative I could see is a simple Dockerfile (no Jenkins) that builds
all artifacts end-to-end and can be run in an automated account well
outside our CI account.

On Mon, Dec 17, 2018 at 1:53 PM Qing Lan  wrote:

> Dear community,
>
> Currently me and Zach are working on the Automated-publish pipeline on
> Jenkins which is a pipeline used to publish Maven packages and pip packages
> nightly build. We are trying to use NVIDIA deb which could help us to build
> different CUDA/CUDNN versions in the publish system. Sheng has provided a
> script here: https://github.com/apache/incubator-mxnet/pull/13646. This
> provide a very concrete and automatic solution from downloading to
> installing on the system. The only scenario we are facing is: It seemed
> NVIDIA has a restriction on distributing CUDA. We are not sure if it is
> legally-safe for us to use this in public.
>
> We would be grateful if somebody has a better context on it and help us
> out!
>
> Thanks,
> Qing
>


[DISCUSS] About the usage of CUDA/CUDNN

2018-12-17 Thread Qing Lan
Dear community,

Currently me and Zach are working on the Automated-publish pipeline on Jenkins 
which is a pipeline used to publish Maven packages and pip packages nightly 
build. We are trying to use NVIDIA deb which could help us to build different 
CUDA/CUDNN versions in the publish system. Sheng has provided a script here: 
https://github.com/apache/incubator-mxnet/pull/13646. This provide a very 
concrete and automatic solution from downloading to installing on the system. 
The only scenario we are facing is: It seemed NVIDIA has a restriction on 
distributing CUDA. We are not sure if it is legally-safe for us to use this in 
public.

We would be grateful if somebody has a better context on it and help us out!

Thanks,
Qing


Re: [Annoucement] New Committer -- Da Zheng

2018-12-17 Thread Alex Zai
Congrats Da.

On Mon, Dec 17, 2018, 7:20 PM Emani, Ashok  Congratulations Da, well deserved!
>
> On 12/17/18, 9:02 AM, "Tianqi Chen"  wrote:
>
> Dear Community:
>
> Please join me to welcome Da Zheng as a new committer of the MXNet.
>
> Da is the main author of MKL-DNN integration and recently he champions
> the
> control flow support. He is one of the few "explorer style"
> contributors of
> the community, who we desperately need in this fast change environment
> of
> the deep learning system landscape.
>
> PRs https://github.com/apache/incubator-mxnet/commits?author=zheng-da
> reviews  *
> https://github.com/apache/incubator-mxnet/pulls?utf8=%E2%9C%93=is%3Apr+reviewed-by%3Azheng-da+
> <
> https://github.com/apache/incubator-mxnet/pulls?utf8=%E2%9C%93=is%3Apr+reviewed-by%3Azheng-da+
> >*
> dev@
> https://lists.apache.org/list.html?d...@mxnet.apache.org:lte=3y:da-
> zheng
>
> Tianqi
>
>
>


Re: [Annoucement] New Committer -- Da Zheng

2018-12-17 Thread Emani, Ashok
Congratulations Da, well deserved!

On 12/17/18, 9:02 AM, "Tianqi Chen"  wrote:

Dear Community:

Please join me to welcome Da Zheng as a new committer of the MXNet.

Da is the main author of MKL-DNN integration and recently he champions the
control flow support. He is one of the few "explorer style" contributors of
the community, who we desperately need in this fast change environment of
the deep learning system landscape.

PRs https://github.com/apache/incubator-mxnet/commits?author=zheng-da
reviews  
*https://github.com/apache/incubator-mxnet/pulls?utf8=%E2%9C%93=is%3Apr+reviewed-by%3Azheng-da+

*
dev@  https://lists.apache.org/list.html?d...@mxnet.apache.org:lte=3y:da-
zheng

Tianqi




Re: [Annoucement] New Committer -- Da Zheng

2018-12-17 Thread Lin Yuan
Congrats!

On Mon, Dec 17, 2018 at 9:19 AM Steffen Rochel 
wrote:

> Congratulation Da!
>
> On Mon, Dec 17, 2018 at 9:02 AM Tianqi Chen  wrote:
>
> > Dear Community:
> >
> > Please join me to welcome Da Zheng as a new committer of the MXNet.
> >
> > Da is the main author of MKL-DNN integration and recently he champions
> the
> > control flow support. He is one of the few "explorer style" contributors
> of
> > the community, who we desperately need in this fast change environment of
> > the deep learning system landscape.
> >
> > PRs https://github.com/apache/incubator-mxnet/commits?author=zheng-da
> > reviews  *
> >
> https://github.com/apache/incubator-mxnet/pulls?utf8=%E2%9C%93=is%3Apr+reviewed-by%3Azheng-da+
> > <
> >
> https://github.com/apache/incubator-mxnet/pulls?utf8=%E2%9C%93=is%3Apr+reviewed-by%3Azheng-da+
> > >*
> > dev@  https://lists.apache.org/list.html?d...@mxnet.apache.org:lte=3y:da-
> > zheng
> >
> > Tianqi
> >
>


Re: API change discussion to resolve inconsistencies in Gluon Model Zoo

2018-12-17 Thread Skalicky, Sam
I think it would be a good idea to do this in all the language bindings so that 
error messages can be appropriate and familiar in the user’s language rather 
than confusing exceptions coming from the c++ backend. Heres an issue I filed 
for tracking: https://github.com/apache/incubator-mxnet/issues/12286

Sam


On Dec 17, 2018, at 12:42 AM, Chaitanya Bapat 
mailto:chai.ba...@gmail.com>> wrote:

Hello everyone,

As a contributor to Apache MXNet project, I wanted to ask the developer
community a few questions pertaining to Gluon Model Zoo API.

1. With regards to APIs for networks like - mobile nets, densenets,
resnets, squeezenets (note the case sensitivity of these APIs), they are
inconsistent compared to alexnet and inceptions.

By inconsistency I mean, **kwargs instead of mentioning all the parameters
that are required by the function. (e.g. pretrained=False, ctx=cpu(0))

Stating just 1 function here for example -

  - mxnet.gluon.model_zoo.vision.mobilenet_v2_0_25(**kwargs)
  


2. What does the community feel about this? Should we resolve this or it's
right the way it is? (Since this was an API breaking change, it was best to
ask the community before submitting a PR with changes)

3. What is the difference between APIs bearing same names but those in
Titlecase vs lowercase
e.g. mxnet.gluon.model_zoo.vision.AlexNet(classes=1000, **kwargs)
vs
mxnet.gluon.model_zoo.vision.alexnet(pretrained=False, ctx=cpu(0),
root='/home/jenkins_slave/.mxnet/models', **kwargs)

For reference and to track the above inconsistency, I have created a Github
Issue #13661 

I would highly appreciate your replies to any or all of the above questions.
Thanks,
Chai


--
*Chaitanya Prakash Bapat*
*+1 (973) 953-6299*

[image: https://www.linkedin.com//in/chaibapat25]
[image: https://www.facebook.com/chaibapat]
[image:
https://twitter.com/ChaiBapchya] [image:
https://www.linkedin.com//in/chaibapat25]




Re: [Annoucement] New Committer -- Da Zheng

2018-12-17 Thread Steffen Rochel
Congratulation Da!

On Mon, Dec 17, 2018 at 9:02 AM Tianqi Chen  wrote:

> Dear Community:
>
> Please join me to welcome Da Zheng as a new committer of the MXNet.
>
> Da is the main author of MKL-DNN integration and recently he champions the
> control flow support. He is one of the few "explorer style" contributors of
> the community, who we desperately need in this fast change environment of
> the deep learning system landscape.
>
> PRs https://github.com/apache/incubator-mxnet/commits?author=zheng-da
> reviews  *
> https://github.com/apache/incubator-mxnet/pulls?utf8=%E2%9C%93=is%3Apr+reviewed-by%3Azheng-da+
> <
> https://github.com/apache/incubator-mxnet/pulls?utf8=%E2%9C%93=is%3Apr+reviewed-by%3Azheng-da+
> >*
> dev@  https://lists.apache.org/list.html?d...@mxnet.apache.org:lte=3y:da-
> zheng
>
> Tianqi
>


[Annoucement] New Committer -- Da Zheng

2018-12-17 Thread Tianqi Chen
Dear Community:

Please join me to welcome Da Zheng as a new committer of the MXNet.

Da is the main author of MKL-DNN integration and recently he champions the
control flow support. He is one of the few "explorer style" contributors of
the community, who we desperately need in this fast change environment of
the deep learning system landscape.

PRs https://github.com/apache/incubator-mxnet/commits?author=zheng-da
reviews  
*https://github.com/apache/incubator-mxnet/pulls?utf8=%E2%9C%93=is%3Apr+reviewed-by%3Azheng-da+
*
dev@  https://lists.apache.org/list.html?d...@mxnet.apache.org:lte=3y:da-
zheng

Tianqi


Re: Apache MXNet v1.4.0 release status

2018-12-17 Thread Pedro Larroy
Hi Steffen

Added some notes in your PR for the release notes.

In particular, I'm a bit concerned about the status of topology aware
communication, since it has open issues and is not being tested in CI.
(The tests also fail). I think we should anounce it when it's working
properly and it's well tested.

Pedro.

On Sat, Dec 15, 2018 at 11:06 AM Steffen Rochel  wrote:
>
> Dear MXNet community -
> all issues beside one
>  have been
> addressed. I suggest to document the last remaining issue as known problem
> and move forward with the release.
> Please communicate if you have concerns of know about critical issues to be
> addressed before starting vote about releasing 1.4.0 as soon as possible.
> Please also have a look at the release notes
> 
> and provide feedback.
>
> I'm planing to start voting beginning of next week.
> Steffen
>
> On Sat, Dec 8, 2018 at 8:31 PM Steffen Rochel 
> wrote:
>
> > Hi Pedro - this are indeed the draft release notes for v1.4.0. Please add
> > description as you suggested.
> >
> > All - please have a look at the release notes and provide feedback and
> > suggestions..
> > Steffen
> > On Sun, Dec 9, 2018 at 3:30 AM Zhao, Patric  wrote:
> >
> >> Hi Steffen,
> >>
> >> I saw the draft of 1.4 release notes in here (
> >> https://cwiki.apache.org/confluence/display/MXNET/Apache+MXNet+%28incubating%29+1.4.0+Release+Notes
> >> ).
> >>
> >> Is this near the final version?  I'd like to add some descriptions of new
> >> quantization features enabled in 1.4.
> >>
> >> Is it OK?
> >>
> >> Thanks,
> >>
> >> --Patric
> >>
> >>
> >> > -Original Message-
> >> > From: Steffen Rochel [mailto:steffenroc...@gmail.com]
> >> > Sent: Saturday, December 8, 2018 1:12 AM
> >> > To: dev@mxnet.incubator.apache.org
> >> > Subject: Apache MXNet v1.4.0 release status
> >> >
> >> > Dear MXNet community -
> >> > I would like to provide update on v1.4.0 status, details are tracked
> >> here
> >> >  >> > ncubating%29+1.4.0+Release+Plan+and+Status>
> >> > .
> >> >
> >> > Thank you very much for everybody effort to resolve the identified
> >> issues.
> >> > We are down to 3 open issues - for details please see
> >> > https://cwiki.apache.org/confluence/display/MXNET/Apache+MXNet+%28in
> >> > cubating%29+1.
> >> > 4.0
> >> > +Release+Plan+and+Status#ApacheMXNet(incubating)1.4.0ReleasePlanandSt
> >> > atu
> >> > +Release+Plan+and+s-OpenPRstotrack
> >> >  >> > ncubating%29+1.4.0+Release+Plan+and+Status#ApacheMXNet(incubating)1.
> >> > 4.0ReleasePlanandStatus-OpenPRstotrack>
> >> > Please help to resolve the remaining issues and integrate to v1.4.x
> >> branch.
> >> > Current estimate to address the identified security vulnerabilities in
> >> the
> >> > Scala/Java package and merge into v1.4.x branch is end of next week
> >> > (December 14th) I will communicate as soon I have more information.
> >> >
> >> > Regards,
> >> > Steffen
> >>
> >


Re: Cambricon MLU support for MXNet.

2018-12-17 Thread Pedro Larroy
Hi Haochong

Welcome to MXNet, It's exciting to have additional hardware platforms
added and supported in the project.

The CI system for MXNet is donated by AWS to the project. We have a
small hardware lab with embedded physical hardware like ARM boards
including NVidia Jetson which we are connecting to the CI system.
(It's a WIP).

However, the bulk of the CI system runs in the AWS Cloud using Jenkins
and EC2 GPU and CPU instances. So even though any of the options you
mention are possible and could work, I think in the order you
mentioned them would be the most preferable. Connecting a remote
server or cloud instance to the MXNet Jenkins would be the easiest
which wouldn't involve hardware shipping and maintenance.

I think once you have the contribution merged and the changes ready to
be tested we can make a plan on how to best integrate with CI. For
that, the recommendation that Hagay gave (Design proposal in the Wiki)
is a good path forward, so other members of the community and the
engineers contributing to the CI system can contribute.

Pedro.

On Mon, Dec 17, 2018 at 3:33 AM 张昊翀  wrote:
>
> Dear MXNet community,
>
> We are from Cambricon, a leading supplier of artificial intelligence chips. 
> We have two product lines, including IP products (e.g., Cambricon 1A/1H) and 
> chip products (e.g., MLU100 released in May 2018)
>
> We are now adapting MXNet on Cambricon products. During the follow-up 
> session, we plan to open source, and hope to merge these new features into 
> the master branch of MXNet and to be a part of MXNet's long-term support. We 
> firmly believe that these MLU features will promote the MXNet community 
> development.
> To this end, we are ready to accept the rigorous inspection of MXNet 
> community. In addition, we need advice from the community to achieve high 
> quality implementation. On this basis, we very much hope to reach a 
> full-scale long-term cooperation with the community.
>
> In order to achieve the above goals, we hope to keep in touch with the 
> community on some issues. Looking forward to your valuable feedback.
>
> 1. MLU100 mainly focuses on inference, and we plan to first support the 
> inference part of MXNet. The training part of MXNet on MLU will be released 
> in the future. Is that acceptable for MXNet community?
>
> 2. Though MLU can support various operators/networks, to guarantee high 
> quality, all supported operators submitted to the community should undergo 
> rigorous stress test. Thus, at the beginning, we plan to release a small 
> number of supported operators and networks, and more of them will be 
> continuously added. Is that acceptable or do we have to support all networks 
> in the ModelZoo in the first release?
>
> 3. Currently we plan to support both Python and C++ APIs. More details on 
> supported APIs will be provided in a follow-up proposal.
>
> 4. We need to modify the mShadow in order to support tensor memory operations.
>
> 5. In order to enable the community to run and fully test our code, we want 
> to provide the community with a complete test environment. At present, we are 
> considering the following three ways.
> A) Provides several remote servers for community and integrates with the 
> community's Jenkins.
> B) Provide a cloud platform to the community.
> C) Donate MLU100 to the community's testing platform. However, we don’t know 
> the specific ways of donation, and we hope to get help. We are wondering 
> about how MXNet's test servers are managed.
>
> About more technical details, a proposal will be submitted to the community 
> before releasing the code.
>
> In addition to the above points, the remaining questions and suggestions are 
> also welcome. Thanks!
>
> More about Cambricon:
> Cambricon is the artificial intelligence computing pioneer that engineers and 
> successfully commercializes world’s first dedicated machine learning 
> processor. To bring its unique AI processors from edge to cloud, enriching 
> and advancing human life, is the firm mission of the company. Dr. Tianshi 
> Chen is the founder and CEO of Cambricon, where he brings over 10 years 
> experience in the fields of micro-processor architecture and artificial 
> intelligence.
> In 2016, Cambricon released Cambricon 1A processor, the first commercial 
> machine learning specific processor in the world. Later, during the 3rd World 
> Internet Conference, Cambricon 1A processor was elected as one of “World 
> Leading Internet Scientific and Technological Achievements“. In May 2018, 
> Cambricon released MLU100, a machine learning chip which is in mass 
> production now. By offering revolutionary technology and products, Cambricon 
> has established and remains active relationships with various companies in 
> the AI industry.
>
>
> Regards,
> Haochong Zhang
> Cambricon MXNet Development Team
>
>


RE: Cambricon MLU support for MXNet.

2018-12-17 Thread Lv, Tao A
"mshadow is being deprecated." 

Surprised to know that. Was it discussed before? Do we have any document to 
tell contributors and developers about that?

-tao

-Original Message-
From: Chris Olivier [mailto:cjolivie...@gmail.com] 
Sent: Monday, December 17, 2018 3:10 PM
To: dev@mxnet.incubator.apache.org; ??? 
Cc: d...@mxnet.apache.org; solomon.zhc 
Subject: Re: Cambricon MLU support for MXNet.

small point: mshadow is being deprecated. probably you shouldn’t invest too 
much time on it. just an FYI

On Sun, Dec 16, 2018 at 6:33 PM 张昊翀  wrote:

> Dear MXNet community,
>
> We are from Cambricon, a leading supplier of artificial intelligence 
> chips. We have two product lines, including IP products (e.g., 
> Cambricon
> 1A/1H) and chip products (e.g., MLU100 released in May 2018)
>
> We are now adapting MXNet on Cambricon products. During the follow-up 
> session, we plan to open source, and hope to merge these new features 
> into the master branch of MXNet and to be a part of MXNet's long-term support.
> We firmly believe that these MLU features will promote the MXNet 
> community development.
> To this end, we are ready to accept the rigorous inspection of MXNet 
> community. In addition, we need advice from the community to achieve 
> high quality implementation. On this basis, we very much hope to reach 
> a full-scale long-term cooperation with the community.
>
> In order to achieve the above goals, we hope to keep in touch with the 
> community on some issues. Looking forward to your valuable feedback.
>
> 1. MLU100 mainly focuses on inference, and we plan to first support 
> the inference part of MXNet. The training part of MXNet on MLU will be 
> released in the future. Is that acceptable for MXNet community?
>
> 2. Though MLU can support various operators/networks, to guarantee 
> high quality, all supported operators submitted to the community 
> should undergo rigorous stress test. Thus, at the beginning, we plan 
> to release a small number of supported operators and networks, and 
> more of them will be continuously added. Is that acceptable or do we 
> have to support all networks in the ModelZoo in the first release?
>
> 3. Currently we plan to support both Python and C++ APIs. More details 
> on supported APIs will be provided in a follow-up proposal.
>
> 4. We need to modify the mShadow in order to support tensor memory 
> operations.
>
> 5. In order to enable the community to run and fully test our code, we 
> want to provide the community with a complete test environment. At 
> present, we are considering the following three ways.
> A) Provides several remote servers for community and integrates with 
> the community's Jenkins.
> B) Provide a cloud platform to the community.
> C) Donate MLU100 to the community's testing platform. However, we 
> don’t know the specific ways of donation, and we hope to get help. We 
> are wondering about how MXNet's test servers are managed.
>
> About more technical details, a proposal will be submitted to the 
> community before releasing the code.
>
> In addition to the above points, the remaining questions and 
> suggestions are also welcome. Thanks!
>
> More about Cambricon:
> Cambricon is the artificial intelligence computing pioneer that 
> engineers and successfully commercializes world’s first dedicated 
> machine learning processor. To bring its unique AI processors from 
> edge to cloud, enriching and advancing human life, is the firm mission 
> of the company. Dr. Tianshi Chen is the founder and CEO of Cambricon, 
> where he brings over 10 years experience in the fields of 
> micro-processor architecture and artificial intelligence.
> In 2016, Cambricon released Cambricon 1A processor, the first 
> commercial machine learning specific processor in the world. Later, 
> during the 3rd World Internet Conference, Cambricon 1A processor was 
> elected as one of “World Leading Internet Scientific and Technological 
> Achievements“. In May 2018, Cambricon released MLU100, a machine 
> learning chip which is in mass production now. By offering 
> revolutionary technology and products, Cambricon has established and 
> remains active relationships with various companies in the AI industry.
>
>
> Regards,
> Haochong Zhang
> Cambricon MXNet Development Team
>
>
>


API change discussion to resolve inconsistencies in Gluon Model Zoo

2018-12-17 Thread Chaitanya Bapat
Hello everyone,

As a contributor to Apache MXNet project, I wanted to ask the developer
community a few questions pertaining to Gluon Model Zoo API.

1. With regards to APIs for networks like - mobile nets, densenets,
resnets, squeezenets (note the case sensitivity of these APIs), they are
inconsistent compared to alexnet and inceptions.

By inconsistency I mean, **kwargs instead of mentioning all the parameters
that are required by the function. (e.g. pretrained=False, ctx=cpu(0))

Stating just 1 function here for example -

   - mxnet.gluon.model_zoo.vision.mobilenet_v2_0_25(**kwargs)
   


2. What does the community feel about this? Should we resolve this or it's
right the way it is? (Since this was an API breaking change, it was best to
ask the community before submitting a PR with changes)

3. What is the difference between APIs bearing same names but those in
Titlecase vs lowercase
e.g. mxnet.gluon.model_zoo.vision.AlexNet(classes=1000, **kwargs)
vs
mxnet.gluon.model_zoo.vision.alexnet(pretrained=False, ctx=cpu(0),
root='/home/jenkins_slave/.mxnet/models', **kwargs)

For reference and to track the above inconsistency, I have created a Github
Issue #13661 

I would highly appreciate your replies to any or all of the above questions.
Thanks,
Chai


-- 
*Chaitanya Prakash Bapat*
*+1 (973) 953-6299*

[image: https://www.linkedin.com//in/chaibapat25]
[image: https://www.facebook.com/chaibapat]
[image:
https://twitter.com/ChaiBapchya] [image:
https://www.linkedin.com//in/chaibapat25]