Re: [VOTE] Release Apache MXNet (incubating) version 1.5.0.rc0

2019-06-11 Thread Pedro Larroy
A bit more background into this: While tuning a model using LSTM and convolutions we find that using hybridize with static_alloc and static_shape is 15% slower in the latest revision vs in version 1.4.1 in which using hybridize with static_alloc and static_shape is 10% faster than without.

Re: Does internal quality matters to users?

2019-06-11 Thread Pedro Larroy
Thanks for the good discussion. I actually wasn't referring particularly to our conversations in github with respect of the refactors, but it's nice from you to bring them up. And it's ok to disagree in small things, hopefully we can align in the big ones. I understand that for TVM you might

Re: Does internal quality matters to users?

2019-06-11 Thread Pedro Larroy
To put a recent specific example and focus the discussion (there are many as there are attributes), the shapes in the graph are a vector of Shape set as an attribute using dmlc::any so this makes it very difficult to debug the shapes when you have a graph object. I would have it as a typed

Re: [VOTE] Release Apache MXNet (incubating) version 1.5.0.rc0

2019-06-11 Thread Zhi Zhang
On 2019/06/11 17:36:09, Pedro Larroy wrote: > A bit more background into this: > > While tuning a model using LSTM and convolutions we find that using > hybridize with static_alloc and static_shape is 15% slower in the > latest revision vs in version 1.4.1 in which using hybridize with >

Re: [VOTE] Release Apache MXNet (incubating) version 1.5.0.rc0

2019-06-11 Thread Pedro Larroy
The stack trace doesn't seem to come from MXNet, do you have more info? On Tue, Jun 11, 2019 at 11:46 AM Zhi Zhang wrote: > > > > On 2019/06/11 17:36:09, Pedro Larroy wrote: > > A bit more background into this: > > > > While tuning a model using LSTM and convolutions we find that using > >

Re: [VOTE] Release Apache MXNet (incubating) version 1.5.0.rc0

2019-06-11 Thread Zhang Zhi
-1. Built from source, import mxnet in python cause Segfault. back trace: Thread 1 "python3" received signal SIGSEGV, Segmentation fault. 0x7fff3e8a9f20 in ?? () (gdb) bt #0 0x7fff3e8a9f20 in ?? () #1 0x7fffebbf440c in ReadConfigFile(Configuration&, std::__cxx11::basic_string,

Re: Does internal quality matters to users?

2019-06-11 Thread Tianqi Chen
> Re that particular case. > > The shape of vector will be typed after being fetched and won’t affect the > general effort for programming. Getting the shape vector out contains > around one line of code. > > The str to any map is defined to enable future compatibility of the > general set of

Re: Does internal quality matters to users?

2019-06-11 Thread Pedro Larroy
Another data point. While working with a contributor, he/she is asking to get access to the graph and values of the NDArray (me too) to be able to reason more effectively about an enhacements to the operators: https://github.com/apache/incubator-mxnet/pull/15120 I think gathering in the wiki

Re: [VOTE] Release Apache MXNet (incubating) version 1.5.0.rc0

2019-06-11 Thread Lai Wei
Hi guys, Thanks for the updates. Currently, we are able to confirm Lin's issue with Horovod, and there is a fix pending. [1] Will update later today to see if we need to cancel this vote for the fix. As for the hybridize with static alloc performance regression. IMO it does not need to be a

Re: [VOTE] Release Apache MXNet (incubating) version 1.5.0.rc0

2019-06-11 Thread Aaron Markham
-1 There's an autogenerated file that doesn't get cleaned up in the scala-package folder when you run make clean. This causes the scaladoc step to fail. I'm putting in workaround messaging in the error message and that'll go into master, but if anyone wants to specifically run the scaladocs for

Re: [VOTE] Release Apache MXNet (incubating) version 1.5.0.rc0

2019-06-11 Thread Pedro Larroy
Correction, I wanted to say: 1.5 is 33% faster than 1.4.1 when using hybridize without static_alloc and static_shape. We are claiming that static_alloc should improve speed and in this case it makes it worse. Is that a blocker for the release? Pedro. On Tue, Jun 11, 2019 at 10:36 AM Pedro

Re: [VOTE] Release Apache MXNet (incubating) version 1.5.0.rc0

2019-06-11 Thread Zhi Zhang
On 2019/06/11 18:53:56, Pedro Larroy wrote: > The stack trace doesn't seem to come from MXNet, do you have more info? > > On Tue, Jun 11, 2019 at 11:46 AM Zhi Zhang wrote: > > > > > > > > On 2019/06/11 17:36:09, Pedro Larroy wrote: > > > A bit more background into this: > > > > > > While

Re: Does internal quality matters to users?

2019-06-11 Thread Tianqi Chen
We have thought very carefully when introducing type-erasures, including considering the concerns you raised, and never-the-less have made the decision that resulted in the current design, which strikes the balance of type-erasure and typing. The original intention of the current design is to

Re: [VOTE] Release Apache MXNet (incubating) version 1.5.0.rc0

2019-06-11 Thread Pedro Larroy
Tested with CPU, 2.6x slower. comparing master vs 1.4.1. Looks like a general regression. On Tue, Jun 11, 2019 at 2:31 PM Lai Wei wrote: > > Hi guys, > > Thanks for the updates. Currently, we are able to confirm Lin's issue with > Horovod, and there is a fix pending. [1] > Will update later