+1 for Sandeep's suggestion On Wed, Aug 16, 2017 at 11:21 AM, YiZhi Liu <[email protected]> wrote:
> Agree with Sandeep, while I guess the performance won't change. But > yes, benchmark talks. > > Moreover, in Scala package we use macros to generate operators > automatically, which will require more efforts if we switch to pure > Java. > > 2017-08-17 2:12 GMT+08:00 sandeep krishnamurthy < > [email protected]>: > > The fastest way to get Java binding is through building Java native > > wrappers on Scala package. > > Disadvantages would be: > > * *Bloated library size: *May not be suitable for users planning to > use > > Java APIs in Android of such smaller systems. > > * *Performance:* Performance may not be as good as building directly > > over JNI and implementing ground up. For example, taking NDArray > dimensions > > as Java ArrayList then converting it to Scala Seq to adapt for Scala > > NDArray API and more such adapters. > > > > However, building ground up from JNI would be a huge effort without > > actually getting feedback from users early. > > > > *My Plan:* > > 1. Build Java interface on top of Scala package. > > 2. Get early feedback from users. It may turn out Java is not a great > > candidate for DL training jobs. > > 3. Solidify the interface (APIs) for Java users. > > 4. Do performance benchmarks to see Scala Native / Java interface. This > > gives us comparable numbers on performance in Java. > > 5. Over a period of time replace underlying Scala usage with JNI base and > > native Java implementation. Provided feedback from users is positive. > > > > Comments/Suggestion? > > > > Regards, > > Sandeep > > > > > > On Wed, Aug 16, 2017 at 10:56 AM, YiZhi Liu <[email protected]> wrote: > > > >> What Nan and I worried about is the re-implementation of something > >> like https://github.com/apache/incubator-mxnet/blob/master/ > >> scala-package/core/src/main/scala/ml/dmlc/mxnet/Model.scala#L246, > >> and the executorManager, NDArray, KVStore ... it uses. > >> > >> the C API stays at the very low level. If this is the purpose, we can > >> simply move ml.dmlc.mxnet.LibInfo to 'java' folder and compile without > >> scala, no need to introduce JavaCPP. But I don't think this is what > >> users want. > >> > >> 2017-08-17 1:41 GMT+08:00 Joern Kottmann <[email protected]>: > >> > There will be a new scala version one day, and the story we had with > >> > going from 2.10 to 2.11 might just repeat. In the end if you make a > >> > dependency using scala you just end up making it for the currently > >> > popular scala versions. And that might be ok for projects with > >> > developers who are familiar with these issues, but it is not ok for > >> > java projects, where people might not expect it or know about these > >> > problems. It just makes it harder to use. > >> > > >> > To me it looks like that the C API is very stable and used by all/most > >> > other APIs. If we have a Java API - accessing the C API via JavaCPP - > >> > then we should end up with a pretty stable solution and a lot the code > >> > that is duplicated with the Scala API is the generated code. > >> > > >> > I think we should explore this possible way of implementing it with a > >> > proof-of-concept. > >> > > >> > And if we have a well made Java API it might be something which maybe > >> > wouldn't need a lot of additions to be pleasurable to use from scala. > >> > > >> > Jörn > >> > > >> > On Wed, Aug 16, 2017 at 6:45 PM, Nan Zhu <[email protected]> > wrote: > >> >> I don't think there will be problems under "11", did the user see > >> concrete > >> >> errors? > >> >> > >> >> Best, > >> >> > >> >> Nan > >> >> > >> >> > >> >> > >> >> On Wed, Aug 16, 2017 at 9:30 AM, YiZhi Liu <[email protected]> > wrote: > >> >> > >> >>> Hi Nan, > >> >>> > >> >>> Users have 2.11, but with a different minor version, will it cause > >> >>> conflicts? > >> >>> > >> >>> 2017-08-17 0:19 GMT+08:00 Nan Zhu <[email protected]>: > >> >>> > Hi, Yizhi, > >> >>> > > >> >>> > You mean users have 2.10 env while we assemble 2.11 in it? > >> >>> > > >> >>> > Best, > >> >>> > > >> >>> > Nan > >> >>> > > >> >>> > On Wed, Aug 16, 2017 at 9:08 AM, YiZhi Liu <[email protected]> > >> wrote: > >> >>> > > >> >>> >> Hi Joern, > >> >>> >> > >> >>> >> The point is that, the front is not a simple wrapper of c_api.h, > as > >> >>> >> you mentioned, which can be easily achieved by JavaCPP. > >> >>> >> > >> >>> >> I have noticed the potential conflicts between the assembled > scala > >> >>> >> library and the one in users' environment. Can we remove the > scala > >> >>> >> library from the assembly jar? @Nan It wouldn't be a problem > since > >> the > >> >>> >> scala libraries with same major version are compatible. > >> >>> >> > >> >>> >> 2017-08-16 23:49 GMT+08:00 Joern Kottmann <[email protected]>: > >> >>> >> > Hello, > >> >>> >> > > >> >>> >> > I personally had quite some issues with Scala dependencies in > >> >>> >> > different versions and Spark, where one version is not > compatible > >> with > >> >>> >> > the other version. Then you need to debug the dependency tree > to > >> find > >> >>> >> > the places where the versions don't match. Every project which > >> would > >> >>> >> > like to use MXnet then has to depend on Scala and might also > get > >> >>> >> > conflicts if other dependencies depend on different Scala > >> versions. > >> >>> >> > Probably something which will cause issues for some of your > users. > >> >>> >> > Users who want to use Java might not be familiar with Scala > >> dependency > >> >>> >> > problems and have a hard time resolving them by getting strange > >> error > >> >>> >> > messages. > >> >>> >> > > >> >>> >> > The JNI layer could be generated with JavaCPP, then we would > not > >> need > >> >>> >> > to write/maintain the C and the jvm side for that our self. > >> >>> >> > A good example of JavaCPP and Scala usage is Apache Mahout [1]. > >> >>> >> > > >> >>> >> > Even if we don't use JavaCPP, the JNI layer should be easy to > get > >> into > >> >>> >> > a state where both can share it, the current Scala JNI layers > >> LibInfo > >> >>> >> > classes could be converted to Java classes and would in most > cases > >> >>> >> > require only minor changes in the Scala code. > >> >>> >> > > >> >>> >> > Jörn > >> >>> >> > > >> >>> >> > [1] https://github.com/apache/mahout/tree/master/viennacl/ > >> src/main > >> >>> >> > > >> >>> >> > On Wed, Aug 16, 2017 at 5:30 PM, Nan Zhu < > [email protected]> > >> >>> wrote: > >> >>> >> >> I agree with Yizhi > >> >>> >> >> > >> >>> >> >> My major concern is the duplicate implementations, which are > >> usually > >> >>> >> one of > >> >>> >> >> the major sources of bugs, especially with two languages which > >> are > >> >>> >> >> naturally interactive (OK, Calling Scala from Java might need > >> some > >> >>> more > >> >>> >> >> efforts). It is just like we provide C++ & C APIs of MxNet in > two > >> >>> >> separated > >> >>> >> >> packages. > >> >>> >> >> > >> >>> >> >> About dependency problem, when you say "As far as I see this > has > >> the > >> >>> >> great > >> >>> >> >> disadvantage that the Java API would force Scala as a > dependency > >> onto > >> >>> >> the > >> >>> >> >> java users.", would you please give a concrete example causing > >> >>> critical > >> >>> >> >> issues? > >> >>> >> >> > >> >>> >> >> Best, > >> >>> >> >> > >> >>> >> >> Nan > >> >>> >> >> > >> >>> >> >> > >> >>> >> >> > >> >>> >> >> On Wed, Aug 16, 2017 at 8:19 AM, YiZhi Liu < > [email protected]> > >> >>> wrote: > >> >>> >> >> > >> >>> >> >>> Hi, > >> >>> >> >>> > >> >>> >> >>> If we build the Java API from the very beginning, i.e. the > JNI > >> part, > >> >>> >> >>> we have to rewrite the codes for training, predict, > inferShape, > >> etc. > >> >>> >> >>> It would be too heavy to maintain a totally new front > language. > >> >>> >> >>> > >> >>> >> >>> As far as I see, I don't think Scala library dependency would > >> be a > >> >>> big > >> >>> >> >>> problem in most cases, unless we are going to use it in > embedded > >> >>> >> >>> devices. Could you illustrate some use-cases where you cannot > >> >>> involve > >> >>> >> >>> Scala dependencies? > >> >>> >> >>> > >> >>> >> >>> 2017-08-16 22:13 GMT+08:00 Joern Kottmann < > [email protected]>: > >> >>> >> >>> > Hello, > >> >>> >> >>> > > >> >>> >> >>> > the approach which is taken by Spark is described here [1]. > >> >>> >> >>> > > >> >>> >> >>> > As far as I see this has the great disadvantage that the > Java > >> API > >> >>> >> >>> > would force Scala as a dependency onto the java users. > >> >>> >> >>> > For a library it is always a great advantage if it doesn't > >> have > >> >>> many > >> >>> >> >>> > dependencies, or zero dependencies. In our case it could be > >> quite > >> >>> >> >>> > realistic to have a thin wrapper around the C API without > >> needing > >> >>> any > >> >>> >> >>> > other dependencies (or only dependencies which can't be > >> avoided). > >> >>> >> >>> > > >> >>> >> >>> > The JNI layer could easily be shared between the Java and > >> Scala > >> >>> API. > >> >>> >> >>> > As far as I understand is the JNI layer in the Scala API > >> anyway > >> >>> >> >>> > private and a change to it wouldn't require that the public > >> part > >> >>> of > >> >>> >> >>> > the Scala API is changed. > >> >>> >> >>> > > >> >>> >> >>> > What do you think? > >> >>> >> >>> > > >> >>> >> >>> > Jörn > >> >>> >> >>> > > >> >>> >> >>> > [1] https://cwiki.apache.org/ > confluence/display/SPARK/Java+ > >> >>> >> API+Internals > >> >>> >> >>> > > >> >>> >> >>> > On Wed, Aug 16, 2017 at 3:39 PM, YiZhi Liu < > >> [email protected]> > >> >>> >> wrote: > >> >>> >> >>> >> Hi Joern, > >> >>> >> >>> >> > >> >>> >> >>> >> I suggest to build Java API as a wrapper of Scala API, > re-use > >> >>> most > >> >>> >> of > >> >>> >> >>> >> the procedures. Referring to the Java API in Apache Spark. > >> >>> >> >>> >> > >> >>> >> >>> >> 2017-08-16 18:21 GMT+08:00 Joern Kottmann < > [email protected] > >> >: > >> >>> >> >>> >>> Hello all, > >> >>> >> >>> >>> > >> >>> >> >>> >>> I would like to propose the addition of a Java API to > MXNet. > >> >>> >> >>> >>> > >> >>> >> >>> >>> There has been some previous work done for the Scala API, > >> and it > >> >>> >> makes > >> >>> >> >>> >>> sense to at least share the JNI layer between the two. > >> >>> >> >>> >>> > >> >>> >> >>> >>> The Java API probably should be aligned with the Python > API > >> >>> (and > >> >>> >> >>> >>> others which exist already) with a few changes to give > it a > >> >>> native > >> >>> >> >>> >>> Java feel. > >> >>> >> >>> >>> > >> >>> >> >>> >>> As far as I understand there are multiple people > interested > >> to > >> >>> >> work on > >> >>> >> >>> >>> this and it would be good to maybe come up with a written > >> >>> proposal > >> >>> >> on > >> >>> >> >>> >>> how things should be. > >> >>> >> >>> >>> > >> >>> >> >>> >>> My motivation is to get a Java API which can be used by > >> Apache > >> >>> >> OpenNLP > >> >>> >> >>> >>> to solve various NLP tasks using Deep Learning based > >> approaches > >> >>> >> and I > >> >>> >> >>> >>> am also interested to work on MXNet. > >> >>> >> >>> >>> > >> >>> >> >>> >>> Jörn > >> >>> >> >>> >> > >> >>> >> >>> >> > >> >>> >> >>> >> > >> >>> >> >>> >> -- > >> >>> >> >>> >> Yizhi Liu > >> >>> >> >>> >> DMLC member > >> >>> >> >>> >> Technical Manager > >> >>> >> >>> >> Qihoo 360 Inc, Shanghai, China > >> >>> >> >>> > >> >>> >> >>> > >> >>> >> >>> > >> >>> >> >>> -- > >> >>> >> >>> Yizhi Liu > >> >>> >> >>> DMLC member > >> >>> >> >>> Technical Manager > >> >>> >> >>> Qihoo 360 Inc, Shanghai, China > >> >>> >> >>> > >> >>> >> > >> >>> >> > >> >>> >> > >> >>> >> -- > >> >>> >> Yizhi Liu > >> >>> >> DMLC member > >> >>> >> Technical Manager > >> >>> >> Qihoo 360 Inc, Shanghai, China > >> >>> >> > >> >>> > >> >>> > >> >>> > >> >>> -- > >> >>> Yizhi Liu > >> >>> DMLC member > >> >>> Technical Manager > >> >>> Qihoo 360 Inc, Shanghai, China > >> >>> > >> > >> > >> > >> -- > >> Yizhi Liu > >> DMLC member > >> Technical Manager > >> Qihoo 360 Inc, Shanghai, China > >> > > > > > > > > -- > > Sandeep Krishnamurthy > > > > -- > Yizhi Liu > DMLC member > Technical Manager > Qihoo 360 Inc, Shanghai, China >
