I use MXNet in production for a while, where I trained the model in Python and
deploy it in Java env by calling a Scala object. I think that’s enough in my
use case.
class MXNetPredictor (prefix: String, epoch: Int, batchSize: Int) {
val model = FeedForward.load(prefix, epoch)
/**
* predict
*
* @param flat A flat feature input vector.
* @param shape Shape of input data.
* @return
*/
def predict(flat: Array[Float], shape: Array[Int]): Array[NDArray] = {
val ndArray = NDArray.array(flat, Shape(shape))
val data: IndexedSeq[NDArray] = IndexedSeq(ndArray)
val label: IndexedSeq[NDArray] = IndexedSeq()
val inputBatchSize = shape(0)
if (inputBatchSize != batchSize) {
// xxx
} else {
val valData: NDArrayIter = new NDArrayIter(data, label, batchSize)
val prediction = model.predict(valData)
return prediction
}
null
}
/**
* Top-1 prediction results for a batch data.
*
* @param flat A flat feature input vector.
* @param shape Shape of input data.
* @return Return top-1 prediction results for a batch data.
*/
def predictTop1(flat: Array[Float], shape: Array[Int]): NDArray = {
val prediction = predict(flat, shape)(0)
NDArray.argmaxChannel(prediction)
}
}
> 在 2017年8月17日,上午5:32,Joern Kottmann <[email protected]> 写道:
>
> With Java API I mean a set of classes I can use from Java, I tried
> this with the current Scala API but wasn't very successful. Probably
> if you know a bit about Scala internals, you can figure it all out but
> this makes it kind of unpleasant to use. You don't necessarily need to
> write Java code to built a Java API, so you can also write Scala code
> and sticking to certain rules to make it callable from Java code
> without magic tricks.
>
> So yeah, maybe we should just take a look at the Scala API, come up
> with a list of things that are difficult when used from Java code and
> see how it can be improved. That approach probably at least gives you
> the advantages mentioned here before, quick to do, no duplication,
> etc.
>
> Afterwards we could still work on an approach for Java which goes
> beyond "build a Scala API wrapper".
>
> If you look at the quick wins, maybe a good approach would just be to
> give the following advice to people who need to access MXNet from Java
> code:
> - Integrate MXNet with custom Scala code
> - Use a maven/gradle build to create a module of the integration
> which can be called from your Java code
>
> Jörn
>
>
>
> On Wed, Aug 16, 2017 at 10:02 PM, Nan Zhu <[email protected]> wrote:
>> Hi, Joern,
>>
>> when you say "Java API " it's sharing scala impl or not?
>>
>> Best,
>>
>> Nan
>>
>> On Wed, Aug 16, 2017 at 12:46 PM, Joern Kottmann <[email protected]> wrote:
>>
>>> Seems like we are all agree about the idea to add a Java API.
>>>
>>> Maybe it is just me, but it wouldn't at all make sense for me (OpenNLP
>>> use case) to use the Java API when it requires a Scala dependency,
>>> because at that point I would be better of just using the Scala API,
>>> and ensure that the things I build are compatible with Java.
>>>
>>> So if I don't want to add Scala as a dependency then I am better off
>>> building something on top of a generated JNI layer. As far as I can
>>> tell from my tests with the scala-package you can get quite far with
>>> MXNet using NDArray and the Symbol API.
>>>
>>> Maybe we could work on this from two sides as described by Pracheer.
>>> If we have a well defined Java API you could look at the work I have
>>> done by then and see how it can be plugged in or what can be learnt
>>> from it.
>>>
>>> Jörn
>>>
>>> On Wed, Aug 16, 2017 at 9:05 PM, Nan Zhu <[email protected]> wrote:
>>>> +1 for Sandeep's suggestion
>>>>
>>>> On Wed, Aug 16, 2017 at 11:21 AM, YiZhi Liu <[email protected]> wrote:
>>>>
>>>>> Agree with Sandeep, while I guess the performance won't change. But
>>>>> yes, benchmark talks.
>>>>>
>>>>> Moreover, in Scala package we use macros to generate operators
>>>>> automatically, which will require more efforts if we switch to pure
>>>>> Java.
>>>>>
>>>>> 2017-08-17 2:12 GMT+08:00 sandeep krishnamurthy <
>>>>> [email protected]>:
>>>>>> The fastest way to get Java binding is through building Java native
>>>>>> wrappers on Scala package.
>>>>>> Disadvantages would be:
>>>>>> * *Bloated library size: *May not be suitable for users planning to
>>>>> use
>>>>>> Java APIs in Android of such smaller systems.
>>>>>> * *Performance:* Performance may not be as good as building
>>> directly
>>>>>> over JNI and implementing ground up. For example, taking NDArray
>>>>> dimensions
>>>>>> as Java ArrayList then converting it to Scala Seq to adapt for Scala
>>>>>> NDArray API and more such adapters.
>>>>>>
>>>>>> However, building ground up from JNI would be a huge effort without
>>>>>> actually getting feedback from users early.
>>>>>>
>>>>>> *My Plan:*
>>>>>> 1. Build Java interface on top of Scala package.
>>>>>> 2. Get early feedback from users. It may turn out Java is not a great
>>>>>> candidate for DL training jobs.
>>>>>> 3. Solidify the interface (APIs) for Java users.
>>>>>> 4. Do performance benchmarks to see Scala Native / Java interface.
>>> This
>>>>>> gives us comparable numbers on performance in Java.
>>>>>> 5. Over a period of time replace underlying Scala usage with JNI base
>>> and
>>>>>> native Java implementation. Provided feedback from users is positive.
>>>>>>
>>>>>> Comments/Suggestion?
>>>>>>
>>>>>> Regards,
>>>>>> Sandeep
>>>>>>
>>>>>>
>>>>>> On Wed, Aug 16, 2017 at 10:56 AM, YiZhi Liu <[email protected]>
>>> wrote:
>>>>>>
>>>>>>> What Nan and I worried about is the re-implementation of something
>>>>>>> like https://github.com/apache/incubator-mxnet/blob/master/
>>>>>>> scala-package/core/src/main/scala/ml/dmlc/mxnet/Model.scala#L246,
>>>>>>> and the executorManager, NDArray, KVStore ... it uses.
>>>>>>>
>>>>>>> the C API stays at the very low level. If this is the purpose, we can
>>>>>>> simply move ml.dmlc.mxnet.LibInfo to 'java' folder and compile
>>> without
>>>>>>> scala, no need to introduce JavaCPP. But I don't think this is what
>>>>>>> users want.
>>>>>>>
>>>>>>> 2017-08-17 1:41 GMT+08:00 Joern Kottmann <[email protected]>:
>>>>>>>> There will be a new scala version one day, and the story we had
>>> with
>>>>>>>> going from 2.10 to 2.11 might just repeat. In the end if you make a
>>>>>>>> dependency using scala you just end up making it for the currently
>>>>>>>> popular scala versions. And that might be ok for projects with
>>>>>>>> developers who are familiar with these issues, but it is not ok for
>>>>>>>> java projects, where people might not expect it or know about these
>>>>>>>> problems. It just makes it harder to use.
>>>>>>>>
>>>>>>>> To me it looks like that the C API is very stable and used by
>>> all/most
>>>>>>>> other APIs. If we have a Java API - accessing the C API via
>>> JavaCPP -
>>>>>>>> then we should end up with a pretty stable solution and a lot the
>>> code
>>>>>>>> that is duplicated with the Scala API is the generated code.
>>>>>>>>
>>>>>>>> I think we should explore this possible way of implementing it
>>> with a
>>>>>>>> proof-of-concept.
>>>>>>>>
>>>>>>>> And if we have a well made Java API it might be something which
>>> maybe
>>>>>>>> wouldn't need a lot of additions to be pleasurable to use from
>>> scala.
>>>>>>>>
>>>>>>>> Jörn
>>>>>>>>
>>>>>>>> On Wed, Aug 16, 2017 at 6:45 PM, Nan Zhu <[email protected]>
>>>>> wrote:
>>>>>>>>> I don't think there will be problems under "11", did the user see
>>>>>>> concrete
>>>>>>>>> errors?
>>>>>>>>>
>>>>>>>>> Best,
>>>>>>>>>
>>>>>>>>> Nan
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Wed, Aug 16, 2017 at 9:30 AM, YiZhi Liu <[email protected]>
>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Hi Nan,
>>>>>>>>>>
>>>>>>>>>> Users have 2.11, but with a different minor version, will it
>>> cause
>>>>>>>>>> conflicts?
>>>>>>>>>>
>>>>>>>>>> 2017-08-17 0:19 GMT+08:00 Nan Zhu <[email protected]>:
>>>>>>>>>>> Hi, Yizhi,
>>>>>>>>>>>
>>>>>>>>>>> You mean users have 2.10 env while we assemble 2.11 in it?
>>>>>>>>>>>
>>>>>>>>>>> Best,
>>>>>>>>>>>
>>>>>>>>>>> Nan
>>>>>>>>>>>
>>>>>>>>>>> On Wed, Aug 16, 2017 at 9:08 AM, YiZhi Liu <
>>> [email protected]>
>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Hi Joern,
>>>>>>>>>>>>
>>>>>>>>>>>> The point is that, the front is not a simple wrapper of
>>> c_api.h,
>>>>> as
>>>>>>>>>>>> you mentioned, which can be easily achieved by JavaCPP.
>>>>>>>>>>>>
>>>>>>>>>>>> I have noticed the potential conflicts between the assembled
>>>>> scala
>>>>>>>>>>>> library and the one in users' environment. Can we remove the
>>>>> scala
>>>>>>>>>>>> library from the assembly jar? @Nan It wouldn't be a problem
>>>>> since
>>>>>>> the
>>>>>>>>>>>> scala libraries with same major version are compatible.
>>>>>>>>>>>>
>>>>>>>>>>>> 2017-08-16 23:49 GMT+08:00 Joern Kottmann <[email protected]
>>>> :
>>>>>>>>>>>>> Hello,
>>>>>>>>>>>>>
>>>>>>>>>>>>> I personally had quite some issues with Scala dependencies
>>> in
>>>>>>>>>>>>> different versions and Spark, where one version is not
>>>>> compatible
>>>>>>> with
>>>>>>>>>>>>> the other version. Then you need to debug the dependency
>>> tree
>>>>> to
>>>>>>> find
>>>>>>>>>>>>> the places where the versions don't match. Every project
>>> which
>>>>>>> would
>>>>>>>>>>>>> like to use MXnet then has to depend on Scala and might also
>>>>> get
>>>>>>>>>>>>> conflicts if other dependencies depend on different Scala
>>>>>>> versions.
>>>>>>>>>>>>> Probably something which will cause issues for some of your
>>>>> users.
>>>>>>>>>>>>> Users who want to use Java might not be familiar with Scala
>>>>>>> dependency
>>>>>>>>>>>>> problems and have a hard time resolving them by getting
>>> strange
>>>>>>> error
>>>>>>>>>>>>> messages.
>>>>>>>>>>>>>
>>>>>>>>>>>>> The JNI layer could be generated with JavaCPP, then we would
>>>>> not
>>>>>>> need
>>>>>>>>>>>>> to write/maintain the C and the jvm side for that our self.
>>>>>>>>>>>>> A good example of JavaCPP and Scala usage is Apache Mahout
>>> [1].
>>>>>>>>>>>>>
>>>>>>>>>>>>> Even if we don't use JavaCPP, the JNI layer should be easy
>>> to
>>>>> get
>>>>>>> into
>>>>>>>>>>>>> a state where both can share it, the current Scala JNI
>>> layers
>>>>>>> LibInfo
>>>>>>>>>>>>> classes could be converted to Java classes and would in most
>>>>> cases
>>>>>>>>>>>>> require only minor changes in the Scala code.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Jörn
>>>>>>>>>>>>>
>>>>>>>>>>>>> [1] https://github.com/apache/mahout/tree/master/viennacl/
>>>>>>> src/main
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Wed, Aug 16, 2017 at 5:30 PM, Nan Zhu <
>>>>> [email protected]>
>>>>>>>>>> wrote:
>>>>>>>>>>>>>> I agree with Yizhi
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> My major concern is the duplicate implementations, which
>>> are
>>>>>>> usually
>>>>>>>>>>>> one of
>>>>>>>>>>>>>> the major sources of bugs, especially with two languages
>>> which
>>>>>>> are
>>>>>>>>>>>>>> naturally interactive (OK, Calling Scala from Java might
>>> need
>>>>>>> some
>>>>>>>>>> more
>>>>>>>>>>>>>> efforts). It is just like we provide C++ & C APIs of MxNet
>>> in
>>>>> two
>>>>>>>>>>>> separated
>>>>>>>>>>>>>> packages.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> About dependency problem, when you say "As far as I see
>>> this
>>>>> has
>>>>>>> the
>>>>>>>>>>>> great
>>>>>>>>>>>>>> disadvantage that the Java API would force Scala as a
>>>>> dependency
>>>>>>> onto
>>>>>>>>>>>> the
>>>>>>>>>>>>>> java users.", would you please give a concrete example
>>> causing
>>>>>>>>>> critical
>>>>>>>>>>>>>> issues?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Nan
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Wed, Aug 16, 2017 at 8:19 AM, YiZhi Liu <
>>>>> [email protected]>
>>>>>>>>>> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> If we build the Java API from the very beginning, i.e. the
>>>>> JNI
>>>>>>> part,
>>>>>>>>>>>>>>> we have to rewrite the codes for training, predict,
>>>>> inferShape,
>>>>>>> etc.
>>>>>>>>>>>>>>> It would be too heavy to maintain a totally new front
>>>>> language.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> As far as I see, I don't think Scala library dependency
>>> would
>>>>>>> be a
>>>>>>>>>> big
>>>>>>>>>>>>>>> problem in most cases, unless we are going to use it in
>>>>> embedded
>>>>>>>>>>>>>>> devices. Could you illustrate some use-cases where you
>>> cannot
>>>>>>>>>> involve
>>>>>>>>>>>>>>> Scala dependencies?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> 2017-08-16 22:13 GMT+08:00 Joern Kottmann <
>>>>> [email protected]>:
>>>>>>>>>>>>>>>> Hello,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> the approach which is taken by Spark is described here
>>> [1].
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> As far as I see this has the great disadvantage that the
>>>>> Java
>>>>>>> API
>>>>>>>>>>>>>>>> would force Scala as a dependency onto the java users.
>>>>>>>>>>>>>>>> For a library it is always a great advantage if it
>>> doesn't
>>>>>>> have
>>>>>>>>>> many
>>>>>>>>>>>>>>>> dependencies, or zero dependencies. In our case it
>>> could be
>>>>>>> quite
>>>>>>>>>>>>>>>> realistic to have a thin wrapper around the C API
>>> without
>>>>>>> needing
>>>>>>>>>> any
>>>>>>>>>>>>>>>> other dependencies (or only dependencies which can't be
>>>>>>> avoided).
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> The JNI layer could easily be shared between the Java
>>> and
>>>>>>> Scala
>>>>>>>>>> API.
>>>>>>>>>>>>>>>> As far as I understand is the JNI layer in the Scala API
>>>>>>> anyway
>>>>>>>>>>>>>>>> private and a change to it wouldn't require that the
>>> public
>>>>>>> part
>>>>>>>>>> of
>>>>>>>>>>>>>>>> the Scala API is changed.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> What do you think?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Jörn
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> [1] https://cwiki.apache.org/
>>>>> confluence/display/SPARK/Java+
>>>>>>>>>>>> API+Internals
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Wed, Aug 16, 2017 at 3:39 PM, YiZhi Liu <
>>>>>>> [email protected]>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>> Hi Joern,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I suggest to build Java API as a wrapper of Scala API,
>>>>> re-use
>>>>>>>>>> most
>>>>>>>>>>>> of
>>>>>>>>>>>>>>>>> the procedures. Referring to the Java API in Apache
>>> Spark.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> 2017-08-16 18:21 GMT+08:00 Joern Kottmann <
>>>>> [email protected]
>>>>>>>> :
>>>>>>>>>>>>>>>>>> Hello all,
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> I would like to propose the addition of a Java API to
>>>>> MXNet.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> There has been some previous work done for the Scala
>>> API,
>>>>>>> and it
>>>>>>>>>>>> makes
>>>>>>>>>>>>>>>>>> sense to at least share the JNI layer between the two.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> The Java API probably should be aligned with the
>>> Python
>>>>> API
>>>>>>>>>> (and
>>>>>>>>>>>>>>>>>> others which exist already) with a few changes to give
>>>>> it a
>>>>>>>>>> native
>>>>>>>>>>>>>>>>>> Java feel.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> As far as I understand there are multiple people
>>>>> interested
>>>>>>> to
>>>>>>>>>>>> work on
>>>>>>>>>>>>>>>>>> this and it would be good to maybe come up with a
>>> written
>>>>>>>>>> proposal
>>>>>>>>>>>> on
>>>>>>>>>>>>>>>>>> how things should be.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> My motivation is to get a Java API which can be used
>>> by
>>>>>>> Apache
>>>>>>>>>>>> OpenNLP
>>>>>>>>>>>>>>>>>> to solve various NLP tasks using Deep Learning based
>>>>>>> approaches
>>>>>>>>>>>> and I
>>>>>>>>>>>>>>>>>> am also interested to work on MXNet.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Jörn
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>> Yizhi Liu
>>>>>>>>>>>>>>>>> DMLC member
>>>>>>>>>>>>>>>>> Technical Manager
>>>>>>>>>>>>>>>>> Qihoo 360 Inc, Shanghai, China
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>> Yizhi Liu
>>>>>>>>>>>>>>> DMLC member
>>>>>>>>>>>>>>> Technical Manager
>>>>>>>>>>>>>>> Qihoo 360 Inc, Shanghai, China
>>>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> --
>>>>>>>>>>>> Yizhi Liu
>>>>>>>>>>>> DMLC member
>>>>>>>>>>>> Technical Manager
>>>>>>>>>>>> Qihoo 360 Inc, Shanghai, China
>>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> Yizhi Liu
>>>>>>>>>> DMLC member
>>>>>>>>>> Technical Manager
>>>>>>>>>> Qihoo 360 Inc, Shanghai, China
>>>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Yizhi Liu
>>>>>>> DMLC member
>>>>>>> Technical Manager
>>>>>>> Qihoo 360 Inc, Shanghai, China
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Sandeep Krishnamurthy
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Yizhi Liu
>>>>> DMLC member
>>>>> Technical Manager
>>>>> Qihoo 360 Inc, Shanghai, China
>>>>>
>>>