Have you looked at ikvm?
http://www.ikvm.net/devguide/java2net.html
________________________________
From: Kenneth Tran<mailto:[email protected]>
Sent: 12/16/2013 7:43 PM
To: user<mailto:[email protected]>
Subject: Re: Best ways to use Spark with .NET code
Hi Matei,
1. If I understand pipe correctly, I don't think that it can solve the problem
if the algorithm is iterative and requires a reduction step in each iteration.
Consider this simple linear regression example
// Example: Batch-gradient-descent logistic regression, ignoring
biases
for (int i = 0; i < NIter; i++) {
var gradient = data.Sum(p => (w dot p.x - p.y) * p.x);
w -= rate * gradient;
}
In order to use pipe as you said, one needs to move the for loop to the calling
code (in Java), which may not be simple when dealing with more complex code and
would still require (major) re-factoring of the ML libraries. Furthermore,
there will be I/O at each iteration, which makes Spark not different from
Hadoop MapReduce.
2. Before asking this, I have also looked at jni4net. Besides the usage
complexity, jni4net has a few red flags
* It hasn't been developed since 2011 although the latest status is alpha
* Its license terms (and code integrity) may not pass our legal department
* Its robustness and efficiency are dubious.
Anyway, I'm looking at some other alternatives (e.g. JNBridge).
Thanks.
-Ken
On Mon, Dec 16, 2013 at 12:04 PM, Matei Zaharia
<[email protected]<mailto:[email protected]>> wrote:
Hi Kenneth,
Try using the RDD.pipe() operator in Spark, which lets you call out to an
external process by passing data to it through standard in/out. This will let
you call programs written in C# (e.g. that use your ML libraries) from a Spark
program.
I believe there are other projects enabling communication from Java to .NET,
e.g. http://jni4net.sourceforge.net, but I’m not sure how easy they’ll be to
use.
Matei
On Dec 16, 2013, at 10:54 AM, Kenneth Tran
<[email protected]<mailto:[email protected]>> wrote:
Hi,
We have a large ML code base in .NET. Spark seems cool and we want to leverage
it. What would be the best strategies to bridge the our .NET code and Spark?
1. Initiate a Spark .NET project
2. A lightweight bridge between .NET and Java
While (1) sound too daunting, it's not clear to me how to do (2) easily and
efficiently.
I'm willing to contribute to (1) if there's already an existing effort.