Small number of use cases is an important reason. I see many people interested in Julia & Spark integration, but almost nobody interested *enough* to invest time into its development.
Another reason is that Julia infrastructure (and especially Julia-Java integration) is not mature enough to make integrations of such level. Instability of JNI, inconsistency between Java and Scala, serialization issues in Julia - these are just few difficulties I faced while working on Sparta.jl. Many people do great work to fix such issues, but at the moment Julia is far behind, say, Python. Finally, it's just huge amount of work. I don't mean basic functionality like map and reduce operations over text file, but the whole variety of supported data formats, DataFrames, subprojects like Spark Streaming and MLlib, etc. And without these features we get back to paragraph 1 - nobody is interested enough to invest time when there's already PySpark and SparkR. All of these makes me think that similar framework for big data analytics written in pure Julia could bypass many of these issues and generate more interest in Julia community. I wonder if somebody would want to take part in such a challenge. On Sat, Nov 14, 2015 at 2:20 PM, Christof Stocker < [email protected]> wrote: > Personally, I think the most progress is made if some person has a huge > interest in doing it. I for one have a big interest in using Julia for ML, > but I myself am not particularly interested in using Spark from Julia. I > just don't feel like it would be useful to me for anything. In the > situations that I do use spark I don't feel like I would gain anything from > using it from Julia. That of course doesn't mean that it wouldn't be very > useful to others, but it does mean that it is unlikely that I will spend > any of my time on it in the near future. Maybe other people are in similar > situations. > > What I can leave you with is this: I think open source is a place in which > one person can make all the difference in the world, if he/she sets his/her > mind to it. So if someone is interested in doing it, go for it. I don't > think it's to far fetched to assume that once the functionality is > available (and reasonable mature) that people will gravitate towards it. > > > On 2015-11-14 11:51, Frank wrote: > > Hi, > > I would have expected more interest in a Spark & Julia integration. Is the > lack of interest due to > a) missing use cases > b) fact that both Spark and Julia are very new - relatively speaking > > What do you think? > Thanks > Frank > > > On Wednesday, April 15, 2015 at 11:37:50 AM UTC+2, Tanmay K. Mohapatra > wrote: >> >> This thread is to discuss Julia - Spark integration further. >> >> This is a continuation of discussions from >> <https://groups.google.com/forum/#%21topic/julia-users/LeCnTmOvUbw> >> https://groups.google.com/forum/#!topic/julia-users/LeCnTmOvUbw (the >> thread topic was misleading and we could not change it). >> >> To summarize briefly, here are a few interesting packages: >> - https://github.com/d9w/Spark.jl >> - https://github.com/jey/Spock.jl >> - https://github.com/benhamner/MachineLearning.jl >> <https://www.google.com/url?q=https%3A%2F%2Fgithub.com%2Fbenhamner%2FMachineLearning.jl&sa=D&sntz=1&usg=AFQjCNEBun6ioX809NFBqVDu3eMKWzrZBQ> >> - packages at https://github.com/JuliaParallel >> >> We can discuss approaches and coordinate efforts towards whichever looks >> promising. >> > >
