Hi, Kazuaki, It’s very similar with my requirement, thanks! It seems they want to write to a C++ process with zero copy, and I want to do both read/write with zero copy. Any one knows how to obtain more information like current status of this JIRA entry?
Best Regards, Jia On Dec 7, 2015, at 12:26 PM, Kazuaki Ishizaki <ishiz...@jp.ibm.com> wrote: > Is this JIRA entry related to what you want? > https://issues.apache.org/jira/browse/SPARK-10399 > > Regards, > Kazuaki Ishizaki > > > > From: Jia <jacqueline...@gmail.com> > To: Dewful <dew...@gmail.com> > Cc: "user @spark" <u...@spark.apache.org>, dev@spark.apache.org, Robin > East <robin.e...@xense.co.uk> > Date: 2015/12/08 03:17 > Subject: Re: Shared memory between C++ process and Spark > > > > Thanks, Dewful! > > My impression is that Tachyon is a very nice in-memory file system that can > connect to multiple storages. > However, because our data is also hold in memory, I suspect that connecting > to Spark directly may be more efficient in performance. > But definitely I need to look at Tachyon more carefully, in case it has a > very efficient C++ binding mechanism. > > Best Regards, > Jia > > On Dec 7, 2015, at 11:46 AM, Dewful <dew...@gmail.com> wrote: > Maybe looking into something like Tachyon would help, I see some sample c++ > bindings, not sure how much of the current functionality they support... > Hi, Robin, > Thanks for your reply and thanks for copying my question to user mailing list. > Yes, we have a distributed C++ application, that will store data on each node > in the cluster, and we hope to leverage Spark to do more fancy analytics on > those data. But we need high performance, that’s why we want shared memory. > Suggestions will be highly appreciated! > > Best Regards, > Jia > > On Dec 7, 2015, at 10:54 AM, Robin East <robin.e...@xense.co.uk> wrote: > > -dev, +user (this is not a question about development of Spark itself so > you’ll get more answers in the user mailing list) > > First up let me say that I don’t really know how this could be done - I’m > sure it would be possible with enough tinkering but it’s not clear what you > are trying to achieve. Spark is a distributed processing system, it has > multiple JVMs running on different machines that each run a small part of the > overall processing. Unless you have some sort of idea to have multiple C++ > processes collocated with the distributed JVMs using named memory mapped > files doesn’t make architectural sense. > ------------------------------------------------------------------------------- > Robin East > Spark GraphX in Action Michael Malak and Robin East > Manning Publications Co. > http://www.manning.com/books/spark-graphx-in-action > > > > > > On 6 Dec 2015, at 20:43, Jia <jacqueline...@gmail.com> wrote: > > Dears, for one project, I need to implement something so Spark can read data > from a C++ process. > To provide high performance, I really hope to implement this through shared > memory between the C++ process and Java JVM process. > It seems it may be possible to use named memory mapped files and JNI to do > this, but I wonder whether there is any existing efforts or more efficient > approach to do this? > Thank you very much! > > Best Regards, > Jia > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org > For additional commands, e-mail: dev-h...@spark.apache.org > > > > > >