I like the sound of where this is going - seems like a good idea to me. On Thu, Dec 3, 2015 at 8:20 PM, Ran Magen <[email protected]> wrote:
> After digging some more in the code, I retract my ill-informed question. > > Apologies, > Ran > > > On Thu, 3 Dec 2015 at 23:11 Ran Magen <[email protected]> wrote: > > > This would be great for me. > > In Unopop we want to enable running heavy queries in a distributed > manner. > > We figured we could implement some kind of UnipopSparkComputer that > > utilizes the current Spark implementation, but from a quick check we > didn't > > find an obvious way to do that. > > > > Might DefaultInputRDD be a good solution for us? > > > > Cheers, > > Ran > > > > On Wed, 2 Dec 2015 at 22:23 Marko Rodriguez <[email protected]> > wrote: > > > >> Hello, > >> > >> It is possible for us to provide a DefaultInputRDD and > DefaultInputFormat > >> to allow any OLTP graph system to easily load the data into > >> Giraph/Spark/etc. > >> > >> https://issues.apache.org/jira/browse/TINKERPOP3-1015 > >> > >> This is a "quick and dirty" as its single threaded -- no splits. It uses > >> Graph.vertices() to stream in the vertices one at a time. > >> > >> Would people be interested in this feature? It would allow you to, for > >> example, use Spark with Neo4j. Also, another thing we could do to make > this > >> efficient is: > >> > >> List<Iterator<Vertex>> Graph.vertexSplits(int numberOfSplits) > >> > >> Then each graph provider can specify how to do parallel reads. The > >> default implementation would be: > >> > >> List<Iterator<Vertex>> splits = new ArrayList<>(numberOfSplits); > >> list.add(this.vertices()); > >> return splits; > >> > >> Anywho…. random idea as I was doing some Spark InputRDD test suite > stuff. > >> > >> Take care, > >> Marko. > >> > >> http://markorodriguez.com > >> > >> >
