[
https://issues.apache.org/jira/browse/SPARK-25994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16857833#comment-16857833
]
Martin Junghanns commented on SPARK-25994:
------------------------------------------
Hi [~RBerenguel]. Thanks a lot for your interest. I think a good way to get
involved is to join the discussion on the PRs. As soon as we have
https://issues.apache.org/jira/browse/SPARK-27300 merged, we'll open a PR that
inludes the API for property graph construction. I think this is the perfect
opportunity to discuss your ideas and get an understanding of how the Python
API should behave. wdyt?
> SPIP: Property Graphs, Cypher Queries, and Algorithms
> -----------------------------------------------------
>
> Key: SPARK-25994
> URL: https://issues.apache.org/jira/browse/SPARK-25994
> Project: Spark
> Issue Type: Epic
> Components: Graph
> Affects Versions: 3.0.0
> Reporter: Xiangrui Meng
> Assignee: Martin Junghanns
> Priority: Major
> Labels: SPIP
>
> Copied from the SPIP doc:
> {quote}
> GraphX was one of the foundational pillars of the Spark project, and is the
> current graph component. This reflects the importance of the graphs data
> model, which naturally pairs with an important class of analytic function,
> the network or graph algorithm.
> However, GraphX is not actively maintained. It is based on RDDs, and cannot
> exploit Spark 2’s Catalyst query engine. GraphX is only available to Scala
> users.
> GraphFrames is a Spark package, which implements DataFrame-based graph
> algorithms, and also incorporates simple graph pattern matching with fixed
> length patterns (called “motifs”). GraphFrames is based on DataFrames, but
> has a semantically weak graph data model (based on untyped edges and
> vertices). The motif pattern matching facility is very limited by comparison
> with the well-established Cypher language.
> The Property Graph data model has become quite widespread in recent years,
> and is the primary focus of commercial graph data management and of graph
> data research, both for on-premises and cloud data management. Many users of
> transactional graph databases also wish to work with immutable graphs in
> Spark.
> The idea is to define a Cypher-compatible Property Graph type based on
> DataFrames; to replace GraphFrames querying with Cypher; to reimplement
> GraphX/GraphFrames algos on the PropertyGraph type.
> To achieve this goal, a core subset of Cypher for Apache Spark (CAPS),
> reusing existing proven designs and code, will be employed in Spark 3.0. This
> graph query processor, like CAPS, will overlay and drive the SparkSQL
> Catalyst query engine, using the CAPS graph query planner.
> {quote}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]