Hi Spark community, I'd like to announce a new release of GraphFrames, a Spark Package for DataFrame-based graphs!
*We strongly encourage all users to use this latest release for the bug fix described below.* *Critical bug fix* This release fixes a bug in indexing vertices. This may have affected your results if: * your graph uses non-Integer IDs and * you use ConnectedComponents and other algorithms which are wrappers around GraphX. The bug occurs when the input DataFrame is non-deterministic. E.g., running an algorithm on a DataFrame just loaded from disk should be fine in previous releases, but running that algorithm on a DataFrame produced using shuffling, unions, and other operators can cause incorrect results. This issue is fixed in this release. *New features* * Python API for aggregateMessages for building custom graph algorithms * Scala API for parallel personalized PageRank, wrapping the GraphX implementation. This is only available when using GraphFrames with Spark 2.1+. Support for Spark 1.6, 2.0, and 2.1 *Special thanks to Felix Cheung for his work as a new committer for GraphFrames!* *Full release notes*: https://github.com/graphframes/graphframes/releases/tag/release-0.5.0 *Docs*: http://graphframes.github.io/ *Spark Package*: https://spark-packages.org/package/graphframes/graphframes *Source*: https://github.com/graphframes/graphframes Thanks to all contributors and to the community for feedback! Joseph -- Joseph Bradley Software Engineer - Machine Learning Databricks, Inc. [image: http://databricks.com] <http://databricks.com/>