GitHub user rdblue opened a pull request: https://github.com/apache/spark/pull/21242
[SPARK-23657][SQL] Document and expose the internal data API ## What changes were proposed in this pull request? This makes the `InternalRow`, `ArrayData`, and `MapData` classes public and adds package documentation for the internal representation used by Spark SQL. The motivation for this change is to document the internal API because it has been leaked as a public API and used in the v2 DataSource classes. ## How was this patch tested? Existing tests. This is a refactor and adds documentation. You can merge this pull request into a Git repository by running: $ git pull https://github.com/rdblue/spark SPARK-23657-expose-internal-data-apis Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/21242.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #21242 ---- commit f738127bc287632ede185daa3345b1f0de8b0ccc Author: Ryan Blue <blue@...> Date: 2018-05-04T21:12:22Z Move InternalRow to org.apache.spark.sql.catalyst.data. commit 747867c719029ddc86d44e7149e98e031a3468ea Author: Ryan Blue <blue@...> Date: 2018-05-04T21:20:31Z Move ArrayData to org.apache.spark.sql.catalyst.data. commit 7d108e9f1e603977b21773c664ba506b65ed0682 Author: Ryan Blue <blue@...> Date: 2018-05-04T21:34:25Z Move MapData to org.apache.spark.sql.catalyst.data. commit 45a5c8d5e90df5e659bea107c73fc6e1a015ad7d Author: Ryan Blue <blue@...> Date: 2018-05-04T21:36:05Z Move SpecializedGetters to org.apache.spark.sql.catalyst.data. commit 465852a7f32c568b89d4ee988262b53a26710f97 Author: Ryan Blue <blue@...> Date: 2018-05-04T22:38:18Z Clean up the public API for InternalRow, ArrayData, and MapData. This adds SpecializedSetters, with helper functions that are common between InternalRow and ArrayData. This class also defines a default get method that was previously implemented by UnsafeArrayData and UnsafeRow. The package.scala for data has a high-level overview of the internal data API. ---- --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org