GitHub user marmbrus opened a pull request:
https://github.com/apache/spark/pull/3063
[SPARK-3572] [SQL] Internal API for User-Defined Types
This PR adds User-Defined Types (UDTs) to SQL. It is a precursor to using
SchemaRDD as a Dataset for the new MLlib API. Currently, the UDT API is private
since there is incomplete support (e.g., no Java or Python support yet).
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/marmbrus/spark udts
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/3063.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #3063
----
commit 105c5a366501a8ef6957cba43968f477b56f9a45
Author: Joseph K. Bradley <[email protected]>
Date: 2014-10-03T02:06:49Z
Adding UserDefinedType to SQL, not done yet.
commit 0eaeb8187342048287b4ada1400a750ca6742f80
Author: Joseph K. Bradley <[email protected]>
Date: 2014-10-06T16:54:51Z
Still working on UDTs
commit 19b2f60cf337c6e3e332ee8cddfeb818dad2b699
Author: Joseph K. Bradley <[email protected]>
Date: 2014-10-06T20:18:26Z
still working on UDTs
commit 982c03561d8b1b41d14c061e704910b582f91703
Author: Joseph K. Bradley <[email protected]>
Date: 2014-10-07T02:10:43Z
still working on UDTs
commit 53de70f3a38bde5a1efec9234ee72aa87055cfdc
Author: Joseph K. Bradley <[email protected]>
Date: 2014-10-07T19:25:45Z
more udts...
commit 8bebf24ad16f63034cb049c718e3d1b6070eea80
Author: Joseph K. Bradley <[email protected]>
Date: 2014-10-07T22:51:07Z
commented out convertRowToScala for debugging
commit 273ac9627b4acaced95521dc9ce2f1ef0eab7305
Author: Joseph K. Bradley <[email protected]>
Date: 2014-10-08T02:22:10Z
basic UDT is working, but deserialization has yet to be done
commit 39f870732aedc70dc7b6f7509f3c18c2dc1964c6
Author: Joseph K. Bradley <[email protected]>
Date: 2014-10-08T02:31:59Z
removed old udt suite
commit 04303c9b1c179b6bb08b7e0c5987ebffadc65c92
Author: Joseph K. Bradley <[email protected]>
Date: 2014-10-09T19:39:44Z
udts
commit 50f97269654b859f1babdef526c9fdebb9fa78f2
Author: Joseph K. Bradley <[email protected]>
Date: 2014-10-09T20:09:15Z
udts
commit 893ee4cacefcfd8d6516481bc15166a6f3aced60
Author: Joseph K. Bradley <[email protected]>
Date: 2014-10-09T21:18:41Z
udt finallly working
commit 964b32e532c2949976d764239298064dea9c5081
Author: Joseph K. Bradley <[email protected]>
Date: 2014-10-09T21:48:33Z
some cleanups
commit fea04af0cf855149c6bed75792942ed6081e1995
Author: Joseph K. Bradley <[email protected]>
Date: 2014-10-09T21:56:09Z
more cleanups
commit b226b9e56687bbd31b3ea2cdd0c4dc0e9609fb8e
Author: Joseph K. Bradley <[email protected]>
Date: 2014-10-10T17:33:37Z
Changing UDT to annotation
commit 357903547e731bfc1a15ead6fa8903737c547316
Author: Joseph K. Bradley <[email protected]>
Date: 2014-10-10T18:53:27Z
udt annotation now working
commit 2f40c02a8891e3466fdf1cbb10120cc17b961b96
Author: Joseph K. Bradley <[email protected]>
Date: 2014-10-10T20:13:35Z
renamed UDT types
commit e1f7b9cd053b53550a23909bd5a9ace5074a9066
Author: Joseph K. Bradley <[email protected]>
Date: 2014-10-10T21:02:32Z
blah
commit 34a5831eb28b1422b1562f256e7be1f290a70c40
Author: Joseph K. Bradley <[email protected]>
Date: 2014-10-10T22:14:29Z
Added MLlib dependency on SQL.
commit cd60cb48d36142a152cd02a263212f5c041e6c23
Author: Joseph K. Bradley <[email protected]>
Date: 2014-10-21T18:57:14Z
Trying to get other SQL tests to run
commit dff99d6b29b02b33be24dae00d8da7122e0d7d2f
Author: Joseph K. Bradley <[email protected]>
Date: 2014-10-22T02:26:41Z
Added UDTs for Vectors in MLlib, plus DatasetExample using the UDTs
commit 85872f6e2fbb2385793b645a629ed26ee2e98cbc
Author: Michael Armbrust <[email protected]>
Date: 2014-10-23T21:17:55Z
Allow schema calculation to be lazy, but ensure its available on executors.
commit f025035b77b6fa21a3cda3f05f2875895013bdfa
Author: Joseph K. Bradley <[email protected]>
Date: 2014-10-24T00:46:52Z
Cleanups before PR. Added new tests
commit 51e5282c346d58c9e433c90d019fe35825a2fec1
Author: Joseph K. Bradley <[email protected]>
Date: 2014-10-24T18:10:21Z
fixed 1 test
commit 63626a4f2a62e03f22c9b0bc453b754ef5858988
Author: Joseph K. Bradley <[email protected]>
Date: 2014-10-24T18:25:44Z
Updated ScalaReflectionsSuite per @marmbrus suggestions
commit 759af7ac349579967ea5929f5b019097318c28ee
Author: Joseph K. Bradley <[email protected]>
Date: 2014-10-27T21:25:50Z
Added more doc to UserDefineType
commit db16139ba5030fc0e6c4455a3adc1abd853211a5
Author: Joseph K. Bradley <[email protected]>
Date: 2014-10-28T01:05:34Z
Added more doc for UserDefinedType. Removed unused code in Suite
commit cfbc3215332571a4cb033a27de495d9865c8a4dc
Author: Xiangrui Meng <[email protected]>
Date: 2014-10-28T10:44:11Z
support UDT in parquet
commit 3143ac304a9cbb81e766d915863d343cd81dc673
Author: Xiangrui Meng <[email protected]>
Date: 2014-10-28T19:19:10Z
remove unnecessary changes
commit 87264a5aa500f3d44c3a806893fcc9df6b5e0e90
Author: Xiangrui Meng <[email protected]>
Date: 2014-10-28T19:20:06Z
remove debug code
commit 4500d8a3bd04c515af29fa42d803ee29e415e8a8
Author: Xiangrui Meng <[email protected]>
Date: 2014-10-28T19:25:57Z
update example code
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]