Hi,

First time posting here, so apologies if I need to be directing this topic
elsewhere.

I'm the author of RasterFrames, and a contributor to GeoMesa's Spark SQL
module. Both make use of decently low level Catalyst constructs, include
custom UDTs; RasterFrames introduces a geospatial raster type, and GeoMesa
a geometry type.

In order to make this work we've circumvented the [`package private`](
https://bit.ly/3pr0fVv)  restriction on `UDTRegistration` by inserting
sibling classes into the package namespace. It's a hack, and works fine
with JVM 8, but violates the [much more restrictive](https://bit.ly/3aadO5g)
module constructs in JVM 9+.

We've been monitoring [SPARK-7768](
https://issues.apache.org/jira/browse/SPARK-7768) (filed in 2015)  and it's
[associated PR](https://github.com/apache/spark/pull/16478) for years now,
but it keeps getting kicked down the road(map).

As authors of open source systems we completely understand how and why this
happens, but we are at a critical juncture in our projects' lifecycle,
anchored to JVM 8 while other systems have moved on to later versions. We'd
also like to enjoy the benefits of later JVMs.

So... I'm here to find out how I and others critically needing public
access to `UDTRegistration` might better advocate for it?

I think (but not 100% sure) the PR linked above is more extensive than what
we need, also addressing usability around Encoders, for which we have our
own type class solution. My assumption to date has been all we need is line
32 of `UDTRegistration` deleted (if there's folly therein, please say so!).
While I understand a reluctance to promote `UDTRegistration` to `public`, I
note that it has not been changed since 2016, perhaps a good indicator that
the API is stable enough. Marking it as `@Experimental` could be a
compromise option.

Thanks for reading this far and giving this consideration. Any and all
advice is appreciated.

Simeon (@metasim)


-- 
Simeon Fitch
Co-founder & VP of R&D
Astraea, Inc.

Reply via email to