I think the biggest blocker to doing this is the way that we pass extension
types through IPC. Extension types are sent as their underlying storage
type with metadata key-value pairs of specific keys "ARROW:extension:name"
and "ARROW:extension:metadata". Since you can't have multiple values for
the same key in the metadata, this would prevent the ability to define an
extension type in terms of another extension type as you wouldn't be able
to include the metadata for the second-level extension part.

i.e. you'd be able to have "ARROW:extension:name" => "HLLSKETCH", but you
wouldn't be able to *also* have "ARROW:extension:name" => "JSON" for its
storage type. So the storage type needs to be a valid core Arrow data type
for this reason.

On Tue, Apr 30, 2024 at 10:16 AM Ian Cook <ianmc...@apache.org> wrote:

> The vote on adding a JSON canonical extension type [1] got me wondering: Is
> it possible to define an extension type that is based on a canonical
> extension type? If so, how?
>
> For example, say I wanted to define a (non-canonical) HLLSKETCH extension
> type that corresponds to the type that Redshift uses for HyperLogLog
> sketches and is represented as JSON [2]. Is there a way to do this by
> building on the JSON canonical extension type?
>
> [1] https://lists.apache.org/thread/4dw3dnz6rjp5wz2240mn299p51d5tvtq
> [2] https://docs.aws.amazon.com/redshift/latest/dg/r_HLLSKTECH_type.html
>
> Ian
>

Reply via email to