I think the biggest blocker to doing this is the way that we pass extension types through IPC. Extension types are sent as their underlying storage type with metadata key-value pairs of specific keys "ARROW:extension:name" and "ARROW:extension:metadata". Since you can't have multiple values for the same key in the metadata, this would prevent the ability to define an extension type in terms of another extension type as you wouldn't be able to include the metadata for the second-level extension part.
i.e. you'd be able to have "ARROW:extension:name" => "HLLSKETCH", but you wouldn't be able to *also* have "ARROW:extension:name" => "JSON" for its storage type. So the storage type needs to be a valid core Arrow data type for this reason. On Tue, Apr 30, 2024 at 10:16 AM Ian Cook <ianmc...@apache.org> wrote: > The vote on adding a JSON canonical extension type [1] got me wondering: Is > it possible to define an extension type that is based on a canonical > extension type? If so, how? > > For example, say I wanted to define a (non-canonical) HLLSKETCH extension > type that corresponds to the type that Redshift uses for HyperLogLog > sketches and is represented as JSON [2]. Is there a way to do this by > building on the JSON canonical extension type? > > [1] https://lists.apache.org/thread/4dw3dnz6rjp5wz2240mn299p51d5tvtq > [2] https://docs.aws.amazon.com/redshift/latest/dg/r_HLLSKTECH_type.html > > Ian >