paleolimbot commented on issue #18223:
URL: https://github.com/apache/datafusion/issues/18223#issuecomment-3439065685

   Thank you for writing this up, and thank you @tobixdev for 
https://github.com/apache/datafusion/pull/15106 and the reviews along the way 
(that PR I think hits the mark in a number of ways!)
   
   I wonder if a low impact way and reasonably satisfying way to start would be:
   
   ```rust
   // Maybe the lifetimes will be too annoying here, but the idea is that this 
can cheaply
   // represent any of the combinations of 
ArrayRef/FieldMetadata/FieldRef/ScalarValue/DataType
   // without cloning anything. This is basically nanoarrow's ArrowSchemaView :)
   pub struct SerializedTypeView<'a, 'b, 'c> {
       arrow_type: &'a DataType,
       extension_name: Option<&'b str>,
       extension_metadata: Option<&'c str>,
   }
   
   pub trait TypeExtensions {
       // None means just use the storage type implementation. Maybe Box<> or 
&'static could work here
       fn pretty_print_extension(&self, extension_name: &str) -> Option<Arc<dyn 
PrettyPrintExtension>> { None }
       // ...a future version could also return an Option<Arc<dyn 
CustomOrdering>> from #18124 
       // ...and in some magical future maybe we can just do
       fn logical_type(&self, type_view: &SerializedTypeView) -> Result<Arc<dyn 
ThePerfectLogicalTypeTrait>>;
       // ...where the LogicalType knows how to create LogicalArray/Scalars 
that can do this stuff on their own
       // with convenient APIs so people actually use them.
   }
   
   pub trait PrettyPrintExtension {
       // Probably there's a more established pretty print API.
       fn pretty_print_serialized(&self, type_view: &SerializedTypeView, 
storage: &ArrayRef, options_of_some_kind) -> Result<ArrayRef>; 
       // ...a separate trait allows this API to evolve without affecting 
TypeExtensions
   }
   ```
   
   It might be particularly satisfying to implement that for variant so that 
queries against the test files in the CLI show nice pretty JSON. It would be 
less satisfying, but possibly easier, to implement that for UUID. It also makes 
it so we don't have to come up with `ThePerfectLogicalTypeTrait` because that 
is hard and right now we represent type information in a lot of different ways.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to