wjones127 commented on issue #4472:
URL: https://github.com/apache/arrow-rs/issues/4472#issuecomment-1615258370

   > I might be missing something here, but why would it be lost, schema 
metadata should roundtrip over C data interface?
   
   This works well for RecordBatch, but not for an individual array transported 
independent from any batch. Basically, arrays themselves have no way to be 
tagged as an extension array, since those don't contain a field where that 
metadata is stored; they are only extension arrays in the context of a batch.
   
   > I feel quite strongly that only codepaths explicitly concerned with 
extension types should need concern themselves with them, for example the take 
or arithmetic kernel should not need to know about extension types.
   
   I definitely agree, and don't want to make these operations more complex 
than they ought to be. 
   
   If we can think of another place to put this information, I'm open to that.
   
   (A bit of a tangent, but...) In my ideal world, there would be a logical 
type enum and a physical type enum. Physical types would be the current 
`DataType`. Then logical types would be things like `String` (just one, 
regardless of offset size and encoding) and then a generic `ExtensionType` 
variant. Sort of like what Sasha was talking about a long time ago: 
https://lists.apache.org/thread/357z4587dczho4x1257ttf0b4o9302co


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to