zeroshade commented on code in PR #34454:
URL: https://github.com/apache/arrow/pull/34454#discussion_r1128364805


##########
go/arrow/datatype_extension.go:
##########
@@ -122,7 +124,9 @@ type ExtensionType interface {
        // If the storage type is incorrect or something else is invalid with 
the data this should
        // return nil and an appropriate error.
        Deserialize(storageType DataType, data string) (ExtensionType, error)
-
+       // this should return array.Builder interface but we cannot import due 
to cycle import, so we use 
+       // interface{} instead. At least for 
+       NewBuilder(mem memory.Allocator, dt ExtensionType) interface{}

Review Comment:
   For the second point: Note that you had to add the `NewBuilder` method to 
the existing extension types. By adding a new method to the interface, anyone 
who has their own extension types defined that doesn't already have this method 
defined will have a compile error when they upgrade to this version of Arrow 
which adds the method because their structs will no longer meet the interface 
for `ExtensionType`.
   
   >  But actually I think this should 1) provide better performance, no 
reflection. 2) much better type safety, and better developer experience as it 
is clear what should be the type of this function (I also wanted to return 
Builder but had to fallback to interface due to some cyclic import that didn't 
want to fix part of this PR).
   
   In general, I wouldn't expect creating a builder to be a performance 
bottleneck as consumers shouldn't create builders repeatedly. That said, a 
couple ideas I've had so far:
   
   * Like was previously done for Arrays, we could shift the definition of the 
`Builder` interface to the `arrow` package directly and then add an alias in 
the `array` package to ensure we don't break any consumers (with a deprecated 
message telling people to point at `arrow.Builder` instead.
   * If we're going in this direction, rather than passing the allocator and 
the extension type, we should pass an `ExtensionBuilder` to the method and have 
this just wrap it and return the wrapped builder. The consumer can also 
retrieve the extension type and the allocator from the builder directly if they 
need to. So perhaps something like `WrapBuilder(ExtensionBuilder) Builder`. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to