kflansburg commented on pull request #8971:
URL: https://github.com/apache/arrow/pull/8971#issuecomment-751163846


   @nevi-me, Looking at PyArrow's implementation of extension types, they 
appear to be creating a new type (`UuidType`) which wraps the underlying array, 
much like I'm doing here. Based on this, I'm thinking the following would match 
the spirit of the Extension specification: 
   
   1. Add a default-empty metadata map to `Field`. 
   1. Possibly define `ExtensionType: Array` trait which at the very least 
outputs the required metadata.
   1. Move `JSONArray` to a new module called `extension` and rename to 
`JSONType`. 
   1. Ensure that `Fields` for `JSONArrays` have   `ARROW:extension:name=json` 
and `ARROW:extension:metadata=` (empty). 
   1. Update various Arrow reading code to capture `Field` metadata and produce 
`JSONType` when appropriate. 
   
   The remaining concern I have here is that there appears to be no other 
libraries implementing this (JSON), so interoperability seems unlikely. 
   
   Finally, you mention using `StringArray` as the underlying type, however 
there are a number of use-cases where a `BinaryArray` will be the input, and 
since `serde_json` supports parsing `&[u8]`, it would be nice to be able to 
skip the extra utf-8 validation step. Thoughts?
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to