adriangb commented on issue #7715: URL: https://github.com/apache/arrow-rs/issues/7715#issuecomment-3058466627
> What should the "path" argument be? A String? A JSON path? Some structured thing (Vec)`? I expect something structured (maybe `Vec<VariantPathSegment>` would make sense (if possible) since this will be called repeatedly -> parsing the string for every batch is overhead. But it may not be possible today with DataFusion APIs to use a kernel like this. Worst case though DataFusion parses the string -> structured for every batch, at least if we ever introduce something like `PhysicalExpr::optimize` into DataFusion it can maybe handle that? > Should we also provide a "requested data type" field? Similar to the data bricks function I suspect yes: when we parse the binary un-shredded data I'm guessing we'll at some point have some json typed thing (number, string, object, array). What we do in https://github.com/datafusion-contrib/datafusion-functions-json is rewrite `get_field(col, 'a.b')::int` into `get_field_as_int(col, 'a.b')` or I guess if it were a parameter `get_field(col, 'a.b', 'Int32')` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
