raulcd commented on issue #49368: URL: https://github.com/apache/arrow/issues/49368#issuecomment-3944380857
oh! I understand now! Columns that were read from Parquet as "strings" are now read as "string_view" and there are no kernels for those. Got it. This is probably due to: - https://github.com/apache/arrow/issues/43041 Do you manage how those Parquet files are read or is deltalake managing it? There's a new option `binary_type` which should drive this behavior. Using `opts.binary_type == pa.binary()` instead of `opts.binary_type == pa.binary_view()`. This should go back to the previous behavior unless, from what I understand, the files are stored with an Arrow Schema that says those are `string view`. @pitrou is the above correct about the new `binary_type` option also driving the string behavior (not only binary columns)? The real solution would be to provide Kernels though. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
