alamb commented on issue #6692: URL: https://github.com/apache/arrow-rs/issues/6692#issuecomment-2462548490
@tustvold > I'm not sure the proposed builder API makes sense, as the typing for nested types like ListBuilder and DictionaryBuilder is not what we want here, and they can't easily be type erased. We also ideally want to avoid overly bloating the arrow-array crate with kernel logic. This is reasonable -- though I could imagine adding type erased builders like `DynListBuilder` for this usecase > This isn't even touching on the fact these kernels don't use the builders for performance reasons. Is there some fundamental reason the builders can't made faster? If we could make the builders fast enough to use for filter that would seem to be valuable in its own right. But I am likely just dreaming here > The only remaining challenge concerns dictionaries, as the output dictionary needs to be computed up front. Simply not supporting dictonaries could potentially be a valid workaround though. A builder based approach could help (e.g. optimize for the case where the input batches had the same dictionary and handle the case where they didn't -- either via deferred computation or on the fly or something else) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
