XiangpengHao commented on issue #5904: URL: https://github.com/apache/arrow-rs/issues/5904#issuecomment-2174392899
The current gc function won't deduplicating strings, it only use GenericByteViewBuilder to create a new instance of the array. I think it would be a great addition to implement the deduplicating logic. A straightforward approach is to use a hash table to track the location of the strings while building the GenericByteView. It is not on my top priority list, but might give it a try when I have time. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
