nealrichardson commented on pull request #8365: URL: https://github.com/apache/arrow/pull/8365#issuecomment-726237202
What you describe (including using GetView) is essentially what we now have on master: https://github.com/apache/arrow/blob/master/r/src/array_to_vector.cpp#L290-L321 The difference is that we moved back to `Rf_mkCharLenCE`, which is much faster because length is known and errors (correctly) when there are embedded nuls, instead of `Rf_mkCharCE`, which IIUC was the reason for the performance regression in #8356 and which was responsible for silently truncating at an embedded nul. If `Rf_mkCharLenCE` is what is raising the "embedded nul in string" error, then we have the option of catching that and falling back to a slower `skipNul` path to strip the nulls. If the error comes from somewhere else later, we may need to consider other options. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
